Discrete non-parametric kernel estimation for global sensitivity analysis
International Nuclear Information System (INIS)
Senga Kiessé, Tristan; Ventura, Anne
2016-01-01
This work investigates the discrete kernel approach for evaluating the contribution of the variance of discrete input variables to the variance of model output, via analysis of variance (ANOVA) decomposition. Until recently only the continuous kernel approach has been applied as a metamodeling approach within sensitivity analysis framework, for both discrete and continuous input variables. Now the discrete kernel estimation is known to be suitable for smoothing discrete functions. We present a discrete non-parametric kernel estimator of ANOVA decomposition of a given model. An estimator of sensitivity indices is also presented with its asymtotic convergence rate. Some simulations on a test function analysis and a real case study from agricultural have shown that the discrete kernel approach outperforms the continuous kernel one for evaluating the contribution of moderate or most influential discrete parameters to the model output. - Highlights: • We study a discrete kernel estimation for sensitivity analysis of a model. • A discrete kernel estimator of ANOVA decomposition of the model is presented. • Sensitivity indices are calculated for discrete input parameters. • An estimator of sensitivity indices is also presented with its convergence rate. • An application is realized for improving the reliability of environmental models.
Kernel bandwidth estimation for non-parametric density estimation: a comparative study
CSIR Research Space (South Africa)
Van der Walt, CM
2013-12-01
Full Text Available We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high...
Ambrogioni, Luca; Güçlü, Umut; van Gerven, Marcel A. J.; Maris, Eric
2017-01-01
This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of kernel functions centered at a subset of training points. The weights are determined by the outer layer of a deep neural network, trained by minimizing the negative log likelihood. This generalizes the popular quantized softmax approach, which can be seen ...
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
Low default credit scoring using two-class non-parametric kernel density estimation
CSIR Research Space (South Africa)
Rademeyer, E
2016-12-01
Full Text Available This paper investigates the performance of two-class classification credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and non-parametric Parzen classifiers are extended, using Bayes’ rule, to include either...
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...
Consistent Estimation of Pricing Kernels from Noisy Price Data
Vladislav Kargin
2003-01-01
If pricing kernels are assumed non-negative then the inverse problem of finding the pricing kernel is well-posed. The constrained least squares method provides a consistent estimate of the pricing kernel. When the data are limited, a new method is suggested: relaxed maximization of the relative entropy. This estimator is also consistent. Keywords: $\\epsilon$-entropy, non-parametric estimation, pricing kernel, inverse problems.
Variable Kernel Density Estimation
Terrell, George R.; Scott, David W.
1992-01-01
We investigate some of the possibilities for improvement of univariate and multivariate kernel density estimates by varying the window over the domain of estimation, pointwise and globally. Two general approaches are to vary the window width by the point of estimation and by point of the sample observation. The first possibility is shown to be of little efficacy in one variable. In particular, nearest-neighbor estimators in all versions perform poorly in one and two dimensions, but begin to b...
Nonparametric e-Mixture Estimation.
Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru
2016-12-01
This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.
Global Polynomial Kernel Hazard Estimation
DEFF Research Database (Denmark)
Hiabu, Munir; Miranda, Maria Dolores Martínez; Nielsen, Jens Perch
2015-01-01
This paper introduces a new bias reducing method for kernel hazard estimation. The method is called global polynomial adjustment (GPA). It is a global correction which is applicable to any kernel hazard estimator. The estimator works well from a theoretical point of view as it asymptotically redu...
On Improving Convergence Rates for Nonnegative Kernel Density Estimators
Terrell, George R.; Scott, David W.
1980-01-01
To improve the rate of decrease of integrated mean square error for nonparametric kernel density estimators beyond $0(n^{-\\frac{4}{5}}),$ we must relax the constraint that the density estimate be a bonafide density function, that is, be nonnegative and integrate to one. All current methods for kernel (and orthogonal series) estimators relax the nonnegativity constraint. In this paper we show how to achieve similar improvement by relaxing the integral constraint only. This is important in appl...
Nonparametric methods for volatility density estimation
Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.
2009-01-01
Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on
The Kernel Estimation in Biosystems Engineering
Directory of Open Access Journals (Sweden)
Esperanza Ayuga Téllez
2008-04-01
Full Text Available In many fields of biosystems engineering, it is common to find works in which statistical information is analysed that violates the basic hypotheses necessary for the conventional forecasting methods. For those situations, it is necessary to find alternative methods that allow the statistical analysis considering those infringements. Non-parametric function estimation includes methods that fit a target function locally, using data from a small neighbourhood of the point. Weak assumptions, such as continuity and differentiability of the target function, are rather used than "a priori" assumption of the global target function shape (e.g., linear or quadratic. In this paper a few basic rules of decision are enunciated, for the application of the non-parametric estimation method. These statistical rules set up the first step to build an interface usermethod for the consistent application of kernel estimation for not expert users. To reach this aim, univariate and multivariate estimation methods and density function were analysed, as well as regression estimators. In some cases the models to be applied in different situations, based on simulations, were defined. Different biosystems engineering applications of the kernel estimation are also analysed in this review.
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
Non-Parametric Estimation of Correlation Functions
DEFF Research Database (Denmark)
Brincker, Rune; Rytter, Anders; Krenk, Steen
In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...
Nonparametric estimation in models for unobservable heterogeneity
Hohmann, Daniel
2014-01-01
Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.
Adaptive Nonparametric Variance Estimation for a Ratio Estimator ...
African Journals Online (AJOL)
Kernel estimators for smooth curves require modifications when estimating near end points of the support, both for practical and asymptotic reasons. The construction of such boundary kernels as solutions of variational problem is a difficult exercise. For estimating the error variance of a ratio estimator, we suggest an ...
Nonparametric estimation of location and scale parameters
Potgieter, C.J.; Lombard, F.
2012-01-01
Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
Nonparametric estimation of location and scale parameters
Potgieter, C.J.
2012-12-01
Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.
Adaptive nonparametric estimation for L\\'evy processes observed at low frequency
Kappus, Johanna
2013-01-01
This article deals with adaptive nonparametric estimation for L\\'evy processes observed at low frequency. For general linear functionals of the L\\'evy measure, we construct kernel estimators, provide upper risk bounds and derive rates of convergence under regularity assumptions. Our focus lies on the adaptive choice of the bandwidth, using model selection techniques. We face here a non-standard problem of model selection with unknown variance. A new approach towards this problem is proposed, ...
Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.
Bhattacharya, Abhishek; Dunson, David B
2010-12-01
Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.
portfolio optimization based on nonparametric estimation methods
Directory of Open Access Journals (Sweden)
mahsa ghandehari
2017-03-01
Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.
Nonparametric Inference of Doubly Stochastic Poisson Process Data via the Kernel Method.
Zhang, Tingting; Kou, S C
2010-01-01
Doubly stochastic Poisson processes, also known as the Cox processes, frequently occur in various scientific fields. In this article, motivated primarily by analyzing Cox process data in biophysics, we propose a nonparametric kernel-based inference method. We conduct a detailed study, including an asymptotic analysis, of the proposed method, and provide guidelines for its practical use, introducing a fast and stable regression method for bandwidth selection. We apply our method to real photon arrival data from recent single-molecule biophysical experiments, investigating proteins' conformational dynamics. Our result shows that conformational fluctuation is widely present in protein systems, and that the fluctuation covers a broad range of time scales, highlighting the dynamic and complex nature of proteins' structure.
A nonparametric mixture model for cure rate estimation.
Peng, Y; Dear, K B
2000-03-01
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
Optimal Bandwidth Selection for Kernel Density Functionals Estimation
Directory of Open Access Journals (Sweden)
Su Chen
2015-01-01
Full Text Available The choice of bandwidth is crucial to the kernel density estimation (KDE and kernel based regression. Various bandwidth selection methods for KDE and local least square regression have been developed in the past decade. It has been known that scale and location parameters are proportional to density functionals ∫γ(xf2(xdx with appropriate choice of γ(x and furthermore equality of scale and location tests can be transformed to comparisons of the density functionals among populations. ∫γ(xf2(xdx can be estimated nonparametrically via kernel density functionals estimation (KDFE. However, the optimal bandwidth selection for KDFE of ∫γ(xf2(xdx has not been examined. We propose a method to select the optimal bandwidth for the KDFE. The idea underlying this method is to search for the optimal bandwidth by minimizing the mean square error (MSE of the KDFE. Two main practical bandwidth selection techniques for the KDFE of ∫γ(xf2(xdx are provided: Normal scale bandwidth selection (namely, “Rule of Thumb” and direct plug-in bandwidth selection. Simulation studies display that our proposed bandwidth selection methods are superior to existing density estimation bandwidth selection methods in estimating density functionals.
Panel data nonparametric estimation of production risk and risk preferences
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
approaches for obtaining firm-specific measures of risk attitudes. We found that Polish dairy farmers are risk averse regarding production risk and price uncertainty. According to our results, Polish dairy farmers perceive the production risk as being more significant than the risk related to output price......We apply nonparametric panel data kernel regression to investigate production risk, out-put price uncertainty, and risk attitudes of Polish dairy farms based on a firm-level unbalanced panel data set that covers the period 2004–2010. We compare different model specifications and different...
Nonparametric evaluation of dynamic disease risk: a spatio-temporal kernel approach.
Directory of Open Access Journals (Sweden)
Zhijie Zhang
Full Text Available Quantifying the distributions of disease risk in space and time jointly is a key element for understanding spatio-temporal phenomena while also having the potential to enhance our understanding of epidemiologic trajectories. However, most studies to date have neglected time dimension and focus instead on the "average" spatial pattern of disease risk, thereby masking time trajectories of disease risk. In this study we propose a new idea titled "spatio-temporal kernel density estimation (stKDE" that employs hybrid kernel (i.e., weight functions to evaluate the spatio-temporal disease risks. This approach not only can make full use of sample data but also "borrows" information in a particular manner from neighboring points both in space and time via appropriate choice of kernel functions. Monte Carlo simulations show that the proposed method performs substantially better than the traditional (i.e., frequency-based kernel density estimation (trKDE which has been used in applied settings while two illustrative examples demonstrate that the proposed approach can yield superior results compared to the popular trKDE approach. In addition, there exist various possibilities for improving and extending this method.
Nonparametric NAR-ARCH Modelling of Stock Prices by the Kernel Methodology
Directory of Open Access Journals (Sweden)
Mohamed Chikhi
2018-02-01
Full Text Available This paper analyses cyclical behaviour of Orange stock price listed in French stock exchange over 01/03/2000 to 02/02/2017 by testing the nonlinearities through a class of conditional heteroscedastic nonparametric models. The linearity and Gaussianity assumptions are rejected for Orange Stock returns and informational shocks have transitory effects on returns and volatility. The forecasting results show that Orange stock prices are short-term predictable and nonparametric NAR-ARCH model has better performance over parametric MA-APARCH model for short horizons. Plus, the estimates of this model are also better comparing to the predictions of the random walk model. This finding provides evidence for weak form of inefficiency in Paris stock market with limited rationality, thus it emerges arbitrage opportunities.
A survey of kernel-type estimators for copula and their applications
Sumarjaya, I. W.
2017-10-01
Copulas have been widely used to model nonlinear dependence structure. Main applications of copulas include areas such as finance, insurance, hydrology, rainfall to name but a few. The flexibility of copula allows researchers to model dependence structure beyond Gaussian distribution. Basically, a copula is a function that couples multivariate distribution functions to their one-dimensional marginal distribution functions. In general, there are three methods to estimate copula. These are parametric, nonparametric, and semiparametric method. In this article we survey kernel-type estimators for copula such as mirror reflection kernel, beta kernel, transformation method and local likelihood transformation method. Then, we apply these kernel methods to three stock indexes in Asia. The results of our analysis suggest that, albeit variation in information criterion values, the local likelihood transformation method performs better than the other kernel methods.
Nonparametric Collective Spectral Density Estimation and Clustering
Maadooliat, Mehdi
2017-04-12
In this paper, we develop a method for the simultaneous estimation of spectral density functions (SDFs) for a collection of stationary time series that share some common features. Due to the similarities among the SDFs, the log-SDF can be represented using a common set of basis functions. The basis shared by the collection of the log-SDFs is estimated as a low-dimensional manifold of a large space spanned by a pre-specified rich basis. A collective estimation approach pools information and borrows strength across the SDFs to achieve better estimation efficiency. Also, each estimated spectral density has a concise representation using the coefficients of the basis expansion, and these coefficients can be used for visualization, clustering, and classification purposes. The Whittle pseudo-maximum likelihood approach is used to fit the model and an alternating blockwise Newton-type algorithm is developed for the computation. A web-based shiny App found at
Nonparametric Collective Spectral Density Estimation and Clustering
Maadooliat, Mehdi; Sun, Ying; Chen, Tianbo
2017-01-01
In this paper, we develop a method for the simultaneous estimation of spectral density functions (SDFs) for a collection of stationary time series that share some common features. Due to the similarities among the SDFs, the log-SDF can be represented using a common set of basis functions. The basis shared by the collection of the log-SDFs is estimated as a low-dimensional manifold of a large space spanned by a pre-specified rich basis. A collective estimation approach pools information and borrows strength across the SDFs to achieve better estimation efficiency. Also, each estimated spectral density has a concise representation using the coefficients of the basis expansion, and these coefficients can be used for visualization, clustering, and classification purposes. The Whittle pseudo-maximum likelihood approach is used to fit the model and an alternating blockwise Newton-type algorithm is developed for the computation. A web-based shiny App found at
Variable kernel density estimation in high-dimensional feature spaces
CSIR Research Space (South Africa)
Van der Walt, Christiaan M
2017-02-01
Full Text Available Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high...
High throughput nonparametric probability density estimation.
Farmer, Jenny; Jacobs, Donald
2018-01-01
In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.
Nonparametric volatility density estimation for discrete time models
Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.
2005-01-01
We consider discrete time models for asset prices with a stationary volatility process. We aim at estimating the multivariate density of this process at a set of consecutive time instants. A Fourier-type deconvolution kernel density estimator based on the logarithm of the squared process is proposed
On the robust nonparametric regression estimation for a functional regressor
Azzedine , Nadjia; Laksaci , Ali; Ould-Saïd , Elias
2009-01-01
On the robust nonparametric regression estimation for a functional regressor correspondance: Corresponding author. (Ould-Said, Elias) (Azzedine, Nadjia) (Laksaci, Ali) (Ould-Said, Elias) Departement de Mathematiques--> , Univ. Djillali Liabes--> , BP 89--> , 22000 Sidi Bel Abbes--> - ALGERIA (Azzedine, Nadjia) Departement de Mathema...
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.
Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen
2011-04-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Estimation of Stochastic Volatility Models by Nonparametric Filtering
DEFF Research Database (Denmark)
Kanaya, Shin; Kristensen, Dennis
2016-01-01
/estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases...... and variances due to the first-step estimation, but under regularity conditions we show that these vanish asymptotically and our estimators inherit the asymptotic properties of the infeasible estimators based on observations of the volatility process. A simulation study examines the finite-sample properties...
Investigation of MLE in nonparametric estimation methods of reliability function
International Nuclear Information System (INIS)
Ahn, Kwang Won; Kim, Yoon Ik; Chung, Chang Hyun; Kim, Kil Yoo
2001-01-01
There have been lots of trials to estimate a reliability function. In the ESReDA 20 th seminar, a new method in nonparametric way was proposed. The major point of that paper is how to use censored data efficiently. Generally there are three kinds of approach to estimate a reliability function in nonparametric way, i.e., Reduced Sample Method, Actuarial Method and Product-Limit (PL) Method. The above three methods have some limits. So we suggest an advanced method that reflects censored information more efficiently. In many instances there will be a unique maximum likelihood estimator (MLE) of an unknown parameter, and often it may be obtained by the process of differentiation. It is well known that the three methods generally used to estimate a reliability function in nonparametric way have maximum likelihood estimators that are uniquely exist. So, MLE of the new method is derived in this study. The procedure to calculate a MLE is similar just like that of PL-estimator. The difference of the two is that in the new method, the mass (or weight) of each has an influence of the others but the mass in PL-estimator not
Bhattacharya, Abhishek; Dunson, David B
2012-08-01
This article considers a broad class of kernel mixture density models on compact metric spaces and manifolds. Following a Bayesian approach with a nonparametric prior on the location mixing distribution, sufficient conditions are obtained on the kernel, prior and the underlying space for strong posterior consistency at any continuous density. The prior is also allowed to depend on the sample size n and sufficient conditions are obtained for weak and strong consistency. These conditions are verified on compact Euclidean spaces using multivariate Gaussian kernels, on the hypersphere using a von Mises-Fisher kernel and on the planar shape space using complex Watson kernels.
Non-parametric estimation of the individual's utility map
Noguchi, Takao; Sanborn, Adam N.; Stewart, Neil
2013-01-01
Models of risky choice have attracted much attention in behavioural economics. Previous research has repeatedly demonstrated that individuals' choices are not well explained by expected utility theory, and a number of alternative models have been examined using carefully selected sets of choice alternatives. The model performance however, can depend on which choice alternatives are being tested. Here we develop a non-parametric method for estimating the utility map over the wide range of choi...
A non-parametric framework for estimating threshold limit values
Directory of Open Access Journals (Sweden)
Ulm Kurt
2005-11-01
Full Text Available Abstract Background To estimate a threshold limit value for a compound known to have harmful health effects, an 'elbow' threshold model is usually applied. We are interested on non-parametric flexible alternatives. Methods We describe how a step function model fitted by isotonic regression can be used to estimate threshold limit values. This method returns a set of candidate locations, and we discuss two algorithms to select the threshold among them: the reduced isotonic regression and an algorithm considering the closed family of hypotheses. We assess the performance of these two alternative approaches under different scenarios in a simulation study. We illustrate the framework by analysing the data from a study conducted by the German Research Foundation aiming to set a threshold limit value in the exposure to total dust at workplace, as a causal agent for developing chronic bronchitis. Results In the paper we demonstrate the use and the properties of the proposed methodology along with the results from an application. The method appears to detect the threshold with satisfactory success. However, its performance can be compromised by the low power to reject the constant risk assumption when the true dose-response relationship is weak. Conclusion The estimation of thresholds based on isotonic framework is conceptually simple and sufficiently powerful. Given that in threshold value estimation context there is not a gold standard method, the proposed model provides a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.
A Bayesian nonparametric estimation of distributions and quantiles
International Nuclear Information System (INIS)
Poern, K.
1988-11-01
The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)
Nonparametric Estimation of Distributions in Random Effects Models
Hart, Jeffrey D.
2011-01-01
We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.
Nonparametric autocovariance estimation from censored time series by Gaussian imputation.
Park, Jung Wook; Genton, Marc G; Ghosh, Sujit K
2009-02-01
One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic.
Nonparametric estimation of benchmark doses in environmental risk assessment
Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen
2013-01-01
Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133
Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines
Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.
2011-01-01
Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433
Nonparametric estimation of stochastic differential equations with sparse Gaussian processes.
García, Constantino A; Otero, Abraham; Félix, Paulo; Presedo, Jesús; Márquez, David G
2017-08-01
The application of stochastic differential equations (SDEs) to the analysis of temporal data has attracted increasing attention, due to their ability to describe complex dynamics with physically interpretable equations. In this paper, we introduce a nonparametric method for estimating the drift and diffusion terms of SDEs from a densely observed discrete time series. The use of Gaussian processes as priors permits working directly in a function-space view and thus the inference takes place directly in this space. To cope with the computational complexity that requires the use of Gaussian processes, a sparse Gaussian process approximation is provided. This approximation permits the efficient computation of predictions for the drift and diffusion terms by using a distribution over a small subset of pseudosamples. The proposed method has been validated using both simulated data and real data from economy and paleoclimatology. The application of the method to real data demonstrates its ability to capture the behavior of complex systems.
Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution
Directory of Open Access Journals (Sweden)
Emmanuel Kidando
2017-01-01
Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.
Directory of Open Access Journals (Sweden)
Ignacio Santamaría
2008-04-01
Full Text Available This paper treats the identification of nonlinear systems that consist of a cascade of a linear channel and a nonlinearity, such as the well-known Wiener and Hammerstein systems. In particular, we follow a supervised identification approach that simultaneously identifies both parts of the nonlinear system. Given the correct restrictions on the identification problem, we show how kernel canonical correlation analysis (KCCA emerges as the logical solution to this problem. We then extend the proposed identification algorithm to an adaptive version allowing to deal with time-varying systems. In order to avoid overfitting problems, we discuss and compare three possible regularization techniques for both the batch and the adaptive versions of the proposed algorithm. Simulations are included to demonstrate the effectiveness of the presented algorithm.
Improved Variable Window Kernel Estimates of Probability Densities
Hall, Peter; Hu, Tien Chung; Marron, J. S.
1995-01-01
Variable window width kernel density estimators, with the width varying proportionally to the square root of the density, have been thought to have superior asymptotic properties. The rate of convergence has been claimed to be as good as those typical for higher-order kernels, which makes the variable width estimators more attractive because no adjustment is needed to handle the negativity usually entailed by the latter. However, in a recent paper, Terrell and Scott show that these results ca...
On convergence of kernel learning estimators
Norkin, V.I.; Keyzer, M.A.
2009-01-01
The paper studies convex stochastic optimization problems in a reproducing kernel Hilbert space (RKHS). The objective (risk) functional depends on functions from this RKHS and takes the form of a mathematical expectation (integral) of a nonnegative integrand (loss function) over a probability
DEFF Research Database (Denmark)
Effraimidis, Georgios; Dahl, Christian Møller
In this paper, we develop a fully nonparametric approach for the estimation of the cumulative incidence function with Missing At Random right-censored competing risks data. We obtain results on the pointwise asymptotic normality as well as the uniform convergence rate of the proposed nonparametric...
Directory of Open Access Journals (Sweden)
Luis García Domínguez
2006-03-01
Full Text Available ABSTRACTKernel nonparametric nonlinear autoregression was applied to measles data from the pre-vaccination era (1944-1966. A slowly sliding time window covered 20 overlapping segments of the series. In the case of data from Birmingham the order of the model was higher than 22 for all windows and the reconstructed noise free realizations were periodic with the most probable period being equal to 3 years, though values of 2, 4 and 6 years were also obtained.For London data 6 windows were with low orders (below 5. Low order noise free realizations were chaotic. The rest presented periodic solutions corresponding to 1, 2, and 3-years cycles. Our results are consistent with views about dynamical transitions among measles data. The method is reliable and puts practically no restrictions regarding data properties. We recommend its use for further exploration of epidemic data from different origin. RESUMENPROPIEDADES NO LINEALES DE DATOS EPIDEMIOLÓGICOS DE SARAMPIÓN EVALUADAS MEDIANTE UN ENFOQUE DE IDENTIFICACIÓN NO LINEAL POR NÚCLEOS.Se aplicó un método de auto-regresión no lineal por núcleos a datos de incidencia de sarampión correspondientes a la época previa a la vacunación (1944-1966. Una ventana de tiempo que se desplazaba lentamente cubrió 20 segmentos de serie temporal que se solapaban. En el caso de los datos correspondientes a Birmingham el orden del modelo era mayor de 22 para todas las ventanas y las realizaciones libres de ruido reconstruidas eran periódicas con la duración del periodo más probable igual a 3 años, aunque también se obtuvieron valores de 2, 4 y 6 años.Para los datos de Londres, se observaron 6 ventanas con órdenes inferiores a 5. Las realizaciones libres de ruido con órdenes bajos eran caóticas. El resto de las ventanas mostraron ciclos de 1, 2 y tres años. Nuestros resultados son concordantes con la idea de la presencia de transiciones de fase en series de sarampión. El método es confiable y no
Kernel PLS Estimation of Single-trial Event-related Potentials
Rosipal, Roman; Trejo, Leonard J.
2004-01-01
Nonlinear kernel partial least squaes (KPLS) regressior, is a novel smoothing approach to nonparametric regression curve fitting. We have developed a KPLS approach to the estimation of single-trial event related potentials (ERPs). For improved accuracy of estimation, we also developed a local KPLS method for situations in which there exists prior knowledge about the approximate latency of individual ERP components. To assess the utility of the KPLS approach, we compared non-local KPLS and local KPLS smoothing with other nonparametric signal processing and smoothing methods. In particular, we examined wavelet denoising, smoothing splines, and localized smoothing splines. We applied these methods to the estimation of simulated mixtures of human ERPs and ongoing electroencephalogram (EEG) activity using a dipole simulator (BESA). In this scenario we considered ongoing EEG to represent spatially and temporally correlated noise added to the ERPs. This simulation provided a reasonable but simplified model of real-world ERP measurements. For estimation of the simulated single-trial ERPs, local KPLS provided a level of accuracy that was comparable with or better than the other methods. We also applied the local KPLS method to the estimation of human ERPs recorded in an experiment on co,onitive fatigue. For these data, the local KPLS method provided a clear improvement in visualization of single-trial ERPs as well as their averages. The local KPLS method may serve as a new alternative to the estimation of single-trial ERPs and improvement of ERP averages.
Rigorous home range estimation with movement data: a new autocorrelated kernel density estimator.
Fleming, C H; Fagan, W F; Mueller, T; Olson, K A; Leimgruber, P; Calabrese, J M
2015-05-01
Quantifying animals' home ranges is a key problem in ecology and has important conservation and wildlife management applications. Kernel density estimation (KDE) is a workhorse technique for range delineation problems that is both statistically efficient and nonparametric. KDE assumes that the data are independent and identically distributed (IID). However, animal tracking data, which are routinely used as inputs to KDEs, are inherently autocorrelated and violate this key assumption. As we demonstrate, using realistically autocorrelated data in conventional KDEs results in grossly underestimated home ranges. We further show that the performance of conventional KDEs actually degrades as data quality improves, because autocorrelation strength increases as movement paths become more finely resolved. To remedy these flaws with the traditional KDE method, we derive an autocorrelated KDE (AKDE) from first principles to use autocorrelated data, making it perfectly suited for movement data sets. We illustrate the vastly improved performance of AKDE using analytical arguments, relocation data from Mongolian gazelles, and simulations based upon the gazelle's observed movement process. By yielding better minimum area estimates for threatened wildlife populations, we believe that future widespread use of AKDE will have significant impact on ecology and conservation biology.
Prior processes and their applications nonparametric Bayesian estimation
Phadia, Eswar G
2016-01-01
This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...
Directory of Open Access Journals (Sweden)
Shanshan Yang
Full Text Available Detection of dysphonia is useful for monitoring the progression of phonatory impairment for patients with Parkinson's disease (PD, and also helps assess the disease severity. This paper describes the statistical pattern analysis methods to study different vocal measurements of sustained phonations. The feature dimension reduction procedure was implemented by using the sequential forward selection (SFS and kernel principal component analysis (KPCA methods. Four selected vocal measures were projected by the KPCA onto the bivariate feature space, in which the class-conditional feature densities can be approximated with the nonparametric kernel density estimation technique. In the vocal pattern classification experiments, Fisher's linear discriminant analysis (FLDA was applied to perform the linear classification of voice records for healthy control subjects and PD patients, and the maximum a posteriori (MAP decision rule and support vector machine (SVM with radial basis function kernels were employed for the nonlinear classification tasks. Based on the KPCA-mapped feature densities, the MAP classifier successfully distinguished 91.8% voice records, with a sensitivity rate of 0.986, a specificity rate of 0.708, and an area value of 0.94 under the receiver operating characteristic (ROC curve. The diagnostic performance provided by the MAP classifier was superior to those of the FLDA and SVM classifiers. In addition, the classification results indicated that gender is insensitive to dysphonia detection, and the sustained phonations of PD patients with minimal functional disability are more difficult to be correctly identified.
Moderate deviations principles for the kernel estimator of ...
African Journals Online (AJOL)
Abstract. The aim of this paper is to provide pointwise and uniform moderate deviations principles for the kernel estimator of a nonrandom regression function. Moreover, we give an application of these moderate deviations principles to the construction of condence regions for the regression function. Resume. L'objectif de ...
Corruption clubs: empirical evidence from kernel density estimates
Herzfeld, T.; Weiss, Ch.
2007-01-01
A common finding of many analytical models is the existence of multiple equilibria of corruption. Countries characterized by the same economic, social and cultural background do not necessarily experience the same levels of corruption. In this article, we use Kernel Density Estimation techniques to
A NONPARAMETRIC HYPOTHESIS TEST VIA THE BOOTSTRAP RESAMPLING
Temel, Tugrul T.
2001-01-01
This paper adapts an already existing nonparametric hypothesis test to the bootstrap framework. The test utilizes the nonparametric kernel regression method to estimate a measure of distance between the models stated under the null hypothesis. The bootstraped version of the test allows to approximate errors involved in the asymptotic hypothesis test. The paper also develops a Mathematica Code for the test algorithm.
Semi-Nonparametric Estimation and Misspecification Testing of Diffusion Models
DEFF Research Database (Denmark)
Kristensen, Dennis
of the estimators and tests under the null are derived, and the power properties are analyzed by considering contiguous alternatives. Test directly comparing the drift and diffusion estimators under the relevant null and alternative are also analyzed. Markov Bootstrap versions of the test statistics are proposed...... to improve on the finite-sample approximations. The finite sample properties of the estimators are examined in a simulation study....
Nonparametric Estimation of Distributions in Random Effects Models
Hart, Jeffrey D.; Cañ ette, Isabel
2011-01-01
to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article
Hamilton's gradient estimate for the heat kernel on complete manifolds
Kotschwar, Brett
2007-01-01
In this paper we extend a gradient estimate of R. Hamilton for positive solutions to the heat equation on closed manifolds to bounded positive solutions on complete, non-compact manifolds with $Rc \\geq -Kg$. We accomplish this extension via a maximum principle of L. Karp and P. Li and a Bernstein-type estimate on the gradient of the solution. An application of our result, together with the bounds of P. Li and S.T. Yau, yields an estimate on the gradient of the heat kernel for complete manifol...
Bayesian nonparametric estimation of hazard rate in monotone Aalen model
Czech Academy of Sciences Publication Activity Database
Timková, Jana
2014-01-01
Roč. 50, č. 6 (2014), s. 849-868 ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf
Multivariate and semiparametric kernel regression
Härdle, Wolfgang; Müller, Marlene
1997-01-01
The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...
Nonparametric Estimation of Interval Reliability for Discrete-Time Semi-Markov Systems
DEFF Research Database (Denmark)
Georgiadis, Stylianos; Limnios, Nikolaos
2016-01-01
In this article, we consider a repairable discrete-time semi-Markov system with finite state space. The measure of the interval reliability is given as the probability of the system being operational over a given finite-length time interval. A nonparametric estimator is proposed for the interval...
The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models
GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.
2008-01-01
In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.
Non-parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods
DEFF Research Database (Denmark)
Høg, Esben
In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean-reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...
Non-Parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods
DEFF Research Database (Denmark)
Høg, Esben
2003-01-01
In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean--reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...
Nonparametric estimation of the stationary M/G/1 workload distribution function
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted
2005-01-01
In this paper it is demonstrated how a nonparametric estimator of the stationary workload distribution function of the M/G/1-queue can be obtained by systematic sampling the workload process. Weak convergence results and bootstrap methods for empirical distribution functions for stationary associ...
Nonparametric estimation in an "illness-death" model when all transition times are interval censored
DEFF Research Database (Denmark)
Frydman, Halina; Gerds, Thomas; Grøn, Randi
2013-01-01
We develop nonparametric maximum likelihood estimation for the parameters of an irreversible Markov chain on states {0,1,2} from the observations with interval censored times of 0 → 1, 0 → 2 and 1 → 2 transitions. The distinguishing aspect of the data is that, in addition to all transition times ...
Nonparametric Estimation of Regression Parameters in Measurement Error Models
Czech Academy of Sciences Publication Activity Database
Ehsanes Saleh, A.K.M.D.; Picek, J.; Kalina, Jan
2009-01-01
Roč. 67, č. 2 (2009), s. 177-200 ISSN 0026-1424 Grant - others:GA AV ČR(CZ) IAA101120801; GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z10300504 Keywords : asymptotic relative efficiency(ARE) * asymptotic theory * emaculate mode * Me model * R-estimation * Reliabilty ratio(RR) Subject RIV: BB - Applied Statistics, Operational Research
MAP estimators and their consistency in Bayesian nonparametric inverse problems
Dashti, M.
2013-09-01
We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μy. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager-Machlup functional defined on the Cameron-Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier-Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. © 2013 IOP Publishing Ltd.
MAP estimators and their consistency in Bayesian nonparametric inverse problems
International Nuclear Information System (INIS)
Dashti, M; Law, K J H; Stuart, A M; Voss, J
2013-01-01
We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map G applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ 0 . We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μ y . Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of G(u) can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. (paper)
Nonparametric conditional predictive regions for time series
de Gooijer, J.G.; Zerom Godefay, D.
2000-01-01
Several nonparametric predictors based on the Nadaraya-Watson kernel regression estimator have been proposed in the literature. They include the conditional mean, the conditional median, and the conditional mode. In this paper, we consider three types of predictive regions for these predictors — the
Dai, Wenlin; Tong, Tiejun; Zhu, Lixing
2017-01-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Dai, Wenlin
2017-09-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Nonparametric adaptive estimation of linear functionals for low frequency observed Lévy processes
Kappus, Johanna
2012-01-01
For a Lévy process X having finite variation on compact sets and finite first moments, Âµ( dx) = xv( dx) is a finite signed measure which completely describes the jump dynamics. We construct kernel estimators for linear functionals of Âµ and provide rates of convergence under regularity assumptions. Moreover, we consider adaptive estimation via model selection and propose a new strategy for the data driven choice of the smoothing parameter.
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Wang, Yuanjia; Garcia, Tanya P; Ma, Yanyuan
2012-01-01
This work presents methods for estimating genotype-specific distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs) which do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators which do not assume parametric density models and are easy to implement. They are based on the inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). The AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington's Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated non-carrier survival rates to that of the U.S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared to non-carriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic testing, and in facilitating future subjects at risk
Smooth semi-nonparametric (SNP) estimation of the cumulative incidence function.
Duc, Anh Nguyen; Wolbers, Marcel
2017-08-15
This paper presents a novel approach to estimation of the cumulative incidence function in the presence of competing risks. The underlying statistical model is specified via a mixture factorization of the joint distribution of the event type and the time to the event. The time to event distributions conditional on the event type are modeled using smooth semi-nonparametric densities. One strength of this approach is that it can handle arbitrary censoring and truncation while relying on mild parametric assumptions. A stepwise forward algorithm for model estimation and adaptive selection of smooth semi-nonparametric polynomial degrees is presented, implemented in the statistical software R, evaluated in a sequence of simulation studies, and applied to data from a clinical trial in cryptococcal meningitis. The simulations demonstrate that the proposed method frequently outperforms both parametric and nonparametric alternatives. They also support the use of 'ad hoc' asymptotic inference to derive confidence intervals. An extension to regression modeling is also presented, and its potential and challenges are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
Directory of Open Access Journals (Sweden)
Terrance D. Savitsky
2016-08-01
Full Text Available We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS. The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1 A Gaussian process (GP construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2 An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households, rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot
Sardet, Laure; Patilea, Valentin
When pricing a specific insurance premium, actuary needs to evaluate the claims cost distribution for the warranty. Traditional actuarial methods use parametric specifications to model claims distribution, like lognormal, Weibull and Pareto laws. Mixtures of such distributions allow to improve the flexibility of the parametric approach and seem to be quite well-adapted to capture the skewness, the long tails as well as the unobserved heterogeneity among the claims. In this paper, instead of looking for a finely tuned mixture with many components, we choose a parsimonious mixture modeling, typically a two or three-component mixture. Next, we use the mixture cumulative distribution function (CDF) to transform data into the unit interval where we apply a beta-kernel smoothing procedure. A bandwidth rule adapted to our methodology is proposed. Finally, the beta-kernel density estimate is back-transformed to recover an estimate of the original claims density. The beta-kernel smoothing provides an automatic fine-tuning of the parsimonious mixture and thus avoids inference in more complex mixture models with many parameters. We investigate the empirical performance of the new method in the estimation of the quantiles with simulated nonnegative data and the quantiles of the individual claims distribution in a non-life insurance application.
Kernel density estimation-based real-time prediction for respiratory motion
International Nuclear Information System (INIS)
Ruan, Dan
2010-01-01
Effective delivery of adaptive radiotherapy requires locating the target with high precision in real time. System latency caused by data acquisition, streaming, processing and delivery control necessitates prediction. Prediction is particularly challenging for highly mobile targets such as thoracic and abdominal tumors undergoing respiration-induced motion. The complexity of the respiratory motion makes it difficult to build and justify explicit models. In this study, we honor the intrinsic uncertainties in respiratory motion and propose a statistical treatment of the prediction problem. Instead of asking for a deterministic covariate-response map and a unique estimate value for future target position, we aim to obtain a distribution of the future target position (response variable) conditioned on the observed historical sample values (covariate variable). The key idea is to estimate the joint probability distribution (pdf) of the covariate and response variables using an efficient kernel density estimation method. Then, the problem of identifying the distribution of the future target position reduces to identifying the section in the joint pdf based on the observed covariate. Subsequently, estimators are derived based on this estimated conditional distribution. This probabilistic perspective has some distinctive advantages over existing deterministic schemes: (1) it is compatible with potentially inconsistent training samples, i.e., when close covariate variables correspond to dramatically different response values; (2) it is not restricted by any prior structural assumption on the map between the covariate and the response; (3) the two-stage setup allows much freedom in choosing statistical estimates and provides a full nonparametric description of the uncertainty for the resulting estimate. We evaluated the prediction performance on ten patient RPM traces, using the root mean squared difference between the prediction and the observed value normalized by the
Robustifying Bayesian nonparametric mixtures for count data.
Canale, Antonio; Prünster, Igor
2017-03-01
Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Yang, Shanshan; Zheng, Fang; Luo, Xin; Cai, Suxian; Wu, Yunfeng; Liu, Kaizhi; Wu, Meihong; Chen, Jian; Krishnan, Sridhar
2014-01-01
Detection of dysphonia is useful for monitoring the progression of phonatory impairment for patients with Parkinson’s disease (PD), and also helps assess the disease severity. This paper describes the statistical pattern analysis methods to study different vocal measurements of sustained phonations. The feature dimension reduction procedure was implemented by using the sequential forward selection (SFS) and kernel principal component analysis (KPCA) methods. Four selected vocal measures were projected by the KPCA onto the bivariate feature space, in which the class-conditional feature densities can be approximated with the nonparametric kernel density estimation technique. In the vocal pattern classification experiments, Fisher’s linear discriminant analysis (FLDA) was applied to perform the linear classification of voice records for healthy control subjects and PD patients, and the maximum a posteriori (MAP) decision rule and support vector machine (SVM) with radial basis function kernels were employed for the nonlinear classification tasks. Based on the KPCA-mapped feature densities, the MAP classifier successfully distinguished 91.8% voice records, with a sensitivity rate of 0.986, a specificity rate of 0.708, and an area value of 0.94 under the receiver operating characteristic (ROC) curve. The diagnostic performance provided by the MAP classifier was superior to those of the FLDA and SVM classifiers. In addition, the classification results indicated that gender is insensitive to dysphonia detection, and the sustained phonations of PD patients with minimal functional disability are more difficult to be correctly identified. PMID:24586406
Non-parametric estimation of the availability in a general repairable system
International Nuclear Information System (INIS)
Gamiz, M.L.; Roman, Y.
2008-01-01
This work deals with repairable systems with unknown failure and repair time distributions. We focus on the estimation of the instantaneous availability, that is, the probability that the system is functioning at a given time, which we consider as the most significant measure for evaluating the effectiveness of a repairable system. The estimation of the availability function is not, in general, an easy task, i.e., analytical techniques are difficult to apply. We propose a smooth estimation of the availability based on kernel estimator of the cumulative distribution functions (CDF) of the failure and repair times, for which the bandwidth parameters are obtained by bootstrap procedures. The consistency properties of the availability estimator are established by using techniques based on the Laplace transform
Non-parametric estimation of the availability in a general repairable system
Energy Technology Data Exchange (ETDEWEB)
Gamiz, M.L. [Departamento de Estadistica e I.O., Facultad de Ciencias, Universidad de Granada, Granada 18071 (Spain)], E-mail: mgamiz@ugr.es; Roman, Y. [Departamento de Estadistica e I.O., Facultad de Ciencias, Universidad de Granada, Granada 18071 (Spain)
2008-08-15
This work deals with repairable systems with unknown failure and repair time distributions. We focus on the estimation of the instantaneous availability, that is, the probability that the system is functioning at a given time, which we consider as the most significant measure for evaluating the effectiveness of a repairable system. The estimation of the availability function is not, in general, an easy task, i.e., analytical techniques are difficult to apply. We propose a smooth estimation of the availability based on kernel estimator of the cumulative distribution functions (CDF) of the failure and repair times, for which the bandwidth parameters are obtained by bootstrap procedures. The consistency properties of the availability estimator are established by using techniques based on the Laplace transform.
Semi-nonparametric estimates of interfuel substitution in US energy demand
Energy Technology Data Exchange (ETDEWEB)
Serletis, A.; Shahmoradi, A. [University of Calgary, Calgary, AB (Canada). Dept. of Economics
2008-09-15
This paper focuses on the demand for crude oil, natural gas, and coal in the United States in the context of two globally flexible functional forms - the Fourier and the Asymptotically Ideal Model (AIM) - estimated subject to full regularity, using methods suggested over 20 years ago by Gallant and Golub (Gallant, A. Ronald and Golub, Gene H. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics 26 (1984), 295-321) and recently used by Serletis and Shahmoradi (Serletis, A., Shahmoradi, A., 2005. Semi-nonparametric estimates of the demand for money in the United States. Macroeconomic Dynamics 9, 542-559) in the monetary demand systems literature. We provide a comparison in terms of a full set of elasticities and also a policy perspective, using (for the first time) parameter estimates that are consistent with global regularity.
Semi-nonparametric estimates of interfuel substitution in U.S. energy demand
Energy Technology Data Exchange (ETDEWEB)
Serletis, Apostolos [Department of Economics, University of Calgary, Calgary, Alberta (Canada); Shahmoradi, Asghar [Faculty of Economics, University of Tehran, Tehran (Iran)
2008-09-15
This paper focuses on the demand for crude oil, natural gas, and coal in the United States in the context of two globally flexible functional forms - the Fourier and the Asymptotically Ideal Model (AIM) - estimated subject to full regularity, using methods suggested over 20 years ago by Gallant and Golub [Gallant, A. Ronald and Golub, Gene H. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics 26 (1984), 295-321] and recently used by Serletis and Shahmoradi [Serletis, A., Shahmoradi, A., 2005. Semi-nonparametric estimates of the demand for money in the United States. Macroeconomic Dynamics 9, 542-559] in the monetary demand systems literature. We provide a comparison in terms of a full set of elasticities and also a policy perspective, using (for the first time) parameter estimates that are consistent with global regularity. (author)
Transformation-invariant and nonparametric monotone smooth estimation of ROC curves.
Du, Pang; Tang, Liansheng
2009-01-30
When a new diagnostic test is developed, it is of interest to evaluate its accuracy in distinguishing diseased subjects from non-diseased subjects. The accuracy of the test is often evaluated by receiver operating characteristic (ROC) curves. Smooth ROC estimates are often preferable for continuous test results when the underlying ROC curves are in fact continuous. Nonparametric and parametric methods have been proposed by various authors to obtain smooth ROC curve estimates. However, there are certain drawbacks with the existing methods. Parametric methods need specific model assumptions. Nonparametric methods do not always satisfy the inherent properties of the ROC curves, such as monotonicity and transformation invariance. In this paper we propose a monotone spline approach to obtain smooth monotone ROC curves. Our method ensures important inherent properties of the underlying ROC curves, which include monotonicity, transformation invariance, and boundary constraints. We compare the finite sample performance of the newly proposed ROC method with other ROC smoothing methods in large-scale simulation studies. We illustrate our method through a real life example. Copyright (c) 2008 John Wiley & Sons, Ltd.
Carroll, Raymond J.
2011-03-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.
International Nuclear Information System (INIS)
Morio, Jerome
2011-01-01
Importance sampling (IS) is a useful simulation technique to estimate critical probability with a better accuracy than Monte Carlo methods. It consists in generating random weighted samples from an auxiliary distribution rather than the distribution of interest. The crucial part of this algorithm is the choice of an efficient auxiliary PDF that has to be able to simulate more rare random events. The optimisation of this auxiliary distribution is often in practice very difficult. In this article, we propose to approach the IS optimal auxiliary density with non-parametric adaptive importance sampling (NAIS). We apply this technique for the probability estimation of spatial launcher impact position since it has currently become a more and more important issue in the field of aeronautics.
Nonparametric estimation of age-specific reference percentile curves with radial smoothing.
Wan, Xiaohai; Qu, Yongming; Huang, Yao; Zhang, Xiao; Song, Hanping; Jiang, Honghua
2012-01-01
Reference percentile curves represent the covariate-dependent distribution of a quantitative measurement and are often used to summarize and monitor dynamic processes such as human growth. We propose a new nonparametric method based on a radial smoothing (RS) technique to estimate age-specific reference percentile curves assuming the underlying distribution is relatively close to normal. We compared the RS method with both the LMS and the generalized additive models for location, scale and shape (GAMLSS) methods using simulated data and found that our method has smaller estimation error than the two existing methods. We also applied the new method to analyze height growth data from children being followed in a clinical observational study of growth hormone treatment, and compared the growth curves between those with growth disorders and the general population. Copyright © 2011 Elsevier Inc. All rights reserved.
Non-parametric PSF estimation from celestial transit solar images using blind deconvolution
Directory of Open Access Journals (Sweden)
González Adriana
2016-01-01
Full Text Available Context: Characterization of instrumental effects in astronomical imaging is important in order to extract accurate physical information from the observations. The measured image in a real optical instrument is usually represented by the convolution of an ideal image with a Point Spread Function (PSF. Additionally, the image acquisition process is also contaminated by other sources of noise (read-out, photon-counting. The problem of estimating both the PSF and a denoised image is called blind deconvolution and is ill-posed. Aims: We propose a blind deconvolution scheme that relies on image regularization. Contrarily to most methods presented in the literature, our method does not assume a parametric model of the PSF and can thus be applied to any telescope. Methods: Our scheme uses a wavelet analysis prior model on the image and weak assumptions on the PSF. We use observations from a celestial transit, where the occulting body can be assumed to be a black disk. These constraints allow us to retain meaningful solutions for the filter and the image, eliminating trivial, translated, and interchanged solutions. Under an additive Gaussian noise assumption, they also enforce noise canceling and avoid reconstruction artifacts by promoting the whiteness of the residual between the blurred observations and the cleaned data. Results: Our method is applied to synthetic and experimental data. The PSF is estimated for the SECCHI/EUVI instrument using the 2007 Lunar transit, and for SDO/AIA using the 2012 Venus transit. Results show that the proposed non-parametric blind deconvolution method is able to estimate the core of the PSF with a similar quality to parametric methods proposed in the literature. We also show that, if these parametric estimations are incorporated in the acquisition model, the resulting PSF outperforms both the parametric and non-parametric methods.
Modeling reactive transport with particle tracking and kernel estimators
Rahbaralam, Maryam; Fernandez-Garcia, Daniel; Sanchez-Vila, Xavier
2015-04-01
Groundwater reactive transport models are useful to assess and quantify the fate and transport of contaminants in subsurface media and are an essential tool for the analysis of coupled physical, chemical, and biological processes in Earth Systems. Particle Tracking Method (PTM) provides a computationally efficient and adaptable approach to solve the solute transport partial differential equation. On a molecular level, chemical reactions are the result of collisions, combinations, and/or decay of different species. For a well-mixed system, the chem- ical reactions are controlled by the classical thermodynamic rate coefficient. Each of these actions occurs with some probability that is a function of solute concentrations. PTM is based on considering that each particle actually represents a group of molecules. To properly simulate this system, an infinite number of particles is required, which is computationally unfeasible. On the other hand, a finite number of particles lead to a poor-mixed system which is limited by diffusion. Recent works have used this effect to actually model incomplete mix- ing in naturally occurring porous media. In this work, we demonstrate that this effect in most cases should be attributed to a defficient estimation of the concentrations and not to the occurrence of true incomplete mixing processes in porous media. To illustrate this, we show that a Kernel Density Estimation (KDE) of the concentrations can approach the well-mixed solution with a limited number of particles. KDEs provide weighting functions of each particle mass that expands its region of influence, hence providing a wider region for chemical reactions with time. Simulation results show that KDEs are powerful tools to improve state-of-the-art simulations of chemical reactions and indicates that incomplete mixing in diluted systems should be modeled based on alternative conceptual models and not on a limited number of particles.
Lee, Soojeong; Rajan, Sreeraman; Jeon, Gwanggil; Chang, Joon-Hyuk; Dajani, Hilmi R; Groza, Voicu Z
2017-06-01
Blood pressure (BP) is one of the most important vital indicators and plays a key role in determining the cardiovascular activity of patients. This paper proposes a hybrid approach consisting of nonparametric bootstrap (NPB) and machine learning techniques to obtain the characteristic ratios (CR) used in the blood pressure estimation algorithm to improve the accuracy of systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates and obtain confidence intervals (CI). The NPB technique is used to circumvent the requirement for large sample set for obtaining the CI. A mixture of Gaussian densities is assumed for the CRs and Gaussian mixture model (GMM) is chosen to estimate the SBP and DBP ratios. The K-means clustering technique is used to obtain the mixture order of the Gaussian densities. The proposed approach achieves grade "A" under British Society of Hypertension testing protocol and is superior to the conventional approach based on maximum amplitude algorithm (MAA) that uses fixed CR ratios. The proposed approach also yields a lower mean error (ME) and the standard deviation of the error (SDE) in the estimates when compared to the conventional MAA method. In addition, CIs obtained through the proposed hybrid approach are also narrower with a lower SDE. The proposed approach combining the NPB technique with the GMM provides a methodology to derive individualized characteristic ratio. The results exhibit that the proposed approach enhances the accuracy of SBP and DBP estimation and provides narrower confidence intervals for the estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Le Bihan, Nicolas; Margerin, Ludovic
2009-07-01
In this paper, we present a nonparametric method to estimate the heterogeneity of a random medium from the angular distribution of intensity of waves transmitted through a slab of random material. Our approach is based on the modeling of forward multiple scattering using compound Poisson processes on compact Lie groups. The estimation technique is validated through numerical simulations based on radiative transfer theory.
Genomic outlier profile analysis: mixture models, null hypotheses, and nonparametric estimation.
Ghosh, Debashis; Chinnaiyan, Arul M
2009-01-01
In most analyses of large-scale genomic data sets, differential expression analysis is typically assessed by testing for differences in the mean of the distributions between 2 groups. A recent finding by Tomlins and others (2005) is of a different type of pattern of differential expression in which a fraction of samples in one group have overexpression relative to samples in the other group. In this work, we describe a general mixture model framework for the assessment of this type of expression, called outlier profile analysis. We start by considering the single-gene situation and establishing results on identifiability. We propose 2 nonparametric estimation procedures that have natural links to familiar multiple testing procedures. We then develop multivariate extensions of this methodology to handle genome-wide measurements. The proposed methodologies are compared using simulation studies as well as data from a prostate cancer gene expression study.
Nonparametric estimates of drift and diffusion profiles via Fokker-Planck algebra.
Lund, Steven P; Hubbard, Joseph B; Halter, Michael
2014-11-06
Diffusion processes superimposed upon deterministic motion play a key role in understanding and controlling the transport of matter, energy, momentum, and even information in physics, chemistry, material science, biology, and communications technology. Given functions defining these random and deterministic components, the Fokker-Planck (FP) equation is often used to model these diffusive systems. Many methods exist for estimating the drift and diffusion profiles from one or more identifiable diffusive trajectories; however, when many identical entities diffuse simultaneously, it may not be possible to identify individual trajectories. Here we present a method capable of simultaneously providing nonparametric estimates for both drift and diffusion profiles from evolving density profiles, requiring only the validity of Langevin/FP dynamics. This algebraic FP manipulation provides a flexible and robust framework for estimating stationary drift and diffusion coefficient profiles, is not based on fluctuation theory or solved diffusion equations, and may facilitate predictions for many experimental systems. We illustrate this approach on experimental data obtained from a model lipid bilayer system exhibiting free diffusion and electric field induced drift. The wide range over which this approach provides accurate estimates for drift and diffusion profiles is demonstrated through simulation.
On Convergence of Kernel Density Estimates in Particle Filtering
Czech Academy of Sciences Publication Activity Database
Coufal, David
2016-01-01
Roč. 52, č. 5 (2016), s. 735-756 ISSN 0023-5954 Grant - others:GA ČR(CZ) GA16-03708S; SVV(CZ) 260334/2016 Institutional support: RVO:67985807 Keywords : Fourier analysis * kernel methods * particle filter Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.379, year: 2016
Two-component mixture cure rate model with spline estimated nonparametric components.
Wang, Lu; Du, Pang; Liang, Hua
2012-09-01
In some survival analysis of medical studies, there are often long-term survivors who can be considered as permanently cured. The goals in these studies are to estimate the noncured probability of the whole population and the hazard rate of the susceptible subpopulation. When covariates are present as often happens in practice, to understand covariate effects on the noncured probability and hazard rate is of equal importance. The existing methods are limited to parametric and semiparametric models. We propose a two-component mixture cure rate model with nonparametric forms for both the cure probability and the hazard rate function. Identifiability of the model is guaranteed by an additive assumption that allows no time-covariate interactions in the logarithm of hazard rate. Estimation is carried out by an expectation-maximization algorithm on maximizing a penalized likelihood. For inferential purpose, we apply the Louis formula to obtain point-wise confidence intervals for noncured probability and hazard rate. Asymptotic convergence rates of our function estimates are established. We then evaluate the proposed method by extensive simulations. We analyze the survival data from a melanoma study and find interesting patterns for this study. © 2011, The International Biometric Society.
Xu, Yonghong; Gao, Xiaohuan; Wang, Zhengxi
2014-04-01
Missing data represent a general problem in many scientific fields, especially in medical survival analysis. Dealing with censored data, interpolation method is one of important methods. However, most of the interpolation methods replace the censored data with the exact data, which will distort the real distribution of the censored data and reduce the probability of the real data falling into the interpolation data. In order to solve this problem, we in this paper propose a nonparametric method of estimating the survival function of right-censored and interval-censored data and compare its performance to SC (self-consistent) algorithm. Comparing to the average interpolation and the nearest neighbor interpolation method, the proposed method in this paper replaces the right-censored data with the interval-censored data, and greatly improves the probability of the real data falling into imputation interval. Then it bases on the empirical distribution theory to estimate the survival function of right-censored and interval-censored data. The results of numerical examples and a real breast cancer data set demonstrated that the proposed method had higher accuracy and better robustness for the different proportion of the censored data. This paper provides a good method to compare the clinical treatments performance with estimation of the survival data of the patients. This pro vides some help to the medical survival data analysis.
Chen, Rongda; Wang, Ze
2013-01-01
Recovery rate is essential to the estimation of the portfolio's loss and economic capital. Neglecting the randomness of the distribution of recovery rate may underestimate the risk. The study introduces two kinds of models of distribution, Beta distribution estimation and kernel density distribution estimation, to simulate the distribution of recovery rates of corporate loans and bonds. As is known, models based on Beta distribution are common in daily usage, such as CreditMetrics by J.P. Morgan, Portfolio Manager by KMV and Losscalc by Moody's. However, it has a fatal defect that it can't fit the bimodal or multimodal distributions such as recovery rates of corporate loans and bonds as Moody's new data show. In order to overcome this flaw, the kernel density estimation is introduced and we compare the simulation results by histogram, Beta distribution estimation and kernel density estimation to reach the conclusion that the Gaussian kernel density distribution really better imitates the distribution of the bimodal or multimodal data samples of corporate loans and bonds. Finally, a Chi-square test of the Gaussian kernel density estimation proves that it can fit the curve of recovery rates of loans and bonds. So using the kernel density distribution to precisely delineate the bimodal recovery rates of bonds is optimal in credit risk management.
Directory of Open Access Journals (Sweden)
Rongda Chen
Full Text Available Recovery rate is essential to the estimation of the portfolio's loss and economic capital. Neglecting the randomness of the distribution of recovery rate may underestimate the risk. The study introduces two kinds of models of distribution, Beta distribution estimation and kernel density distribution estimation, to simulate the distribution of recovery rates of corporate loans and bonds. As is known, models based on Beta distribution are common in daily usage, such as CreditMetrics by J.P. Morgan, Portfolio Manager by KMV and Losscalc by Moody's. However, it has a fatal defect that it can't fit the bimodal or multimodal distributions such as recovery rates of corporate loans and bonds as Moody's new data show. In order to overcome this flaw, the kernel density estimation is introduced and we compare the simulation results by histogram, Beta distribution estimation and kernel density estimation to reach the conclusion that the Gaussian kernel density distribution really better imitates the distribution of the bimodal or multimodal data samples of corporate loans and bonds. Finally, a Chi-square test of the Gaussian kernel density estimation proves that it can fit the curve of recovery rates of loans and bonds. So using the kernel density distribution to precisely delineate the bimodal recovery rates of bonds is optimal in credit risk management.
Chen, Rongda; Wang, Ze
2013-01-01
Recovery rate is essential to the estimation of the portfolio’s loss and economic capital. Neglecting the randomness of the distribution of recovery rate may underestimate the risk. The study introduces two kinds of models of distribution, Beta distribution estimation and kernel density distribution estimation, to simulate the distribution of recovery rates of corporate loans and bonds. As is known, models based on Beta distribution are common in daily usage, such as CreditMetrics by J.P. Morgan, Portfolio Manager by KMV and Losscalc by Moody’s. However, it has a fatal defect that it can’t fit the bimodal or multimodal distributions such as recovery rates of corporate loans and bonds as Moody’s new data show. In order to overcome this flaw, the kernel density estimation is introduced and we compare the simulation results by histogram, Beta distribution estimation and kernel density estimation to reach the conclusion that the Gaussian kernel density distribution really better imitates the distribution of the bimodal or multimodal data samples of corporate loans and bonds. Finally, a Chi-square test of the Gaussian kernel density estimation proves that it can fit the curve of recovery rates of loans and bonds. So using the kernel density distribution to precisely delineate the bimodal recovery rates of bonds is optimal in credit risk management. PMID:23874558
Sueiro, Manuel J.; Abad, Francisco J.
2011-01-01
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Graziani, Rebecca; Guindani, Michele; Thall, Peter F.
2015-01-01
Summary The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212
International Nuclear Information System (INIS)
Bobrowski, Sebastian; Chen, Hong; Döring, Maik; Jensen, Uwe; Schinköthe, Wolfgang
2015-01-01
In practice manufacturers may have lots of failure data of similar products using the same technology basis under different operating conditions. Thus, one can try to derive predictions for the distribution of the lifetime of newly developed components or new application environments through the existing data using regression models based on covariates. Three categories of such regression models are considered: a parametric, a semiparametric and a nonparametric approach. First, we assume that the lifetime is Weibull distributed, where its parameters are modelled as linear functions of the covariate. Second, the Cox proportional hazards model, well-known in Survival Analysis, is applied. Finally, a kernel estimator is used to interpolate between empirical distribution functions. In particular the last case is new in the context of reliability analysis. We propose a goodness of fit measure (GoF), which can be applied to all three types of regression models. Using this GoF measure we discuss a new model selection procedure. To illustrate this method of reliability prediction, the three classes of regression models are applied to real test data of motor experiments. Further the performance of the approaches is investigated by Monte Carlo simulations. - Highlights: • We estimate the lifetime distribution in the presence of a covariate. • Three types of regression models are considered and compared. • A new nonparametric estimator based on our particular data structure is introduced. • We propose a goodness of fit measure and show a new model selection procedure. • A case study with real data and Monte Carlo simulations are performed
Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices
Hiroyuki Kasahara; Katsumi Shimotsu
2006-01-01
In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...
Directory of Open Access Journals (Sweden)
YU Wenhao
2015-01-01
Full Text Available The distribution pattern and the distribution density of urban facility POIs are of great significance in the fields of infrastructure planning and urban spatial analysis. The kernel density estimation, which has been usually utilized for expressing these spatial characteristics, is superior to other density estimation methods (such as Quadrat analysis, Voronoi-based method, for that the Kernel density estimation considers the regional impact based on the first law of geography. However, the traditional kernel density estimation is mainly based on the Euclidean space, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. Hence, this research proposed a computational model of network kernel density estimation, and the extension type of model in the case of adding constraints. This work also discussed the impacts of distance attenuation threshold and height extreme to the representation of kernel density. The large-scale actual data experiment for analyzing the different POIs' distribution patterns (random type, sparse type, regional-intensive type, linear-intensive type discusses the POI infrastructure in the city on the spatial distribution of characteristics, influence factors, and service functions.
Essays on parametric and nonparametric modeling and estimation with applications to energy economics
Gao, Weiyu
My dissertation research is composed of two parts: a theoretical part on semiparametric efficient estimation and an applied part in energy economics under different dynamic settings. The essays are related in terms of their applications as well as the way in which models are constructed and estimated. In the first essay, efficient estimation of the partially linear model is studied. We work out the efficient score functions and efficiency bounds under four stochastic restrictions---independence, conditional symmetry, conditional zero mean, and partially conditional zero mean. A feasible efficient estimation method for the linear part of the model is developed based on the efficient score. A battery of specification test that allows for choosing between the alternative assumptions is provided. A Monte Carlo simulation is also conducted. The second essay presents a dynamic optimization model for a stylized oilfield resembling the largest developed light oil field in Saudi Arabia, Ghawar. We use data from different sources to estimate the oil production cost function and the revenue function. We pay particular attention to the dynamic aspect of the oil production by employing petroleum-engineering software to simulate the interaction between control variables and reservoir state variables. Optimal solutions are studied under different scenarios to account for the possible changes in the exogenous variables and the uncertainty about the forecasts. The third essay examines the effect of oil price volatility on the level of innovation displayed by the U.S. economy. A measure of innovation is calculated by decomposing an output-based Malmquist index. We also construct a nonparametric measure for oil price volatility. Technical change and oil price volatility are then placed in a VAR system with oil price and a variable indicative of monetary policy. The system is estimated and analyzed for significant relationships. We find that oil price volatility displays a significant
DEFF Research Database (Denmark)
Rasmussen, Peter Mondrup; Abrahamsen, Trine Julie; Madsen, Kristoffer Hougaard
2012-01-01
We investigate the use of kernel principal component analysis (PCA) and the inverse problem known as pre-image estimation in neuroimaging: i) We explore kernel PCA and pre-image estimation as a means for image denoising as part of the image preprocessing pipeline. Evaluation of the denoising...... procedure is performed within a data-driven split-half evaluation framework. ii) We introduce manifold navigation for exploration of a nonlinear data manifold, and illustrate how pre-image estimation can be used to generate brain maps in the continuum between experimentally defined brain states/classes. We...
Kernel and wavelet density estimators on manifolds and more general metric spaces
DEFF Research Database (Denmark)
Cleanthous, G.; Georgiadis, Athanasios; Kerkyacharian, G.
We consider the problem of estimating the density of observations taking values in classical or nonclassical spaces such as manifolds and more general metric spaces. Our setting is quite general but also sufficiently rich in allowing the development of smooth functional calculus with well localized...... spectral kernels, Besov regularity spaces, and wavelet type systems. Kernel and both linear and nonlinear wavelet density estimators are introduced and studied. Convergence rates for these estimators are established, which are analogous to the existing results in the classical setting of real...
Interpolation of Missing Precipitation Data Using Kernel Estimations for Hydrologic Modeling
Directory of Open Access Journals (Sweden)
Hyojin Lee
2015-01-01
Full Text Available Precipitation is the main factor that drives hydrologic modeling; therefore, missing precipitation data can cause malfunctions in hydrologic modeling. Although interpolation of missing precipitation data is recognized as an important research topic, only a few methods follow a regression approach. In this study, daily precipitation data were interpolated using five different kernel functions, namely, Epanechnikov, Quartic, Triweight, Tricube, and Cosine, to estimate missing precipitation data. This study also presents an assessment that compares estimation of missing precipitation data through Kth nearest neighborhood (KNN regression to the five different kernel estimations and their performance in simulating streamflow using the Soil Water Assessment Tool (SWAT hydrologic model. The results show that the kernel approaches provide higher quality interpolation of precipitation data compared with the KNN regression approach, in terms of both statistical data assessment and hydrologic modeling performance.
Clustering via Kernel Decomposition
DEFF Research Database (Denmark)
Have, Anna Szynkowiak; Girolami, Mark A.; Larsen, Jan
2006-01-01
Methods for spectral clustering have been proposed recently which rely on the eigenvalue decomposition of an affinity matrix. In this work it is proposed that the affinity matrix is created based on the elements of a non-parametric density estimator. This matrix is then decomposed to obtain...... posterior probabilities of class membership using an appropriate form of nonnegative matrix factorization. The troublesome selection of hyperparameters such as kernel width and number of clusters can be obtained using standard cross-validation methods as is demonstrated on a number of diverse data sets....
International Nuclear Information System (INIS)
Uchida, Isao; Yamada, Yasuhiko; Yamashita, Takashi; Okigaki, Shigeyasu; Oyamada, Hiyoshimaru; Ito, Akira.
1995-01-01
In radiotherapy with radiopharmaceuticals, more accurate estimates of the three-dimensional (3-D) distribution of absorbed dose is important in specifying the activity to be administered to patients to deliver a prescribed absorbed dose to target volumes without exceeding the toxicity limit of normal tissues in the body. A calculation algorithm for the purpose has already been developed by the authors. An accurate 3-D distribution of absorbed dose based on the algorithm is given by convolution of the 3-D dose matrix for a unit cubic voxel containing unit cumulated activity, which is obtained by transforming a dose point kernel into a 3-D cubic dose matrix, with the 3-D cumulated activity distribution given by the same voxel size. However, beta-dose point kernels affecting accurate estimates of the 3-D absorbed dose distribution have been different among the investigators. The purpose of this study is to elucidate how different beta-dose point kernels in water influence on the estimates of the absorbed dose distribution due to the dose point kernel convolution method by the authors. Computer simulations were performed using the MIRD thyroid and lung phantoms under assumption of uniform activity distribution of 32 P. Using beta-dose point kernels derived from Monte Carlo simulations (EGS-4 or ACCEPT computer code), the differences among their point kernels gave little differences for the mean and maximum absorbed dose estimates for the MIRD phantoms used. In the estimates of mean and maximum absorbed doses calculated using different cubic voxel sizes (4x4x4 mm and 8x8x8 mm) for the MIRD thyroid phantom, the maximum absorbed doses for the 4x4x4 mm-voxel were estimated approximately 7% greater than the cases of the 8x8x8 mm-voxel. They were found in every beta-dose point kernel used in this study. On the other hand, the percentage difference of the mean absorbed doses in the both voxel sizes for each beta-dose point kernel was less than approximately 0.6%. (author)
Verrelst, Jochem; Rivera, Juan Pablo; Veroustraete, Frank; Muñoz-Marí, Jordi; Clevers, J.G.P.W.; Camps-Valls, Gustau; Moreno, José
2015-01-01
Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC),
Wang, Yuanjia; Garcia, Tanya P.; Ma, Yanyuan
2012-01-01
This work presents methods for estimating genotype-specific distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs) which do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators which do not assume parametric density models and are easy to implement. They are based on the inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). The AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington’s Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated non-carrier survival rates to that of the U.S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared to non-carriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic testing, and in facilitating future subjects at risk
Regularized Pre-image Estimation for Kernel PCA De-noising
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
The main challenge in de-noising by kernel Principal Component Analysis (PCA) is the mapping of de-noised feature space points back into input space, also referred to as “the pre-image problem”. Since the feature space mapping is typically not bijective, pre-image estimation is inherently illposed...
Bornkamp, Björn; Ickstadt, Katja
2009-03-01
In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.
Directory of Open Access Journals (Sweden)
Z. Nematollahi
2016-03-01
Full Text Available Introduction: Due to existence of the risk and uncertainty in agriculture, risk management is crucial for management in agriculture. Therefore the present study was designed to determine the risk aversion coefficient for Esfarayens farmers. Materials and Methods: The following approaches have been utilized to assess risk attitudes: (1 direct elicitation of utility functions, (2 experimental procedures in which individuals are presented with hypothetical questionnaires regarding risky alternatives with or without real payments and (3: Inference from observation of economic behavior. In this paper, we focused on approach (3: inference from observation of economic behavior, based on this assumption of existence of the relationship between the actual behavior of a decision maker and the behavior predicted from empirically specified models. A new non-parametric method and the QP method were used to calculate the coefficient of risk aversion. We maximized the decision maker expected utility with the E-V formulation (Freund, 1956. Ideally, in constructing a QP model, the variance-covariance matrix should be formed for each individual farmer. For this purpose, a sample of 100 farmers was selected using random sampling and their data about 14 products of years 2008- 2012 were assembled. The lowlands of Esfarayen were used since within this area, production possibilities are rather homogeneous. Results and Discussion: The results of this study showed that there was low correlation between some of the activities, which implies opportunities for income stabilization through diversification. With respect to transitory income, Ra, vary from 0.000006 to 0.000361 and the absolute coefficient of risk aversion in our sample were 0.00005. The estimated Ra values vary considerably from farm to farm. The results showed that the estimated Ra for the subsample existing of 'non-wealthy' farmers was 0.00010. The subsample with farmers in the 'wealthy' group had an
Sharma, Andy
2017-06-01
The purpose of this study was to showcase an advanced methodological approach to model disability and institutional entry. Both of these are important areas to investigate given the on-going aging of the United States population. By 2020, approximately 15% of the population will be 65 years and older. Many of these older adults will experience disability and require formal care. A probit analysis was employed to determine which disabilities were associated with admission into an institution (i.e. long-term care). Since this framework imposes strong distributional assumptions, misspecification leads to inconsistent estimators. To overcome such a short-coming, this analysis extended the probit framework by employing an advanced semi-nonparamertic maximum likelihood estimation utilizing Hermite polynomial expansions. Specification tests show semi-nonparametric estimation is preferred over probit. In terms of the estimates, semi-nonparametric ratios equal 42 for cognitive difficulty, 64 for independent living, and 111 for self-care disability while probit yields much smaller estimates of 19, 30, and 44, respectively. Public health professionals can use these results to better understand why certain interventions have not shown promise. Equally important, healthcare workers can use this research to evaluate which type of treatment plans may delay institutionalization and improve the quality of life for older adults. Implications for rehabilitation With on-going global aging, understanding the association between disability and institutional entry is important in devising successful rehabilitation interventions. Semi-nonparametric is preferred to probit and shows ambulatory and cognitive impairments present high risk for institutional entry (long-term care). Informal caregiving and home-based care require further examination as forms of rehabilitation/therapy for certain types of disabilities.
Martinez Manzanera, Octavio; Elting, Jan Willem; van der Hoeven, Johannes H.; Maurits, Natasha M.
2016-01-01
In the clinic, tremor is diagnosed during a time-limited process in which patients are observed and the characteristics of tremor are visually assessed. For some tremor disorders, a more detailed analysis of these characteristics is needed. Accelerometry and electromyography can be used to obtain a better insight into tremor. Typically, routine clinical assessment of accelerometry and electromyography data involves visual inspection by clinicians and occasionally computational analysis to obtain objective characteristics of tremor. However, for some tremor disorders these characteristics may be different during daily activity. This variability in presentation between the clinic and daily life makes a differential diagnosis more difficult. A long-term recording of tremor by accelerometry and/or electromyography in the home environment could help to give a better insight into the tremor disorder. However, an evaluation of such recordings using routine clinical standards would take too much time. We evaluated a range of techniques that automatically detect tremor segments in accelerometer data, as accelerometer data is more easily obtained in the home environment than electromyography data. Time can be saved if clinicians only have to evaluate the tremor characteristics of segments that have been automatically detected in longer daily activity recordings. We tested four non-parametric methods and five parametric methods on clinical accelerometer data from 14 patients with different tremor disorders. The consensus between two clinicians regarding the presence or absence of tremor on 3943 segments of accelerometer data was employed as reference. The nine methods were tested against this reference to identify their optimal parameters. Non-parametric methods generally performed better than parametric methods on our dataset when optimal parameters were used. However, one parametric method, employing the high frequency content of the tremor bandwidth under consideration
Directory of Open Access Journals (Sweden)
Haorui Liu
2016-01-01
Full Text Available In the car control systems, it is hard to measure some key vehicle states directly and accurately when running on the road and the cost of the measurement is high as well. To address these problems, a vehicle state estimation method based on the kernel principal component analysis and the improved Elman neural network is proposed. Combining with nonlinear vehicle model of three degrees of freedom (3 DOF, longitudinal, lateral, and yaw motion, this paper applies the method to the soft sensor of the vehicle states. The simulation results of the double lane change tested by Matlab/SIMULINK cosimulation prove the KPCA-IENN algorithm (kernel principal component algorithm and improved Elman neural network to be quick and precise when tracking the vehicle states within the nonlinear area. This algorithm method can meet the software performance requirements of the vehicle states estimation in precision, tracking speed, noise suppression, and other aspects.
DEFF Research Database (Denmark)
Gimperlein, Heiko; Grubb, Gerd
2014-01-01
The purpose of this article is to establish upper and lower estimates for the integral kernel of the semigroup exp(−t P) associated to a classical, strongly elliptic pseudodifferential operator P of positive order on a closed manifold. The Poissonian bounds generalize those obtained for perturbat......The purpose of this article is to establish upper and lower estimates for the integral kernel of the semigroup exp(−t P) associated to a classical, strongly elliptic pseudodifferential operator P of positive order on a closed manifold. The Poissonian bounds generalize those obtained...... for perturbations of fractional powers of the Laplacian. In the selfadjoint case, extensions to t∈C+ are studied. In particular, our results apply to the Dirichlet-to-Neumann semigroup....
Automated voxelization of 3D atom probe data through kernel density estimation
International Nuclear Information System (INIS)
Srinivasan, Srikant; Kaluskar, Kaustubh; Dumpala, Santoshrupa; Broderick, Scott; Rajan, Krishna
2015-01-01
Identifying nanoscale chemical features from atom probe tomography (APT) data routinely involves adjustment of voxel size as an input parameter, through visual supervision, making the final outcome user dependent, reliant on heuristic knowledge and potentially prone to error. This work utilizes Kernel density estimators to select an optimal voxel size in an unsupervised manner to perform feature selection, in particular targeting resolution of interfacial features and chemistries. The capability of this approach is demonstrated through analysis of the γ / γ’ interface in a Ni–Al–Cr superalloy. - Highlights: • Develop approach for standardizing aspects of atom probe reconstruction. • Use Kernel density estimators to select optimal voxel sizes in an unsupervised manner. • Perform interfacial analysis of Ni–Al–Cr superalloy, using new automated approach. • Optimize voxel size to preserve the feature of interest and minimizing loss / noise.
Michalski, Andrew S; Edwards, W Brent; Boyd, Steven K
2017-10-17
Quantitative computed tomography has been posed as an alternative imaging modality to investigate osteoporosis. We examined the influence of computed tomography convolution back-projection reconstruction kernels on the analysis of bone quantity and estimated mechanical properties in the proximal femur. Eighteen computed tomography scans of the proximal femur were reconstructed using both a standard smoothing reconstruction kernel and a bone-sharpening reconstruction kernel. Following phantom-based density calibration, we calculated typical bone quantity outcomes of integral volumetric bone mineral density, bone volume, and bone mineral content. Additionally, we performed finite element analysis in a standard sideways fall on the hip loading configuration. Significant differences for all outcome measures, except integral bone volume, were observed between the 2 reconstruction kernels. Volumetric bone mineral density measured using images reconstructed by the standard kernel was significantly lower (6.7%, p kernel. Furthermore, the whole-bone stiffness and the failure load measured in images reconstructed by the standard kernel were significantly lower (16.5%, p kernel. These data suggest that for future quantitative computed tomography studies, a standardized reconstruction kernel will maximize reproducibility, independent of the use of a quantitative calibration phantom. Copyright © 2017 The International Society for Clinical Densitometry. Published by Elsevier Inc. All rights reserved.
Estimation of the applicability domain of kernel-based machine learning models for virtual screening
Directory of Open Access Journals (Sweden)
Fechner Nikolas
2010-03-01
Full Text Available Abstract Background The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model. Results We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening
Fechner, Nikolas; Jahn, Andreas; Hinselmann, Georg; Zell, Andreas
2010-03-11
The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model. We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening. The proposed applicability domain formulations
Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors
Directory of Open Access Journals (Sweden)
Xibin Zhang
2016-04-01
Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.
Nonparametric Estimation of ATE and QTE: An Application of Fractile Graphical Analysis
Directory of Open Access Journals (Sweden)
Gabriel V. Montes-Rojas
2011-01-01
characteristics. The proposed method has two steps: first, the propensity score is estimated, and, second, a blocking estimation procedure using this estimate is used to compute treatment effects. In both cases, the estimators are proved to be consistent. Monte Carlo results show a better performance than other procedures based on the propensity score. Finally, these estimators are applied to a job training dataset.
Energy Technology Data Exchange (ETDEWEB)
Constantinescu, C C; Yoder, K K; Normandin, M D; Morris, E D [Department of Radiology, Indiana University School of Medicine, Indianapolis, IN (United States); Kareken, D A [Department of Neurology, Indiana University School of Medicine, Indianapolis, IN (United States); Bouman, C A [Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN (United States); O' Connor, S J [Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN (United States)], E-mail: emorris@iupui.edu
2008-03-07
We previously developed a model-independent technique (non-parametric ntPET) for extracting the transient changes in neurotransmitter concentration from paired (rest and activation) PET studies with a receptor ligand. To provide support for our method, we introduced three hypotheses of validation based on work by Endres and Carson (1998 J. Cereb. Blood Flow Metab. 18 1196-210) and Yoder et al (2004 J. Nucl. Med. 45 903-11), and tested them on experimental data. All three hypotheses describe relationships between the estimated free (synaptic) dopamine curves (F{sup DA}(t)) and the change in binding potential ({delta}BP). The veracity of the F{sup DA}(t) curves recovered by nonparametric ntPET is supported when the data adhere to the following hypothesized behaviors: (1) {delta}BP should decline with increasing DA peak time, (2) {delta}BP should increase as the strength of the temporal correlation between F{sup DA}(t) and the free raclopride (F{sup RAC}(t)) curve increases, (3) {delta}BP should decline linearly with the effective weighted availability of the receptor sites. We analyzed regional brain data from 8 healthy subjects who received two [{sup 11}C]raclopride scans: one at rest, and one during which unanticipated IV alcohol was administered to stimulate dopamine release. For several striatal regions, nonparametric ntPET was applied to recover F{sup DA}(t), and binding potential values were determined. Kendall rank-correlation analysis confirmed that the F{sup DA}(t) data followed the expected trends for all three validation hypotheses. Our findings lend credence to our model-independent estimates of F{sup DA}(t). Application of nonparametric ntPET may yield important insights into how alterations in timing of dopaminergic neurotransmission are involved in the pathologies of addiction and other psychiatric disorders.
Calculation of the time resolution of the J-PET tomograph using kernel density estimation
Raczyński, L.; Wiślicki, W.; Krzemień, W.; Kowalski, P.; Alfs, D.; Bednarski, T.; Białas, P.; Curceanu, C.; Czerwiński, E.; Dulski, K.; Gajos, A.; Głowacz, B.; Gorgol, M.; Hiesmayr, B.; Jasińska, B.; Kamińska, D.; Korcyl, G.; Kozik, T.; Krawczyk, N.; Kubicz, E.; Mohammed, M.; Pawlik-Niedźwiecka, M.; Niedźwiecki, S.; Pałka, M.; Rudy, Z.; Rundel, O.; Sharma, N. G.; Silarski, M.; Smyrski, J.; Strzelecki, A.; Wieczorek, A.; Zgardzińska, B.; Zieliński, M.; Moskal, P.
2017-06-01
In this paper we estimate the time resolution of the J-PET scanner built from plastic scintillators. We incorporate the method of signal processing using the Tikhonov regularization framework and the kernel density estimation method. We obtain simple, closed-form analytical formulae for time resolution. The proposed method is validated using signals registered by means of the single detection unit of the J-PET tomograph built from a 30 cm long plastic scintillator strip. It is shown that the experimental and theoretical results obtained for the J-PET scanner equipped with vacuum tube photomultipliers are consistent.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Nonparametric Transfer Function Models
Liu, Jun M.; Chen, Rong; Yao, Qiwei
2009-01-01
In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584
Using kernel density estimates to investigate lymphatic filariasis in northeast Brazil
Medeiros, Zulma; Bonfim, Cristine; Brandão, Eduardo; Netto, Maria José Evangelista; Vasconcellos, Lucia; Ribeiro, Liany; Portugal, José Luiz
2012-01-01
After more than 10 years of the Global Program to Eliminate Lymphatic Filariasis (GPELF) in Brazil, advances have been seen, but the endemic disease persists as a public health problem. The aim of this study was to describe the spatial distribution of lymphatic filariasis in the municipality of Jaboatão dos Guararapes, Pernambuco, Brazil. An epidemiological survey was conducted in the municipality, and positive filariasis cases identified in this survey were georeferenced in point form, using the GPS. A kernel intensity estimator was applied to identify clusters with greater intensity of cases. We examined 23 673 individuals and 323 individuals with microfilaremia were identified, representing a mean prevalence rate of 1.4%. Around 88% of the districts surveyed presented cases of filarial infection, with prevalences of 0–5.6%. The male population was more affected by the infection, with 63.8% of the cases (P<0.005). Positive cases were found in all age groups examined. The kernel intensity estimator identified the areas of greatest intensity and least intensity of filarial infection cases. The case distribution was heterogeneous across the municipality. The kernel estimator identified spatial clusters of cases, thus indicating locations with greater intensity of transmission. The main advantage of this type of analysis lies in its ability to rapidly and easily show areas with the highest concentration of cases, thereby contributing towards planning, monitoring, and surveillance of filariasis elimination actions. Incorporation of geoprocessing and spatial analysis techniques constitutes an important tool for use within the GPELF. PMID:22943547
Analytical Plug-In Method for Kernel Density Estimator Applied to Genetic Neutrality Study
Troudi, Molka; Alimi, Adel M.; Saoudi, Samir
2008-12-01
The plug-in method enables optimization of the bandwidth of the kernel density estimator in order to estimate probability density functions (pdfs). Here, a faster procedure than that of the common plug-in method is proposed. The mean integrated square error (MISE) depends directly upon [InlineEquation not available: see fulltext.] which is linked to the second-order derivative of the pdf. As we intend to introduce an analytical approximation of [InlineEquation not available: see fulltext.], the pdf is estimated only once, at the end of iterations. These two kinds of algorithm are tested on different random variables having distributions known for their difficult estimation. Finally, they are applied to genetic data in order to provide a better characterisation in the mean of neutrality of Tunisian Berber populations.
Analytical Plug-In Method for Kernel Density Estimator Applied to Genetic Neutrality Study
Directory of Open Access Journals (Sweden)
Samir Saoudi
2008-07-01
Full Text Available The plug-in method enables optimization of the bandwidth of the kernel density estimator in order to estimate probability density functions (pdfs. Here, a faster procedure than that of the common plug-in method is proposed. The mean integrated square error (MISE depends directly upon J(f which is linked to the second-order derivative of the pdf. As we intend to introduce an analytical approximation of J(f, the pdf is estimated only once, at the end of iterations. These two kinds of algorithm are tested on different random variables having distributions known for their difficult estimation. Finally, they are applied to genetic data in order to provide a better characterisation in the mean of neutrality of Tunisian Berber populations.
Efficient 3D movement-based kernel density estimator and application to wildlife ecology
Tracey-PR, Jeff; Sheppard, James K.; Lockwood, Glenn K.; Chourasia, Amit; Tatineni, Mahidhar; Fisher, Robert N.; Sinkovits, Robert S.
2014-01-01
We describe an efficient implementation of a 3D movement-based kernel density estimator for determining animal space use from discrete GPS measurements. This new method provides more accurate results, particularly for species that make large excursions in the vertical dimension. The downside of this approach is that it is much more computationally expensive than simpler, lower-dimensional models. Through a combination of code restructuring, parallelization and performance optimization, we were able to reduce the time to solution by up to a factor of 1000x, thereby greatly improving the applicability of the method.
Afshinpour, Babak; Hossein-Zadeh, Gholam-Ali; Soltanian-Zadeh, Hamid
2008-06-30
Unknown low frequency fluctuations called "trend" are observed in noisy time-series measured for different applications. In some disciplines, they carry primary information while in other fields such as functional magnetic resonance imaging (fMRI) they carry nuisance effects. In all cases, however, it is necessary to estimate them accurately. In this paper, a method for estimating trend in the presence of fractal noise is proposed and applied to fMRI time-series. To this end, a partly linear model (PLM) is fitted to each time-series. The parametric and nonparametric parts of PLM are considered as contributions of hemodynamic response and trend, respectively. Using the whitening property of wavelet transform, the unknown components of the model are estimated in the wavelet domain. The results of the proposed method are compared to those of other parametric trend-removal approaches such as spline and polynomial models. It is shown that the proposed method improves activation detection and decreases variance of the estimated parameters relative to the other methods.
Type I Error Rates and Power Estimates of Selected Parametric and Nonparametric Tests of Scale.
Olejnik, Stephen F.; Algina, James
1987-01-01
Estimated Type I Error rates and power are reported for the Brown-Forsythe, O'Brien, Klotz, and Siegal-Tukey procedures. The effect of aligning the data using deviations from group means or group medians is investigated. (RB)
International Nuclear Information System (INIS)
Trapero, Juan R.
2016-01-01
In order to integrate solar energy into the grid it is important to predict the solar radiation accurately, where forecast errors can lead to significant costs. Recently, the increasing statistical approaches that cope with this problem is yielding a prolific literature. In general terms, the main research discussion is centred on selecting the “best” forecasting technique in accuracy terms. However, the need of the users of such forecasts require, apart from point forecasts, information about the variability of such forecast to compute prediction intervals. In this work, we will analyze kernel density estimation approaches, volatility forecasting models and combination of both of them in order to improve the prediction intervals performance. The results show that an optimal combination in terms of prediction interval statistical tests can achieve the desired confidence level with a lower average interval width. Data from a facility located in Spain are used to illustrate our methodology. - Highlights: • This work explores uncertainty forecasting models to build prediction intervals. • Kernel density estimators, exponential smoothing and GARCH models are compared. • An optimal combination of methods provides the best results. • A good compromise between coverage and average interval width is shown.
Adaptive Estimation of Heteroscedastic Money Demand Model of Pakistan
Directory of Open Access Journals (Sweden)
Muhammad Aslam
2007-07-01
Full Text Available For the problem of estimation of Money demand model of Pakistan, money supply (M1 shows heteroscedasticity of the unknown form. For estimation of such model we compare two adaptive estimators with ordinary least squares estimator and show the attractive performance of the adaptive estimators, namely, nonparametric kernel estimator and nearest neighbour regression estimator. These comparisons are made on the basis standard errors of the estimated coefficients, standard error of regression, Akaike Information Criteria (AIC value, and the Durban-Watson statistic for autocorrelation. We further show that nearest neighbour regression estimator performs better when comparing with the other nonparametric kernel estimator.
Hrycik, Janelle M.; Chassé, Joël; Ruddick, Barry R.; Taggart, Christopher T.
2013-11-01
Early life-stage dispersal influences recruitment and is of significance in explaining the distribution and connectivity of marine species. Motivations for quantifying dispersal range from biodiversity conservation to the design of marine reserves and the mitigation of species invasions. Here we compare estimates of real particle dispersion in a coastal marine environment with similar estimates provided by hydrodynamic modelling. We do so by using a system of magnetically attractive particles (MAPs) and a magnetic-collector array that provides measures of Lagrangian dispersion based on the time-integration of MAPs dispersing through the array. MAPs released as a point source in a coastal marine location dispersed through the collector array over a 5-7 d period. A virtual release and observed (real-time) environmental conditions were used in a high-resolution three-dimensional hydrodynamic model to estimate the dispersal of virtual particles (VPs). The number of MAPs captured throughout the collector array and the number of VPs that passed through each corresponding model location were enumerated and compared. Although VP dispersal reflected several aspects of the observed MAP dispersal, the comparisons demonstrated model sensitivity to the small-scale (random-walk) particle diffusivity parameter (Kp). The one-dimensional dispersal kernel for the MAPs had an e-folding scale estimate in the range of 5.19-11.44 km, while those from the model simulations were comparable at 1.89-6.52 km, and also demonstrated sensitivity to Kp. Variations among comparisons are related to the value of Kp used in modelling and are postulated to be related to MAP losses from the water column and (or) shear dispersion acting on the MAPs; a process that is constrained in the model. Our demonstration indicates a promising new way of 1) quantitatively and empirically estimating the dispersal kernel in aquatic systems, and 2) quantitatively assessing and (or) improving regional hydrodynamic
Nonparametric bootstrap procedures for predictive inference based on recursive estimation schemes
Corradi, Valentina; Swanson, Norman R.
2005-01-01
Our objectives in this paper are twofold. First, we introduce block bootstrap techniques that are (first order) valid in recursive estimation frameworks. Thereafter, we present two examples where predictive accuracy tests are made operational using our new bootstrap procedures. In one application, we outline a consistent test for out-of-sample nonlinear Granger causality, and in the other we outline a test for selecting amongst multiple alternative forecasting models, all of which are possibl...
International Nuclear Information System (INIS)
Hanachi, Houman; Liu, Jie; Banerjee, Avisekh; Chen, Ying
2015-01-01
Modern health management approaches for gas turbine engines (GTEs) aim to precisely estimate the health state of the GTE components to optimize maintenance decisions with respect to both economy and safety. In this research, we propose an advanced framework to identify the most likely degradation state of the turbine section in a GTE for prognostics and health management (PHM) applications. A novel nonlinear thermodynamic model is used to predict the performance parameters of the GTE given the measurements. The ratio between real efficiency of the GTE and simulated efficiency in the newly installed condition is defined as the health indicator and provided at each sequence. The symptom of nonrecoverable degradations in the turbine section, i.e. loss of turbine efficiency, is assumed to be the internal degradation state. A regularized auxiliary particle filter (RAPF) is developed to sequentially estimate the internal degradation state in nonuniform time sequences upon receiving sets of new measurements. The effectiveness of the technique is examined using the operating data over an entire time-between-overhaul cycle of a simple-cycle industrial GTE. The results clearly show the trend of degradation in the turbine section and the occasional fluctuations, which are well supported by the service history of the GTE. The research also suggests the efficacy of the proposed technique to monitor the health state of the turbine section of a GTE by implementing model-based PHM without the need for additional instrumentation. (paper)
International Nuclear Information System (INIS)
Stevens, D.L.; Dagle, G.E.
1986-01-01
Retention and translocation of inhaled radionuclides are often estimated from the sacrifice of multiple animals at different time points. The data for each time point can be averaged and a smooth curve fitted to the mean values, or a smooth curve may be fitted to the entire data set. However, an analysis based on means may not be the most appropriate if there is substantial variation in the initial amount of the radionuclide inhaled or if the data are subject to outliers. A method has been developed that takes account of these problems. The body burden is viewed as a compartmental system, with the compartments identified with body organs. A median polish is applied to the multiple logistic transform of the compartmental fractions (compartment burden/total burden) at each time point. A smooth function is fitted to the results of the median polish. This technique was applied to data from beagles exposed to an aerosol of 239 Pu(NO 3 ) 4 . Models of retention and translocation for lungs, skeleton, liver, kidneys, and tracheobronchial lymph nodes were developed and used to estimate dose. 4 refs., 3 figs., 4 tabs
de-Graft Acquah, Henry
2014-01-01
This paper highlights the sensitivity of technical efficiency estimates to estimation approaches using empirical data. Firm specific technical efficiency and mean technical efficiency are estimated using the non parametric Data Envelope Analysis (DEA) and the parametric Corrected Ordinary Least Squares (COLS) and Stochastic Frontier Analysis (SFA) approaches. Mean technical efficiency is found to be sensitive to the choice of estimation technique. Analysis of variance and Tukeyâ€™s test sugge...
Near-native protein loop sampling using nonparametric density estimation accommodating sparcity.
Joo, Hyun; Chavan, Archana G; Day, Ryan; Lennox, Kristin P; Sukhanov, Paul; Dahl, David B; Vannucci, Marina; Tsai, Jerry
2011-10-01
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD 7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/.
Near-native protein loop sampling using nonparametric density estimation accommodating sparcity.
Directory of Open Access Journals (Sweden)
Hyun Joo
2011-10-01
Full Text Available Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM. Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD 7.0 Å, this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/.
Near-Native Protein Loop Sampling Using Nonparametric Density Estimation Accommodating Sparcity
Day, Ryan; Lennox, Kristin P.; Sukhanov, Paul; Dahl, David B.; Vannucci, Marina; Tsai, Jerry
2011-01-01
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD 7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/. PMID:22028638
Kernel density estimation and transition maps of Moldavian Neolithic and Eneolithic settlement
Directory of Open Access Journals (Sweden)
Robin Brigand
2018-04-01
Full Text Available The data presented in this article are related to the research article entitled “Neo-Eneolithic settlement pattern and salt exploitation in Romanian Moldavia” (Brigand and Weller, 2018 [1]. Kernel density estimation (KDE is used in order to move beyond the discrete distribution of sites and to enable us to work on a continuous surface that reflects the intensity of the occupation in the space. Maps of density per period – Neolithic I (Cris, Neolithic II (LBK, Eneolithic I (Precucuteni, Eneolithic II (Cucuteni A, Eneolithic III-IV (Cucuteni A-B and B – are used to create maps of density difference (Figs. 1–4 in order to analyse the dynamic (either non-existent, negative or positive between two chronological sequences.
Zhao, Zhibiao
2011-06-01
We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.
Pencil kernel correction and residual error estimation for quality-index-based dose calculations
International Nuclear Information System (INIS)
Nyholm, Tufve; Olofsson, Joergen; Ahnesjoe, Anders; Georg, Dietmar; Karlsson, Mikael
2006-01-01
Experimental data from 593 photon beams were used to quantify the errors in dose calculations using a previously published pencil kernel model. A correction of the kernel was derived in order to remove the observed systematic errors. The remaining residual error for individual beams was modelled through uncertainty associated with the kernel model. The methods were tested against an independent set of measurements. No significant systematic error was observed in the calculations using the derived correction of the kernel and the remaining random errors were found to be adequately predicted by the proposed method
Network Kernel Density Estimation for the Analysis of Facility POI Hotspots
Directory of Open Access Journals (Sweden)
YU Wenhao
2015-12-01
Full Text Available The distribution pattern of urban facility POIs (points of interest usually forms clusters (i.e. "hotspots" in urban geographic space. To detect such type of hotspot, the methods mostly employ spatial density estimation based on Euclidean distance, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. By using these methods, it is difficult to exactly and objectively delimitate the shape and the size of hotspot. Therefore, this research adopts the kernel density estimation based on the network distance to compute the density of hotspot and proposes a simple and efficient algorithm. The algorithm extends the 2D dilation operator to the 1D morphological operator, thus computing the density of network unit. Through evaluation experiment, it is suggested that the algorithm is more efficient and scalable than the existing algorithms. Based on the case study on real POI data, the range of hotspot can highlight the spatial characteristic of urban functions along traffic routes, in order to provide valuable spatial knowledge and information services for the applications of region planning, navigation and geographic information inquiring.
Lévy matters VI Lévy-type processes moments, construction and heat kernel estimates
Kühn, Franziska
2017-01-01
Presenting some recent results on the construction and the moments of Lévy-type processes, the focus of this volume is on a new existence theorem, which is proved using a parametrix construction. Applications range from heat kernel estimates for a class of Lévy-type processes to existence and uniqueness theorems for Lévy-driven stochastic differential equations with Hölder continuous coefficients. Moreover, necessary and sufficient conditions for the existence of moments of Lévy-type processes are studied and some estimates on moments are derived. Lévy-type processes behave locally like Lévy processes but, in contrast to Lévy processes, they are not homogeneous in space. Typical examples are processes with varying index of stability and solutions of Lévy-driven stochastic differential equations. This is the sixth volume in a subseries of the Lecture Notes in Mathematics called Lévy Matters. Each volume describes a number of important topics in the theory or applicati ons of Lévy processes and pays ...
Considering a non-polynomial basis for local kernel regression problem
Silalahi, Divo Dharma; Midi, Habshah
2017-01-01
A common used as solution for local kernel nonparametric regression problem is given using polynomial regression. In this study, we demonstrated the estimator and properties using maximum likelihood estimator for a non-polynomial basis such B-spline to replacing the polynomial basis. This estimator allows for flexibility in the selection of a bandwidth and a knot. The best estimator was selected by finding an optimal bandwidth and knot through minimizing the famous generalized validation function.
2009-01-01
Abstract A kernel estimator of the conditional quantile is defined for a scalar response variable given a covariate taking values in a semi-metric space. The approach generalizes the median?s L1-norm estimator. The almost complete consistency and asymptotic normality are stated. correspondance: Corresponding author. Tel: +33 320 964 933; fax: +33 320 964 704. (Lemdani, Mohamed) (Laksaci, Ali) mohamed.lemdani@univ-lill...
DEFF Research Database (Denmark)
Varneskov, Rasmus T.
. Lastly, two small empirical applications to high frequency stock market data illustrate the bias reduction relative to competing estimators in estimating correlations, realized betas, and mean-variance frontiers, as well as the use of the new estimators in the dynamics of hedging....... problems. These transformations are all shown to inherit the desirable asymptotic properties of the generalized at-top realized kernels. A simulation study shows that the class of estimators has a superior finite sample tradeoff between bias and root mean squared error relative to competing estimators...
International Nuclear Information System (INIS)
Li Heng; Mohan, Radhe; Zhu, X Ronald
2008-01-01
The clinical applications of kilovoltage x-ray cone-beam computed tomography (CBCT) have been compromised by the limited quality of CBCT images, which typically is due to a substantial scatter component in the projection data. In this paper, we describe an experimental method of deriving the scatter kernel of a CBCT imaging system. The estimated scatter kernel can be used to remove the scatter component from the CBCT projection images, thus improving the quality of the reconstructed image. The scattered radiation was approximated as depth-dependent, pencil-beam kernels, which were derived using an edge-spread function (ESF) method. The ESF geometry was achieved with a half-beam block created by a 3 mm thick lead sheet placed on a stack of slab solid-water phantoms. Measurements for ten water-equivalent thicknesses (WET) ranging from 0 cm to 41 cm were taken with (half-blocked) and without (unblocked) the lead sheet, and corresponding pencil-beam scatter kernels or point-spread functions (PSFs) were then derived without assuming any empirical trial function. The derived scatter kernels were verified with phantom studies. Scatter correction was then incorporated into the reconstruction process to improve image quality. For a 32 cm diameter cylinder phantom, the flatness of the reconstructed image was improved from 22% to 5%. When the method was applied to CBCT images for patients undergoing image-guided therapy of the pelvis and lung, the variation in selected regions of interest (ROIs) was reduced from >300 HU to <100 HU. We conclude that the scatter reduction technique utilizing the scatter kernel effectively suppresses the artifact caused by scatter in CBCT.
International Nuclear Information System (INIS)
Kuosmanen, Timo
2012-01-01
Electricity distribution network is a prime example of a natural local monopoly. In many countries, electricity distribution is regulated by the government. Many regulators apply frontier estimation techniques such as data envelopment analysis (DEA) or stochastic frontier analysis (SFA) as an integral part of their regulatory framework. While more advanced methods that combine nonparametric frontier with stochastic error term are known in the literature, in practice, regulators continue to apply simplistic methods. This paper reports the main results of the project commissioned by the Finnish regulator for further development of the cost frontier estimation in their regulatory framework. The key objectives of the project were to integrate a stochastic SFA-style noise term to the nonparametric, axiomatic DEA-style cost frontier, and to take the heterogeneity of firms and their operating environments better into account. To achieve these objectives, a new method called stochastic nonparametric envelopment of data (StoNED) was examined. Based on the insights and experiences gained in the empirical analysis using the real data of the regulated networks, the Finnish regulator adopted the StoNED method in use from 2012 onwards.
Irvine, Michael A; Hollingsworth, T Déirdre
2018-05-26
Fitting complex models to epidemiological data is a challenging problem: methodologies can be inaccessible to all but specialists, there may be challenges in adequately describing uncertainty in model fitting, the complex models may take a long time to run, and it can be difficult to fully capture the heterogeneity in the data. We develop an adaptive approximate Bayesian computation scheme to fit a variety of epidemiologically relevant data with minimal hyper-parameter tuning by using an adaptive tolerance scheme. We implement a novel kernel density estimation scheme to capture both dispersed and multi-dimensional data, and directly compare this technique to standard Bayesian approaches. We then apply the procedure to a complex individual-based simulation of lymphatic filariasis, a human parasitic disease. The procedure and examples are released alongside this article as an open access library, with examples to aid researchers to rapidly fit models to data. This demonstrates that an adaptive ABC scheme with a general summary and distance metric is capable of performing model fitting for a variety of epidemiological data. It also does not require significant theoretical background to use and can be made accessible to the diverse epidemiological research community. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Mekaroonreung, Maethee; Johnson, Andrew L.
2012-01-01
Weak disposability between outputs and pollutants, defined as a simultaneous proportional reduction of both outputs and pollutants, assumes that pollutants are byproducts of the output generation process and that a firm can “freely dispose” of both by scaling down production levels, leaving some inputs idle. Based on the production axioms of monotonicity, convexity and weak disposability, we formulate a convex nonparametric least squares (CNLS) quadratic optimization problem to estimate a frontier production function assuming either a deterministic disturbance term consisting only of inefficiency, or a composite disturbance term composed of both inefficiency and noise. The suggested methodology extends the stochastic semi-nonparametric envelopment of data (StoNED) described in Kuosmanen and Kortelainen (2011). Applying the method to estimate the shadow prices of SO 2 and NO x generated by U.S. coal power plants, we conclude that the weak disposability StoNED method provides more consistent estimates of market prices. - Highlights: ► Develops methodology to estimate shadow prices for SO 2 and NO x in the U.S. coal power plants. ► Extends CNLS and StoNED methods to include the weak disposability assumption. ► Estimates the range of SO 2 and NO x shadow prices as 201–343 $/ton and 409–1352 $/ton. ► StoNED method provides more accurate estimates of shadow prices than deterministic frontier.
Siciliani, Luigi
2006-01-01
Policy makers are increasingly interested in developing performance indicators that measure hospital efficiency. These indicators may give the purchasers of health services an additional regulatory tool to contain health expenditure. Using panel data, this study compares different parametric (econometric) and non-parametric (linear programming) techniques for the measurement of a hospital's technical efficiency. This comparison was made using a sample of 17 Italian hospitals in the years 1996-9. Highest correlations are found in the efficiency scores between the non-parametric data envelopment analysis under the constant returns to scale assumption (DEA-CRS) and several parametric models. Correlation reduces markedly when using more flexible non-parametric specifications such as data envelopment analysis under the variable returns to scale assumption (DEA-VRS) and the free disposal hull (FDH) model. Correlation also generally reduces when moving from one output to two-output specifications. This analysis suggests that there is scope for developing performance indicators at hospital level using panel data, but it is important that extensive sensitivity analysis is carried out if purchasers wish to make use of these indicators in practice.
Energy Technology Data Exchange (ETDEWEB)
Kropat, Georg, E-mail: georg.kropat@chuv.ch [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Bochud, Francois [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Jaboyedoff, Michel [Faculty of Geosciences and Environment, University of Lausanne, GEOPOLIS — 3793, 1015 Lausanne (Switzerland); Laedermann, Jean-Pascal [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Murith, Christophe; Palacios, Martha [Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne (Switzerland); Baechler, Sébastien [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne (Switzerland)
2015-02-01
Purpose: The aim of this study was to develop models based on kernel regression and probability estimation in order to predict and map IRC in Switzerland by taking into account all of the following: architectural factors, spatial relationships between the measurements, as well as geological information. Methods: We looked at about 240 000 IRC measurements carried out in about 150 000 houses. As predictor variables we included: building type, foundation type, year of construction, detector type, geographical coordinates, altitude, temperature and lithology into the kernel estimation models. We developed predictive maps as well as a map of the local probability to exceed 300 Bq/m{sup 3}. Additionally, we developed a map of a confidence index in order to estimate the reliability of the probability map. Results: Our models were able to explain 28% of the variations of IRC data. All variables added information to the model. The model estimation revealed a bandwidth for each variable, making it possible to characterize the influence of each variable on the IRC estimation. Furthermore, we assessed the mapping characteristics of kernel estimation overall as well as by municipality. Overall, our model reproduces spatial IRC patterns which were already obtained earlier. On the municipal level, we could show that our model accounts well for IRC trends within municipal boundaries. Finally, we found that different building characteristics result in different IRC maps. Maps corresponding to detached houses with concrete foundations indicate systematically smaller IRC than maps corresponding to farms with earth foundation. Conclusions: IRC mapping based on kernel estimation is a powerful tool to predict and analyze IRC on a large-scale as well as on a local level. This approach enables to develop tailor-made maps for different architectural elements and measurement conditions and to account at the same time for geological information and spatial relations between IRC measurements
Kernel-based tests for joint independence
DEFF Research Database (Denmark)
Pfister, Niklas; Bühlmann, Peter; Schölkopf, Bernhard
2018-01-01
if the $d$ variables are jointly independent, as long as the kernel is characteristic. Based on an empirical estimate of dHSIC, we define three different non-parametric hypothesis tests: a permutation test, a bootstrap test and a test based on a Gamma approximation. We prove that the permutation test......We investigate the problem of testing whether $d$ random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two variable Hilbert-Schmidt independence criterion (HSIC) but allows for an arbitrary number of variables. We embed...... the $d$-dimensional joint distribution and the product of the marginals into a reproducing kernel Hilbert space and define the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) as the squared distance between the embeddings. In the population case, the value of dHSIC is zero if and only...
What is hypomania? Tetrachoric factor analysis and kernel estimation of DSM-IV hypomanic symptoms.
Benazzi, Franco
2009-11-01
The DSM-IV definition of hypomania, which relies on clinical consensus and historical tradition, includes several "nonspecific" symptoms. The aim of this study was to identify the core symptoms of DSM-IV hypomania. In an outpatient private practice, 266 bipolar II disorder (BP-II) and 138 major depressive disorder (MDD) remitted patients were interviewed by a bipolar-trained psychiatrist, for different study goals. Patients were questioned, using the Structured Clinical Interview for DSM-IV, about the most common symptoms and duration of recent threshold and subthreshold hypomanic episodes. Data were recorded between 2002 and 2006. Four different samples, assessed with the same methodology, were pooled for the present analyses. Tetrachoric factor analysis was used to identify core hypomanic symptoms. Distribution of symptoms by kernel estimation was inspected for bimodality. Validity of core hypomania was tested by receiver operating characteristic (ROC) analysis. The distribution of subthreshold and threshold hypomanic episodes did not show bimodality. Tetrachoric factor analysis found 2 uncorrelated factors: factor 1 included the "classic" symptoms elevated mood, inflated self-esteem, decreased need for sleep, talkativeness, and increase in goal-directed activity (overactivity); factor 2 included the "nonspecific" symptoms irritable mood, racing/crowded thoughts, and distractibility. Factor 1 discriminatory accuracy for distinguishing BP-II versus MDD was high (ROC area = 0.94). The distribution of the 5-symptom episodes of factor 1 showed clear-cut bimodality. Similar results were found for episodes limited to 3 behavioral symptoms of factor 1 (decreased need for sleep, talkativeness, and overactivity) and 4 behavioral symptoms of factor 1 (adding elevated mood), with high discriminatory accuracy. A core, categorical DSM-IV hypomania was found that included 3 to 5 symptoms, ie, behavioral symptoms and elevated mood. Behavioral symptoms (overactivity domain
Nonparametric Identification of Glucose-Insulin Process in IDDM Patient with Multi-meal Disturbance
Bhattacharjee, A.; Sutradhar, A.
2012-12-01
Modern close loop control for blood glucose level in a diabetic patient necessarily uses an explicit model of the process. A fixed parameter full order or reduced order model does not characterize the inter-patient and intra-patient parameter variability. This paper deals with a frequency domain nonparametric identification of the nonlinear glucose-insulin process in an insulin dependent diabetes mellitus patient that captures the process dynamics in presence of uncertainties and parameter variations. An online frequency domain kernel estimation method has been proposed that uses the input-output data from the 19th order first principle model of the patient in intravenous route. Volterra equations up to second order kernels with extended input vector for a Hammerstein model are solved online by adaptive recursive least square (ARLS) algorithm. The frequency domain kernels are estimated using the harmonic excitation input data sequence from the virtual patient model. A short filter memory length of M = 2 was found sufficient to yield acceptable accuracy with lesser computation time. The nonparametric models are useful for closed loop control, where the frequency domain kernels can be directly used as the transfer function. The validation results show good fit both in frequency and time domain responses with nominal patient as well as with parameter variations.
Influence Function and Robust Variant of Kernel Canonical Correlation Analysis
Alam, Md. Ashad; Fukumizu, Kenji; Wang, Yu-Ping
2017-01-01
Many unsupervised kernel methods rely on the estimation of the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both kernel CO and kernel CCO are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In addition, while the influence function (IF) of an estimator can characterize its robustness, asymptotic ...
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2011-01-01
In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator
DEFF Research Database (Denmark)
Linnet, Kristian
2005-01-01
Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger
2011-01-01
In a recent paper we have introduced the class of realised kernel estimators of the increments of quadratic variation in the presence of noise. We showed that this estimator is consistent and derived its limit distribution under various assumptions on the kernel weights. In this paper we extend our...... that subsampling is impotent, in the sense that subsampling has no effect on the asymptotic distribution. Perhaps surprisingly, for the efficient smooth kernels, such as the Parzen kernel, we show that subsampling is harmful as it increases the asymptotic variance. We also study the performance of subsampled...
Generalized Jackknife Estimators of Weighted Average Derivatives
DEFF Research Database (Denmark)
Cattaneo, Matias D.; Crump, Richard K.; Jansson, Michael
With the aim of improving the quality of asymptotic distributional approximations for nonlinear functionals of nonparametric estimators, this paper revisits the large-sample properties of an important member of that class, namely a kernel-based weighted average derivative estimator. Asymptotic...
Nonparametric statistical inference
Gibbons, Jean Dickinson
2014-01-01
Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.
A shortest-path graph kernel for estimating gene product semantic similarity
Directory of Open Access Journals (Sweden)
Alvarez Marco A
2011-07-01
Full Text Available Abstract Background Existing methods for calculating semantic similarity between gene products using the Gene Ontology (GO often rely on external resources, which are not part of the ontology. Consequently, changes in these external resources like biased term distribution caused by shifting of hot research topics, will affect the calculation of semantic similarity. One way to avoid this problem is to use semantic methods that are "intrinsic" to the ontology, i.e. independent of external knowledge. Results We present a shortest-path graph kernel (spgk method that relies exclusively on the GO and its structure. In spgk, a gene product is represented by an induced subgraph of the GO, which consists of all the GO terms annotating it. Then a shortest-path graph kernel is used to compute the similarity between two graphs. In a comprehensive evaluation using a benchmark dataset, spgk compares favorably with other methods that depend on external resources. Compared with simUI, a method that is also intrinsic to GO, spgk achieves slightly better results on the benchmark dataset. Statistical tests show that the improvement is significant when the resolution and EC similarity correlation coefficient are used to measure the performance, but is insignificant when the Pfam similarity correlation coefficient is used. Conclusions Spgk uses a graph kernel method in polynomial time to exploit the structure of the GO to calculate semantic similarity between gene products. It provides an alternative to both methods that use external resources and "intrinsic" methods with comparable performance.
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2009-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2010-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Nonparametric correlation models for portfolio allocation
DEFF Research Database (Denmark)
Aslanidis, Nektarios; Casas, Isabel
2013-01-01
This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....
Optimized Kernel Entropy Components.
Izquierdo-Verdiguier, Emma; Laparra, Valero; Jenssen, Robert; Gomez-Chova, Luis; Camps-Valls, Gustau
2017-06-01
This brief addresses two main issues of the standard kernel entropy component analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of variance, as in the kernel principal components analysis. In this brief, we propose an extension of the KECA method, named optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular, it is based on the independent component analysis framework, and introduces an extra rotation to the eigen decomposition, which is optimized via gradient-ascent search. This maximum entropy preservation suggests that OKECA features are more efficient than KECA features for density estimation. In addition, a critical issue in both the methods is the selection of the kernel parameter, since it critically affects the resulting performance. Here, we analyze the most common kernel length-scale selection criteria. The results of both the methods are illustrated in different synthetic and real problems. Results show that OKECA returns projections with more expressive power than KECA, the most successful rule for estimating the kernel parameter is based on maximum likelihood, and OKECA is more robust to the selection of the length-scale parameter in kernel density estimation.
Nonparametric Bayes Classification and Hypothesis Testing on Manifolds
Bhattacharya, Abhishek; Dunson, David
2012-01-01
Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028
Donald Gagliasso; Susan Hummel; Hailemariam. Temesgen
2014-01-01
Various methods have been used to estimate the amount of above ground forest biomass across landscapes and to create biomass maps for specific stands or pixels across ownership or project areas. Without an accurate estimation method, land managers might end up with incorrect biomass estimate maps, which could lead them to make poorer decisions in their future...
Directory of Open Access Journals (Sweden)
Delong Feng
2016-05-01
Full Text Available Remaining useful life estimation of the prognostics and health management technique is a complicated and difficult research question for maintenance. In this article, we consider the problem of prognostics modeling and estimation of the turbofan engine under complicated circumstances and propose a kernel principal component analysis–based degradation model and remaining useful life estimation method for such aircraft engine. We first analyze the output data created by the turbofan engine thermodynamic simulation that is based on the kernel principal component analysis method and then distinguish the qualitative and quantitative relationships between the key factors. Next, we build a degradation model for the engine fault based on the following assumptions: the engine has only had constant failure (i.e. no sudden failure is included, and the engine has a Wiener process, which is a covariate stand for the engine system drift. To predict the remaining useful life of the turbofan engine, we built a health index based on the degradation model and used the method of maximum likelihood and the data from the thermodynamic simulation model to estimate the parameters of this degradation model. Through the data analysis, we obtained a trend model of the regression curve line that fits with the actual statistical data. Based on the predicted health index model and the data trend model, we estimate the remaining useful life of the aircraft engine as the index reaches zero. At last, a case study involving engine simulation data demonstrates the precision and performance advantages of this prediction method that we propose. At last, a case study involving engine simulation data demonstrates the precision and performance advantages of this proposed method, the precision of the method can reach to 98.9% and the average precision is 95.8%.
de Uña-Álvarez, Jacobo; Meira-Machado, Luís
2015-06-01
Multi-state models are often used for modeling complex event history data. In these models the estimation of the transition probabilities is of particular interest, since they allow for long-term predictions of the process. These quantities have been traditionally estimated by the Aalen-Johansen estimator, which is consistent if the process is Markov. Several non-Markov estimators have been proposed in the recent literature, and their superiority with respect to the Aalen-Johansen estimator has been proved in situations in which the Markov condition is strongly violated. However, the existing estimators have the drawback of requiring that the support of the censoring distribution contains the support of the lifetime distribution, which is not often the case. In this article, we propose two new methods for estimating the transition probabilities in the progressive illness-death model. Some asymptotic results are derived. The proposed estimators are consistent regardless the Markov condition and the referred assumption about the censoring support. We explore the finite sample behavior of the estimators through simulations. The main conclusion of this piece of research is that the proposed estimators are much more efficient than the existing non-Markov estimators in most cases. An application to a clinical trial on colon cancer is included. Extensions to progressive processes beyond the three-state illness-death model are discussed. © 2015, The International Biometric Society.
Nonparametric Inference for Periodic Sequences
Sun, Ying
2012-02-01
This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.
Ingacheva, Anastasia; Chukalina, Marina; Khanipov, Timur; Nikolaev, Dmitry
2018-04-01
Motion blur caused by camera vibration is a common source of degradation in photographs. In this paper we study the problem of finding the point spread function (PSF) of a blurred image using the tomography technique. The PSF reconstruction result strongly depends on the particular tomography technique used. We present a tomography algorithm with regularization adapted specifically for this task. We use the algebraic reconstruction technique (ART algorithm) as the starting algorithm and introduce regularization. We use the conjugate gradient method for numerical implementation of the proposed approach. The algorithm is tested using a dataset which contains 9 kernels extracted from real photographs by the Adobe corporation where the point spread function is known. We also investigate influence of noise on the quality of image reconstruction and investigate how the number of projections influence the magnitude change of the reconstruction error.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
International Nuclear Information System (INIS)
Shin, Ho Cheol; Park, Moon Ghu; You, Skin
2006-01-01
Recently, many on-line approaches to instrument channel surveillance (drift monitoring and fault detection) have been reported worldwide. On-line monitoring (OLM) method evaluates instrument channel performance by assessing its consistency with other plant indications through parametric or non-parametric models. The heart of an OLM system is the model giving an estimate of the true process parameter value against individual measurements. This model gives process parameter estimate calculated as a function of other plant measurements which can be used to identify small sensor drifts that would require the sensor to be manually calibrated or replaced. This paper describes an improvement of auto associative kernel regression (AAKR) by introducing a correlation coefficient weighting on kernel distances. The prediction performance of the developed method is compared with conventional auto-associative kernel regression
Directory of Open Access Journals (Sweden)
J. Bohlin
2012-07-01
Full Text Available The recent development in software for automatic photogrammetric processing of multispectral aerial imagery, and the growing nation-wide availability of Digital Elevation Model (DEM data, are about to revolutionize data capture for forest management planning in Scandinavia. Using only already available aerial imagery and ALS-assessed DEM data, raster estimates of the forest variables mean tree height, basal area, total stem volume, and species-specific stem volumes were produced and evaluated. The study was conducted at a coniferous hemi-boreal test site in southern Sweden (lat. 58° N, long. 13° E. Digital aerial images from the Zeiss/Intergraph Digital Mapping Camera system were used to produce 3D point-cloud data with spectral information. Metrics were calculated for 696 field plots (10 m radius from point-cloud data and used in k-MSN to estimate forest variables. For these stands, the tree height ranged from 1.4 to 33.0 m (18.1 m mean, stem volume from 0 to 829 m3 ha-1 (249 m3 ha-1 mean and basal area from 0 to 62.2 m2 ha-1 (26.1 m2 ha-1 mean, with mean stand size of 2.8 ha. Estimates made using digital aerial images corresponding to the standard acquisition of the Swedish National Land Survey (Lantmäteriet showed RMSEs (in percent of the surveyed stand mean of 7.5% for tree height, 11.4% for basal area, 13.2% for total stem volume, 90.6% for pine stem volume, 26.4 for spruce stem volume, and 72.6% for deciduous stem volume. The results imply that photogrammetric matching of digital aerial images has significant potential for operational use in forestry.
Rafal Podlaski; Francis A. Roesch
2014-01-01
Two-component mixtures of either the Weibull distribution or the gamma distribution and the kernel density estimator were used for describing the diameter at breast height (dbh) empirical distributions of two-cohort stands. The data consisted of study plots from the Å wietokrzyski National Park (central Poland) and areas close to and including the North Carolina section...
Choi, Sae Il
2009-01-01
This study used simulation (a) to compare the kernel equating method to traditional equipercentile equating methods under the equivalent-groups (EG) design and the nonequivalent-groups with anchor test (NEAT) design and (b) to apply the parametric bootstrap method for estimating standard errors of equating. A two-parameter logistic item response…
Asymptotic normality of kernel estimator of $\\psi$-regression function for functional ergodic data
Laksaci ALI; Benziadi Fatima; Gheriballak Abdelkader
2016-01-01
In this paper we consider the problem of the estimation of the $\\psi$-regression function when the covariates take values in an infinite dimensional space. Our main aim is to establish, under a stationary ergodic process assumption, the asymptotic normality of this estimate.
Coburn, T.C.; Freeman, P.A.; Attanasi, E.D.
2012-01-01
The primary objectives of this research were to (1) investigate empirical methods for establishing regional trends in unconventional gas resources as exhibited by historical production data and (2) determine whether or not incorporating additional knowledge of a regional trend in a suite of previously established local nonparametric resource prediction algorithms influences assessment results. Three different trend detection methods were applied to publicly available production data (well EUR aggregated to 80-acre cells) from the Devonian Antrim Shale gas play in the Michigan Basin. This effort led to the identification of a southeast-northwest trend in cell EUR values across the play that, in a very general sense, conforms to the primary fracture and structural orientations of the province. However, including this trend in the resource prediction algorithms did not lead to improved results. Further analysis indicated the existence of clustering among cell EUR values that likely dampens the contribution of the regional trend. The reason for the clustering, a somewhat unexpected result, is not completely understood, although the geological literature provides some possible explanations. With appropriate data, a better understanding of this clustering phenomenon may lead to important information about the factors and their interactions that control Antrim Shale gas production, which may, in turn, help establish a more general protocol for better estimating resources in this and other shale gas plays. ?? 2011 International Association for Mathematical Geology (outside the USA).
Nonparametric functional mapping of quantitative trait loci.
Yang, Jie; Wu, Rongling; Casella, George
2009-03-01
Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole; Hansen, Peter Reinhard; Lunde, Asger
We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement noise of certain types and can also handle non-synchronous trading. It is the first estimator...
DEFF Research Database (Denmark)
Varneskov, Rasmus T.
2014-01-01
-top estimators are shown to be consistent, asymptotically unbiased, and mixed Gaussian at the optimal rate of convergence, n1/4. Exact bound on lower order terms are obtained using maximal inequalities and these are used to derive a conservative, MSE-optimal flat-top shrinkage. Additionally, bounds...
On Cooper's Nonparametric Test.
Schmeidler, James
1978-01-01
The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)
Directory of Open Access Journals (Sweden)
Wenhao Yu
Full Text Available The urban facility, one of the most important service providers is usually represented by sets of points in GIS applications using POI (Point of Interest model associated with certain human social activities. The knowledge about distribution intensity and pattern of facility POIs is of great significance in spatial analysis, including urban planning, business location choosing and social recommendations. Kernel Density Estimation (KDE, an efficient spatial statistics tool for facilitating the processes above, plays an important role in spatial density evaluation, because KDE method considers the decay impact of services and allows the enrichment of the information from a very simple input scatter plot to a smooth output density surface. However, the traditional KDE is mainly based on the Euclidean distance, ignoring the fact that in urban street network the service function of POI is carried out over a network-constrained structure, rather than in a Euclidean continuous space. Aiming at this question, this study proposes a computational method of KDE on a network and adopts a new visualization method by using 3-D "wall" surface. Some real conditional factors are also taken into account in this study, such as traffic capacity, road direction and facility difference. In practical works the proposed method is implemented in real POI data in Shenzhen city, China to depict the distribution characteristic of services under impacts of multi-factors.
Galindo, I.; Romero, M. C.; Sánchez, N.; Morales, J. M.
2016-06-01
Risk management stakeholders in high-populated volcanic islands should be provided with the latest high-quality volcanic information. We present here the first volcanic susceptibility map of Lanzarote and Chinijo Islands and their submarine flanks based on updated chronostratigraphical and volcano structural data, as well as on the geomorphological analysis of the bathymetric data of the submarine flanks. The role of the structural elements in the volcanic susceptibility analysis has been reviewed: vents have been considered since they indicate where previous eruptions took place; eruptive fissures provide information about the stress field as they are the superficial expression of the dyke conduit; eroded dykes have been discarded since they are single non-feeder dykes intruded in deep parts of Miocene-Pliocene volcanic edifices; main faults have been taken into account only in those cases where they could modified the superficial movement of magma. The application of kernel density estimation via a linear diffusion process for the volcanic susceptibility assessment has been applied successfully to Lanzarote and could be applied to other fissure volcanic fields worldwide since the results provide information about the probable area where an eruption could take place but also about the main direction of the probable volcanic fissures.
Curceac, S.; Ternynck, C.; Ouarda, T.
2015-12-01
Over the past decades, a substantial amount of research has been conducted to model and forecast climatic variables. In this study, Nonparametric Functional Data Analysis (NPFDA) methods are applied to forecast air temperature and wind speed time series in Abu Dhabi, UAE. The dataset consists of hourly measurements recorded for a period of 29 years, 1982-2010. The novelty of the Functional Data Analysis approach is in expressing the data as curves. In the present work, the focus is on daily forecasting and the functional observations (curves) express the daily measurements of the above mentioned variables. We apply a non-linear regression model with a functional non-parametric kernel estimator. The computation of the estimator is performed using an asymmetrical quadratic kernel function for local weighting based on the bandwidth obtained by a cross validation procedure. The proximities between functional objects are calculated by families of semi-metrics based on derivatives and Functional Principal Component Analysis (FPCA). Additionally, functional conditional mode and functional conditional median estimators are applied and the advantages of combining their results are analysed. A different approach employs a SARIMA model selected according to the minimum Akaike (AIC) and Bayessian (BIC) Information Criteria and based on the residuals of the model. The performance of the models is assessed by calculating error indices such as the root mean square error (RMSE), relative RMSE, BIAS and relative BIAS. The results indicate that the NPFDA models provide more accurate forecasts than the SARIMA models. Key words: Nonparametric functional data analysis, SARIMA, time series forecast, air temperature, wind speed
Reproducing kernel Hilbert spaces of Gaussian priors
Vaart, van der A.W.; Zanten, van J.H.; Clarke, B.; Ghosal, S.
2008-01-01
We review definitions and properties of reproducing kernel Hilbert spaces attached to Gaussian variables and processes, with a view to applications in nonparametric Bayesian statistics using Gaussian priors. The rate of contraction of posterior distributions based on Gaussian priors can be described
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Hansen, P. Reinhard; Lunde, Asger
2009-01-01
and find a remarkable level of agreement. We identify some features of the high-frequency data, which are challenging for realized kernels. They are when there are local trends in the data, over periods of around 10 minutes, where the prices and quotes are driven up or down. These can be associated......Realized kernels use high-frequency data to estimate daily volatility of individual stock prices. They can be applied to either trade or quote data. Here we provide the details of how we suggest implementing them in practice. We compare the estimates based on trade and quote data for the same stock...
Chen, Y.; Ho, C.; Chang, L.
2011-12-01
In previous decades, the climate change caused by global warming increases the occurrence frequency of extreme hydrological events. Water supply shortages caused by extreme events create great challenges for water resource management. To evaluate future climate variations, general circulation models (GCMs) are the most wildly known tools which shows possible weather conditions under pre-defined CO2 emission scenarios announced by IPCC. Because the study area of GCMs is the entire earth, the grid sizes of GCMs are much larger than the basin scale. To overcome the gap, a statistic downscaling technique can transform the regional scale weather factors into basin scale precipitations. The statistic downscaling technique can be divided into three categories include transfer function, weather generator and weather type. The first two categories describe the relationships between the weather factors and precipitations respectively based on deterministic algorithms, such as linear or nonlinear regression and ANN, and stochastic approaches, such as Markov chain theory and statistical distributions. In the weather type, the method has ability to cluster weather factors, which are high dimensional and continuous variables, into weather types, which are limited number of discrete states. In this study, the proposed downscaling model integrates the weather type, using the K-means clustering algorithm, and the weather generator, using the kernel density estimation. The study area is Shihmen basin in northern of Taiwan. In this study, the research process contains two steps, a calibration step and a synthesis step. Three sub-steps were used in the calibration step. First, weather factors, such as pressures, humidities and wind speeds, obtained from NCEP and the precipitations observed from rainfall stations were collected for downscaling. Second, the K-means clustering grouped the weather factors into four weather types. Third, the Markov chain transition matrixes and the
Dickhaus, Thorsten
2018-01-01
This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.
Bayesian nonparametric data analysis
Müller, Peter; Jara, Alejandro; Hanson, Tim
2015-01-01
This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.
Schaeck, S.; Karspeck, T.; Ott, C.; Weckler, M.; Stoermer, A. O.
2011-03-01
In March 2007 the BMW Group has launched the micro-hybrid functions brake energy regeneration (BER) and automatic start and stop function (ASSF). Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology are applied in vehicles with micro-hybrid power system (MHPS). In both part I and part II of this publication vehicles with MHPS and AGM batteries are subject to a field operational test (FOT). Test vehicles with conventional power system (CPS) and flooded batteries were used as a reference. In the FOT sample batteries were mounted several times and electrically tested in the laboratory intermediately. Vehicle- and battery-related diagnosis data were read out for each test run and were matched with laboratory data in a data base. The FOT data were analyzed by the use of two-dimensional, nonparametric kernel estimation for clear data presentation. The data show that capacity loss in the MHPS is comparable to the CPS. However, the influence of mileage performance, which cannot be separated, suggests that battery stress is enhanced in the MHPS although a battery refresh function is applied. Anyway, the FOT demonstrates the unsuitability of flooded batteries for the MHPS because of high early capacity loss due to acid stratification and because of vanishing cranking performance due to increasing internal resistance. Furthermore, the lack of dynamic charge acceptance for high energy regeneration efficiency is illustrated. Under the presented FOT conditions charge acceptance of lead-acid (LA) batteries decreases to less than one third for about half of the sample batteries compared to new battery condition. In part II of this publication FOT data are presented by multiple regression analysis (Schaeck et al., submitted for publication [1]).
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Bayesian Kernel Mixtures for Counts.
Canale, Antonio; Dunson, David B
2011-12-01
Although Bayesian nonparametric mixture models for continuous data are well developed, there is a limited literature on related approaches for count data. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions having variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow smooth deviations from the Poisson. As a broad class of alternative models, we propose to use nonparametric mixtures of rounded continuous kernels. An efficient Gibbs sampler is developed for posterior computation, and a simulation study is performed to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. The methods are illustrated through applications to a developmental toxicity study and marketing data. This article has supplementary material online.
Quantal Response: Nonparametric Modeling
2017-01-01
capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log
Nonparametric Bayesian inference for multidimensional compound Poisson processes
Gugushvili, S.; van der Meulen, F.; Spreij, P.
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,
A Structural Labor Supply Model with Nonparametric Preferences
van Soest, A.H.O.; Das, J.W.M.; Gong, X.
2000-01-01
Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger
2011-01-01
We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement error of certain types and can also handle non-synchronous trading. It is the first estimator...... which has these three properties which are all essential for empirical work in this area. We derive the large sample asymptotics of this estimator and assess its accuracy using a Monte Carlo study. We implement the estimator on some US equity data, comparing our results to previous work which has used...
Higher-Order Hybrid Gaussian Kernel in Meshsize Boosting Algorithm
African Journals Online (AJOL)
In this paper, we shall use higher-order hybrid Gaussian kernel in a meshsize boosting algorithm in kernel density estimation. Bias reduction is guaranteed in this scheme like other existing schemes but uses the higher-order hybrid Gaussian kernel instead of the regular fixed kernels. A numerical verification of this scheme ...
Adaptive Kernel in Meshsize Boosting Algorithm in KDE ...
African Journals Online (AJOL)
This paper proposes the use of adaptive kernel in a meshsize boosting algorithm in kernel density estimation. The algorithm is a bias reduction scheme like other existing schemes but uses adaptive kernel instead of the regular fixed kernels. An empirical study for this scheme is conducted and the findings are comparatively ...
Adaptive Kernel In The Bootstrap Boosting Algorithm In KDE ...
African Journals Online (AJOL)
This paper proposes the use of adaptive kernel in a bootstrap boosting algorithm in kernel density estimation. The algorithm is a bias reduction scheme like other existing schemes but uses adaptive kernel instead of the regular fixed kernels. An empirical study for this scheme is conducted and the findings are comparatively ...
Directory of Open Access Journals (Sweden)
Zhi-Sai Ma
2017-01-01
Full Text Available Modal parameter estimation plays an important role in vibration-based damage detection and is worth more attention and investigation, as changes in modal parameters are usually being used as damage indicators. This paper focuses on the problem of output-only modal parameter recursive estimation of time-varying structures based upon parameterized representations of the time-dependent autoregressive moving average (TARMA. A kernel ridge regression functional series TARMA (FS-TARMA recursive identification scheme is proposed and subsequently employed for the modal parameter estimation of a numerical three-degree-of-freedom time-varying structural system and a laboratory time-varying structure consisting of a simply supported beam and a moving mass sliding on it. The proposed method is comparatively assessed against an existing recursive pseudolinear regression FS-TARMA approach via Monte Carlo experiments and shown to be capable of accurately tracking the time-varying dynamics in a recursive manner.
On Improving Density Estimators which are not Bona Fide Functions
Gajek, Leslaw
1986-01-01
In order to improve the rate of decrease of the IMSE for nonparametric kernel density estimators with nonrandom bandwidth beyond $O(n^{-4/5})$ all current methods must relax the constraint that the density estimate be a bona fide function, that is, be nonnegative and integrate to one. In this paper we show how to achieve similar improvement without relaxing any of these constraints. The method can also be applied for orthogonal series, adaptive orthogonal series, spline, jackknife, and other ...
International Nuclear Information System (INIS)
Boehlke, S.; Niegoth, H.
2012-01-01
In the nuclear power plant Leibstadt (KKL) during the next year large components will be dismantled and stored for final disposal within the interim storage facility ZENT at the NPP site. Before construction of ZENT appropriate estimations of the local dose rate inside and outside the building and the collective dose for the normal operation have to be performed. The shielding calculations are based on the properties of the stored components and radiation sources and on the concepts for working place requirements. The installation of control and monitoring areas will depend on these calculations. For the determination of the shielding potential of concrete walls and steel doors with the defined boundary conditions point-kernel codes like MICROSHIELd registered are used. Complex problems cannot be modeled with this code. Therefore the point-kernel code VISIPLAN registered was developed for the determination of the local dose distribution functions in 3D models. The possibility of motion sequence inputs allows an optimization of collective dose estimations for the operational phases of a nuclear facility.
Nonparametric Forecasting for Biochar Utilization in Poyang Lake Eco-Economic Zone in China
Directory of Open Access Journals (Sweden)
Meng-Shiuh Chang
2014-01-01
Full Text Available Agriculture is the least profitable industry in China. However, even with large financial subsidies from the government, farmers’ living standards have had no significant impact so far due to the historical, geographical, climatic factors. The study examines and quantifies the net economic and environmental benefits by utilizing biochar as a soil amendment in eleven counties in the Poyang Lake Eco-Economic Zone. A nonparametric kernel regression model is employed to estimate the relation between the scaled environmental and economic factors, which are determined as regression variables. In addition, the partial linear and single index regression models are used for comparison. In terms of evaluations of mean squared errors, the kernel estimator, exceeding the other estimators, is employed to forecast benefits of using biochar under various scenarios. The results indicate that biochar utilization can potentially increase farmers’ income if rice is planted and the net economic benefits can be achieved up to ¥114,900. The net economic benefits are higher when the pyrolysis plant is built in the south of Poyang Lake Eco-Economic Zone than when it is built in the north as the southern land is relatively barren, and biochar can save more costs on irrigation and fertilizer use.
Partial Deconvolution with Inaccurate Blur Kernel.
Ren, Dongwei; Zuo, Wangmeng; Zhang, David; Xu, Jun; Zhang, Lei
2017-10-17
Most non-blind deconvolution methods are developed under the error-free kernel assumption, and are not robust to inaccurate blur kernel. Unfortunately, despite the great progress in blind deconvolution, estimation error remains inevitable during blur kernel estimation. Consequently, severe artifacts such as ringing effects and distortions are likely to be introduced in the non-blind deconvolution stage. In this paper, we tackle this issue by suggesting: (i) a partial map in the Fourier domain for modeling kernel estimation error, and (ii) a partial deconvolution model for robust deblurring with inaccurate blur kernel. The partial map is constructed by detecting the reliable Fourier entries of estimated blur kernel. And partial deconvolution is applied to wavelet-based and learning-based models to suppress the adverse effect of kernel estimation error. Furthermore, an E-M algorithm is developed for estimating the partial map and recovering the latent sharp image alternatively. Experimental results show that our partial deconvolution model is effective in relieving artifacts caused by inaccurate blur kernel, and can achieve favorable deblurring quality on synthetic and real blurry images.Most non-blind deconvolution methods are developed under the error-free kernel assumption, and are not robust to inaccurate blur kernel. Unfortunately, despite the great progress in blind deconvolution, estimation error remains inevitable during blur kernel estimation. Consequently, severe artifacts such as ringing effects and distortions are likely to be introduced in the non-blind deconvolution stage. In this paper, we tackle this issue by suggesting: (i) a partial map in the Fourier domain for modeling kernel estimation error, and (ii) a partial deconvolution model for robust deblurring with inaccurate blur kernel. The partial map is constructed by detecting the reliable Fourier entries of estimated blur kernel. And partial deconvolution is applied to wavelet-based and learning
Smoothed Conditional Scale Function Estimation in AR(1-ARCH(1 Processes
Directory of Open Access Journals (Sweden)
Lema Logamou Seknewna
2018-01-01
Full Text Available The estimation of the Smoothed Conditional Scale Function for time series was taken out under the conditional heteroscedastic innovations by imitating the kernel smoothing in nonparametric QAR-QARCH scheme. The estimation was taken out based on the quantile regression methodology proposed by Koenker and Bassett. And the proof of the asymptotic properties of the Conditional Scale Function estimator for this type of process was given and its consistency was shown.
Parametric and Non-Parametric System Modelling
DEFF Research Database (Denmark)
Nielsen, Henrik Aalborg
1999-01-01
the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...
Nonparametric combinatorial sequence models.
Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa
2011-11-01
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.
2nd Conference of the International Society for Nonparametric Statistics
Manteiga, Wenceslao; Romo, Juan
2016-01-01
This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...
Nonparametric tests for censored data
Bagdonavicus, Vilijandas; Nikulin, Mikhail
2013-01-01
This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and
A note on Nonparametric Confidence Interval for a Shift Parameter ...
African Journals Online (AJOL)
The method is illustrated using the Cauchy distribution as a location model. The kernel-based method is found to have a shorter interval for the shift parameter between two Cauchy distributions than the one based on the Mann-Whitney test statistic. Keywords: Best Asymptotic Normal; Cauchy distribution; Kernel estimates; ...
McCullagh, Laura; Schmitz, Susanne; Barry, Michael; Walsh, Cathal
2017-11-01
In Ireland, all new drugs for which reimbursement by the healthcare payer is sought undergo a health technology assessment by the National Centre for Pharmacoeconomics. The National Centre for Pharmacoeconomics estimate expected value of perfect information but not partial expected value of perfect information (owing to computational expense associated with typical methodologies). The objective of this study was to examine the feasibility and utility of estimating partial expected value of perfect information via a computationally efficient, non-parametric regression approach. This was a retrospective analysis of evaluations on drugs for cancer that had been submitted to the National Centre for Pharmacoeconomics (January 2010 to December 2014 inclusive). Drugs were excluded if cost effective at the submitted price. Drugs were excluded if concerns existed regarding the validity of the applicants' submission or if cost-effectiveness model functionality did not allow required modifications to be made. For each included drug (n = 14), value of information was estimated at the final reimbursement price, at a threshold equivalent to the incremental cost-effectiveness ratio at that price. The expected value of perfect information was estimated from probabilistic analysis. Partial expected value of perfect information was estimated via a non-parametric approach. Input parameters with a population value at least €1 million were identified as potential targets for research. All partial estimates were determined within minutes. Thirty parameters (across nine models) each had a value of at least €1 million. These were categorised. Collectively, survival analysis parameters were valued at €19.32 million, health state utility parameters at €15.81 million and parameters associated with the cost of treating adverse effects at €6.64 million. Those associated with drug acquisition costs and with the cost of care were valued at €6.51 million and €5.71
Efficient nonparametric n -body force fields from machine learning
Glielmo, Aldo; Zeni, Claudio; De Vita, Alessandro
2018-05-01
We provide a definition and explicit expressions for n -body Gaussian process (GP) kernels, which can learn any interatomic interaction occurring in a physical system, up to n -body contributions, for any value of n . The series is complete, as it can be shown that the "universal approximator" squared exponential kernel can be written as a sum of n -body kernels. These recipes enable the choice of optimally efficient force models for each target system, as confirmed by extensive testing on various materials. We furthermore describe how the n -body kernels can be "mapped" on equivalent representations that provide database-size-independent predictions and are thus crucially more efficient. We explicitly carry out this mapping procedure for the first nontrivial (three-body) kernel of the series, and we show that this reproduces the GP-predicted forces with meV /Å accuracy while being orders of magnitude faster. These results pave the way to using novel force models (here named "M-FFs") that are computationally as fast as their corresponding standard parametrized n -body force fields, while retaining the nonparametric character, the ease of training and validation, and the accuracy of the best recently proposed machine-learning potentials.
Kang, Youngok; Cho, Nahye; Son, Serin
2018-01-01
The purpose of this study is to analyze how the spatiotemporal characteristics of traffic accidents involving the elderly population in Seoul are changing by time period. We applied kernel density estimation and hotspot analyses to analyze the spatial characteristics of elderly people's traffic accidents, and the space-time cube, emerging hotspot, and space-time kernel density estimation analyses to analyze the spatiotemporal characteristics. In addition, we analyzed elderly people's traffic accidents by dividing cases into those in which the drivers were elderly people and those in which elderly people were victims of traffic accidents, and used the traffic accidents data in Seoul for 2013 for analysis. The main findings were as follows: (1) the hotspots for elderly people's traffic accidents differed according to whether they were drivers or victims. (2) The hourly analysis showed that the hotspots for elderly drivers' traffic accidents are in specific areas north of the Han River during the period from morning to afternoon, whereas the hotspots for elderly victims are distributed over a wide area from daytime to evening. (3) Monthly analysis showed that the hotspots are weak during winter and summer, whereas they are strong in the hiking and climbing areas in Seoul during spring and fall. Further, elderly victims' hotspots are more sporadic than elderly drivers' hotspots. (4) The analysis for the entire period of 2013 indicates that traffic accidents involving elderly people are increasing in specific areas on the north side of the Han River. We expect the results of this study to aid in reducing the number of traffic accidents involving elderly people in the future.
Cho, Nahye; Son, Serin
2018-01-01
The purpose of this study is to analyze how the spatiotemporal characteristics of traffic accidents involving the elderly population in Seoul are changing by time period. We applied kernel density estimation and hotspot analyses to analyze the spatial characteristics of elderly people’s traffic accidents, and the space-time cube, emerging hotspot, and space-time kernel density estimation analyses to analyze the spatiotemporal characteristics. In addition, we analyzed elderly people’s traffic accidents by dividing cases into those in which the drivers were elderly people and those in which elderly people were victims of traffic accidents, and used the traffic accidents data in Seoul for 2013 for analysis. The main findings were as follows: (1) the hotspots for elderly people’s traffic accidents differed according to whether they were drivers or victims. (2) The hourly analysis showed that the hotspots for elderly drivers’ traffic accidents are in specific areas north of the Han River during the period from morning to afternoon, whereas the hotspots for elderly victims are distributed over a wide area from daytime to evening. (3) Monthly analysis showed that the hotspots are weak during winter and summer, whereas they are strong in the hiking and climbing areas in Seoul during spring and fall. Further, elderly victims’ hotspots are more sporadic than elderly drivers’ hotspots. (4) The analysis for the entire period of 2013 indicates that traffic accidents involving elderly people are increasing in specific areas on the north side of the Han River. We expect the results of this study to aid in reducing the number of traffic accidents involving elderly people in the future. PMID:29768453
Bayesian Nonparametric Longitudinal Data Analysis.
Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen
2016-01-01
Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.
Directory of Open Access Journals (Sweden)
Yang Zhang
2015-11-01
Full Text Available Prognostics is necessary to ensure the reliability and safety of lithium-ion batteries for hybrid electric vehicles or satellites. This process can be achieved by capacity estimation, which is a direct fading indicator for assessing the state of health of a battery. However, the capacity of a lithium-ion battery onboard is difficult to monitor. This paper presents a data-driven approach for online capacity estimation. First, six novel features are extracted from cyclic charge/discharge cycles and used as indirect health indicators. An adaptive multi-kernel relevance machine (MKRVM based on accelerated particle swarm optimization algorithm is used to determine the optimal parameters of MKRVM and characterize the relationship between extracted features and battery capacity. The overall estimation process comprises offline and online stages. A supervised learning step in the offline stage is established for model verification to ensure the generalizability of MKRVM for online application. Cross-validation is further conducted to validate the performance of the proposed model. Experiment and comparison results show the effectiveness, accuracy, efficiency, and robustness of the proposed approach for online capacity estimation of lithium-ion batteries.
Chen, Tai-Been; Chen, Jyh-Cheng; Lu, Henry Horng-Shing
2012-01-01
Segmentation of positron emission tomography (PET) is typically achieved using the K-Means method or other approaches. In preclinical and clinical applications, the K-Means method needs a prior estimation of parameters such as the number of clusters and appropriate initialized values. This work segments microPET images using a hybrid method combining the Gaussian mixture model (GMM) with kernel density estimation. Segmentation is crucial to registration of disordered 2-deoxy-2-fluoro-D-glucose (FDG) accumulation locations with functional diagnosis and to estimate standardized uptake values (SUVs) of region of interests (ROIs) in PET images. Therefore, simulation studies are conducted to apply spherical targets to evaluate segmentation accuracy based on Tanimoto's definition of similarity. The proposed method generates a higher degree of similarity than the K-Means method. The PET images of a rat brain are used to compare the segmented shape and area of the cerebral cortex by the K-Means method and the proposed method by volume rendering. The proposed method provides clearer and more detailed activity structures of an FDG accumulation location in the cerebral cortex than those by the K-Means method.
Nonparametric identification of copula structures
Li, Bo; Genton, Marc G.
2013-01-01
We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric
Bivariate discrete beta Kernel graduation of mortality data.
Mazza, Angelo; Punzo, Antonio
2015-07-01
Various parametric/nonparametric techniques have been proposed in literature to graduate mortality data as a function of age. Nonparametric approaches, as for example kernel smoothing regression, are often preferred because they do not assume any particular mortality law. Among the existing kernel smoothing approaches, the recently proposed (univariate) discrete beta kernel smoother has been shown to provide some benefits. Bivariate graduation, over age and calendar years or durations, is common practice in demography and actuarial sciences. In this paper, we generalize the discrete beta kernel smoother to the bivariate case, and we introduce an adaptive bandwidth variant that may provide additional benefits when data on exposures to the risk of death are available; furthermore, we outline a cross-validation procedure for bandwidths selection. Using simulations studies, we compare the bivariate approach proposed here with its corresponding univariate formulation and with two popular nonparametric bivariate graduation techniques, based on Epanechnikov kernels and on P-splines. To make simulations realistic, a bivariate dataset, based on probabilities of dying recorded for the US males, is used. Simulations have confirmed the gain in performance of the new bivariate approach with respect to both the univariate and the bivariate competitors.
Speaker Linking and Applications using Non-Parametric Hashing Methods
2016-09-08
nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
Rock, N. M. S.
ROBUST calculates 53 statistics, plus significance levels for 6 hypothesis tests, on each of up to 52 variables. These together allow the following properties of the data distribution for each variable to be examined in detail: (1) Location. Three means (arithmetic, geometric, harmonic) are calculated, together with the midrange and 19 high-performance robust L-, M-, and W-estimates of location (combined, adaptive, trimmed estimates, etc.) (2) Scale. The standard deviation is calculated along with the H-spread/2 (≈ semi-interquartile range), the mean and median absolute deviations from both mean and median, and a biweight scale estimator. The 23 location and 6 scale estimators programmed cover all possible degrees of robustness. (3) Normality: Distributions are tested against the null hypothesis that they are normal, using the 3rd (√ h1) and 4th ( b 2) moments, Geary's ratio (mean deviation/standard deviation), Filliben's probability plot correlation coefficient, and a more robust test based on the biweight scale estimator. These statistics collectively are sensitive to most usual departures from normality. (4) Presence of outliers. The maximum and minimum values are assessed individually or jointly using Grubbs' maximum Studentized residuals, Harvey's and Dixon's criteria, and the Studentized range. For a single input variable, outliers can be either winsorized or eliminated and all estimates recalculated iteratively as desired. The following data-transformations also can be applied: linear, log 10, generalized Box Cox power (including log, reciprocal, and square root), exponentiation, and standardization. For more than one variable, all results are tabulated in a single run of ROBUST. Further options are incorporated to assess ratios (of two variables) as well as discrete variables, and be concerned with missing data. Cumulative S-plots (for assessing normality graphically) also can be generated. The mutual consistency or inconsistency of all these measures
Energy Technology Data Exchange (ETDEWEB)
Fan, J; Fan, J; Hu, W; Wang, J [Fudan University Shanghai Cancer Center, Shanghai, Shanghai (China)
2016-06-15
Purpose: To develop a fast automatic algorithm based on the two dimensional kernel density estimation (2D KDE) to predict the dose-volume histogram (DVH) which can be employed for the investigation of radiotherapy quality assurance and automatic treatment planning. Methods: We propose a machine learning method that uses previous treatment plans to predict the DVH. The key to the approach is the framing of DVH in a probabilistic setting. The training consists of estimating, from the patients in the training set, the joint probability distribution of the dose and the predictive features. The joint distribution provides an estimation of the conditional probability of the dose given the values of the predictive features. For the new patient, the prediction consists of estimating the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimation of the DVH. The 2D KDE is implemented to predict the joint probability distribution of the training set and the distribution of the predictive features for the new patient. Two variables, including the signed minimal distance from each OAR (organs at risk) voxel to the target boundary and its opening angle with respect to the origin of voxel coordinate, are considered as the predictive features to represent the OAR-target spatial relationship. The feasibility of our method has been demonstrated with the rectum, breast and head-and-neck cancer cases by comparing the predicted DVHs with the planned ones. Results: The consistent result has been found between these two DVHs for each cancer and the average of relative point-wise differences is about 5% within the clinical acceptable extent. Conclusion: According to the result of this study, our method can be used to predict the clinical acceptable DVH and has ability to evaluate the quality and consistency of the treatment planning.
International Nuclear Information System (INIS)
Fan, J; Fan, J; Hu, W; Wang, J
2016-01-01
Purpose: To develop a fast automatic algorithm based on the two dimensional kernel density estimation (2D KDE) to predict the dose-volume histogram (DVH) which can be employed for the investigation of radiotherapy quality assurance and automatic treatment planning. Methods: We propose a machine learning method that uses previous treatment plans to predict the DVH. The key to the approach is the framing of DVH in a probabilistic setting. The training consists of estimating, from the patients in the training set, the joint probability distribution of the dose and the predictive features. The joint distribution provides an estimation of the conditional probability of the dose given the values of the predictive features. For the new patient, the prediction consists of estimating the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimation of the DVH. The 2D KDE is implemented to predict the joint probability distribution of the training set and the distribution of the predictive features for the new patient. Two variables, including the signed minimal distance from each OAR (organs at risk) voxel to the target boundary and its opening angle with respect to the origin of voxel coordinate, are considered as the predictive features to represent the OAR-target spatial relationship. The feasibility of our method has been demonstrated with the rectum, breast and head-and-neck cancer cases by comparing the predicted DVHs with the planned ones. Results: The consistent result has been found between these two DVHs for each cancer and the average of relative point-wise differences is about 5% within the clinical acceptable extent. Conclusion: According to the result of this study, our method can be used to predict the clinical acceptable DVH and has ability to evaluate the quality and consistency of the treatment planning.
Analog forecasting with dynamics-adapted kernels
Zhao, Zhizhen; Giannakis, Dimitrios
2016-09-01
Analog forecasting is a nonparametric technique introduced by Lorenz in 1969 which predicts the evolution of states of a dynamical system (or observables defined on the states) by following the evolution of the sample in a historical record of observations which most closely resembles the current initial data. Here, we introduce a suite of forecasting methods which improve traditional analog forecasting by combining ideas from kernel methods developed in harmonic analysis and machine learning and state-space reconstruction for dynamical systems. A key ingredient of our approach is to replace single-analog forecasting with weighted ensembles of analogs constructed using local similarity kernels. The kernels used here employ a number of dynamics-dependent features designed to improve forecast skill, including Takens’ delay-coordinate maps (to recover information in the initial data lost through partial observations) and a directional dependence on the dynamical vector field generating the data. Mathematically, our approach is closely related to kernel methods for out-of-sample extension of functions, and we discuss alternative strategies based on the Nyström method and the multiscale Laplacian pyramids technique. We illustrate these techniques in applications to forecasting in a low-order deterministic model for atmospheric dynamics with chaotic metastability, and interannual-scale forecasting in the North Pacific sector of a comprehensive climate model. We find that forecasts based on kernel-weighted ensembles have significantly higher skill than the conventional approach following a single analog.
Kernel regression with functional response
Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe
2011-01-01
We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.
Predictive Model Equations for Palm Kernel (Elaeis guneensis J ...
African Journals Online (AJOL)
Estimated error of ± 0.18 and ± 0.2 are envisaged while applying the models for predicting palm kernel and sesame oil colours respectively. Keywords: Palm kernel, Sesame, Palm kernel, Oil Colour, Process Parameters, Model. Journal of Applied Science, Engineering and Technology Vol. 6 (1) 2006 pp. 34-38 ...
Decision support using nonparametric statistics
Beatty, Warren
2018-01-01
This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.
Ngan, Henry Y. T.; Yung, Nelson H. C.; Yeh, Anthony G. O.
2015-02-01
This paper aims at presenting a comparative study of outlier detection (OD) for large-scale traffic data. The traffic data nowadays are massive in scale and collected in every second throughout any modern city. In this research, the traffic flow dynamic is collected from one of the busiest 4-armed junction in Hong Kong in a 31-day sampling period (with 764,027 vehicles in total). The traffic flow dynamic is expressed in a high dimension spatial-temporal (ST) signal format (i.e. 80 cycles) which has a high degree of similarities among the same signal and across different signals in one direction. A total of 19 traffic directions are identified in this junction and lots of ST signals are collected in the 31-day period (i.e. 874 signals). In order to reduce its dimension, the ST signals are firstly undergone a principal component analysis (PCA) to represent as (x,y)-coordinates. Then, these PCA (x,y)-coordinates are assumed to be conformed as Gaussian distributed. With this assumption, the data points are further to be evaluated by (a) a correlation study with three variant coefficients, (b) one-class support vector machine (SVM) and (c) kernel density estimation (KDE). The correlation study could not give any explicit OD result while the one-class SVM and KDE provide average 59.61% and 95.20% DSRs, respectively.
RTOS kernel in portable electrocardiograph
Centeno, C. A.; Voos, J. A.; Riva, G. G.; Zerbini, C.; Gonzalez, E. A.
2011-12-01
This paper presents the use of a Real Time Operating System (RTOS) on a portable electrocardiograph based on a microcontroller platform. All medical device digital functions are performed by the microcontroller. The electrocardiograph CPU is based on the 18F4550 microcontroller, in which an uCOS-II RTOS can be embedded. The decision associated with the kernel use is based on its benefits, the license for educational use and its intrinsic time control and peripherals management. The feasibility of its use on the electrocardiograph is evaluated based on the minimum memory requirements due to the kernel structure. The kernel's own tools were used for time estimation and evaluation of resources used by each process. After this feasibility analysis, the migration from cyclic code to a structure based on separate processes or tasks able to synchronize events is used; resulting in an electrocardiograph running on one Central Processing Unit (CPU) based on RTOS.
RTOS kernel in portable electrocardiograph
International Nuclear Information System (INIS)
Centeno, C A; Voos, J A; Riva, G G; Zerbini, C; Gonzalez, E A
2011-01-01
This paper presents the use of a Real Time Operating System (RTOS) on a portable electrocardiograph based on a microcontroller platform. All medical device digital functions are performed by the microcontroller. The electrocardiograph CPU is based on the 18F4550 microcontroller, in which an uCOS-II RTOS can be embedded. The decision associated with the kernel use is based on its benefits, the license for educational use and its intrinsic time control and peripherals management. The feasibility of its use on the electrocardiograph is evaluated based on the minimum memory requirements due to the kernel structure. The kernel's own tools were used for time estimation and evaluation of resources used by each process. After this feasibility analysis, the migration from cyclic code to a structure based on separate processes or tasks able to synchronize events is used; resulting in an electrocardiograph running on one Central Processing Unit (CPU) based on RTOS.
Zheng, Yinggan; Gierl, Mark J.; Cui, Ying
2010-01-01
This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…
A ¤nonparametric dynamic additive regression model for longitudinal data
DEFF Research Database (Denmark)
Martinussen, T.; Scheike, T. H.
2000-01-01
dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...
Alam, Md. Ashad; Fukumizu, Kenji; Wang, Yu-Ping
2016-01-01
To the best of our knowledge, there are no general well-founded robust methods for statistical unsupervised learning. Most of the unsupervised methods explicitly or implicitly depend on the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). They are sensitive to contaminated data, even when using bounded positive definite kernels. First, we propose robust kernel covariance operator (robust kernel CO) and robust kernel crosscovariance operator (robust kern...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Approximate kernel competitive learning.
Wu, Jian-Sheng; Zheng, Wei-Shi; Lai, Jian-Huang
2015-03-01
Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.
Efficient nonparametric estimation of causal mediation effects
Chan, K. C. G.; Imai, K.; Yam, S. C. P.; Zhang, Z.
2016-01-01
An essential goal of program evaluation and scientific research is the investigation of causal mechanisms. Over the past several decades, causal mediation analysis has been used in medical and social sciences to decompose the treatment effect into the natural direct and indirect effects. However, all of the existing mediation analysis methods rely on parametric modeling assumptions in one way or another, typically requiring researchers to specify multiple regression models involving the treat...
International Nuclear Information System (INIS)
Massaro, F.; Funk, S.; D'Abrusco, R.; Paggi, A.; Smith, Howard A.; Masetti, N.; Giroletti, M.; Tosti, G.
2013-01-01
Nearly one-third of the γ-ray sources detected by Fermi are still unidentified, despite significant recent progress in this area. However, all of the γ-ray extragalactic sources associated in the second Fermi-LAT catalog have a radio counterpart. Motivated by this observational evidence, we investigate all the radio sources of the major radio surveys that lie within the positional uncertainty region of the unidentified γ-ray sources (UGSs) at a 95% level of confidence. First, we search for their infrared counterparts in the all-sky survey performed by the Wide-field Infrared Survey Explorer (WISE) and then we analyze their IR colors in comparison with those of the known γ-ray blazars. We propose a new approach, on the basis of a two-dimensional kernel density estimation technique in the single [3.4] – [4.6] – [12] μm WISE color-color plot, replacing the constraint imposed in our previous investigations on the detection at 22 μm of each potential IR counterpart of the UGSs with associated radio emission. The main goal of this analysis is to find distant γ-ray blazar candidates that, being too faint at 22 μm, are not detected by WISE and thus are not selected by our purely IR-based methods. We find 55 UGSs that likely correspond to radio sources with blazar-like IR signatures. An additional 11 UGSs that have blazar-like IR colors have been found within the sample of sources found with deep recent Australia Telescope Compact Array observations
Nonparametric factor analysis of time series
Rodríguez-Poo, Juan M.; Linton, Oliver Bruce
1998-01-01
We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.
Directory of Open Access Journals (Sweden)
Xiaochun Sun
Full Text Available Genomic selection (GS procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA and reproducing kernel Hilbert spaces (RKHS regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.
Sun, Xiaochun; Ma, Ping; Mumm, Rita H
2012-01-01
Genomic selection (GS) procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA) and reproducing kernel Hilbert spaces (RKHS) regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC) and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.
Employment of kernel methods on wind turbine power performance assessment
DEFF Research Database (Denmark)
Skrimpas, Georgios Alexandros; Sweeney, Christian Walsted; Marhadi, Kun S.
2015-01-01
A power performance assessment technique is developed for the detection of power production discrepancies in wind turbines. The method employs a widely used nonparametric pattern recognition technique, the kernel methods. The evaluation is based on the trending of an extracted feature from...... the kernel matrix, called similarity index, which is introduced by the authors for the first time. The operation of the turbine and consequently the computation of the similarity indexes is classified into five power bins offering better resolution and thus more consistent root cause analysis. The accurate...
Estimation and variable selection for generalized additive partial linear models
Wang, Li
2011-08-01
We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus results in gains in computational simplicity. We further develop a class of variable selection procedures for the linear parameters by employing a nonconcave penalized quasi-likelihood, which is shown to have an asymptotic oracle property. Monte Carlo simulations and an empirical example are presented for illustration. © Institute of Mathematical Statistics, 2011.
Nonparametric predictive inference in reliability
International Nuclear Information System (INIS)
Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.
2002-01-01
We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere
A Cure for Variance Inflation in High Dimensional Kernel Principal Component Analysis
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
Small sample high-dimensional principal component analysis (PCA) suffers from variance inflation and lack of generalizability. It has earlier been pointed out that a simple leave-one-out variance renormalization scheme can cure the problem. In this paper we generalize the cure in two directions......: First, we propose a computationally less intensive approximate leave-one-out estimator, secondly, we show that variance inflation is also present in kernel principal component analysis (kPCA) and we provide a non-parametric renormalization scheme which can quite efficiently restore generalizability in kPCA....... As for PCA our analysis also suggests a simplified approximate expression. © 2011 Trine J. Abrahamsen and Lars K. Hansen....
Adaptive nonparametric Bayesian inference using location-scale mixture priors
Jonge, de R.; Zanten, van J.H.
2010-01-01
We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if
The nonparametric bootstrap for the current status model
Groeneboom, P.; Hendrickx, K.
2017-01-01
It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid
Non-Parametric Analysis of Rating Transition and Default Data
DEFF Research Database (Denmark)
Fledelius, Peter; Lando, David; Perch Nielsen, Jens
2004-01-01
We demonstrate the use of non-parametric intensity estimation - including construction of pointwise confidence sets - for analyzing rating transition data. We find that transition intensities away from the class studied here for illustration strongly depend on the direction of the previous move b...
Directory of Open Access Journals (Sweden)
Adrien Rieux
Full Text Available Given its biological significance, determining the dispersal kernel (i.e., the distribution of dispersal distances of spore-producing pathogens is essential. Here, we report two field experiments designed to measure disease gradients caused by sexually- and asexually-produced spores of the wind-dispersed banana plant fungus Mycosphaerella fijiensis. Gradients were measured during a single generation and over 272 traps installed up to 1000 m along eight directions radiating from a traceable source of inoculum composed of fungicide-resistant strains. We adjusted several kernels differing in the shape of their tail and tested for two types of anisotropy. Contrasting dispersal kernels were observed between the two types of spores. For sexual spores (ascospores, we characterized both a steep gradient in the first few metres in all directions and rare long-distance dispersal (LDD events up to 1000 m from the source in two directions. A heavy-tailed kernel best fitted the disease gradient. Although ascospores distributed evenly in all directions, average dispersal distance was greater in two different directions without obvious correlation with wind patterns. For asexual spores (conidia, few dispersal events occurred outside of the source plot. A gradient up to 12.5 m from the source was observed in one direction only. Accordingly, a thin-tailed kernel best fitted the disease gradient, and anisotropy in both density and distance was correlated with averaged daily wind gust. We discuss the validity of our results as well as their implications in terms of disease diffusion and management strategy.
Zhang, Wencan; Leong, Siew Mun; Zhao, Feifei; Zhao, Fangju; Yang, Tiankui; Liu, Shaoquan
2018-05-01
With an interest to enhance the aroma of palm kernel oil (PKO), Viscozyme L, an enzyme complex containing a wide range of carbohydrases, was applied to alter the carbohydrates in palm kernels (PK) to modulate the formation of volatiles upon kernel roasting. After Viscozyme treatment, the content of simple sugars and free amino acids in PK increased by 4.4-fold and 4.5-fold, respectively. After kernel roasting and oil extraction, significantly more 2,5-dimethylfuran, 2-[(methylthio)methyl]-furan, 1-(2-furanyl)-ethanone, 1-(2-furyl)-2-propanone, 5-methyl-2-furancarboxaldehyde and 2-acetyl-5-methylfuran but less 2-furanmethanol and 2-furanmethanol acetate were found in treated PKO; the correlation between their formation and simple sugar profile was estimated by using partial least square regression (PLS1). Obvious differences in pyrroles and Strecker aldehydes were also found between the control and treated PKOs. Principal component analysis (PCA) clearly discriminated the treated PKOs from that of control PKOs on the basis of all volatile compounds. Such changes in volatiles translated into distinct sensory attributes, whereby treated PKO was more caramelic and burnt after aqueous extraction and more nutty, roasty, caramelic and smoky after solvent extraction. Copyright © 2018 Elsevier Ltd. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Duff, I.
1994-12-31
This workshop focuses on kernels for iterative software packages. Specifically, the three speakers discuss various aspects of sparse BLAS kernels. Their topics are: `Current status of user lever sparse BLAS`; Current status of the sparse BLAS toolkit`; and `Adding matrix-matrix and matrix-matrix-matrix multiply to the sparse BLAS toolkit`.
Nonparametric identification of copula structures
Li, Bo
2013-06-01
We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric and based on the asymptotic distribution of the empirical copula process.We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on the structure of copulas, particularly when the sample size is moderately large. We illustrate our testing approach on two datasets. © 2013 American Statistical Association.
Nonparametric additive regression for repeatedly measured data
Carroll, R. J.
2009-05-20
We develop an easily computed smooth backfitting algorithm for additive model fitting in repeated measures problems. Our methodology easily copes with various settings, such as when some covariates are the same over repeated response measurements. We allow for a working covariance matrix for the regression errors, showing that our method is most efficient when the correct covariance matrix is used. The component functions achieve the known asymptotic variance lower bound for the scalar argument case. Smooth backfitting also leads directly to design-independent biases in the local linear case. Simulations show our estimator has smaller variance than the usual kernel estimator. This is also illustrated by an example from nutritional epidemiology. © 2009 Biometrika Trust.
Classification With Truncated Distance Kernel.
Huang, Xiaolin; Suykens, Johan A K; Wang, Shuning; Hornegger, Joachim; Maier, Andreas
2018-05-01
This brief proposes a truncated distance (TL1) kernel, which results in a classifier that is nonlinear in the global region but is linear in each subregion. With this kernel, the subregion structure can be trained using all the training data and local linear classifiers can be established simultaneously. The TL1 kernel has good adaptiveness to nonlinearity and is suitable for problems which require different nonlinearities in different areas. Though the TL1 kernel is not positive semidefinite, some classical kernel learning methods are still applicable which means that the TL1 kernel can be directly used in standard toolboxes by replacing the kernel evaluation. In numerical experiments, the TL1 kernel with a pregiven parameter achieves similar or better performance than the radial basis function kernel with the parameter tuned by cross validation, implying the TL1 kernel a promising nonlinear kernel for classification tasks.
A contingency table approach to nonparametric testing
Rayner, JCW
2000-01-01
Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp
Nonparametric statistics for social and behavioral sciences
Kraska-MIller, M
2013-01-01
Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency
Gärtner, Thomas
2009-01-01
This book provides a unique treatment of an important area of machine learning and answers the question of how kernel methods can be applied to structured data. Kernel methods are a class of state-of-the-art learning algorithms that exhibit excellent learning results in several application domains. Originally, kernel methods were developed with data in mind that can easily be embedded in a Euclidean vector space. Much real-world data does not have this property but is inherently structured. An example of such data, often consulted in the book, is the (2D) graph structure of molecules formed by
Locally linear approximation for Kernel methods : the Railway Kernel
Muñoz, Alberto; González, Javier
2008-01-01
In this paper we present a new kernel, the Railway Kernel, that works properly for general (nonlinear) classification problems, with the interesting property that acts locally as a linear kernel. In this way, we avoid potential problems due to the use of a general purpose kernel, like the RBF kernel, as the high dimension of the induced feature space. As a consequence, following our methodology the number of support vectors is much lower and, therefore, the generalization capab...
Using non-parametric methods in econometric production analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
2012-01-01
by investigating the relationship between the elasticity of scale and the farm size. We use a balanced panel data set of 371~specialised crop farms for the years 2004-2007. A non-parametric specification test shows that neither the Cobb-Douglas function nor the Translog function are consistent with the "true......Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify a functional form of the production function of which the Cobb...... parameter estimates, but also in biased measures which are derived from the parameters, such as elasticities. Therefore, we propose to use non-parametric econometric methods. First, these can be applied to verify the functional form used in parametric production analysis. Second, they can be directly used...
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Motai, Yuichi
2015-01-01
Describes and discusses the variants of kernel analysis methods for data types that have been intensely studied in recent years This book covers kernel analysis topics ranging from the fundamental theory of kernel functions to its applications. The book surveys the current status, popular trends, and developments in kernel analysis studies. The author discusses multiple kernel learning algorithms and how to choose the appropriate kernels during the learning phase. Data-Variant Kernel Analysis is a new pattern analysis framework for different types of data configurations. The chapters include
DEFF Research Database (Denmark)
Petersen, Annette
of kernels promoted (10 and 60 kernels/day for the general population and cancer patients, respectively), exposures exceeded the ARfD 17–413 and 3–71 times in toddlers and adults, respectively. The estimated maximum quantity of apricot kernels (or raw apricot material) that can be consumed without exceeding...
Energy Technology Data Exchange (ETDEWEB)
Boehlke, S.; Niegoth, H. [STEAG Energy Services GmbH, Essen (Germany). Nuclear Technologies; Stalder, I. [Kernkraftwerk Leibstadt AG, Leibstadt (Switzerland)
2012-11-01
In the nuclear power plant Leibstadt (KKL) during the next year large components will be dismantled and stored for final disposal within the interim storage facility ZENT at the NPP site. Before construction of ZENT appropriate estimations of the local dose rate inside and outside the building and the collective dose for the normal operation have to be performed. The shielding calculations are based on the properties of the stored components and radiation sources and on the concepts for working place requirements. The installation of control and monitoring areas will depend on these calculations. For the determination of the shielding potential of concrete walls and steel doors with the defined boundary conditions point-kernel codes like MICROSHIELd {sup registered} are used. Complex problems cannot be modeled with this code. Therefore the point-kernel code VISIPLAN {sup registered} was developed for the determination of the local dose distribution functions in 3D models. The possibility of motion sequence inputs allows an optimization of collective dose estimations for the operational phases of a nuclear facility.
Nonlinear Forecasting With Many Predictors Using Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter; Groenen, Patrick J.F.; Heij, Christiaan
This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predi...
Nonparametric Inference for Periodic Sequences
Sun, Ying; Hart, Jeffrey D.; Genton, Marc G.
2012-01-01
the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both
Nonparametric tests for equality of psychometric functions.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2017-12-07
Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...
Using non-parametric methods in econometric production analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify the functional form of the production function. Most often, the Cobb...... results—including measures that are of interest of applied economists, such as elasticities. Therefore, we propose to use nonparametric econometric methods. First, they can be applied to verify the functional form used in parametric estimations of production functions. Second, they can be directly used...
Directory of Open Access Journals (Sweden)
Hailun Wang
2017-01-01
Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.
DEFF Research Database (Denmark)
Chen, Tianshi; Andersen, Martin Skovgaard; Ljung, Lennart
2014-01-01
Model estimation and structure detection with short data records are two issues that receive increasing interests in System Identification. In this paper, a multiple kernel-based regularization method is proposed to handle those issues. Multiple kernels are conic combinations of fixed kernels...
Essays on nonparametric econometrics of stochastic volatility
Zu, Y.
2012-01-01
Volatility is a concept that describes the variation of financial returns. Measuring and modelling volatility dynamics is an important aspect of financial econometrics. This thesis is concerned with nonparametric approaches to volatility measurement and volatility model validation.
Nonparametric instrumental regression with non-convex constraints
International Nuclear Information System (INIS)
Grasmair, M; Scherzer, O; Vanhems, A
2013-01-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition. (paper)
Nonparametric instrumental regression with non-convex constraints
Grasmair, M.; Scherzer, O.; Vanhems, A.
2013-03-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.
Single versus mixture Weibull distributions for nonparametric satellite reliability
International Nuclear Information System (INIS)
Castet, Jean-Francois; Saleh, Joseph H.
2010-01-01
Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.
Froelich, Markus; Puhani, Patrick
2004-01-01
Based on a nonparametrically estimated model of labor market classifications, this paper makes suggestions for immigration policy using data from western Germany in the 1990s. It is demonstrated that nonparametric regression is feasible in higher dimensions with only a few thousand observations. In sum, labor markets able to absorb immigrants are characterized by above average age and by professional occupations. On the other hand, labor markets for young workers in service occupations are id...
On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests
Directory of Open Access Journals (Sweden)
Aaditya Ramdas
2017-01-01
Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.
Kernel methods for deep learning
Cho, Youngmin
2012-01-01
We introduce a new family of positive-definite kernels that mimic the computation in large neural networks. We derive the different members of this family by considering neural networks with different activation functions. Using these kernels as building blocks, we also show how to construct other positive-definite kernels by operations such as composition, multiplication, and averaging. We explore the use of these kernels in standard models of supervised learning, such as support vector mach...
DEFF Research Database (Denmark)
Sommer, Stefan Horst; Lauze, Francois Bernard; Nielsen, Mads
2011-01-01
In the LDDMM framework, optimal warps for image registration are found as end-points of critical paths for an energy functional, and the EPDiff equations describe the evolution along such paths. The Large Deformation Diffeomorphic Kernel Bundle Mapping (LDDKBM) extension of LDDMM allows scale space...
Spafford, Eugene H.; Mckendry, Martin S.
1986-01-01
An overview of the internal structure of the Clouds kernel was presented. An indication of how these structures will interact in the prototype Clouds implementation is given. Many specific details have yet to be determined and await experimentation with an actual working system.
Explicit signal to noise ratio in reproducing kernel Hilbert spaces
DEFF Research Database (Denmark)
Gomez-Chova, Luis; Nielsen, Allan Aasbjerg; Camps-Valls, Gustavo
2011-01-01
This paper introduces a nonlinear feature extraction method based on kernels for remote sensing data analysis. The proposed approach is based on the minimum noise fraction (MNF) transform, which maximizes the signal variance while also minimizing the estimated noise variance. We here propose...... an alternative kernel MNF (KMNF) in which the noise is explicitly estimated in the reproducing kernel Hilbert space. This enables KMNF dealing with non-linear relations between the noise and the signal features jointly. Results show that the proposed KMNF provides the most noise-free features when confronted...
Evaluation of Nonparametric Probabilistic Forecasts of Wind Power
DEFF Research Database (Denmark)
Pinson, Pierre; Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg, orlov 31.07.2008
Predictions of wind power production for horizons up to 48-72 hour ahead comprise a highly valuable input to the methods for the daily management or trading of wind generation. Today, users of wind power predictions are not only provided with point predictions, which are estimates of the most...... likely outcome for each look-ahead time, but also with uncertainty estimates given by probabilistic forecasts. In order to avoid assumptions on the shape of predictive distributions, these probabilistic predictions are produced from nonparametric methods, and then take the form of a single or a set...
Viscosity kernel of molecular fluids
DEFF Research Database (Denmark)
Puscasu, Ruslan; Todd, Billy; Daivis, Peter
2010-01-01
, temperature, and chain length dependencies of the reciprocal and real-space viscosity kernels are presented. We find that the density has a major effect on the shape of the kernel. The temperature range and chain lengths considered here have by contrast less impact on the overall normalized shape. Functional...... forms that fit the wave-vector-dependent kernel data over a large density and wave-vector range have also been tested. Finally, a structural normalization of the kernels in physical space is considered. Overall, the real-space viscosity kernel has a width of roughly 3–6 atomic diameters, which means...
2013-01-01
Background Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel. Results We sought to investigate the performance of a diffusion kernel, which was specifically developed to model discrete marker inputs, using Holstein cattle and wheat data. This kernel can be viewed as a discretization of the Gaussian kernel. The predictive ability of the diffusion kernel was similar to that of non-spatial distance-based additive genomic relationship kernels in the Holstein data, but outperformed the latter in the wheat data. However, the difference in performance between the diffusion and Gaussian kernels was negligible. Conclusions It is concluded that the ability of a diffusion kernel to capture the total genetic variance is not better than that of a Gaussian kernel, at least for these data. Although the diffusion kernel as a choice of basis function may have potential for use in whole-genome prediction, our results imply that embedding genetic markers into a non-Euclidean metric space has very small impact on prediction. Our results suggest that use of the black box Gaussian kernel is justified, given its connection to the diffusion kernel and its similar predictive performance. PMID:23763755
Effects of dating errors on nonparametric trend analyses of speleothem time series
Directory of Open Access Journals (Sweden)
M. Mudelsee
2012-10-01
Full Text Available A fundamental problem in paleoclimatology is to take fully into account the various error sources when examining proxy records with quantitative methods of statistical time series analysis. Records from dated climate archives such as speleothems add extra uncertainty from the age determination to the other sources that consist in measurement and proxy errors. This paper examines three stalagmite time series of oxygen isotopic composition (δ^{18}O from two caves in western Germany, the series AH-1 from the Atta Cave and the series Bu1 and Bu4 from the Bunker Cave. These records carry regional information about past changes in winter precipitation and temperature. U/Th and radiocarbon dating reveals that they cover the later part of the Holocene, the past 8.6 thousand years (ka. We analyse centennial- to millennial-scale climate trends by means of nonparametric Gasser–Müller kernel regression. Error bands around fitted trend curves are determined by combining (1 block bootstrap resampling to preserve noise properties (shape, autocorrelation of the δ^{18}O residuals and (2 timescale simulations (models StalAge and iscam. The timescale error influences on centennial- to millennial-scale trend estimation are not excessively large. We find a "mid-Holocene climate double-swing", from warm to cold to warm winter conditions (6.5 ka to 6.0 ka to 5.1 ka, with warm–cold amplitudes of around 0.5‰ δ^{18}O; this finding is documented by all three records with high confidence. We also quantify the Medieval Warm Period (MWP, the Little Ice Age (LIA and the current warmth. Our analyses cannot unequivocally support the conclusion that current regional winter climate is warmer than that during the MWP.
Rizvi, S.Z.; Mohammadpour, J.; Toth, R.; Meskin, N.
2015-01-01
This paper first describes the development of a nonparametric identification method for linear parameter-varying (LPV) state-space models and then applies it to a nonlinear process system. The proposed method uses kernel-based least-squares support vector machines (LS-SVM). While parametric
An Adaptive Genetic Association Test Using Double Kernel Machines.
Zhan, Xiang; Epstein, Michael P; Ghosh, Debashis
2015-10-01
Recently, gene set-based approaches have become very popular in gene expression profiling studies for assessing how genetic variants are related to disease outcomes. Since most genes are not differentially expressed, existing pathway tests considering all genes within a pathway suffer from considerable noise and power loss. Moreover, for a differentially expressed pathway, it is of interest to select important genes that drive the effect of the pathway. In this article, we propose an adaptive association test using double kernel machines (DKM), which can both select important genes within the pathway as well as test for the overall genetic pathway effect. This DKM procedure first uses the garrote kernel machines (GKM) test for the purposes of subset selection and then the least squares kernel machine (LSKM) test for testing the effect of the subset of genes. An appealing feature of the kernel machine framework is that it can provide a flexible and unified method for multi-dimensional modeling of the genetic pathway effect allowing for both parametric and nonparametric components. This DKM approach is illustrated with application to simulated data as well as to data from a neuroimaging genetics study.
Steerability of Hermite Kernel
Czech Academy of Sciences Publication Activity Database
Yang, Bo; Flusser, Jan; Suk, Tomáš
2013-01-01
Roč. 27, č. 4 (2013), 1354006-1-1354006-25 ISSN 0218-0014 R&D Projects: GA ČR GAP103/11/1552 Institutional support: RVO:67985556 Keywords : Hermite polynomials * Hermite kernel * steerability * adaptive filtering Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.558, year: 2013 http://library.utia.cas.cz/separaty/2013/ZOI/yang-0394387. pdf
Kernel-based whole-genome prediction of complex traits: a review.
Morota, Gota; Gianola, Daniel
2014-01-01
Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.
Kernel-based whole-genome prediction of complex traits: a review
Directory of Open Access Journals (Sweden)
Gota eMorota
2014-10-01
Full Text Available Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways, thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.
Support vector machines for nuclear reactor state estimation
Energy Technology Data Exchange (ETDEWEB)
Zavaljevski, N.; Gross, K. C.
2000-02-14
Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.
Support vector machines for nuclear reactor state estimation
International Nuclear Information System (INIS)
Zavaljevski, N.; Gross, K. C.
2000-01-01
Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm
Recent Advances and Trends in Nonparametric Statistics
Akritas, MG
2003-01-01
The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o
A Nonparametric Bayesian Approach For Emission Tomography Reconstruction
International Nuclear Information System (INIS)
Barat, Eric; Dautremer, Thomas
2007-01-01
We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM
DEFF Research Database (Denmark)
Arenas-Garcia, J.; Petersen, K.; Camps-Valls, G.
2013-01-01
correlation analysis (CCA), and orthonormalized PLS (OPLS), as well as their nonlinear extensions derived by means of the theory of reproducing kernel Hilbert spaces (RKHSs). We also review their connections to other methods for classification and statistical dependence estimation and introduce some recent...
Kernel Machine SNP-set Testing under Multiple Candidate Kernels
Wu, Michael C.; Maity, Arnab; Lee, Seunggeun; Simmons, Elizabeth M.; Harmon, Quaker E.; Lin, Xinyi; Engel, Stephanie M.; Molldrem, Jeffrey J.; Armistead, Paul M.
2013-01-01
Joint testing for the cumulative effect of multiple single nucleotide polymorphisms grouped on the basis of prior biological knowledge has become a popular and powerful strategy for the analysis of large scale genetic association studies. The kernel machine (KM) testing framework is a useful approach that has been proposed for testing associations between multiple genetic variants and many different types of complex traits by comparing pairwise similarity in phenotype between subjects to pairwise similarity in genotype, with similarity in genotype defined via a kernel function. An advantage of the KM framework is its flexibility: choosing different kernel functions allows for different assumptions concerning the underlying model and can allow for improved power. In practice, it is difficult to know which kernel to use a priori since this depends on the unknown underlying trait architecture and selecting the kernel which gives the lowest p-value can lead to inflated type I error. Therefore, we propose practical strategies for KM testing when multiple candidate kernels are present based on constructing composite kernels and based on efficient perturbation procedures. We demonstrate through simulations and real data applications that the procedures protect the type I error rate and can lead to substantially improved power over poor choices of kernels and only modest differences in power versus using the best candidate kernel. PMID:23471868
NONPARAMETRIC IDENTIFICATION FOR NONLINEAR AUTOREGRESSIVE TIMESERIES MODELS： CONVERGENCE RATES
Institute of Scientific and Technical Information of China (English)
LUZUDI; CHENGPING
1999-01-01
In this paper the optimal convergence rates of estimators ba~ed on kernel approach fornonlinear AR model are investigated in the sense of Stone[17'1a]. By combining the mixingproperty of the stationary solution with the characteristics of the model itself, the restrictiveconditions in the literature which are not easy to be satisfied by the nonlinear AR model axeremoved, and the mild conditions are obtained to guarantee the optimal ratea of the estimatorof autoregTession function. In addition: the strongly coasistent estimator of the ~riance ofwhite noise is also constructed.
Coupling individual kernel-filling processes with source-sink interactions into GREENLAB-Maize.
Ma, Yuntao; Chen, Youjia; Zhu, Jinyu; Meng, Lei; Guo, Yan; Li, Baoguo; Hoogenboom, Gerrit
2018-02-13
Failure to account for the variation of kernel growth in a cereal crop simulation model may cause serious deviations in the estimates of crop yield. The goal of this research was to revise the GREENLAB-Maize model to incorporate source- and sink-limited allocation approaches to simulate the dry matter accumulation of individual kernels of an ear (GREENLAB-Maize-Kernel). The model used potential individual kernel growth rates to characterize the individual potential sink demand. The remobilization of non-structural carbohydrates from reserve organs to kernels was also incorporated. Two years of field experiments were conducted to determine the model parameter values and to evaluate the model using two maize hybrids with different plant densities and pollination treatments. Detailed observations were made on the dimensions and dry weights of individual kernels and other above-ground plant organs throughout the seasons. Three basic traits characterizing an individual kernel were compared on simulated and measured individual kernels: (1) final kernel size; (2) kernel growth rate; and (3) duration of kernel filling. Simulations of individual kernel growth closely corresponded to experimental data. The model was able to reproduce the observed dry weight of plant organs well. Then, the source-sink dynamics and the remobilization of carbohydrates for kernel growth were quantified to show that remobilization processes accompanied source-sink dynamics during the kernel-filling process. We conclude that the model may be used to explore options for optimizing plant kernel yield by matching maize management to the environment, taking into account responses at the level of individual kernels. © The Author(s) 2018. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Kernel Temporal Differences for Neural Decoding
Bae, Jihye; Sanchez Giraldo, Luis G.; Pohlmeyer, Eric A.; Francis, Joseph T.; Sanchez, Justin C.; Príncipe, José C.
2015-01-01
We study the feasibility and capability of the kernel temporal difference (KTD)(λ) algorithm for neural decoding. KTD(λ) is an online, kernel-based learning algorithm, which has been introduced to estimate value functions in reinforcement learning. This algorithm combines kernel-based representations with the temporal difference approach to learning. One of our key observations is that by using strictly positive definite kernels, algorithm's convergence can be guaranteed for policy evaluation. The algorithm's nonlinear functional approximation capabilities are shown in both simulations of policy evaluation and neural decoding problems (policy improvement). KTD can handle high-dimensional neural states containing spatial-temporal information at a reasonable computational complexity allowing real-time applications. When the algorithm seeks a proper mapping between a monkey's neural states and desired positions of a computer cursor or a robot arm, in both open-loop and closed-loop experiments, it can effectively learn the neural state to action mapping. Finally, a visualization of the coadaptation process between the decoder and the subject shows the algorithm's capabilities in reinforcement learning brain machine interfaces. PMID:25866504
Digital spectral analysis parametric, non-parametric and advanced methods
Castanié, Francis
2013-01-01
Digital Spectral Analysis provides a single source that offers complete coverage of the spectral analysis domain. This self-contained work includes details on advanced topics that are usually presented in scattered sources throughout the literature.The theoretical principles necessary for the understanding of spectral analysis are discussed in the first four chapters: fundamentals, digital signal processing, estimation in spectral analysis, and time-series models.An entire chapter is devoted to the non-parametric methods most widely used in industry.High resolution methods a
Smolka, Gert
1994-01-01
Oz is a concurrent language providing for functional, object-oriented, and constraint programming. This paper defines Kernel Oz, a semantically complete sublanguage of Oz. It was an important design requirement that Oz be definable by reduction to a lean kernel language. The definition of Kernel Oz introduces three essential abstractions: the Oz universe, the Oz calculus, and the actor model. The Oz universe is a first-order structure defining the values and constraints Oz computes with. The ...
2010-01-01
... 7 Agriculture 8 2010-01-01 2010-01-01 false Edible kernel. 981.7 Section 981.7 Agriculture... Regulating Handling Definitions § 981.7 Edible kernel. Edible kernel means a kernel, piece, or particle of almond kernel that is not inedible. [41 FR 26852, June 30, 1976] ...
7 CFR 981.408 - Inedible kernel.
2010-01-01
... 7 Agriculture 8 2010-01-01 2010-01-01 false Inedible kernel. 981.408 Section 981.408 Agriculture... Administrative Rules and Regulations § 981.408 Inedible kernel. Pursuant to § 981.8, the definition of inedible kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as...
7 CFR 981.8 - Inedible kernel.
2010-01-01
... 7 Agriculture 8 2010-01-01 2010-01-01 false Inedible kernel. 981.8 Section 981.8 Agriculture... Regulating Handling Definitions § 981.8 Inedible kernel. Inedible kernel means a kernel, piece, or particle of almond kernel with any defect scored as serious damage, or damage due to mold, gum, shrivel, or...
Teaching Nonparametric Statistics Using Student Instrumental Values.
Anderson, Jonathan W.; Diddams, Margaret
Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2000-01-01
New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2004-01-01
Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric
A Bayesian Nonparametric Approach to Factor Analysis
DEFF Research Database (Denmark)
Piatek, Rémi; Papaspiliopoulos, Omiros
2018-01-01
This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does no...
Donnelly, Aoife; Misstear, Bruce; Broderick, Brian
2011-02-15
Background concentrations of nitrogen dioxide (NO(2)) are not constant but vary temporally and spatially. The current paper presents a powerful tool for the quantification of the effects of wind direction and wind speed on background NO(2) concentrations, particularly in cases where monitoring data are limited. In contrast to previous studies which applied similar methods to sites directly affected by local pollution sources, the current study focuses on background sites with the aim of improving methods for predicting background concentrations adopted in air quality modelling studies. The relationship between measured NO(2) concentration in air at three such sites in Ireland and locally measured wind direction has been quantified using nonparametric regression methods. The major aim was to analyse a method for quantifying the effects of local wind direction on background levels of NO(2) in Ireland. The method was expanded to include wind speed as an added predictor variable. A Gaussian kernel function is used in the analysis and circular statistics employed for the wind direction variable. Wind direction and wind speed were both found to have a statistically significant effect on background levels of NO(2) at all three sites. Frequently environmental impact assessments are based on short term baseline monitoring producing a limited dataset. The presented non-parametric regression methods, in contrast to the frequently used methods such as binning of the data, allow concentrations for missing data pairs to be estimated and distinction between spurious and true peaks in concentrations to be made. The methods were found to provide a realistic estimation of long term concentration variation with wind direction and speed, even for cases where the data set is limited. Accurate identification of the actual variation at each location and causative factors could be made, thus supporting the improved definition of background concentrations for use in air quality modelling
A Design-Adaptive Local Polynomial Estimator for the Errors-in-Variables Problem
Delaigle, Aurore
2009-03-01
Local polynomial estimators are popular techniques for nonparametric regression estimation and have received great attention in the literature. Their simplest version, the local constant estimator, can be easily extended to the errors-in-variables context by exploiting its similarity with the deconvolution kernel density estimator. The generalization of the higher order versions of the estimator, however, is not straightforward and has remained an open problem for the last 15 years. We propose an innovative local polynomial estimator of any order in the errors-in-variables context, derive its design-adaptive asymptotic properties and study its finite sample performance on simulated examples. We provide not only a solution to a long-standing open problem, but also provide methodological contributions to error-invariable regression, including local polynomial estimation of derivative functions.
Adaptive kernel regression for freehand 3D ultrasound reconstruction
Alshalalfah, Abdel-Latif; Daoud, Mohammad I.; Al-Najar, Mahasen
2017-03-01
Freehand three-dimensional (3D) ultrasound imaging enables low-cost and flexible 3D scanning of arbitrary-shaped organs, where the operator can freely move a two-dimensional (2D) ultrasound probe to acquire a sequence of tracked cross-sectional images of the anatomy. Often, the acquired 2D ultrasound images are irregularly and sparsely distributed in the 3D space. Several 3D reconstruction algorithms have been proposed to synthesize 3D ultrasound volumes based on the acquired 2D images. A challenging task during the reconstruction process is to preserve the texture patterns in the synthesized volume and ensure that all gaps in the volume are correctly filled. This paper presents an adaptive kernel regression algorithm that can effectively reconstruct high-quality freehand 3D ultrasound volumes. The algorithm employs a kernel regression model that enables nonparametric interpolation of the voxel gray-level values. The kernel size of the regression model is adaptively adjusted based on the characteristics of the voxel that is being interpolated. In particular, when the algorithm is employed to interpolate a voxel located in a region with dense ultrasound data samples, the size of the kernel is reduced to preserve the texture patterns. On the other hand, the size of the kernel is increased in areas that include large gaps to enable effective gap filling. The performance of the proposed algorithm was compared with seven previous interpolation approaches by synthesizing freehand 3D ultrasound volumes of a benign breast tumor. The experimental results show that the proposed algorithm outperforms the other interpolation approaches.
Cid, Jaime A; von Davier, Alina A
2015-05-01
Test equating is a method of making the test scores from different test forms of the same assessment comparable. In the equating process, an important step involves continuizing the discrete score distributions. In traditional observed-score equating, this step is achieved using linear interpolation (or an unscaled uniform kernel). In the kernel equating (KE) process, this continuization process involves Gaussian kernel smoothing. It has been suggested that the choice of bandwidth in kernel smoothing controls the trade-off between variance and bias. In the literature on estimating density functions using kernels, it has also been suggested that the weight of the kernel depends on the sample size, and therefore, the resulting continuous distribution exhibits bias at the endpoints, where the samples are usually smaller. The purpose of this article is (a) to explore the potential effects of atypical scores (spikes) at the extreme ends (high and low) on the KE method in distributions with different degrees of asymmetry using the randomly equivalent groups equating design (Study I), and (b) to introduce the Epanechnikov and adaptive kernels as potential alternative approaches to reducing boundary bias in smoothing (Study II). The beta-binomial model is used to simulate observed scores reflecting a range of different skewed shapes.
Input Space Regularization Stabilizes Pre-images for Kernel PCA De-noising
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2009-01-01
Solution of the pre-image problem is key to efficient nonlinear de-noising using kernel Principal Component Analysis. Pre-image estimation is inherently ill-posed for typical kernels used in applications and consequently the most widely used estimation schemes lack stability. For de...
Nonparametric Bayes Modeling of Multivariate Categorical Data.
Dunson, David B; Xing, Chuanhua
2012-01-01
Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.
Quantized kernel least mean square algorithm.
Chen, Badong; Zhao, Songlin; Zhu, Pingping; Príncipe, José C
2012-01-01
In this paper, we propose a quantization approach, as an alternative of sparsification, to curb the growth of the radial basis function structure in kernel adaptive filtering. The basic idea behind this method is to quantize and hence compress the input (or feature) space. Different from sparsification, the new approach uses the "redundant" data to update the coefficient of the closest center. In particular, a quantized kernel least mean square (QKLMS) algorithm is developed, which is based on a simple online vector quantization method. The analytical study of the mean square convergence has been carried out. The energy conservation relation for QKLMS is established, and on this basis we arrive at a sufficient condition for mean square convergence, and a lower and upper bound on the theoretical value of the steady-state excess mean square error. Static function estimation and short-term chaotic time-series prediction examples are presented to demonstrate the excellent performance.
Kernel-based noise filtering of neutron detector signals
International Nuclear Information System (INIS)
Park, Moon Ghu; Shin, Ho Cheol; Lee, Eun Ki
2007-01-01
This paper describes recently developed techniques for effective filtering of neutron detector signal noise. In this paper, three kinds of noise filters are proposed and their performance is demonstrated for the estimation of reactivity. The tested filters are based on the unilateral kernel filter, unilateral kernel filter with adaptive bandwidth and bilateral filter to show their effectiveness in edge preservation. Filtering performance is compared with conventional low-pass and wavelet filters. The bilateral filter shows a remarkable improvement compared with unilateral kernel and wavelet filters. The effectiveness and simplicity of the unilateral kernel filter with adaptive bandwidth is also demonstrated by applying it to the reactivity measurement performed during reactor start-up physics tests
Omnibus risk assessment via accelerated failure time kernel machine modeling.
Sinnott, Jennifer A; Cai, Tianxi
2013-12-01
Integrating genomic information with traditional clinical risk factors to improve the prediction of disease outcomes could profoundly change the practice of medicine. However, the large number of potential markers and possible complexity of the relationship between markers and disease make it difficult to construct accurate risk prediction models. Standard approaches for identifying important markers often rely on marginal associations or linearity assumptions and may not capture non-linear or interactive effects. In recent years, much work has been done to group genes into pathways and networks. Integrating such biological knowledge into statistical learning could potentially improve model interpretability and reliability. One effective approach is to employ a kernel machine (KM) framework, which can capture nonlinear effects if nonlinear kernels are used (Scholkopf and Smola, 2002; Liu et al., 2007, 2008). For survival outcomes, KM regression modeling and testing procedures have been derived under a proportional hazards (PH) assumption (Li and Luan, 2003; Cai, Tonini, and Lin, 2011). In this article, we derive testing and prediction methods for KM regression under the accelerated failure time (AFT) model, a useful alternative to the PH model. We approximate the null distribution of our test statistic using resampling procedures. When multiple kernels are of potential interest, it may be unclear in advance which kernel to use for testing and estimation. We propose a robust Omnibus Test that combines information across kernels, and an approach for selecting the best kernel for estimation. The methods are illustrated with an application in breast cancer. © 2013, The International Biometric Society.
An adaptive distance measure for use with nonparametric models
International Nuclear Information System (INIS)
Garvey, D. R.; Hines, J. W.
2006-01-01
Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction
Bruemmer, David J [Idaho Falls, ID
2009-11-17
A robot platform includes perceptors, locomotors, and a system controller. The system controller executes a robot intelligence kernel (RIK) that includes a multi-level architecture and a dynamic autonomy structure. The multi-level architecture includes a robot behavior level for defining robot behaviors, that incorporate robot attributes and a cognitive level for defining conduct modules that blend an adaptive interaction between predefined decision functions and the robot behaviors. The dynamic autonomy structure is configured for modifying a transaction capacity between an operator intervention and a robot initiative and may include multiple levels with at least a teleoperation mode configured to maximize the operator intervention and minimize the robot initiative and an autonomous mode configured to minimize the operator intervention and maximize the robot initiative. Within the RIK at least the cognitive level includes the dynamic autonomy structure.
Network structure exploration via Bayesian nonparametric models
International Nuclear Information System (INIS)
Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z
2015-01-01
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)
Nonparametric Mixture Models for Supervised Image Parcellation.
Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina
2009-09-01
We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.
Application of nonparametric statistics to material strength/reliability assessment
International Nuclear Information System (INIS)
Arai, Taketoshi
1992-01-01
An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)
Mixture Density Mercer Kernels: A Method to Learn Kernels
National Aeronautics and Space Administration — This paper presents a method of generating Mercer Kernels from an ensemble of probabilistic mixture models, where each mixture model is generated from a Bayesian...
Introduction to nonparametric statistics for the biological sciences using R
MacFarland, Thomas W
2016-01-01
This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...
Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression
Yoo, W.W.; Ghosal, S
2016-01-01
In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a
A structural nonparametric reappraisal of the CO2 emissions-income relationship
Azomahou, T.T.; Goedhuys - Degelin, Micheline; Nguyen-Van, P.
Relying on a structural nonparametric estimation, we show that co2 emissions clearly increase with income at low income levels. For higher income levels, we observe a decreasing relationship, though not significant. We also find thatco2 emissions monotonically increases with energy use at a
Assessing pupil and school performance by non-parametric and parametric techniques
de Witte, K.; Thanassoulis, E.; Simpson, G.; Battisti, G.; Charlesworth-May, A.
2010-01-01
This paper discusses the use of the non-parametric free disposal hull (FDH) and the parametric multi-level model (MLM) as alternative methods for measuring pupil and school attainment where hierarchical structured data are available. Using robust FDH estimates, we show how to decompose the overall
A non-parametric Bayesian approach to decompounding from high frequency data
Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter
2016-01-01
Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.
Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.
2018-03-01
We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Directory of Open Access Journals (Sweden)
Jinchao Feng
2018-03-01
Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Comparing estimates of genetic variance across different relationship models.
Legarra, Andres
2016-02-01
Use of relationships between individuals to estimate genetic variances and heritabilities via mixed models is standard practice in human, plant and livestock genetics. Different models or information for relationships may give different estimates of genetic variances. However, comparing these estimates across different relationship models is not straightforward as the implied base populations differ between relationship models. In this work, I present a method to compare estimates of variance components across different relationship models. I suggest referring genetic variances obtained using different relationship models to the same reference population, usually a set of individuals in the population. Expected genetic variance of this population is the estimated variance component from the mixed model times a statistic, Dk, which is the average self-relationship minus the average (self- and across-) relationship. For most typical models of relationships, Dk is close to 1. However, this is not true for very deep pedigrees, for identity-by-state relationships, or for non-parametric kernels, which tend to overestimate the genetic variance and the heritability. Using mice data, I show that heritabilities from identity-by-state and kernel-based relationships are overestimated. Weighting these estimates by Dk scales them to a base comparable to genomic or pedigree relationships, avoiding wrong comparisons, for instance, "missing heritabilities". Copyright © 2015 Elsevier Inc. All rights reserved.
2010-01-01
... 7 Agriculture 8 2010-01-01 2010-01-01 false Kernel weight. 981.9 Section 981.9 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING SERVICE (Marketing Agreements... Regulating Handling Definitions § 981.9 Kernel weight. Kernel weight means the weight of kernels, including...
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Half kernel. 51.2295 Section 51.2295 Agriculture... Standards for Shelled English Walnuts (Juglans Regia) Definitions § 51.2295 Half kernel. Half kernel means the separated half of a kernel with not more than one-eighth broken off. ...
A new discrete dipole kernel for quantitative susceptibility mapping.
Milovic, Carlos; Acosta-Cabronero, Julio; Pinto, José Miguel; Mattern, Hendrik; Andia, Marcelo; Uribe, Sergio; Tejos, Cristian
2018-09-01
Most approaches for quantitative susceptibility mapping (QSM) are based on a forward model approximation that employs a continuous Fourier transform operator to solve a differential equation system. Such formulation, however, is prone to high-frequency aliasing. The aim of this study was to reduce such errors using an alternative dipole kernel formulation based on the discrete Fourier transform and discrete operators. The impact of such an approach on forward model calculation and susceptibility inversion was evaluated in contrast to the continuous formulation both with synthetic phantoms and in vivo MRI data. The discrete kernel demonstrated systematically better fits to analytic field solutions, and showed less over-oscillations and aliasing artifacts while preserving low- and medium-frequency responses relative to those obtained with the continuous kernel. In the context of QSM estimation, the use of the proposed discrete kernel resulted in error reduction and increased sharpness. This proof-of-concept study demonstrated that discretizing the dipole kernel is advantageous for QSM. The impact on small or narrow structures such as the venous vasculature might by particularly relevant to high-resolution QSM applications with ultra-high field MRI - a topic for future investigations. The proposed dipole kernel has a straightforward implementation to existing QSM routines. Copyright © 2018 Elsevier Inc. All rights reserved.
A kernel version of spatial factor analysis
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg
2009-01-01
. Schölkopf et al. introduce kernel PCA. Shawe-Taylor and Cristianini is an excellent reference for kernel methods in general. Bishop and Press et al. describe kernel methods among many other subjects. Nielsen and Canty use kernel PCA to detect change in univariate airborne digital camera images. The kernel...... version of PCA handles nonlinearities by implicitly transforming data into high (even infinite) dimensional feature space via the kernel function and then performing a linear analysis in that space. In this paper we shall apply kernel versions of PCA, maximum autocorrelation factor (MAF) analysis...
kernel oil by lipolytic organisms
African Journals Online (AJOL)
USER
2010-08-02
Aug 2, 2010 ... Rancidity of extracted cashew oil was observed with cashew kernel stored at 70, 80 and 90% .... method of American Oil Chemist Society AOCS (1978) using glacial ..... changes occur and volatile products are formed that are.
Bootstrapping Kernel-Based Semiparametric Estimators
DEFF Research Database (Denmark)
Cattaneo, Matias D.; Jansson, Michael
by accommodating a non-negligible bias. A noteworthy feature of the assumptions under which the result is obtained is that reliance on a commonly employed stochastic equicontinuity condition is avoided. The second main result shows that the bootstrap provides an automatic method of correcting for the bias even...... when it is non-negligible....
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole E.
The density function of the gamma distribution is used as shift kernel in Brownian semistationary processes modelling the timewise behaviour of the velocity in turbulent regimes. This report presents exact and asymptotic properties of the second order structure function under such a model......, and relates these to results of von Karmann and Horwath. But first it is shown that the gamma kernel is interpretable as a Green’s function....
Non-parametric smoothing of experimental data
International Nuclear Information System (INIS)
Kuketayev, A.T.; Pen'kov, F.M.
2007-01-01
Full text: Rapid processing of experimental data samples in nuclear physics often requires differentiation in order to find extrema. Therefore, even at the preliminary stage of data analysis, a range of noise reduction methods are used to smooth experimental data. There are many non-parametric smoothing techniques: interval averages, moving averages, exponential smoothing, etc. Nevertheless, it is more common to use a priori information about the behavior of the experimental curve in order to construct smoothing schemes based on the least squares techniques. The latter methodology's advantage is that the area under the curve can be preserved, which is equivalent to conservation of total speed of counting. The disadvantages of this approach include the lack of a priori information. For example, very often the sums of undifferentiated (by a detector) peaks are replaced with one peak during the processing of data, introducing uncontrolled errors in the determination of the physical quantities. The problem is solvable only by having experienced personnel, whose skills are much greater than the challenge. We propose a set of non-parametric techniques, which allows the use of any additional information on the nature of experimental dependence. The method is based on a construction of a functional, which includes both experimental data and a priori information. Minimum of this functional is reached on a non-parametric smoothed curve. Euler (Lagrange) differential equations are constructed for these curves; then their solutions are obtained analytically or numerically. The proposed approach allows for automated processing of nuclear physics data, eliminating the need for highly skilled laboratory personnel. Pursuant to the proposed approach is the possibility to obtain smoothing curves in a given confidence interval, e.g. according to the χ 2 distribution. This approach is applicable when constructing smooth solutions of ill-posed problems, in particular when solving
Kernel and divergence techniques in high energy physics separations
Bouř, Petr; Kůs, Václav; Franc, Jiří
2017-10-01
Binary decision trees under the Bayesian decision technique are used for supervised classification of high-dimensional data. We present a great potential of adaptive kernel density estimation as the nested separation method of the supervised binary divergence decision tree. Also, we provide a proof of alternative computing approach for kernel estimates utilizing Fourier transform. Further, we apply our method to Monte Carlo data set from the particle accelerator Tevatron at DØ experiment in Fermilab and provide final top-antitop signal separation results. We have achieved up to 82 % AUC while using the restricted feature selection entering the signal separation procedure.
Decompounding random sums: A nonparametric approach
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted; Pitts, Susan M.
Observations from sums of random variables with a random number of summands, known as random, compound or stopped sums arise within many areas of engineering and science. Quite often it is desirable to infer properties of the distribution of the terms in the random sum. In the present paper we...... review a number of applications and consider the nonlinear inverse problem of inferring the cumulative distribution function of the components in the random sum. We review the existing literature on non-parametric approaches to the problem. The models amenable to the analysis are generalized considerably...
A Nonparametric Test for Seasonal Unit Roots
Kunst, Robert M.
2009-01-01
Abstract: We consider a nonparametric test for the null of seasonal unit roots in quarterly time series that builds on the RUR (records unit root) test by Aparicio, Escribano, and Sipols. We find that the test concept is more promising than a formalization of visual aids such as plots by quarter. In order to cope with the sensitivity of the original RUR test to autocorrelation under its null of a unit root, we suggest an augmentation step by autoregression. We present some evidence on the siz...
Indoor Positioning Using Nonparametric Belief Propagation Based on Spanning Trees
Directory of Open Access Journals (Sweden)
Savic Vladimir
2010-01-01
Full Text Available Nonparametric belief propagation (NBP is one of the best-known methods for cooperative localization in sensor networks. It is capable of providing information about location estimation with appropriate uncertainty and to accommodate non-Gaussian distance measurement errors. However, the accuracy of NBP is questionable in loopy networks. Therefore, in this paper, we propose a novel approach, NBP based on spanning trees (NBP-ST created by breadth first search (BFS method. In addition, we propose a reliable indoor model based on obtained measurements in our lab. According to our simulation results, NBP-ST performs better than NBP in terms of accuracy and communication cost in the networks with high connectivity (i.e., highly loopy networks. Furthermore, the computational and communication costs are nearly constant with respect to the transmission radius. However, the drawbacks of proposed method are a little bit higher computational cost and poor performance in low-connected networks.
Development of a Modified Kernel Regression Model for a Robust Signal Reconstruction
Energy Technology Data Exchange (ETDEWEB)
Ahmed, Ibrahim; Heo, Gyunyoung [Kyung Hee University, Yongin (Korea, Republic of)
2016-10-15
The demand for robust and resilient performance has led to the use of online-monitoring techniques to monitor the process parameters and signal validation. On-line monitoring and signal validation techniques are the two important terminologies in process and equipment monitoring. These techniques are automated methods of monitoring instrument performance while the plant is operating. To implementing these techniques, several empirical models are used. One of these models is nonparametric regression model, otherwise known as kernel regression (KR). Unlike parametric models, KR is an algorithmic estimation procedure which assumes no significant parameters, and it needs no training process after its development when new observations are prepared; which is good for a system characteristic of changing due to ageing phenomenon. Although KR is used and performed excellently when applied to steady state or normal operating data, it has limitation in time-varying data that has several repetition of the same signal, especially if those signals are used to infer the other signals. The convectional KR has limitation in correctly estimating the dependent variable when time-varying data with repeated values are used to estimate the dependent variable especially in signal validation and monitoring. Therefore, we presented here in this work a modified KR that can resolve this issue which can also be feasible in time domain. Data are first transformed prior to the Euclidian distance evaluation considering their slopes/changes with respect to time. The performance of the developed model is evaluated and compared with that of conventional KR using both the lab experimental data and the real time data from CNS provided by KAERI. The result shows that the proposed developed model, having demonstrated high performance accuracy than that of conventional KR, is capable of resolving the identified limitation with convectional KR. We also discovered that there is still need to further
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.
Seasonal adjustment methods and real time trend-cycle estimation
Bee Dagum, Estela
2016-01-01
This book explores widely used seasonal adjustment methods and recent developments in real time trend-cycle estimation. It discusses in detail the properties and limitations of X12ARIMA, TRAMO-SEATS and STAMP - the main seasonal adjustment methods used by statistical agencies. Several real-world cases illustrate each method and real data examples can be followed throughout the text. The trend-cycle estimation is presented using nonparametric techniques based on moving averages, linear filters and reproducing kernel Hilbert spaces, taking recent advances into account. The book provides a systematical treatment of results that to date have been scattered throughout the literature. Seasonal adjustment and real time trend-cycle prediction play an essential part at all levels of activity in modern economies. They are used by governments to counteract cyclical recessions, by central banks to control inflation, by decision makers for better modeling and planning and by hospitals, manufacturers, builders, transportat...
Exact nonparametric confidence bands for the survivor function.
Matthews, David
2013-10-12
A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.
Ensemble-based forecasting at Horns Rev: Ensemble conversion and kernel dressing
DEFF Research Database (Denmark)
Pinson, Pierre; Madsen, Henrik
. The obtained ensemble forecasts of wind power are then converted into predictive distributions with an original adaptive kernel dressing method. The shape of the kernels is driven by a mean-variance model, the parameters of which are recursively estimated in order to maximize the overall skill of obtained...
Schaid, Daniel J
2010-01-01
Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.
Kernel versions of some orthogonal transformations
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg
Kernel versions of orthogonal transformations such as principal components are based on a dual formulation also termed Q-mode analysis in which the data enter into the analysis via inner products in the Gram matrix only. In the kernel version the inner products of the original data are replaced...... by inner products between nonlinear mappings into higher dimensional feature space. Via kernel substitution also known as the kernel trick these inner products between the mappings are in turn replaced by a kernel function and all quantities needed in the analysis are expressed in terms of this kernel...... function. This means that we need not know the nonlinear mappings explicitly. Kernel principal component analysis (PCA) and kernel minimum noise fraction (MNF) analyses handle nonlinearities by implicitly transforming data into high (even infinite) dimensional feature space via the kernel function...
An Approximate Approach to Automatic Kernel Selection.
Ding, Lizhong; Liao, Shizhong
2016-02-02
Kernel selection is a fundamental problem of kernel-based learning algorithms. In this paper, we propose an approximate approach to automatic kernel selection for regression from the perspective of kernel matrix approximation. We first introduce multilevel circulant matrices into automatic kernel selection, and develop two approximate kernel selection algorithms by exploiting the computational virtues of multilevel circulant matrices. The complexity of the proposed algorithms is quasi-linear in the number of data points. Then, we prove an approximation error bound to measure the effect of the approximation in kernel matrices by multilevel circulant matrices on the hypothesis and further show that the approximate hypothesis produced with multilevel circulant matrices converges to the accurate hypothesis produced with kernel matrices. Experimental evaluations on benchmark datasets demonstrate the effectiveness of approximate kernel selection.
Model Selection in Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter
Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...
Directory of Open Access Journals (Sweden)
Boris Kauhl
2016-11-01
Full Text Available Abstract Background The provision of general practitioners (GPs in Germany still relies mainly on the ratio of inhabitants to GPs at relatively large scales and barely accounts for an increased prevalence of chronic diseases among the elderly and socially underprivileged populations. Type 2 Diabetes Mellitus (T2DM is one of the major cost-intensive diseases with high rates of potentially preventable complications. Provision of healthcare and access to preventive measures is necessary to reduce the burden of T2DM. However, current studies on the spatial variation of T2DM in Germany are mostly based on survey data, which do not only underestimate the true prevalence of T2DM, but are also only available on large spatial scales. The aim of this study is therefore to analyse the spatial distribution of T2DM at fine geographic scales and to assess location-specific risk factors based on data of the AOK health insurance. Methods To display the spatial heterogeneity of T2DM, a bivariate, adaptive kernel density estimation (KDE was applied. The spatial scan statistic (SaTScan was used to detect areas of high risk. Global and local spatial regression models were then constructed to analyze socio-demographic risk factors of T2DM. Results T2DM is especially concentrated in rural areas surrounding Berlin. The risk factors for T2DM consist of proportions of 65–79 year olds, 80 + year olds, unemployment rate among the 55–65 year olds, proportion of employees covered by mandatory social security insurance, mean income tax, and proportion of non-married couples. However, the strength of the association between T2DM and the examined socio-demographic variables displayed strong regional variations. Conclusion The prevalence of T2DM varies at the very local level. Analyzing point data on T2DM of northeastern Germany’s largest health insurance provider thus allows very detailed, location-specific knowledge about increased medical needs. Risk factors
International Nuclear Information System (INIS)
Janurová, Kateřina; Briš, Radim
2014-01-01
Medical survival right-censored data of about 850 patients are evaluated to analyze the uncertainty related to the risk of mortality on one hand and compare two basic surgery techniques in the context of risk of mortality on the other hand. Colorectal data come from patients who underwent colectomy in the University Hospital of Ostrava. Two basic surgery operating techniques are used for the colectomy: either traditional (open) or minimally invasive (laparoscopic). Basic question arising at the colectomy operation is, which type of operation to choose to guarantee longer overall survival time. Two non-parametric approaches have been used to quantify probability of mortality with uncertainties. In fact, complement of the probability to one, i.e. survival function with corresponding confidence levels is calculated and evaluated. First approach considers standard nonparametric estimators resulting from both the Kaplan–Meier estimator of survival function in connection with Greenwood's formula and the Nelson–Aalen estimator of cumulative hazard function including confidence interval for survival function as well. The second innovative approach, represented by Nonparametric Predictive Inference (NPI), uses lower and upper probabilities for quantifying uncertainty and provides a model of predictive survival function instead of the population survival function. The traditional log-rank test on one hand and the nonparametric predictive comparison of two groups of lifetime data on the other hand have been compared to evaluate risk of mortality in the context of mentioned surgery techniques. The size of the difference between two groups of lifetime data has been considered and analyzed as well. Both nonparametric approaches led to the same conclusion, that the minimally invasive operating technique guarantees the patient significantly longer survival time in comparison with the traditional operating technique
Integral equations with contrasting kernels
Directory of Open Access Journals (Sweden)
Theodore Burton
2008-01-01
Full Text Available In this paper we study integral equations of the form $x(t=a(t-\\int^t_0 C(t,sx(sds$ with sharply contrasting kernels typified by $C^*(t,s=\\ln (e+(t-s$ and $D^*(t,s=[1+(t-s]^{-1}$. The kernel assigns a weight to $x(s$ and these kernels have exactly opposite effects of weighting. Each type is well represented in the literature. Our first project is to show that for $a\\in L^2[0,\\infty$, then solutions are largely indistinguishable regardless of which kernel is used. This is a surprise and it leads us to study the essential differences. In fact, those differences become large as the magnitude of $a(t$ increases. The form of the kernel alone projects necessary conditions concerning the magnitude of $a(t$ which could result in bounded solutions. Thus, the next project is to determine how close we can come to proving that the necessary conditions are also sufficient. The third project is to show that solutions will be bounded for given conditions on $C$ regardless of whether $a$ is chosen large or small; this is important in real-world problems since we would like to have $a(t$ as the sum of a bounded, but badly behaved function, and a large well behaved function.
MAP estimators and their consistency in Bayesian nonparametric inverse problems
Dashti, M.; Law, K. J H; Stuart, A. M.; Voss, J.
2013-01-01
with examples from an inverse problem for the Navier-Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. © 2013 IOP Publishing Ltd.
Non-parametric system identification from non-linear stochastic response
DEFF Research Database (Denmark)
Rüdinger, Finn; Krenk, Steen
2001-01-01
An estimation method is proposed for identification of non-linear stiffness and damping of single-degree-of-freedom systems under stationary white noise excitation. Non-parametric estimates of the stiffness and damping along with an estimate of the white noise intensity are obtained by suitable...... of the energy at mean-level crossings, which yields the damping relative to white noise intensity. Finally, an estimate of the noise intensity is extracted by estimating the absolute damping from the autocovariance functions of a set of modified phase plane variables at different energy levels. The method...
Kernel learning algorithms for face recognition
Li, Jun-Bao; Pan, Jeng-Shyang
2013-01-01
Kernel Learning Algorithms for Face Recognition covers the framework of kernel based face recognition. This book discusses the advanced kernel learning algorithms and its application on face recognition. This book also focuses on the theoretical deviation, the system framework and experiments involving kernel based face recognition. Included within are algorithms of kernel based face recognition, and also the feasibility of the kernel based face recognition method. This book provides researchers in pattern recognition and machine learning area with advanced face recognition methods and its new
Model selection for Gaussian kernel PCA denoising
DEFF Research Database (Denmark)
Jørgensen, Kasper Winther; Hansen, Lars Kai
2012-01-01
We propose kernel Parallel Analysis (kPA) for automatic kernel scale and model order selection in Gaussian kernel PCA. Parallel Analysis [1] is based on a permutation test for covariance and has previously been applied for model order selection in linear PCA, we here augment the procedure to also...... tune the Gaussian kernel scale of radial basis function based kernel PCA.We evaluate kPA for denoising of simulated data and the US Postal data set of handwritten digits. We find that kPA outperforms other heuristics to choose the model order and kernel scale in terms of signal-to-noise ratio (SNR...
Methodology in robust and nonparametric statistics
Jurecková, Jana; Picek, Jan
2012-01-01
Introduction and SynopsisIntroductionSynopsisPreliminariesIntroductionInference in Linear ModelsRobustness ConceptsRobust and Minimax Estimation of LocationClippings from Probability and Asymptotic TheoryProblemsRobust Estimation of Location and RegressionIntroductionM-EstimatorsL-EstimatorsR-EstimatorsMinimum Distance and Pitman EstimatorsDifferentiable Statistical FunctionsProblemsAsymptotic Representations for L-Estimators
Promotion time cure rate model with nonparametric form of covariate effects.
Chen, Tianlei; Du, Pang
2018-05-10
Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.
On Parametric (and Non-Parametric Variation
Directory of Open Access Journals (Sweden)
Neil Smith
2009-11-01
Full Text Available This article raises the issue of the correct characterization of ‘Parametric Variation’ in syntax and phonology. After specifying their theoretical commitments, the authors outline the relevant parts of the Principles–and–Parameters framework, and draw a three-way distinction among Universal Principles, Parameters, and Accidents. The core of the contribution then consists of an attempt to provide identity criteria for parametric, as opposed to non-parametric, variation. Parametric choices must be antecedently known, and it is suggested that they must also satisfy seven individually necessary and jointly sufficient criteria. These are that they be cognitively represented, systematic, dependent on the input, deterministic, discrete, mutually exclusive, and irreversible.
Nonparametric predictive pairwise comparison with competing risks
International Nuclear Information System (INIS)
Coolen-Maturi, Tahani
2014-01-01
In reliability, failure data often correspond to competing risks, where several failure modes can cause a unit to fail. This paper presents nonparametric predictive inference (NPI) for pairwise comparison with competing risks data, assuming that the failure modes are independent. These failure modes could be the same or different among the two groups, and these can be both observed and unobserved failure modes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. The focus is on the lower and upper probabilities for the event that the lifetime of a future unit from one group, say Y, is greater than the lifetime of a future unit from the second group, say X. The paper also shows how the two groups can be compared based on particular failure mode(s), and the comparison of the two groups when some of the competing risks are combined is discussed
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Nonparametric Bayesian inference for mean residual life functions in survival analysis.
Poynor, Valerie; Kottas, Athanasios
2018-01-19
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bayesian Frequency Domain Identification of LTI Systems with OBFs kernels
Darwish, M.A.H.; Lataire, J.P.G.; Tóth, R.
2017-01-01
Regularised Frequency Response Function (FRF) estimation based on Gaussian process regression formulated directly in the frequency-domain has been introduced recently The underlying approach largely depends on the utilised kernel function, which encodes the relevant prior knowledge on the system
DEFF Research Database (Denmark)
Walder, Christian; Henao, Ricardo; Mørup, Morten
We present three generalisations of Kernel Principal Components Analysis (KPCA) which incorporate knowledge of the class labels of a subset of the data points. The first, MV-KPCA, penalises within class variances similar to Fisher discriminant analysis. The second, LSKPCA is a hybrid of least...... squares regression and kernel PCA. The final LR-KPCA is an iteratively reweighted version of the previous which achieves a sigmoid loss function on the labeled points. We provide a theoretical risk bound as well as illustrative experiments on real and toy data sets....
Mixed kernel function support vector regression for global sensitivity analysis
Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng
2017-11-01
Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.
Simple nonparametric checks for model data fit in CAT
Meijer, R.R.
2005-01-01
In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item
Nonparametric analysis of blocked ordered categories data: some examples revisited
Directory of Open Access Journals (Sweden)
O. Thas
2006-08-01
Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.
Model selection in kernel ridge regression
DEFF Research Database (Denmark)
Exterkate, Peter
2013-01-01
Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...
Nonparametric statistics with applications to science and engineering
Kvam, Paul H
2007-01-01
A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...
Multiple Kernel Learning with Data Augmentation
2016-11-22
JMLR: Workshop and Conference Proceedings 63:49–64, 2016 ACML 2016 Multiple Kernel Learning with Data Augmentation Khanh Nguyen nkhanh@deakin.edu.au...University, Australia Editors: Robert J. Durrant and Kee-Eung Kim Abstract The motivations of multiple kernel learning (MKL) approach are to increase... kernel expres- siveness capacity and to avoid the expensive grid search over a wide spectrum of kernels . A large amount of work has been proposed to
A kernel version of multivariate alteration detection
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Vestergaard, Jacob Schack
2013-01-01
Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations.......Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations....
Sun, L.G.; De Visser, C.C.; Chu, Q.P.; Mulder, J.A.
2012-01-01
The optimality of the kernel number and kernel centers plays a significant role in determining the approximation power of nearly all kernel methods. However, the process of choosing optimal kernels is always formulated as a global optimization task, which is hard to accomplish. Recently, an
Complex use of cottonseed kernels
Energy Technology Data Exchange (ETDEWEB)
Glushenkova, A I
1977-01-01
A review with 41 references is made on the manufacture of oil, protein, and other products from cottonseed, the effects of gossypol on protein yield and quality and technology of gossypol removal. A process eliminating thermal treatment of the kernels and permitting the production of oil, proteins, phytin, gossypol, sugar, sterols, phosphatides, tocopherols, and residual shells and baggase is described.
GRIM : Leveraging GPUs for Kernel integrity monitoring
Koromilas, Lazaros; Vasiliadis, Giorgos; Athanasopoulos, Ilias; Ioannidis, Sotiris
2016-01-01
Kernel rootkits can exploit an operating system and enable future accessibility and control, despite all recent advances in software protection. A promising defense mechanism against rootkits is Kernel Integrity Monitor (KIM) systems, which inspect the kernel text and data to discover any malicious
Paramecium: An Extensible Object-Based Kernel
van Doorn, L.; Homburg, P.; Tanenbaum, A.S.
1995-01-01
In this paper we describe the design of an extensible kernel, called Paramecium. This kernel uses an object-based software architecture which together with instance naming, late binding and explicit overrides enables easy reconfiguration. Determining which components reside in the kernel protection
Local Observed-Score Kernel Equating
Wiberg, Marie; van der Linden, Wim J.; von Davier, Alina A.
2014-01-01
Three local observed-score kernel equating methods that integrate methods from the local equating and kernel equating frameworks are proposed. The new methods were compared with their earlier counterparts with respect to such measures as bias--as defined by Lord's criterion of equity--and percent relative error. The local kernel item response…
Veto-Consensus Multiple Kernel Learning
Zhou, Y.; Hu, N.; Spanos, C.J.
2016-01-01
We propose Veto-Consensus Multiple Kernel Learning (VCMKL), a novel way of combining multiple kernels such that one class of samples is described by the logical intersection (consensus) of base kernelized decision rules, whereas the other classes by the union (veto) of their complements. The
Optimal kernel shape and bandwidth for atomistic support of continuum stress
International Nuclear Information System (INIS)
Ulz, Manfred H; Moran, Sean J
2013-01-01
The treatment of atomistic scale interactions via molecular dynamics simulations has recently found favour for multiscale modelling within engineering. The estimation of stress at a continuum point on the atomistic scale requires a pre-defined kernel function. This kernel function derives the stress at a continuum point by averaging the contribution from atoms within a region surrounding the continuum point. This averaging volume, and therefore the associated stress at a continuum point, is highly dependent on the bandwidth and shape of the kernel. In this paper we propose an effective and entirely data-driven strategy for simultaneously computing the optimal shape and bandwidth for the kernel. We thoroughly evaluate our proposed approach on copper using three classical elasticity problems. Our evaluation yields three key findings: firstly, our technique can provide a physically meaningful estimation of kernel bandwidth; secondly, we show that a uniform kernel is preferred, thereby justifying the default selection of this kernel shape in future work; and thirdly, we can reliably estimate both of these attributes in a data-driven manner, obtaining values that lead to an accurate estimation of the stress at a continuum point. (paper)
Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki
2010-06-01
Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.
Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks
Gray-Davies, Tristan; Holmes, Chris C.; Caron, François
2018-01-01
We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150
A nonparametric empirical Bayes framework for large-scale multiple testing.
Martin, Ryan; Tokdar, Surya T
2012-07-01
We propose a flexible and identifiable version of the 2-groups model, motivated by hierarchical Bayes considerations, that features an empirical null and a semiparametric mixture model for the nonnull cases. We use a computationally efficient predictive recursion (PR) marginal likelihood procedure to estimate the model parameters, even the nonparametric mixing distribution. This leads to a nonparametric empirical Bayes testing procedure, which we call PRtest, based on thresholding the estimated local false discovery rates. Simulations and real data examples demonstrate that, compared to existing approaches, PRtest's careful handling of the nonnull density can give a much better fit in the tails of the mixture distribution which, in turn, can lead to more realistic conclusions.
On the expected value and variance for an estimator of the spatio-temporal product density function
DEFF Research Database (Denmark)
Rodríguez-Corté, Francisco J.; Ghorbani, Mohammad; Mateu, Jorge
Second-order characteristics are used to analyse the spatio-temporal structure of the underlying point process, and thus these methods provide a natural starting point for the analysis of spatio-temporal point process data. We restrict our attention to the spatio-temporal product density function......, and develop a non-parametric edge-corrected kernel estimate of the product density under the second-order intensity-reweighted stationary hypothesis. The expectation and variance of the estimator are obtained, and closed form expressions derived under the Poisson case. A detailed simulation study is presented...... to compare our close expression for the variance with estimated ones for Poisson cases. The simulation experiments show that the theoretical form for the variance gives acceptable values, which can be used in practice. Finally, we apply the resulting estimator to data on the spatio-temporal distribution...
Directory of Open Access Journals (Sweden)
L Max Tarjan
Full Text Available Parametric and nonparametric kernel methods dominate studies of animal home ranges and space use. Most existing methods are unable to incorporate information about the underlying physical environment, leading to poor performance in excluding areas that are not used. Using radio-telemetry data from sea otters, we developed and evaluated a new algorithm for estimating home ranges (hereafter Permissible Home Range Estimation, or "PHRE" that reflects habitat suitability. We began by transforming sighting locations into relevant landscape features (for sea otters, coastal position and distance from shore. Then, we generated a bivariate kernel probability density function in landscape space and back-transformed this to geographic space in order to define a permissible home range. Compared to two commonly used home range estimation methods, kernel densities and local convex hulls, PHRE better excluded unused areas and required a smaller sample size. Our PHRE method is applicable to species whose ranges are restricted by complex physical boundaries or environmental gradients and will improve understanding of habitat-use requirements and, ultimately, aid in conservation efforts.
Directory of Open Access Journals (Sweden)
Senyue Zhang
2016-01-01
Full Text Available According to the characteristics that the kernel function of extreme learning machine (ELM and its performance have a strong correlation, a novel extreme learning machine based on a generalized triangle Hermitian kernel function was proposed in this paper. First, the generalized triangle Hermitian kernel function was constructed by using the product of triangular kernel and generalized Hermite Dirichlet kernel, and the proposed kernel function was proved as a valid kernel function of extreme learning machine. Then, the learning methodology of the extreme learning machine based on the proposed kernel function was presented. The biggest advantage of the proposed kernel is its kernel parameter values only chosen in the natural numbers, which thus can greatly shorten the computational time of parameter optimization and retain more of its sample data structure information. Experiments were performed on a number of binary classification, multiclassification, and regression datasets from the UCI benchmark repository. The experiment results demonstrated that the robustness and generalization performance of the proposed method are outperformed compared to other extreme learning machines with different kernels. Furthermore, the learning speed of proposed method is faster than support vector machine (SVM methods.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Delimiting areas of endemism through kernel interpolation.
Oliveira, Ubirajara; Brescovit, Antonio D; Santos, Adalberto J
2015-01-01
We propose a new approach for identification of areas of endemism, the Geographical Interpolation of Endemism (GIE), based on kernel spatial interpolation. This method differs from others in being independent of grid cells. This new approach is based on estimating the overlap between the distribution of species through a kernel interpolation of centroids of species distribution and areas of influence defined from the distance between the centroid and the farthest point of occurrence of each species. We used this method to delimit areas of endemism of spiders from Brazil. To assess the effectiveness of GIE, we analyzed the same data using Parsimony Analysis of Endemism and NDM and compared the areas identified through each method. The analyses using GIE identified 101 areas of endemism of spiders in Brazil GIE demonstrated to be effective in identifying areas of endemism in multiple scales, with fuzzy edges and supported by more synendemic species than in the other methods. The areas of endemism identified with GIE were generally congruent with those identified for other taxonomic groups, suggesting that common processes can be responsible for the origin and maintenance of these biogeographic units.
Delimiting areas of endemism through kernel interpolation.
Directory of Open Access Journals (Sweden)
Ubirajara Oliveira
Full Text Available We propose a new approach for identification of areas of endemism, the Geographical Interpolation of Endemism (GIE, based on kernel spatial interpolation. This method differs from others in being independent of grid cells. This new approach is based on estimating the overlap between the distribution of species through a kernel interpolation of centroids of species distribution and areas of influence defined from the distance between the centroid and the farthest point of occurrence of each species. We used this method to delimit areas of endemism of spiders from Brazil. To assess the effectiveness of GIE, we analyzed the same data using Parsimony Analysis of Endemism and NDM and compared the areas identified through each method. The analyses using GIE identified 101 areas of endemism of spiders in Brazil GIE demonstrated to be effective in identifying areas of endemism in multiple scales, with fuzzy edges and supported by more synendemic species than in the other methods. The areas of endemism identified with GIE were generally congruent with those identified for other taxonomic groups, suggesting that common processes can be responsible for the origin and maintenance of these biogeographic units.
Nonparametric methods in actigraphy: An update
Directory of Open Access Journals (Sweden)
Bruno S.B. Gonçalves
2014-09-01
Full Text Available Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm results for each time interval. Simulated data showed that (1 synchronization analysis depends on sample size, and (2 fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization.
Bayesian nonparametric adaptive control using Gaussian processes.
Chowdhary, Girish; Kingravi, Hassan A; How, Jonathan P; Vela, Patricio A
2015-03-01
Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.
Wigner functions defined with Laplace transform kernels.
Oh, Se Baek; Petruccelli, Jonathan C; Tian, Lei; Barbastathis, George
2011-10-24
We propose a new Wigner-type phase-space function using Laplace transform kernels--Laplace kernel Wigner function. Whereas momentum variables are real in the traditional Wigner function, the Laplace kernel Wigner function may have complex momentum variables. Due to the property of the Laplace transform, a broader range of signals can be represented in complex phase-space. We show that the Laplace kernel Wigner function exhibits similar properties in the marginals as the traditional Wigner function. As an example, we use the Laplace kernel Wigner function to analyze evanescent waves supported by surface plasmon polariton. © 2011 Optical Society of America
Directory of Open Access Journals (Sweden)
Ibsen Chivatá Cárdenas
2008-05-01
Full Text Available This article presents a rainfall model constructed by applying non-parametric modelling and imprecise probabilities; these tools were used because there was not enough homogeneous information in the study area. The area’s hydro-logical information regarding rainfall was scarce and existing hydrological time series were not uniform. A distributed extended rainfall model was constructed from so-called probability boxes (p-boxes, multinomial probability distribu-tion and confidence intervals (a friendly algorithm was constructed for non-parametric modelling by combining the last two tools. This model confirmed the high level of uncertainty involved in local rainfall modelling. Uncertainty en-compassed the whole range (domain of probability values thereby showing the severe limitations on information, leading to the conclusion that a detailed estimation of probability would lead to significant error. Nevertheless, rele-vant information was extracted; it was estimated that maximum daily rainfall threshold (70 mm would be surpassed at least once every three years and the magnitude of uncertainty affecting hydrological parameter estimation. This paper’s conclusions may be of interest to non-parametric modellers and decisions-makers as such modelling and imprecise probability represents an alternative for hydrological variable assessment and maybe an obligatory proce-dure in the future. Its potential lies in treating scarce information and represents a robust modelling strategy for non-seasonal stochastic modelling conditions
Surface and top-of-atmosphere radiative feedback kernels for CESM-CAM5
Pendergrass, Angeline G.; Conley, Andrew; Vitt, Francis M.
2018-02-01
Radiative kernels at the top of the atmosphere are useful for decomposing changes in atmospheric radiative fluxes due to feedbacks from atmosphere and surface temperature, water vapor, and surface albedo. Here we describe and validate radiative kernels calculated with the large-ensemble version of CAM5, CESM1.1.2, at the top of the atmosphere and the surface. Estimates of the radiative forcing from greenhouse gases and aerosols in RCP8.5 in the CESM large-ensemble simulations are also diagnosed. As an application, feedbacks are calculated for the CESM large ensemble. The kernels are freely available at https://doi.org/10.5065/D6F47MT6" target="_blank">https://doi.org/10.5065/D6F47MT6, and accompanying software can be downloaded from https://github.com/apendergrass/cam5-kernels" target="_blank">https://github.com/apendergrass/cam5-kernels.
Credit scoring analysis using kernel discriminant
Widiharih, T.; Mukid, M. A.; Mustafid
2018-05-01
Credit scoring model is an important tool for reducing the risk of wrong decisions when granting credit facilities to applicants. This paper investigate the performance of kernel discriminant model in assessing customer credit risk. Kernel discriminant analysis is a non- parametric method which means that it does not require any assumptions about the probability distribution of the input. The main ingredient is a kernel that allows an efficient computation of Fisher discriminant. We use several kernel such as normal, epanechnikov, biweight, and triweight. The models accuracy was compared each other using data from a financial institution in Indonesia. The results show that kernel discriminant can be an alternative method that can be used to determine who is eligible for a credit loan. In the data we use, it shows that a normal kernel is relevant to be selected for credit scoring using kernel discriminant model. Sensitivity and specificity reach to 0.5556 and 0.5488 respectively.
Testing Infrastructure for Operating System Kernel Development
DEFF Research Database (Denmark)
Walter, Maxwell; Karlsson, Sven
2014-01-01
Testing is an important part of system development, and to test effectively we require knowledge of the internal state of the system under test. Testing an operating system kernel is a challenge as it is the operating system that typically provides access to this internal state information. Multi......-core kernels pose an even greater challenge due to concurrency and their shared kernel state. In this paper, we present a testing framework that addresses these challenges by running the operating system in a virtual machine, and using virtual machine introspection to both communicate with the kernel...... and obtain information about the system. We have also developed an in-kernel testing API that we can use to develop a suite of unit tests in the kernel. We are using our framework for for the development of our own multi-core research kernel....
Kernel parameter dependence in spatial factor analysis
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg
2010-01-01
kernel PCA. Shawe-Taylor and Cristianini [4] is an excellent reference for kernel methods in general. Bishop [5] and Press et al. [6] describe kernel methods among many other subjects. The kernel version of PCA handles nonlinearities by implicitly transforming data into high (even infinite) dimensional...... feature space via the kernel function and then performing a linear analysis in that space. In this paper we shall apply a kernel version of maximum autocorrelation factor (MAF) [7, 8] analysis to irregularly sampled stream sediment geochemistry data from South Greenland and illustrate the dependence...... of the kernel width. The 2,097 samples each covering on average 5 km2 are analyzed chemically for the content of 41 elements....
Directory of Open Access Journals (Sweden)
Ellen Kenchington
Full Text Available The United Nations General Assembly Resolution 61/105, concerning sustainable fisheries in the marine ecosystem, calls for the protection of vulnerable marine ecosystems (VME from destructive fishing practices. Subsequently, the Food and Agriculture Organization (FAO produced guidelines for identification of VME indicator species/taxa to assist in the implementation of the resolution, but recommended the development of case-specific operational definitions for their application. We applied kernel density estimation (KDE to research vessel trawl survey data from inside the fishing footprint of the Northwest Atlantic Fisheries Organization (NAFO Regulatory Area in the high seas of the northwest Atlantic to create biomass density surfaces for four VME indicator taxa: large-sized sponges, sea pens, small and large gorgonian corals. These VME indicator taxa were identified previously by NAFO using the fragility, life history characteristics and structural complexity criteria presented by FAO, along with an evaluation of their recovery trajectories. KDE, a non-parametric neighbour-based smoothing function, has been used previously in ecology to identify hotspots, that is, areas of relatively high biomass/abundance. We present a novel approach of examining relative changes in area under polygons created from encircling successive biomass categories on the KDE surface to identify "significant concentrations" of biomass, which we equate to VMEs. This allows identification of the VMEs from the broader distribution of the species in the study area. We provide independent assessments of the VMEs so identified using underwater images, benthic sampling with other gear types (dredges, cores, and/or published species distribution models of probability of occurrence, as available. For each VME indicator taxon we provide a brief review of their ecological function which will be important in future assessments of significant adverse impact on these habitats here
Chen, Lili; Zhang, Xi; Wang, Hui
2015-05-01
Obstructive sleep apnea (OSA) is a common sleep disorder that often remains undiagnosed, leading to an increased risk of developing cardiovascular diseases. Polysomnogram (PSG) is currently used as a golden standard for screening OSA. However, because it is time consuming, expensive and causes discomfort, alternative techniques based on a reduced set of physiological signals are proposed to solve this problem. This study proposes a convenient non-parametric kernel density-based approach for detection of OSA using single-lead electrocardiogram (ECG) recordings. Selected physiologically interpretable features are extracted from segmented RR intervals, which are obtained from ECG signals. These features are fed into the kernel density classifier to detect apnea event and bandwidths for density of each class (normal or apnea) are automatically chosen through an iterative bandwidth selection algorithm. To validate the proposed approach, RR intervals are extracted from ECG signals of 35 subjects obtained from a sleep apnea database ( http://physionet.org/cgi-bin/atm/ATM ). The results indicate that the kernel density classifier, with two features for apnea event detection, achieves a mean accuracy of 82.07 %, with mean sensitivity of 83.23 % and mean specificity of 80.24 %. Compared with other existing methods, the proposed kernel density approach achieves a comparably good performance but by using fewer features without significantly losing discriminant power, which indicates that it could be widely used for home-based screening or diagnosis of OSA.
Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
Directory of Open Access Journals (Sweden)
Bo Liu
2012-02-01
Full Text Available In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL, for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI. Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD. Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces.
International Nuclear Information System (INIS)
Veglia, A.
1981-08-01
In cases where sets of data are obviously not normally distributed, the application of a nonparametric method for the estimation of a confidence interval for the mean seems to be more suitable than some other methods because such a method requires few assumptions about the population of data. A two-step statistical method is proposed which can be applied to any set of analytical results: elimination of outliers by a nonparametric method based on Tchebycheff's inequality, and determination of a confidence interval for the mean by a non-parametric method based on binominal distribution. The method is appropriate only for samples of size n>=10
Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.
2011-05-01
Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation
Validation of Born Traveltime Kernels
Baig, A. M.; Dahlen, F. A.; Hung, S.
2001-12-01
Most inversions for Earth structure using seismic traveltimes rely on linear ray theory to translate observed traveltime anomalies into seismic velocity anomalies distributed throughout the mantle. However, ray theory is not an appropriate tool to use when velocity anomalies have scale lengths less than the width of the Fresnel zone. In the presence of these structures, we need to turn to a scattering theory in order to adequately describe all of the features observed in the waveform. By coupling the Born approximation to ray theory, the first order dependence of heterogeneity on the cross-correlated traveltimes (described by the Fréchet derivative or, more colourfully, the banana-doughnut kernel) may be determined. To determine for what range of parameters these banana-doughnut kernels outperform linear ray theory, we generate several random media specified by their statistical properties, namely the RMS slowness perturbation and the scale length of the heterogeneity. Acoustic waves are numerically generated from a point source using a 3-D pseudo-spectral wave propagation code. These waves are then recorded at a variety of propagation distances from the source introducing a third parameter to the problem: the number of wavelengths traversed by the wave. When all of the heterogeneity has scale lengths larger than the width of the Fresnel zone, ray theory does as good a job at predicting the cross-correlated traveltime as the banana-doughnut kernels do. Below this limit, wavefront healing becomes a significant effect and ray theory ceases to be effective even though the kernels remain relatively accurate provided the heterogeneity is weak. The study of wave propagation in random media is of a more general interest and we will also show our measurements of the velocity shift and the variance of traveltime compare to various theoretical predictions in a given regime.
RKRD: Runtime Kernel Rootkit Detection
Grover, Satyajit; Khosravi, Hormuzd; Kolar, Divya; Moffat, Samuel; Kounavis, Michael E.
In this paper we address the problem of protecting computer systems against stealth malware. The problem is important because the number of known types of stealth malware increases exponentially. Existing approaches have some advantages for ensuring system integrity but sophisticated techniques utilized by stealthy malware can thwart them. We propose Runtime Kernel Rootkit Detection (RKRD), a hardware-based, event-driven, secure and inclusionary approach to kernel integrity that addresses some of the limitations of the state of the art. Our solution is based on the principles of using virtualization hardware for isolation, verifying signatures coming from trusted code as opposed to malware for scalability and performing system checks driven by events. Our RKRD implementation is guided by our goals of strong isolation, no modifications to target guest OS kernels, easy deployment, minimal infra-structure impact, and minimal performance overhead. We developed a system prototype and conducted a number of experiments which show that the per-formance impact of our solution is negligible.
Kernel Bayesian ART and ARTMAP.
Masuyama, Naoki; Loo, Chu Kiong; Dawood, Farhan
2018-02-01
Adaptive Resonance Theory (ART) is one of the successful approaches to resolving "the plasticity-stability dilemma" in neural networks, and its supervised learning model called ARTMAP is a powerful tool for classification. Among several improvements, such as Fuzzy or Gaussian based models, the state of art model is Bayesian based one, while solving the drawbacks of others. However, it is known that the Bayesian approach for the high dimensional and a large number of data requires high computational cost, and the covariance matrix in likelihood becomes unstable. This paper introduces Kernel Bayesian ART (KBA) and ARTMAP (KBAM) by integrating Kernel Bayes' Rule (KBR) and Correntropy Induced Metric (CIM) to Bayesian ART (BA) and ARTMAP (BAM), respectively, while maintaining the properties of BA and BAM. The kernel frameworks in KBA and KBAM are able to avoid the curse of dimensionality. In addition, the covariance-free Bayesian computation by KBR provides the efficient and stable computational capability to KBA and KBAM. Furthermore, Correntropy-based similarity measurement allows improving the noise reduction ability even in the high dimensional space. The simulation experiments show that KBA performs an outstanding self-organizing capability than BA, and KBAM provides the superior classification ability than BAM, respectively. Copyright © 2017 Elsevier Ltd. All rights reserved.
Weak Disposability in Nonparametric Production Analysis with Undesirable Outputs
Kuosmanen, T.K.
2005-01-01
Environmental Economics and Natural Resources Group at Wageningen University in The Netherlands Weak disposability of outputs means that firms can abate harmful emissions by decreasing the activity level. Modeling weak disposability in nonparametric production analysis has caused some confusion.
Multi-sample nonparametric treatments comparison in medical ...
African Journals Online (AJOL)
Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study. P. L. Tan, N.A. Ibrahim, M.B. Adam, J. Arasan ...
Economic decision making and the application of nonparametric prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2008-01-01
Sustained increases in energy prices have focused attention on gas resources in low-permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are often large. Planning and development decisions for extraction of such resources must be areawide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm, the decision to enter such plays depends on reconnaissance-level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional-scale cost functions. The context of the worked example is the Devonian Antrim-shale gas play in the Michigan basin. One finding relates to selection of the resource prediction model to be used with economic models. Models chosen because they can best predict aggregate volume over larger areas (many hundreds of sites) smooth out granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined arbitrarily by extraneous factors. The analysis shows a 15-20% gain in gas volume when these simple models are applied to order drilling prospects strategically rather than to choose drilling locations randomly. Copyright ?? 2008 Society of Petroleum Engineers.
Nonparametric Integrated Agrometeorological Drought Monitoring: Model Development and Application
Zhang, Qiang; Li, Qin; Singh, Vijay P.; Shi, Peijun; Huang, Qingzhong; Sun, Peng
2018-01-01
Drought is a major natural hazard that has massive impacts on the society. How to monitor drought is critical for its mitigation and early warning. This study proposed a modified version of the multivariate standardized drought index (MSDI) based on precipitation, evapotranspiration, and soil moisture, i.e., modified multivariate standardized drought index (MMSDI). This study also used nonparametric joint probability distribution analysis. Comparisons were done between standardized precipitation evapotranspiration index (SPEI), standardized soil moisture index (SSMI), MSDI, and MMSDI, and real-world observed drought regimes. Results indicated that MMSDI detected droughts that SPEI and/or SSMI failed to do. Also, MMSDI detected almost all droughts that were identified by SPEI and SSMI. Further, droughts detected by MMSDI were similar to real-world observed droughts in terms of drought intensity and drought-affected area. When compared to MMSDI, MSDI has the potential to overestimate drought intensity and drought-affected area across China, which should be attributed to exclusion of the evapotranspiration components from estimation of drought intensity. Therefore, MMSDI is proposed for drought monitoring that can detect agrometeorological droughts. Results of this study provide a framework for integrated drought monitoring in other regions of the world and can help to develop drought mitigation.
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Directory of Open Access Journals (Sweden)
Alejandro Jara
2011-04-01
Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
Bayesian Non-Parametric Mixtures of GARCH(1,1 Models
Directory of Open Access Journals (Sweden)
John W. Lau
2012-01-01
Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
DEFF Research Database (Denmark)
Carrao, Hugo; Sepulcre, Guadalupe; Horion, Stéphanie Marie Anne F
2013-01-01
This study evaluates the relationship between the frequency and duration of meteorological droughts and the subsequent temporal changes on the quantity of actively photosynthesizing biomass (greenness) estimated from satellite imagery on rainfed croplands in Latin America. An innovative non-parametric...... and non-supervised approach, based on the Fisher-Jenks optimal classification algorithm, is used to identify multi-scale meteorological droughts on the basis of empirical cumulative distributions of 1, 3, 6, and 12-monthly precipitation totals. As input data for the classifier, we use the gridded GPCC...... for the period between 1998 and 2010. The time-series analysis of vegetation greenness is performed during the growing season with a non-parametric method, namely the seasonal Relative Greenness (RG) of spatially accumulated fAPAR. The Global Land Cover map of 2000 and the GlobCover maps of 2005/2006 and 2009...
Characterisation and final disposal behaviour of theoria-based fuel kernels in aqueous phases
International Nuclear Information System (INIS)
Titov, M.
2005-08-01
Two high-temperature reactors (AVR and THTR) operated in Germany have produced about 1 million spent fuel elements. The nuclear fuel in these reactors consists mainly of thorium-uranium mixed oxides, but also pure uranium dioxide and carbide fuels were tested. One of the possible solutions of utilising spent HTR fuel is the direct disposal in deep geological formations. Under such circumstances, the properties of fuel kernels, and especially their leaching behaviour in aqueous phases, have to be investigated for safety assessments of the final repository. In the present work, unirradiated ThO 2 , (Th 0.906 ,U 0.094 )O 2 , (Th 0.834 ,U 0.166 )O 2 and UO 2 fuel kernels were investigated. The composition, crystal structure and surface of the kernels were investigated by traditional methods. Furthermore, a new method was developed for testing the mechanical properties of ceramic kernels. The method was successfully used for the examination of mechanical properties of oxide kernels and for monitoring their evolution during contact with aqueous phases. The leaching behaviour of thoria-based oxide kernels and powders was investigated in repository-relevant salt solutions, as well as in artificial leachates. The influence of different experimental parameters on the kernel leaching stability was investigated. It was shown that thoria-based fuel kernels possess high chemical stability and are indifferent to presence of oxidative and radiolytic species in solution. The dissolution rate of thoria-based materials is typically several orders of magnitude lower than of conventional UO 2 fuel kernels. The life time of a single intact (Th,U)O 2 kernel under aggressive conditions of salt repository was estimated as about hundred thousand years. The importance of grain boundary quality on the leaching stability was demonstrated. Numerical Monte Carlo simulations were performed in order to explain the results of leaching experiments. (orig.)
Theory of reproducing kernels and applications
Saitoh, Saburou
2016-01-01
This book provides a large extension of the general theory of reproducing kernels published by N. Aronszajn in 1950, with many concrete applications. In Chapter 1, many concrete reproducing kernels are first introduced with detailed information. Chapter 2 presents a general and global theory of reproducing kernels with basic applications in a self-contained way. Many fundamental operations among reproducing kernel Hilbert spaces are dealt with. Chapter 2 is the heart of this book. Chapter 3 is devoted to the Tikhonov regularization using the theory of reproducing kernels with applications to numerical and practical solutions of bounded linear operator equations. In Chapter 4, the numerical real inversion formulas of the Laplace transform are presented by applying the Tikhonov regularization, where the reproducing kernels play a key role in the results. Chapter 5 deals with ordinary differential equations; Chapter 6 includes many concrete results for various fundamental partial differential equations. In Chapt...
Convergence of barycentric coordinates to barycentric kernels
Kosinka, Jiří
2016-02-12
We investigate the close correspondence between barycentric coordinates and barycentric kernels from the point of view of the limit process when finer and finer polygons converge to a smooth convex domain. We show that any barycentric kernel is the limit of a set of barycentric coordinates and prove that the convergence rate is quadratic. Our convergence analysis extends naturally to barycentric interpolants and mappings induced by barycentric coordinates and kernels. We verify our theoretical convergence results numerically on several examples.
Convergence of barycentric coordinates to barycentric kernels
Kosinka, Jiří
2016-01-01
We investigate the close correspondence between barycentric coordinates and barycentric kernels from the point of view of the limit process when finer and finer polygons converge to a smooth convex domain. We show that any barycentric kernel is the limit of a set of barycentric coordinates and prove that the convergence rate is quadratic. Our convergence analysis extends naturally to barycentric interpolants and mappings induced by barycentric coordinates and kernels. We verify our theoretical convergence results numerically on several examples.
Kernel principal component analysis for change detection
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Morton, J.C.
2008-01-01
region acquired at two different time points. If change over time does not dominate the scene, the projection of the original two bands onto the second eigenvector will show change over time. In this paper a kernel version of PCA is used to carry out the analysis. Unlike ordinary PCA, kernel PCA...... with a Gaussian kernel successfully finds the change observations in a case where nonlinearities are introduced artificially....
Impulse response identification with deterministic inputs using non-parametric methods
International Nuclear Information System (INIS)
Bhargava, U.K.; Kashyap, R.L.; Goodman, D.M.
1985-01-01
This paper addresses the problem of impulse response identification using non-parametric methods. Although the techniques developed herein apply to the truncated, untruncated, and the circulant models, we focus on the truncated model which is useful in certain applications. Two methods of impulse response identification will be presented. The first is based on the minimization of the C/sub L/ Statistic, which is an estimate of the mean-square prediction error; the second is a Bayesian approach. For both of these methods, we consider the effects of using both the identity matrix and the Laplacian matrix as weights on the energy in the impulse response. In addition, we present a method for estimating the effective length of the impulse response. Estimating the length is particularly important in the truncated case. Finally, we develop a method for estimating the noise variance at the output. Often, prior information on the noise variance is not available, and a good estimate is crucial to the success of estimating the impulse response with a nonparametric technique
A robust nonparametric method for quantifying undetected extinctions.
Chisholm, Ryan A; Giam, Xingli; Sadanandan, Keren R; Fung, Tak; Rheindt, Frank E
2016-06-01
How many species have gone extinct in modern times before being described by science? To answer this question, and thereby get a full assessment of humanity's impact on biodiversity, statistical methods that quantify undetected extinctions are required. Such methods have been developed recently, but they are limited by their reliance on parametric assumptions; specifically, they assume the pools of extant and undetected species decay exponentially, whereas real detection rates vary temporally with survey effort and real extinction rates vary with the waxing and waning of threatening processes. We devised a new, nonparametric method for estimating undetected extinctions. As inputs, the method requires only the first and last date at which each species in an ensemble was recorded. As outputs, the method provides estimates of the proportion of species that have gone extinct, detected, or undetected and, in the special case where the number of undetected extant species in the present day is assumed close to zero, of the absolute number of undetected extinct species. The main assumption of the method is that the per-species extinction rate is independent of whether a species has been detected or not. We applied the method to the resident native bird fauna of Singapore. Of 195 recorded species, 58 (29.7%) have gone extinct in the last 200 years. Our method projected that an additional 9.6 species (95% CI 3.4, 19.8) have gone extinct without first being recorded, implying a true extinction rate of 33.0% (95% CI 31.0%, 36.2%). We provide R code for implementing our method. Because our method does not depend on strong assumptions, we expect it to be broadly useful for quantifying undetected extinctions. © 2016 Society for Conservation Biology.
Firing rate estimation using infinite mixture models and its application to neural decoding.
Shibue, Ryohei; Komaki, Fumiyasu
2017-11-01
Neural decoding is a framework for reconstructing external stimuli from spike trains recorded by various neural recordings. Kloosterman et al. proposed a new decoding method using marked point processes (Kloosterman F, Layton SP, Chen Z, Wilson MA. J Neurophysiol 111: 217-227, 2014). This method does not require spike sorting and thereby improves decoding accuracy dramatically. In this method, they used kernel density estimation to estimate intensity functions of marked point processes. However, the use of kernel density estimation causes problems such as low decoding accuracy and high computational costs. To overcome these problems, we propose a new decoding method using infinite mixture models to estimate intensity. The proposed method improves decoding performance in terms of accuracy and computational speed. We apply the proposed method to simulation and experimental data to verify its performance. NEW & NOTEWORTHY We propose a new neural decoding method using infinite mixture models and nonparametric Bayesian statistics. The proposed method improves decoding performance in terms of accuracy and computation speed. We have successfully applied the proposed method to position decoding from spike trains recorded in a rat hippocampus. Copyright © 2017 the American Physiological Society.
Process for producing metal oxide kernels and kernels so obtained
International Nuclear Information System (INIS)
Lelievre, Bernard; Feugier, Andre.
1974-01-01
The process desbribed is for producing fissile or fertile metal oxide kernels used in the fabrication of fuels for high temperature nuclear reactors. This process consists in adding to an aqueous solution of at least one metallic salt, particularly actinide nitrates, at least one chemical compound capable of releasing ammonia, in dispersing drop by drop the solution thus obtained into a hot organic phase to gel the drops and transform them into solid particles. These particles are then washed, dried and treated to turn them into oxide kernels. The organic phase used for the gel reaction is formed of a mixture composed of two organic liquids, one acting as solvent and the other being a product capable of extracting the anions from the metallic salt of the drop at the time of gelling. Preferably an amine is used as product capable of extracting the anions. Additionally, an alcohol that causes a part dehydration of the drops can be employed as solvent, thus helping to increase the resistance of the particles [fr
Adaptive Learning in Cartesian Product of Reproducing Kernel Hilbert Spaces
Yukawa, Masahiro
2014-01-01
We propose a novel adaptive learning algorithm based on iterative orthogonal projections in the Cartesian product of multiple reproducing kernel Hilbert spaces (RKHSs). The task is estimating/tracking nonlinear functions which are supposed to contain multiple components such as (i) linear and nonlinear components, (ii) high- and low- frequency components etc. In this case, the use of multiple RKHSs permits a compact representation of multicomponent functions. The proposed algorithm is where t...
Hilbertian kernels and spline functions
Atteia, M
1992-01-01
In this monograph, which is an extensive study of Hilbertian approximation, the emphasis is placed on spline functions theory. The origin of the book was an effort to show that spline theory parallels Hilbertian Kernel theory, not only for splines derived from minimization of a quadratic functional but more generally for splines considered as piecewise functions type. Being as far as possible self-contained, the book may be used as a reference, with information about developments in linear approximation, convex optimization, mechanics and partial differential equations.
Chen, Tianle; Zeng, Donglin
2015-01-01
Summary Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although kernel-based statistical learning methods are proven to be powerful for a wide range of disease prediction problems, these methods are only well studied for independent data but not for longitudinal data. It is thus important to develop time-sensitive prediction rules that make use of the longitudinal nature of the data. In this paper, we develop a novel statistical learning method for longitudinal data by introducing subject-specific short-term and long-term latent effects through a designed kernel to account for within-subject correlation of longitudinal measurements. Since the presence of multiple sources of data is increasingly common, we embed our method in a multiple kernel learning framework and propose a regularized multiple kernel statistical learning with random effects to construct effective nonparametric prediction rules. Our method allows easy integration of various heterogeneous data sources and takes advantage of correlation among longitudinal measures to increase prediction power. We use different kernels for each data source taking advantage of the distinctive feature of each data modality, and then optimally combine data across modalities. We apply the developed methods to two large epidemiological studies, one on Huntington's disease and the other on Alzheimer's Disease (Alzheimer's Disease Neuroimaging Initiative, ADNI) where we explore a unique opportunity to combine imaging and genetic data to study prediction of mild cognitive impairment, and show a substantial gain in performance while accounting for the longitudinal aspect of the data. PMID:26177419
Chiu, Chun-Huo; Wang, Yi-Ting; Walther, Bruno A; Chao, Anne
2014-09-01
It is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper-diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265-270) for individual-based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good-Turing frequency formula, we derive an approximate formula for the first-order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators. © 2014, The International Biometric Society.
Lu, Zhao; Sun, Jing; Butts, Kenneth
2016-02-03
A giant leap has been made in the past couple of decades with the introduction of kernel-based learning as a mainstay for designing effective nonlinear computational learning algorithms. In view of the geometric interpretation of conditional expectation and the ubiquity of multiscale characteristics in highly complex nonlinear dynamic systems [1]-[3], this paper presents a new orthogonal projection operator wavelet kernel, aiming at developing an efficient computational learning approach for nonlinear dynamical system identification. In the framework of multiresolution analysis, the proposed projection operator wavelet kernel can fulfill the multiscale, multidimensional learning to estimate complex dependencies. The special advantage of the projection operator wavelet kernel developed in this paper lies in the fact that it has a closed-form expression, which greatly facilitates its application in kernel learning. To the best of our knowledge, it is the first closed-form orthogonal projection wavelet kernel reported in the literature. It provides a link between grid-based wavelets and mesh-free kernel-based methods. Simulation studies for identifying the parallel models of two benchmark nonlinear dynamical systems confirm its superiority in model accuracy and sparsity.
Dense Medium Machine Processing Method for Palm Kernel/ Shell ...
African Journals Online (AJOL)
ADOWIE PERE
Cracked palm kernel is a mixture of kernels, broken shells, dusts and other impurities. In ... machine processing method using dense medium, a separator, a shell collector and a kernel .... efficiency, ease of maintenance and uniformity of.
Mitigation of artifacts in rtm with migration kernel decomposition
Zhan, Ge; Schuster, Gerard T.
2012-01-01
The migration kernel for reverse-time migration (RTM) can be decomposed into four component kernels using Born scattering and migration theory. Each component kernel has a unique physical interpretation and can be interpreted differently
Ranking Support Vector Machine with Kernel Approximation
Directory of Open Access Journals (Sweden)
Kai Chen
2017-01-01
Full Text Available Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels can give higher accuracy than linear RankSVM (RankSVM with a linear kernel for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.
Ranking Support Vector Machine with Kernel Approximation.
Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi
2017-01-01
Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.
Sentiment classification with interpolated information diffusion kernels
Raaijmakers, S.
2007-01-01
Information diffusion kernels - similarity metrics in non-Euclidean information spaces - have been found to produce state of the art results for document classification. In this paper, we present a novel approach to global sentiment classification using these kernels. We carry out a large array of
Evolution kernel for the Dirac field
International Nuclear Information System (INIS)
Baaquie, B.E.
1982-06-01
The evolution kernel for the free Dirac field is calculated using the Wilson lattice fermions. We discuss the difficulties due to which this calculation has not been previously performed in the continuum theory. The continuum limit is taken, and the complete energy eigenfunctions as well as the propagator are then evaluated in a new manner using the kernel. (author)
Improving the Bandwidth Selection in Kernel Equating
Andersson, Björn; von Davier, Alina A.
2014-01-01
We investigate the current bandwidth selection methods in kernel equating and propose a method based on Silverman's rule of thumb for selecting the bandwidth parameters. In kernel equating, the bandwidth parameters have previously been obtained by minimizing a penalty function. This minimization process has been criticized by practitioners…
Kernel Korner : The Linux keyboard driver
Brouwer, A.E.
1995-01-01
Our Kernel Korner series continues with an article describing the Linux keyboard driver. This article is not for "Kernel Hackers" only--in fact, it will be most useful to those who wish to use their own keyboard to its fullest potential, and those who want to write programs to take advantage of the
Metabolic network prediction through pairwise rational kernels.
Roche-Lima, Abiel; Domaratzki, Michael; Fristensky, Brian
2014-09-26
Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of biochemical reactions, in which the product (output) from one reaction serves as the substrate (input) to another reaction. Many pathways remain incompletely characterized. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes. Pairwise classification methods are supervised learning methods used to classify new pair of entities. Some of these classification methods, e.g., Pairwise Support Vector Machines (SVMs), use pairwise kernels. Pairwise kernels describe similarity measures between two pairs of entities. Using pairwise kernels to handle sequence data requires long processing times and large storage. Rational kernels are kernels based on weighted finite-state transducers that represent similarity measures between sequences or automata. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and machine translations. We create a new family of pairwise kernels using weighted finite-state transducers (called Pairwise Rational Kernel (PRK)) to predict metabolic pathways from a variety of biological data. PRKs take advantage of the simpler representations and faster algorithms of transducers. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We then developed several experiments with PRKs and Pairwise SVM to validate our methods using the metabolic network of Saccharomyces cerevisiae. As a result, when PRKs are used, our method executes faster in comparison with other pairwise kernels. Also, when we use PRKs combined with other simple kernels that include evolutionary information, the accuracy
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Directory of Open Access Journals (Sweden)
Saerom Park
Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Application of nonparametric statistic method for DNBR limit calculation
International Nuclear Information System (INIS)
Dong Bo; Kuang Bo; Zhu Xuenong
2013-01-01
Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)
Comparing parametric and nonparametric regression methods for panel data
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
A nonparametric spatial scan statistic for continuous data.
Jung, Inkyung; Cho, Ho Jin
2015-10-20
Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
Dose calculation methods in photon beam therapy using energy deposition kernels
International Nuclear Information System (INIS)
Ahnesjoe, A.
1991-01-01
The problem of calculating accurate dose distributions in treatment planning of megavoltage photon radiation therapy has been studied. New dose calculation algorithms using energy deposition kernels have been developed. The kernels describe the transfer of energy by secondary particles from a primary photon interaction site to its surroundings. Monte Carlo simulations of particle transport have been used for derivation of kernels for primary photon energies form 0.1 MeV to 50 MeV. The trade off between accuracy and calculational speed has been addressed by the development of two algorithms; one point oriented with low computional overhead for interactive use and one for fast and accurate calculation of dose distributions in a 3-dimensional lattice. The latter algorithm models secondary particle transport in heterogeneous tissue by scaling energy deposition kernels with the electron density of the tissue. The accuracy of the methods has been tested using full Monte Carlo simulations for different geometries, and found to be superior to conventional algorithms based on scaling of broad beam dose distributions. Methods have also been developed for characterization of clinical photon beams in entities appropriate for kernel based calculation models. By approximating the spectrum as laterally invariant, an effective spectrum and dose distribution for contaminating charge particles are derived form depth dose distributions measured in water, using analytical constraints. The spectrum is used to calculate kernels by superposition of monoenergetic kernels. The lateral energy fluence distribution is determined by deconvolving measured lateral dose distributions by a corresponding pencil beam kernel. Dose distributions for contaminating photons are described using two different methods, one for estimation of the dose outside of the collimated beam, and the other for calibration of output factors derived from kernel based dose calculations. (au)
A Bayesian Nonparametric Meta-Analysis Model
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Directory of Open Access Journals (Sweden)
Urbi Garay
2016-03-01
Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.
Nonparametric regression using the concept of minimum energy
International Nuclear Information System (INIS)
Williams, Mike
2011-01-01
It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.
Probabilistic wind power forecasting based on logarithmic transformation and boundary kernel
International Nuclear Information System (INIS)
Zhang, Yao; Wang, Jianxue; Luo, Xu
2015-01-01
Highlights: • Quantitative information on the uncertainty of wind power generation. • Kernel density estimator provides non-Gaussian predictive distributions. • Logarithmic transformation reduces the skewness of wind power density. • Boundary kernel method eliminates the density leakage near the boundary. - Abstracts: Probabilistic wind power forecasting not only produces the expectation of wind power output, but also gives quantitative information on the associated uncertainty, which is essential for making better decisions about power system and market operations with the increasing penetration of wind power generation. This paper presents a novel kernel density estimator for probabilistic wind power forecasting, addressing two characteristics of wind power which have adverse impacts on the forecast accuracy, namely, the heavily skewed and double-bounded nature of wind power density. Logarithmic transformation is used to reduce the skewness of wind power density, which improves the effectiveness of the kernel density estimator in a transformed scale. Transformations partially relieve the boundary effect problem of the kernel density estimator caused by the double-bounded nature of wind power density. However, the case study shows that there are still some serious problems of density leakage after the transformation. In order to solve this problem in the transformed scale, a boundary kernel method is employed to eliminate the density leak at the bounds of wind power distribution. The improvement of the proposed method over the standard kernel density estimator is demonstrated by short-term probabilistic forecasting results based on the data from an actual wind farm. Then, a detailed comparison is carried out of the proposed method and some existing probabilistic forecasting methods
DBKGrad: An R Package for Mortality Rates Graduation by Discrete Beta Kernel Techniques
Directory of Open Access Journals (Sweden)
Angelo Mazza
2014-04-01
Full Text Available We introduce the R package DBKGrad, conceived to facilitate the use of kernel smoothing in graduating mortality rates. The package implements univariate and bivariate adaptive discrete beta kernel estimators. Discrete kernels have been preferred because, in this context, variables such as age, calendar year and duration, are pragmatically considered as discrete and the use of beta kernels is motivated since it reduces boundary bias. Furthermore, when data on exposures to the risk of death are available, the use of adaptive bandwidth, that may be selected by cross-validation, can provide additional benefits. To exemplify the use of the package, an application to Italian mortality rates, for different ages and calendar years, is presented.
Azarnavid, Babak; Parand, Kourosh; Abbasbandy, Saeid
2018-06-01
This article discusses an iterative reproducing kernel method with respect to its effectiveness and capability of solving a fourth-order boundary value problem with nonlinear boundary conditions modeling beams on elastic foundations. Since there is no method of obtaining reproducing kernel which satisfies nonlinear boundary conditions, the standard reproducing kernel methods cannot be used directly to solve boundary value problems with nonlinear boundary conditions as there is no knowledge about the existence and uniqueness of the solution. The aim of this paper is, therefore, to construct an iterative method by the use of a combination of reproducing kernel Hilbert space method and a shooting-like technique to solve the mentioned problems. Error estimation for reproducing kernel Hilbert space methods for nonlinear boundary value problems have yet to be discussed in the literature. In this paper, we present error estimation for the reproducing kernel method to solve nonlinear boundary value problems probably for the first time. Some numerical results are given out to demonstrate the applicability of the method.
Nonparametric Bayesian models for a spatial covariance.
Reich, Brian J; Fuentes, Montserrat
2012-01-01
A crucial step in the analysis of spatial data is to estimate the spatial correlation function that determines the relationship between a spatial process at two locations. The standard approach to selecting the appropriate correlation function is to use prior knowledge or exploratory analysis, such as a variogram analysis, to select the correct parametric correlation function. Rather that selecting a particular parametric correlation function, we treat the covariance function as an unknown function to be estimated from the data. We propose a flexible prior for the correlation function to provide robustness to the choice of correlation function. We specify the prior for the correlation function using spectral methods and the Dirichlet process prior, which is a common prior for an unknown distribution function. Our model does not require Gaussian data or spatial locations on a regular grid. The approach is demonstrated using a simulation study as well as an analysis of California air pollution data.
Statistical decisions under nonparametric a priori information
International Nuclear Information System (INIS)
Chilingaryan, A.A.
1985-01-01
The basic module of applied program package for statistical analysis of the ANI experiment data is described. By means of this module tasks of choosing theoretical model most adequately fitting to experimental data, selection of events of definte type, identification of elementary particles are carried out. For mentioned problems solving, the Bayesian rules, one-leave out test and KNN (K Nearest Neighbour) adaptive density estimation are utilized
Nonhomogeneous Poisson process with nonparametric frailty
International Nuclear Information System (INIS)
Slimacek, Vaclav; Lindqvist, Bo Henry
2016-01-01
The failure processes of heterogeneous repairable systems are often modeled by non-homogeneous Poisson processes. The common way to describe an unobserved heterogeneity between systems is to multiply the basic rate of occurrence of failures by a random variable (a so-called frailty) having a specified parametric distribution. Since the frailty is unobservable, the choice of its distribution is a problematic part of using these models, as are often the numerical computations needed in the estimation of these models. The main purpose of this paper is to develop a method for estimation of the parameters of a nonhomogeneous Poisson process with unobserved heterogeneity which does not require parametric assumptions about the heterogeneity and which avoids the frequently encountered numerical problems associated with the standard models for unobserved heterogeneity. The introduced method is illustrated on an example involving the power law process, and is compared to the standard gamma frailty model and to the classical model without unobserved heterogeneity. The derived results are confirmed in a simulation study which also reveals several not commonly known properties of the gamma frailty model and the classical model, and on a real life example. - Highlights: • A new method for estimation of a NHPP with frailty is introduced. • Introduced method does not require parametric assumptions about frailty. • The approach is illustrated on an example with the power law process. • The method is compared to the gamma frailty model and to the model without frailty.
Energy Technology Data Exchange (ETDEWEB)
Lopez Fontan, J.L.; Costa, J.; Ruso, J.M.; Prieto, G. [Dept. of Applied Physics, Univ. of Santiago de Compostela, Santiago de Compostela (Spain); Sarmiento, F. [Dept. of Mathematics, Faculty of Informatics, Univ. of A Coruna, A Coruna (Spain)
2004-02-01
The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found. (orig.)
Putting Priors in Mixture Density Mercer Kernels
Srivastava, Ashok N.; Schumann, Johann; Fischer, Bernd
2004-01-01
This paper presents a new methodology for automatic knowledge driven data mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly infinite dimensional feature space. We describe a new method called Mixture Density Mercer Kernels to learn kernel function directly from data, rather than using predefined kernels. These data adaptive kernels can en- code prior knowledge in the kernel using a Bayesian formulation, thus allowing for physical information to be encoded in the model. We compare the results with existing algorithms on data from the Sloan Digital Sky Survey (SDSS). The code for these experiments has been generated with the AUTOBAYES tool, which automatically generates efficient and documented C/C++ code from abstract statistical model specifications. The core of the system is a schema library which contains template for learning and knowledge discovery algorithms like different versions of EM, or numeric optimization methods like conjugate gradient methods. The template instantiation is supported by symbolic- algebraic computations, which allows AUTOBAYES to find closed-form solutions and, where possible, to integrate them into the code. The results show that the Mixture Density Mercer-Kernel described here outperforms tree-based classification in distinguishing high-redshift galaxies from low- redshift galaxies by approximately 16% on test data, bagged trees by approximately 7%, and bagged trees built on a much larger sample of data by approximately 2%.
Anisotropic hydrodynamics with a scalar collisional kernel
Almaalol, Dekrayat; Strickland, Michael
2018-04-01
Prior studies of nonequilibrium dynamics using anisotropic hydrodynamics have used the relativistic Anderson-Witting scattering kernel or some variant thereof. In this paper, we make the first study of the impact of using a more realistic scattering kernel. For this purpose, we consider a conformal system undergoing transversally homogenous and boost-invariant Bjorken expansion and take the collisional kernel to be given by the leading order 2 ↔2 scattering kernel in scalar λ ϕ4 . We consider both classical and quantum statistics to assess the impact of Bose enhancement on the dynamics. We also determine the anisotropic nonequilibrium attractor of a system subject to this collisional kernel. We find that, when the near-equilibrium relaxation-times in the Anderson-Witting and scalar collisional kernels are matched, the scalar kernel results in a higher degree of momentum-space anisotropy during the system's evolution, given the same initial conditions. Additionally, we find that taking into account Bose enhancement further increases the dynamically generated momentum-space anisotropy.
Bayesian nonparametric system reliability using sets of priors
Walter, G.M.; Aslett, L.J.M.; Coolen, F.P.A.
2016-01-01
An imprecise Bayesian nonparametric approach to system reliability with multiple types of components is developed. This allows modelling partial or imperfect prior knowledge on component failure distributions in a flexible way through bounds on the functioning probability. Given component level test
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Nonparametric modeling of dynamic functional connectivity in fmri data
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer H.; Røge, Rasmus
2015-01-01
dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted...
Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support
Czech Academy of Sciences Publication Activity Database
Kalina, Jan; Zvárová, Jana
2017-01-01
Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability
A general approach to posterior contraction in nonparametric inverse problems
Knapik, Bartek; Salomond, Jean Bernard
In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related
Non-parametric analysis of production efficiency of poultry egg ...
African Journals Online (AJOL)
Non-parametric analysis of production efficiency of poultry egg farmers in Delta ... analysis of factors affecting the output of poultry farmers showed that stock ... should be put in place for farmers to learn the best farm practices carried out on the ...
International Nuclear Information System (INIS)
Chen Qiang; Ren Xuemei; Na Jing
2011-01-01
Highlights: Model uncertainty of the system is approximated by multiple-kernel LSSVM. Approximation errors and disturbances are compensated in the controller design. Asymptotical anti-synchronization is achieved with model uncertainty and disturbances. Abstract: In this paper, we propose a robust anti-synchronization scheme based on multiple-kernel least squares support vector machine (MK-LSSVM) modeling for two uncertain chaotic systems. The multiple-kernel regression, which is a linear combination of basic kernels, is designed to approximate system uncertainties by constructing a multiple-kernel Lagrangian function and computing the corresponding regression parameters. Then, a robust feedback control based on MK-LSSVM modeling is presented and an improved update law is employed to estimate the unknown bound of the approximation error. The proposed control scheme can guarantee the asymptotic convergence of the anti-synchronization errors in the presence of system uncertainties and external disturbances. Numerical examples are provided to show the effectiveness of the proposed method.
Ahmed, Qasim Zeeshan
2013-01-01
In this letter, a new detector is proposed for amplifyand- forward (AF) relaying system when communicating with the assistance of relays. The major goal of this detector is to improve the bit error rate (BER) performance of the receiver. The probability density function is estimated with the help of kernel density technique. A generalized Gaussian kernel is proposed. This new kernel provides more flexibility and encompasses Gaussian and uniform kernels as special cases. The optimal window width of the kernel is calculated. Simulations results show that a gain of more than 1 dB can be achieved in terms of BER performance as compared to the minimum mean square error (MMSE) receiver when communicating over Rayleigh fading channels.
Analyzing kernel matrices for the identification of differentially expressed genes.
Directory of Open Access Journals (Sweden)
Xiao-Lei Xia
Full Text Available One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs, followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing [Formula: see text]-like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate.
A kernel plus method for quantifying wind turbine performance upgrades
Lee, Giwhyun
2014-04-21
Power curves are commonly estimated using the binning method recommended by the International Electrotechnical Commission, which primarily incorporates wind speed information. When such power curves are used to quantify a turbine\\'s upgrade, the results may not be accurate because many other environmental factors in addition to wind speed, such as temperature, air pressure, turbulence intensity, wind shear and humidity, all potentially affect the turbine\\'s power output. Wind industry practitioners are aware of the need to filter out effects from environmental conditions. Toward that objective, we developed a kernel plus method that allows incorporation of multivariate environmental factors in a power curve model, thereby controlling the effects from environmental factors while comparing power outputs. We demonstrate that the kernel plus method can serve as a useful tool for quantifying a turbine\\'s upgrade because it is sensitive to small and moderate changes caused by certain turbine upgrades. Although we demonstrate the utility of the kernel plus method in this specific application, the resulting method is a general, multivariate model that can connect other physical factors, as long as their measurements are available, with a turbine\\'s power output, which may allow us to explore new physical properties associated with wind turbine performance. © 2014 John Wiley & Sons, Ltd.
NLO corrections to the Kernel of the BKP-equations
Energy Technology Data Exchange (ETDEWEB)
Bartels, J. [Hamburg Univ. (Germany). 2. Inst. fuer Theoretische Physik; Fadin, V.S. [Budker Institute of Nuclear Physics, Novosibirsk (Russian Federation); Novosibirskij Gosudarstvennyj Univ., Novosibirsk (Russian Federation); Lipatov, L.N. [Hamburg Univ. (Germany). 2. Inst. fuer Theoretische Physik; Petersburg Nuclear Physics Institute, Gatchina, St. Petersburg (Russian Federation); Vacca, G.P. [INFN, Sezione di Bologna (Italy)
2012-10-02
We present results for the NLO kernel of the BKP equations for composite states of three reggeized gluons in the Odderon channel, both in QCD and in N=4 SYM. The NLO kernel consists of the NLO BFKL kernel in the color octet representation and the connected 3{yields}3 kernel, computed in the tree approximation.
Kernel maximum autocorrelation factor and minimum noise fraction transformations
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg
2010-01-01
in hyperspectral HyMap scanner data covering a small agricultural area, and 3) maize kernel inspection. In the cases shown, the kernel MAF/MNF transformation performs better than its linear counterpart as well as linear and kernel PCA. The leading kernel MAF/MNF variates seem to possess the ability to adapt...
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Half-kernel. 51.1441 Section 51.1441 Agriculture... Standards for Grades of Shelled Pecans Definitions § 51.1441 Half-kernel. Half-kernel means one of the separated halves of an entire pecan kernel with not more than one-eighth of its original volume missing...
7 CFR 51.2296 - Three-fourths half kernel.
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Three-fourths half kernel. 51.2296 Section 51.2296 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards...-fourths half kernel. Three-fourths half kernel means a portion of a half of a kernel which has more than...
7 CFR 981.401 - Adjusted kernel weight.
2010-01-01
... 7 Agriculture 8 2010-01-01 2010-01-01 false Adjusted kernel weight. 981.401 Section 981.401... Administrative Rules and Regulations § 981.401 Adjusted kernel weight. (a) Definition. Adjusted kernel weight... kernels in excess of five percent; less shells, if applicable; less processing loss of one percent for...
7 CFR 51.1403 - Kernel color classification.
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Kernel color classification. 51.1403 Section 51.1403... STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Kernel Color Classification § 51.1403 Kernel color classification. (a) The skin color of pecan kernels may be described in terms of the color...
The Linux kernel as flexible product-line architecture
M. de Jonge (Merijn)
2002-01-01
textabstractThe Linux kernel source tree is huge ($>$ 125 MB) and inflexible (because it is difficult to add new kernel components). We propose to make this architecture more flexible by assembling kernel source trees dynamically from individual kernel components. Users then, can select what
Detoxification of Jatropha curcas kernel cake by a novel Streptomyces fimicarius strain.
Wang, Xing-Hong; Ou, Lingcheng; Fu, Liang-Liang; Zheng, Shui; Lou, Ji-Dong; Gomes-Laranjo, José; Li, Jiao; Zhang, Changhe
2013-09-15
A huge amount of kernel cake, which contains a variety of toxins including phorbol esters (tumor promoters), is projected to be generated yearly in the near future by the Jatropha biodiesel industry. We showed that the kernel cake strongly inhibited plant seed germination and root growth and was highly toxic to carp fingerlings, even though phorbol esters were undetectable by HPLC. Therefore it must be detoxified before disposal to the environment. A mathematic model was established to estimate the general toxicity of the kernel cake by determining the survival time of carp fingerling. A new strain (Streptomyces fimicarius YUCM 310038) capable of degrading the total toxicity by more than 97% in a 9-day solid state fermentation was screened out from 578 strains including 198 known strains and 380 strains isolated from air and soil. The kernel cake fermented by YUCM 310038 was nontoxic to plants and carp fingerlings and significantly promoted tobacco plant growth, indicating its potential to transform the toxic kernel cake to bio-safe animal feed or organic fertilizer to remove the environmental concern and to reduce the cost of the Jatropha biodiesel industry. Microbial strain profile essential for the kernel cake detoxification was discussed. Copyright © 2013 Elsevier B.V. All rights reserved.
de Oliveira, R L; de Carvalho, G G P; Oliveira, R L; Tosto, M S L; Santos, E M; Ribeiro, R D X; Silva, T M; Correia, B R; de Rufino, L M A
2017-10-01
The objective of this study was to evaluate the effects of the inclusion of palm kernel (Elaeis guineensis) cake in diets for goats on feeding behaviors, rectal temperature, and cardiac and respiratory frequencies. Forty crossbred Boer male, non-castrated goats (ten animals per treatment), with an average age of 90 days and an initial body weight of 15.01 ± 1.76 kg, were used. The goats were fed Tifton 85 (Cynodon spp.) hay and palm kernel supplemented at the rates of 0, 7, 14, and 21% of dry matter (DM). The feeding behaviors (rumination, feeding, and idling times) were observed for three 24-h periods. DM and neutral detergent fiber (NDF) intake values were estimated as the difference between the total DM and NDF contents of the feed offered and the total DM and NDF contents of the orts. There was no effect of palm kernel cake inclusion in goat diets on DM intake (P > 0.05). However, palm kernel cake promoted a linear increase (P kernel cakes had no effects (P > 0.05) on the chewing, feeding, and rumination efficiency (DM and NDF) or on physiological variables. The use up to 21% palm kernel cake in the diet of crossbred Boer goats maintained the feeding behaviors and did not change the physiological parameters of goats; therefore, its use is recommended in the diet of these animals.
Digital signal processing with kernel methods
Rojo-Alvarez, José Luis; Muñoz-Marí, Jordi; Camps-Valls, Gustavo
2018-01-01
A realistic and comprehensive review of joint approaches to machine learning and signal processing algorithms, with application to communications, multimedia, and biomedical engineering systems Digital Signal Processing with Kernel Methods reviews the milestones in the mixing of classical digital signal processing models and advanced kernel machines statistical learning tools. It explains the fundamental concepts from both fields of machine learning and signal processing so that readers can quickly get up to speed in order to begin developing the concepts and application software in their own research. Digital Signal Processing with Kernel Methods provides a comprehensive overview of kernel methods in signal processing, without restriction to any application field. It also offers example applications and detailed benchmarking experiments with real and synthetic datasets throughout. Readers can find further worked examples with Matlab source code on a website developed by the authors. * Presents the necess...
Parsimonious Wavelet Kernel Extreme Learning Machine
Directory of Open Access Journals (Sweden)
Wang Qin
2015-11-01
Full Text Available In this study, a parsimonious scheme for wavelet kernel extreme learning machine (named PWKELM was introduced by combining wavelet theory and a parsimonious algorithm into kernel extreme learning machine (KELM. In the wavelet analysis, bases that were localized in time and frequency to represent various signals effectively were used. Wavelet kernel extreme learning machine (WELM maximized its capability to capture the essential features in “frequency-rich” signals. The proposed parsimonious algorithm also incorporated significant wavelet kernel functions via iteration in virtue of Householder matrix, thus producing a sparse solution that eased the computational burden and improved numerical stability. The experimental results achieved from the synthetic dataset and a gas furnace instance demonstrated that the proposed PWKELM is efficient and feasible in terms of improving generalization accuracy and real time performance.