A nonparametric empirical Bayes framework for large-scale multiple testing.
Martin, Ryan; Tokdar, Surya T
2012-07-01
We propose a flexible and identifiable version of the 2-groups model, motivated by hierarchical Bayes considerations, that features an empirical null and a semiparametric mixture model for the nonnull cases. We use a computationally efficient predictive recursion (PR) marginal likelihood procedure to estimate the model parameters, even the nonparametric mixing distribution. This leads to a nonparametric empirical Bayes testing procedure, which we call PRtest, based on thresholding the estimated local false discovery rates. Simulations and real data examples demonstrate that, compared to existing approaches, PRtest's careful handling of the nonnull density can give a much better fit in the tails of the mixture distribution which, in turn, can lead to more realistic conclusions.
Nonparametric Bayes Modeling of Multivariate Categorical Data.
Dunson, David B; Xing, Chuanhua
2012-01-01
Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.
Nonparametric Bayes Classification and Hypothesis Testing on Manifolds
Bhattacharya, Abhishek; Dunson, David
2012-01-01
Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028
Empirical Bayes Approaches to Multivariate Fuzzy Partitions.
Woodbury, Max A.; Manton, Kenneth G.
1991-01-01
An empirical Bayes-maximum likelihood estimation procedure is presented for the application of fuzzy partition models in describing high dimensional discrete response data. The model describes individuals in terms of partial membership in multiple latent categories that represent bounded discrete spaces. (SLD)
The effect of loss functions on empirical Bayes reliability analysis
Directory of Open Access Journals (Sweden)
Camara Vincent A. R.
1998-01-01
Full Text Available The aim of the present study is to investigate the sensitivity of empirical Bayes estimates of the reliability function with respect to changing of the loss function. In addition to applying some of the basic analytical results on empirical Bayes reliability obtained with the use of the “popular” squared error loss function, we shall derive some expressions corresponding to empirical Bayes reliability estimates obtained with the Higgins–Tsokos, the Harris and our proposed logarithmic loss functions. The concept of efficiency, along with the notion of integrated mean square error, will be used as a criterion to numerically compare our results. It is shown that empirical Bayes reliability functions are in general sensitive to the choice of the loss function, and that the squared error loss does not always yield the best empirical Bayes reliability estimate.
The effect of loss functions on empirical Bayes reliability analysis
Directory of Open Access Journals (Sweden)
Vincent A. R. Camara
1999-01-01
Full Text Available The aim of the present study is to investigate the sensitivity of empirical Bayes estimates of the reliability function with respect to changing of the loss function. In addition to applying some of the basic analytical results on empirical Bayes reliability obtained with the use of the “popular” squared error loss function, we shall derive some expressions corresponding to empirical Bayes reliability estimates obtained with the Higgins–Tsokos, the Harris and our proposed logarithmic loss functions. The concept of efficiency, along with the notion of integrated mean square error, will be used as a criterion to numerically compare our results.
International Nuclear Information System (INIS)
Quigley, John; Walls, Lesley
2011-01-01
Mixing Bayes and Empirical Bayes inference provides reliability estimates for variant system designs by using relevant failure data - observed and anticipated - about engineering changes arising due to modification and innovation. A coherent inference framework is proposed to predict the realization of engineering concerns during product development so that informed decisions can be made about the system design and the analysis conducted to prove reliability. The proposed method involves combining subjective prior distributions for the number of engineering concerns with empirical priors for the non-parametric distribution of time to realize these concerns in such a way that we can cross-tabulate classes of concerns to failure events within time partitions at an appropriate level of granularity. To support efficient implementation, a computationally convenient hypergeometric approximation is developed for the counting distributions appropriate to our underlying stochastic model. The accuracy of our approximation over first-order alternatives is examined, and demonstrated, through an evaluation experiment. An industrial application illustrates model implementation and shows how estimates can be updated using information arising during development test and analysis.
An Empirical Bayes Approach to Mantel-Haenszel DIF Analysis.
Zwick, Rebecca; Thayer, Dorothy T.; Lewis, Charles
1999-01-01
Developed an empirical Bayes enhancement to Mantel-Haenszel (MH) analysis of differential item functioning (DIF) in which it is assumed that the MH statistics are normally distributed and that the prior distribution of underlying DIF parameters is also normal. (Author/SLD)
Using Loss Functions for DIF Detection: An Empirical Bayes Approach.
Zwick, Rebecca; Thayer, Dorothy; Lewis, Charles
2000-01-01
Studied a method for flagging differential item functioning (DIF) based on loss functions. Builds on earlier research that led to the development of an empirical Bayes enhancement to the Mantel-Haenszel DIF analysis. Tested the method through simulation and found its performance better than some commonly used DIF classification systems. (SLD)
Noma, Hisashi; Matsui, Shigeyuki
2013-05-20
The main purpose of microarray studies is screening of differentially expressed genes as candidates for further investigation. Because of limited resources in this stage, prioritizing genes are relevant statistical tasks in microarray studies. For effective gene selections, parametric empirical Bayes methods for ranking and selection of genes with largest effect sizes have been proposed (Noma et al., 2010; Biostatistics 11: 281-289). The hierarchical mixture model incorporates the differential and non-differential components and allows information borrowing across differential genes with separation from nuisance, non-differential genes. In this article, we develop empirical Bayes ranking methods via a semiparametric hierarchical mixture model. A nonparametric prior distribution, rather than parametric prior distributions, for effect sizes is specified and estimated using the "smoothing by roughening" approach of Laird and Louis (1991; Computational statistics and data analysis 12: 27-37). We present applications to childhood and infant leukemia clinical studies with microarrays for exploring genes related to prognosis or disease progression. Copyright © 2012 John Wiley & Sons, Ltd.
Bhattacharya, Abhishek; Dunson, David B
2012-08-01
This article considers a broad class of kernel mixture density models on compact metric spaces and manifolds. Following a Bayesian approach with a nonparametric prior on the location mixing distribution, sufficient conditions are obtained on the kernel, prior and the underlying space for strong posterior consistency at any continuous density. The prior is also allowed to depend on the sample size n and sufficient conditions are obtained for weak and strong consistency. These conditions are verified on compact Euclidean spaces using multivariate Gaussian kernels, on the hypersphere using a von Mises-Fisher kernel and on the planar shape space using complex Watson kernels.
International Nuclear Information System (INIS)
Quigley, John; Hardman, Gavin; Bedford, Tim; Walls, Lesley
2011-01-01
Empirical Bayes provides one approach to estimating the frequency of rare events as a weighted average of the frequencies of an event and a pool of events. The pool will draw upon, for example, events with similar precursors. The higher the degree of homogeneity of the pool, then the Empirical Bayes estimator will be more accurate. We propose and evaluate a new method using homogenisation factors under the assumption that events are generated from a Homogeneous Poisson Process. The homogenisation factors are scaling constants, which can be elicited through structured expert judgement and used to align the frequencies of different events, hence homogenising the pool. The estimation error relative to the homogeneity of the pool is examined theoretically indicating that reduced error is associated with larger pool homogeneity. The effects of misspecified expert assessments of the homogenisation factors are examined theoretically and through simulation experiments. Our results show that the proposed Empirical Bayes method using homogenisation factors is robust under different degrees of misspecification.
Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection
DEFF Research Database (Denmark)
Yang, Ziheng; Wong, Wendy Shuk Wan; Nielsen, Rasmus
2005-01-01
, with > 1 indicating positive selection. Statistical distributions are used to model the variation in among sites, allowing a subset of sites to have > 1 while the rest of the sequence may be under purifying selection with ... probabilities that a site comes from the site class with > 1. Current implementations, however, use the naive EB (NEB) approach and fail to account for sampling errors in maximum likelihood estimates of model parameters, such as the proportions and ratios for the site classes. In small data sets lacking...... information, this approach may lead to unreliable posterior probability calculations. In this paper, we develop a Bayes empirical Bayes (BEB) approach to the problem, which assigns a prior to the model parameters and integrates over their uncertainties. We compare the new and old methods on real and simulated...
Empirical Bayes conditional independence graphs for regulatory network recovery
Mahdi, Rami; Madduri, Abishek S.; Wang, Guoqing; Strulovici-Barel, Yael; Salit, Jacqueline; Hackett, Neil R.; Crystal, Ronald G.; Mezey, Jason G.
2012-01-01
Motivation: Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods. Methods: We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures. Results: Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion. Availability and implementation: Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx. Contact: ramimahdi@yahoo.com or jgm45@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22685074
Flexible Modeling of Epidemics with an Empirical Bayes Framework
Brooks, Logan C.; Farrow, David C.; Hyun, Sangwon; Tibshirani, Ryan J.; Rosenfeld, Roni
2015-01-01
Seasonal influenza epidemics cause consistent, considerable, widespread loss annually in terms of economic burden, morbidity, and mortality. With access to accurate and reliable forecasts of a current or upcoming influenza epidemic’s behavior, policy makers can design and implement more effective countermeasures. This past year, the Centers for Disease Control and Prevention hosted the “Predict the Influenza Season Challenge”, with the task of predicting key epidemiological measures for the 2013–2014 U.S. influenza season with the help of digital surveillance data. We developed a framework for in-season forecasts of epidemics using a semiparametric Empirical Bayes framework, and applied it to predict the weekly percentage of outpatient doctors visits for influenza-like illness, and the season onset, duration, peak time, and peak height, with and without using Google Flu Trends data. Previous work on epidemic modeling has focused on developing mechanistic models of disease behavior and applying time series tools to explain historical data. However, tailoring these models to certain types of surveillance data can be challenging, and overly complex models with many parameters can compromise forecasting ability. Our approach instead produces possibilities for the epidemic curve of the season of interest using modified versions of data from previous seasons, allowing for reasonable variations in the timing, pace, and intensity of the seasonal epidemics, as well as noise in observations. Since the framework does not make strict domain-specific assumptions, it can easily be applied to some other diseases with seasonal epidemics. This method produces a complete posterior distribution over epidemic curves, rather than, for example, solely point predictions of forecasting targets. We report prospective influenza-like-illness forecasts made for the 2013–2014 U.S. influenza season, and compare the framework’s cross-validated prediction error on historical data to
EbayesThresh: R Programs for Empirical Bayes Thresholding
Directory of Open Access Journals (Sweden)
Iain Johnstone
2005-04-01
Full Text Available Suppose that a sequence of unknown parameters is observed sub ject to independent Gaussian noise. The EbayesThresh package in the S language implements a class of Empirical Bayes thresholding methods that can take advantage of possible sparsity in the sequence, to improve the quality of estimation. The prior for each parameter in the sequence is a mixture of an atom of probability at zero and a heavy-tailed density. Within the package, this can be either a Laplace (double exponential density or else a mixture of normal distributions with tail behavior similar to the Cauchy distribution. The mixing weight, or sparsity parameter, is chosen automatically by marginal maximum likelihood. If estimation is carried out using the posterior median, this is a random thresholding procedure; the estimation can also be carried out using other thresholding rules with the same threshold, and the package provides the posterior mean, and hard and soft thresholding, as additional options. This paper reviews the method, and gives details (far beyond those previously published of the calculations needed for implementing the procedures. It explains and motivates both the general methodology, and the use of the EbayesThresh package, through simulated and real data examples. When estimating the wavelet transform of an unknown function, it is appropriate to apply the method level by level to the transform of the observed data. The package can carry out these calculations for wavelet transforms obtained using various packages in R and S-PLUS. Details, including a motivating example, are presented, and the application of the method to image estimation is also explored. The final topic considered is the estimation of a single sequence that may become progressively sparser along the sequence. An iterated least squares isotone regression method allows for the choice of a threshold that depends monotonically on the order in which the observations are made. An alternative
Maydeu-Olivares, Albert
2005-01-01
Chernyshenko, Stark, Chan, Drasgow, and Williams (2001) investigated the fit of Samejima's logistic graded model and Levine's non-parametric MFS model to the scales of two personality questionnaires and found that the graded model did not fit well. We attribute the poor fit of the graded model to small amounts of multidimensionality present in…
Estimating rate of occurrence of rare events with empirical bayes: A railway application
International Nuclear Information System (INIS)
Quigley, John; Bedford, Tim; Walls, Lesley
2007-01-01
Classical approaches to estimating the rate of occurrence of events perform poorly when data are few. Maximum likelihood estimators result in overly optimistic point estimates of zero for situations where there have been no events. Alternative empirical-based approaches have been proposed based on median estimators or non-informative prior distributions. While these alternatives offer an improvement over point estimates of zero, they can be overly conservative. Empirical Bayes procedures offer an unbiased approach through pooling data across different hazards to support stronger statistical inference. This paper considers the application of Empirical Bayes to high consequence low-frequency events, where estimates are required for risk mitigation decision support such as as low as reasonably possible. A summary of empirical Bayes methods is given and the choices of estimation procedures to obtain interval estimates are discussed. The approaches illustrated within the case study are based on the estimation of the rate of occurrence of train derailments within the UK. The usefulness of empirical Bayes within this context is discussed
Empirical Bayes Credibility Models for Economic Catastrophic Losses by Regions
Directory of Open Access Journals (Sweden)
Jindrová Pavla
2017-01-01
Full Text Available Catastrophic events affect various regions of the world with increasing frequency and intensity. The number of catastrophic events and the amount of economic losses is varying in different world regions. Part of these losses is covered by insurance. Catastrophe events in last years are associated with increases in premiums for some lines of business. The article focus on estimating the amount of net premiums that would be needed to cover the total or insured catastrophic losses in different world regions using Bühlmann and Bühlmann-Straub empirical credibility models based on data from Sigma Swiss Re 2010-2016. The empirical credibility models have been developed to estimate insurance premiums for short term insurance contracts using two ingredients: past data from the risk itself and collateral data from other sources considered to be relevant. In this article we deal with application of these models based on the real data about number of catastrophic events and about the total economic and insured catastrophe losses in seven regions of the world in time period 2009-2015. Estimated credible premiums by world regions provide information how much money in the monitored regions will be need to cover total and insured catastrophic losses in next year.
Ren, Wen-Long; Wen, Yang-Jun; Dunwell, Jim M; Zhang, Yuan-Ming
2018-03-01
Although nonparametric methods in genome-wide association studies (GWAS) are robust in quantitative trait nucleotide (QTN) detection, the absence of polygenic background control in single-marker association in genome-wide scans results in a high false positive rate. To overcome this issue, we proposed an integrated nonparametric method for multi-locus GWAS. First, a new model transformation was used to whiten the covariance matrix of polygenic matrix K and environmental noise. Using the transferred model, Kruskal-Wallis test along with least angle regression was then used to select all the markers that were potentially associated with the trait. Finally, all the selected markers were placed into multi-locus model, these effects were estimated by empirical Bayes, and all the nonzero effects were further identified by a likelihood ratio test for true QTN detection. This method, named pKWmEB, was validated by a series of Monte Carlo simulation studies. As a result, pKWmEB effectively controlled false positive rate, although a less stringent significance criterion was adopted. More importantly, pKWmEB retained the high power of Kruskal-Wallis test, and provided QTN effect estimates. To further validate pKWmEB, we re-analyzed four flowering time related traits in Arabidopsis thaliana, and detected some previously reported genes that were not identified by the other methods.
Nonparametric Transfer Function Models
Liu, Jun M.; Chen, Rong; Yao, Qiwei
2009-01-01
In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584
de-Graft Acquah, Henry
2014-01-01
This paper highlights the sensitivity of technical efficiency estimates to estimation approaches using empirical data. Firm specific technical efficiency and mean technical efficiency are estimated using the non parametric Data Envelope Analysis (DEA) and the parametric Corrected Ordinary Least Squares (COLS) and Stochastic Frontier Analysis (SFA) approaches. Mean technical efficiency is found to be sensitive to the choice of estimation technique. Analysis of variance and Tukeyâ€™s test sugge...
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Mikulich-Gilbertson, Susan K; Wagner, Brandie D; Grunwald, Gary K; Riggs, Paula D; Zerbe, Gary O
2018-01-01
Medical research is often designed to investigate changes in a collection of response variables that are measured repeatedly on the same subjects. The multivariate generalized linear mixed model (MGLMM) can be used to evaluate random coefficient associations (e.g. simple correlations, partial regression coefficients) among outcomes that may be non-normal and differently distributed by specifying a multivariate normal distribution for their random effects and then evaluating the latent relationship between them. Empirical Bayes predictors are readily available for each subject from any mixed model and are observable and hence, plotable. Here, we evaluate whether second-stage association analyses of empirical Bayes predictors from a MGLMM, provide a good approximation and visual representation of these latent association analyses using medical examples and simulations. Additionally, we compare these results with association analyses of empirical Bayes predictors generated from separate mixed models for each outcome, a procedure that could circumvent computational problems that arise when the dimension of the joint covariance matrix of random effects is large and prohibits estimation of latent associations. As has been shown in other analytic contexts, the p-values for all second-stage coefficients that were determined by naively assuming normality of empirical Bayes predictors provide a good approximation to p-values determined via permutation analysis. Analyzing outcomes that are interrelated with separate models in the first stage and then associating the resulting empirical Bayes predictors in a second stage results in different mean and covariance parameter estimates from the maximum likelihood estimates generated by a MGLMM. The potential for erroneous inference from using results from these separate models increases as the magnitude of the association among the outcomes increases. Thus if computable, scatterplots of the conditionally independent empirical Bayes
DEFF Research Database (Denmark)
Madsen, Henrik; Rosbjerg, Dan
1997-01-01
parameters is inferred from regional data using generalized least squares (GLS) regression. Two different Bayesian T-year event estimators are introduced: a linear estimator that requires only some moments of the prior distributions to be specified and a parametric estimator that is based on specified......A regional estimation procedure that combines the index-flood concept with an empirical Bayes method for inferring regional information is introduced. The model is based on the partial duration series approach with generalized Pareto (GP) distributed exceedances. The prior information of the model...
Coburn, T.C.; Freeman, P.A.; Attanasi, E.D.
2012-01-01
The primary objectives of this research were to (1) investigate empirical methods for establishing regional trends in unconventional gas resources as exhibited by historical production data and (2) determine whether or not incorporating additional knowledge of a regional trend in a suite of previously established local nonparametric resource prediction algorithms influences assessment results. Three different trend detection methods were applied to publicly available production data (well EUR aggregated to 80-acre cells) from the Devonian Antrim Shale gas play in the Michigan Basin. This effort led to the identification of a southeast-northwest trend in cell EUR values across the play that, in a very general sense, conforms to the primary fracture and structural orientations of the province. However, including this trend in the resource prediction algorithms did not lead to improved results. Further analysis indicated the existence of clustering among cell EUR values that likely dampens the contribution of the regional trend. The reason for the clustering, a somewhat unexpected result, is not completely understood, although the geological literature provides some possible explanations. With appropriate data, a better understanding of this clustering phenomenon may lead to important information about the factors and their interactions that control Antrim Shale gas production, which may, in turn, help establish a more general protocol for better estimating resources in this and other shale gas plays. ?? 2011 International Association for Mathematical Geology (outside the USA).
Spencer, Amy V; Cox, Angela; Lin, Wei-Yu; Easton, Douglas F; Michailidou, Kyriaki; Walters, Kevin
2016-04-01
There is a large amount of functional genetic data available, which can be used to inform fine-mapping association studies (in diseases with well-characterised disease pathways). Single nucleotide polymorphism (SNP) prioritization via Bayes factors is attractive because prior information can inform the effect size or the prior probability of causal association. This approach requires the specification of the effect size. If the information needed to estimate a priori the probability density for the effect sizes for causal SNPs in a genomic region isn't consistent or isn't available, then specifying a prior variance for the effect sizes is challenging. We propose both an empirical method to estimate this prior variance, and a coherent approach to using SNP-level functional data, to inform the prior probability of causal association. Through simulation we show that when ranking SNPs by our empirical Bayes factor in a fine-mapping study, the causal SNP rank is generally as high or higher than the rank using Bayes factors with other plausible values of the prior variance. Importantly, we also show that assigning SNP-specific prior probabilities of association based on expert prior functional knowledge of the disease mechanism can lead to improved causal SNPs ranks compared to ranking with identical prior probabilities of association. We demonstrate the use of our methods by applying the methods to the fine mapping of the CASP8 region of chromosome 2 using genotype data from the Collaborative Oncological Gene-Environment Study (COGS) Consortium. The data we analysed included approximately 46,000 breast cancer case and 43,000 healthy control samples. © 2016 The Authors. *Genetic Epidemiology published by Wiley Periodicals, Inc.
Ator, Scott W.; Brakebill, John W.; Blomquist, Joel D.
2011-01-01
Spatially Referenced Regression on Watershed Attributes (SPARROW) was used to provide empirical estimates of the sources, fate, and transport of total nitrogen (TN) and total phosphorus (TP) in the Chesapeake Bay watershed, and the mean annual TN and TP flux to the bay and in each of 80,579 nontidal tributary stream reaches. Restoration efforts in recent decades have been insufficient to meet established standards for water quality and ecological conditions in Chesapeake Bay. The bay watershed includes 166,000 square kilometers of mixed land uses, multiple nutrient sources, and variable hydrogeologic, soil, and weather conditions, and bay restoration is complicated by the multitude of nutrient sources and complex interacting factors affecting the occurrence, fate, and transport of nitrogen and phosphorus from source areas to streams and the estuary. Effective and efficient nutrient management at the regional scale in support of Chesapeake Bay restoration requires a comprehensive understanding of the sources, fate, and transport of nitrogen and phosphorus in the watershed, which is only available through regional models. The current models, Chesapeake Bay nutrient SPARROW models, version 4 (CBTN_v4 and CBTP_v4), were constructed at a finer spatial resolution than previous SPARROW models for the Chesapeake Bay watershed (versions 1, 2, and 3), and include an updated timeframe and modified sources and other explantory terms.
β-empirical Bayes inference and model diagnosis of microarray data
Directory of Open Access Journals (Sweden)
Hossain Mollah Mohammad
2012-06-01
Full Text Available Abstract Background Microarray data enables the high-throughput survey of mRNA expression profiles at the genomic level; however, the data presents a challenging statistical problem because of the large number of transcripts with small sample sizes that are obtained. To reduce the dimensionality, various Bayesian or empirical Bayes hierarchical models have been developed. However, because of the complexity of the microarray data, no model can explain the data fully. It is generally difficult to scrutinize the irregular patterns of expression that are not expected by the usual statistical gene by gene models. Results As an extension of empirical Bayes (EB procedures, we have developed the β-empirical Bayes (β-EB approach based on a β-likelihood measure which can be regarded as an ’evidence-based’ weighted (quasi- likelihood inference. The weight of a transcript t is described as a power function of its likelihood, fβ(yt|θ. Genes with low likelihoods have unexpected expression patterns and low weights. By assigning low weights to outliers, the inference becomes robust. The value of β, which controls the balance between the robustness and efficiency, is selected by maximizing the predictive β0-likelihood by cross-validation. The proposed β-EB approach identified six significant (p−5 contaminated transcripts as differentially expressed (DE in normal/tumor tissues from the head and neck of cancer patients. These six genes were all confirmed to be related to cancer; they were not identified as DE genes by the classical EB approach. When applied to the eQTL analysis of Arabidopsis thaliana, the proposed β-EB approach identified some potential master regulators that were missed by the EB approach. Conclusions The simulation data and real gene expression data showed that the proposed β-EB method was robust against outliers. The distribution of the weights was used to scrutinize the irregular patterns of expression and diagnose the model
International Nuclear Information System (INIS)
Vesely, W.E.; Uryas'ev, S.P.; Samanta, P.K.
1993-01-01
Emergency Diesel Generators (EDGs) provide backup power to nuclear power plants in case of failure of AC buses. The reliability of EDGs is important to assure response to loss-of-offsite power accident scenarios, a dominant contributor to the plant risk. The reliable performance of EDGs has been of concern both for regulators and plant operators. In this paper the authors present an approach and results from the analysis of failure data from a large population of EDGs. They used empirical Bayes approach to obtain both the population distribution and the individual failure probabilities from EDGs failure to start and load-run data over 4 years for 194 EDGs at 63 plant units
Directory of Open Access Journals (Sweden)
Stochl Jan
2012-06-01
Full Text Available Abstract Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1 a cross-sectional health survey (the Scottish Health Education Population Survey and 2 a general population birth cohort study (the National Child Development Study illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items we show that all items from the 12-item General Health Questionnaire (GHQ-12 – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales. An illustration of ordinal item analysis
Stochl, Jan; Jones, Peter B; Croudace, Tim J
2012-06-11
Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12)--when binary scored--were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech's "well-being" and "distress" clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental
Naznin, Farhana; Currie, Graham; Sarvi, Majid; Logan, David
2016-01-01
Streetcars/tram systems are growing worldwide, and many are given priority to increase speed and reliability performance in mixed traffic conditions. Research related to the road safety impact of tram priority is limited. This study explores the road safety impacts of tram priority measures including lane and intersection/signal priority measures. A before-after crash study was conducted using the empirical Bayes (EB) method to provide more accurate crash impact estimates by accounting for wider crash trends and regression to the mean effects. Before-after crash data for 29 intersections with tram signal priority and 23 arterials with tram lane priority in Melbourne, Australia, were analyzed to evaluate the road safety impact of tram priority. The EB before-after analysis results indicated a statistically significant adjusted crash reduction rate of 16.4% after implementation of tram priority measures. Signal priority measures were found to reduce crashes by 13.9% and lane priority by 19.4%. A disaggregate level simple before-after analysis indicated reductions in total and serious crashes as well as vehicle-, pedestrian-, and motorcycle-involved crashes. In addition, reductions in on-path crashes, pedestrian-involved crashes, and collisions among vehicles moving in the same and opposite directions and all other specific crash types were found after tram priority implementation. Results suggest that streetcar/tram priority measures result in safety benefits for all road users, including vehicles, pedestrians, and cyclists. Policy implications and areas for future research are discussed.
Developing a Clustering-Based Empirical Bayes Analysis Method for Hotspot Identification
Directory of Open Access Journals (Sweden)
Yajie Zou
2017-01-01
Full Text Available Hotspot identification (HSID is a critical part of network-wide safety evaluations. Typical methods for ranking sites are often rooted in using the Empirical Bayes (EB method to estimate safety from both observed crash records and predicted crash frequency based on similar sites. The performance of the EB method is highly related to the selection of a reference group of sites (i.e., roadway segments or intersections similar to the target site from which safety performance functions (SPF used to predict crash frequency will be developed. As crash data often contain underlying heterogeneity that, in essence, can make them appear to be generated from distinct subpopulations, methods are needed to select similar sites in a principled manner. To overcome this possible heterogeneity problem, EB-based HSID methods that use common clustering methodologies (e.g., mixture models, K-means, and hierarchical clustering to select “similar” sites for building SPFs are developed. Performance of the clustering-based EB methods is then compared using real crash data. Here, HSID results, when computed on Texas undivided rural highway cash data, suggest that all three clustering-based EB analysis methods are preferred over the conventional statistical methods. Thus, properly classifying the road segments for heterogeneous crash data can further improve HSID accuracy.
Zhang, Kui; Wiener, Howard; Beasley, Mark; George, Varghese; Amos, Christopher I; Allison, David B
2006-08-01
Individual genome scans for quantitative trait loci (QTL) mapping often suffer from low statistical power and imprecise estimates of QTL location and effect. This lack of precision yields large confidence intervals for QTL location, which are problematic for subsequent fine mapping and positional cloning. In prioritizing areas for follow-up after an initial genome scan and in evaluating the credibility of apparent linkage signals, investigators typically examine the results of other genome scans of the same phenotype and informally update their beliefs about which linkage signals in their scan most merit confidence and follow-up via a subjective-intuitive integration approach. A method that acknowledges the wisdom of this general paradigm but formally borrows information from other scans to increase confidence in objectivity would be a benefit. We developed an empirical Bayes analytic method to integrate information from multiple genome scans. The linkage statistic obtained from a single genome scan study is updated by incorporating statistics from other genome scans as prior information. This technique does not require that all studies have an identical marker map or a common estimated QTL effect. The updated linkage statistic can then be used for the estimation of QTL location and effect. We evaluate the performance of our method by using extensive simulations based on actual marker spacing and allele frequencies from available data. Results indicate that the empirical Bayes method can account for between-study heterogeneity, estimate the QTL location and effect more precisely, and provide narrower confidence intervals than results from any single individual study. We also compared the empirical Bayes method with a method originally developed for meta-analysis (a closely related but distinct purpose). In the face of marked heterogeneity among studies, the empirical Bayes method outperforms the comparator.
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
DEFF Research Database (Denmark)
Thompson, Wesley K.; Wang, Yunpeng; Schork, Andrew J.
2015-01-01
-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via...... analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While...... minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local...
An Empirical Bayes before-after evaluation of road safety effects of a new motorway in Norway.
Elvik, Rune; Ulstein, Heidi; Wifstad, Kristina; Syrstad, Ragnhild S; Seeberg, Aase R; Gulbrandsen, Magnus U; Welde, Morten
2017-11-01
This paper presents an Empirical Bayes before-after evaluation of the road safety effects of a new motorway (freeway) in Østfold county, Norway. The before-period was 1996-2002. The after-period was 2009-2015. The road was rebuilt from an undivided two-lane road into a divided four-lane road. The number of killed or seriously injured road users was reduced by 75 percent, controlling for (downward) long-term trends and regression-to-the-mean (statistically significant at the 5 percent level; recorded numbers 71 before, 11 after). There were small changes in the number of injury accidents (185 before, 123 after; net effect -3%) and the number of slightly injured road users (403 before 279 after; net effect +5%). Motorways appear to mainly reduce injury severity, not the number of accidents. The paper discusses challenges in implementing the Empirical Bayes design when less than ideal data are available. Copyright © 2017 Elsevier Ltd. All rights reserved.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Anderson, Clare
2011-01-01
This article explores the lives of two Andamanese women, both of whom the British called “Tospy.” The first part of the article takes an indigenous and gendered perspective on early British colonization of the Andamans in the 1860s, and through the experiences of a woman called Topsy stresses the sexual violence that underpinned colonial settlement as well as the British reliance on women as cultural interlocutors. Second, the article discusses colonial naming practices, and the employment of Andamanese women and men as nursemaids and household servants during the 1890s–1910s. Using an extraordinary murder case in which a woman known as Topsy-ayah was a central witness, it argues that both reveal something of the enduring associations and legacies of slavery, as well as the cultural influence of the Atlantic in the Bay of Bengal. In sum, these women's lives present a kaleidoscope view of colonization, gender, networks of Empire, labor, and domesticity in the Bay of Bengal.
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies.
Directory of Open Access Journals (Sweden)
Wesley K Thompson
2015-12-01
Full Text Available Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn's disease (CD and the other for schizophrenia (SZ. A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies.
Thompson, Wesley K; Wang, Yunpeng; Schork, Andrew J; Witoelar, Aree; Zuber, Verena; Xu, Shujing; Werge, Thomas; Holland, Dominic; Andreassen, Ole A; Dale, Anders M
2015-12-01
Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn's disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of
Nonparametric identification of copula structures
Li, Bo
2013-06-01
We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric and based on the asymptotic distribution of the empirical copula process.We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on the structure of copulas, particularly when the sample size is moderately large. We illustrate our testing approach on two datasets. © 2013 American Statistical Association.
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
On Cooper's Nonparametric Test.
Schmeidler, James
1978-01-01
The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)
Mejia, Amanda F; Nebel, Mary Beth; Barber, Anita D; Choe, Ann S; Pekar, James J; Caffo, Brian S; Lindquist, Martin A
2018-05-15
Reliability of subject-level resting-state functional connectivity (FC) is determined in part by the statistical techniques employed in its estimation. Methods that pool information across subjects to inform estimation of subject-level effects (e.g., Bayesian approaches) have been shown to enhance reliability of subject-level FC. However, fully Bayesian approaches are computationally demanding, while empirical Bayesian approaches typically rely on using repeated measures to estimate the variance components in the model. Here, we avoid the need for repeated measures by proposing a novel measurement error model for FC describing the different sources of variance and error, which we use to perform empirical Bayes shrinkage of subject-level FC towards the group average. In addition, since the traditional intra-class correlation coefficient (ICC) is inappropriate for biased estimates, we propose a new reliability measure denoted the mean squared error intra-class correlation coefficient (ICC MSE ) to properly assess the reliability of the resulting (biased) estimates. We apply the proposed techniques to test-retest resting-state fMRI data on 461 subjects from the Human Connectome Project to estimate connectivity between 100 regions identified through independent components analysis (ICA). We consider both correlation and partial correlation as the measure of FC and assess the benefit of shrinkage for each measure, as well as the effects of scan duration. We find that shrinkage estimates of subject-level FC exhibit substantially greater reliability than traditional estimates across various scan durations, even for the most reliable connections and regardless of connectivity measure. Additionally, we find partial correlation reliability to be highly sensitive to the choice of penalty term, and to be generally worse than that of full correlations except for certain connections and a narrow range of penalty values. This suggests that the penalty needs to be chosen carefully
Hatfield, L.A.; Gutreuter, S.; Boogaard, M.A.; Carlin, B.P.
2011-01-01
Estimation of extreme quantal-response statistics, such as the concentration required to kill 99.9% of test subjects (LC99.9), remains a challenge in the presence of multiple covariates and complex study designs. Accurate and precise estimates of the LC99.9 for mixtures of toxicants are critical to ongoing control of a parasitic invasive species, the sea lamprey, in the Laurentian Great Lakes of North America. The toxicity of those chemicals is affected by local and temporal variations in water chemistry, which must be incorporated into the modeling. We develop multilevel empirical Bayes models for data from multiple laboratory studies. Our approach yields more accurate and precise estimation of the LC99.9 compared to alternative models considered. This study demonstrates that properly incorporating hierarchical structure in laboratory data yields better estimates of LC99.9 stream treatment values that are critical to larvae control in the field. In addition, out-of-sample prediction of the results of in situ tests reveals the presence of a latent seasonal effect not manifest in the laboratory studies, suggesting avenues for future study and illustrating the importance of dual consideration of both experimental and observational data. ?? 2011, The International Biometric Society.
Directory of Open Access Journals (Sweden)
Simone Vincenzi
2014-09-01
Full Text Available The differences in demographic and life-history processes between organisms living in the same population have important consequences for ecological and evolutionary dynamics. Modern statistical and computational methods allow the investigation of individual and shared (among homogeneous groups determinants of the observed variation in growth. We use an Empirical Bayes approach to estimate individual and shared variation in somatic growth using a von Bertalanffy growth model with random effects. To illustrate the power and generality of the method, we consider two populations of marble trout Salmo marmoratus living in Slovenian streams, where individually tagged fish have been sampled for more than 15 years. We use year-of-birth cohort, population density during the first year of life, and individual random effects as potential predictors of the von Bertalanffy growth function's parameters k (rate of growth and L∞ (asymptotic size. Our results showed that size ranks were largely maintained throughout marble trout lifetime in both populations. According to the Akaike Information Criterion (AIC, the best models showed different growth patterns for year-of-birth cohorts as well as the existence of substantial individual variation in growth trajectories after accounting for the cohort effect. For both populations, models including density during the first year of life showed that growth tended to decrease with increasing population density early in life. Model validation showed that predictions of individual growth trajectories using the random-effects model were more accurate than predictions based on mean size-at-age of fish.
Vincenzi, Simone; Mangel, Marc; Crivelli, Alain J; Munch, Stephan; Skaug, Hans J
2014-09-01
The differences in demographic and life-history processes between organisms living in the same population have important consequences for ecological and evolutionary dynamics. Modern statistical and computational methods allow the investigation of individual and shared (among homogeneous groups) determinants of the observed variation in growth. We use an Empirical Bayes approach to estimate individual and shared variation in somatic growth using a von Bertalanffy growth model with random effects. To illustrate the power and generality of the method, we consider two populations of marble trout Salmo marmoratus living in Slovenian streams, where individually tagged fish have been sampled for more than 15 years. We use year-of-birth cohort, population density during the first year of life, and individual random effects as potential predictors of the von Bertalanffy growth function's parameters k (rate of growth) and L∞ (asymptotic size). Our results showed that size ranks were largely maintained throughout marble trout lifetime in both populations. According to the Akaike Information Criterion (AIC), the best models showed different growth patterns for year-of-birth cohorts as well as the existence of substantial individual variation in growth trajectories after accounting for the cohort effect. For both populations, models including density during the first year of life showed that growth tended to decrease with increasing population density early in life. Model validation showed that predictions of individual growth trajectories using the random-effects model were more accurate than predictions based on mean size-at-age of fish.
Directory of Open Access Journals (Sweden)
Jo Nishino
2018-04-01
Full Text Available Genome-wide association studies (GWAS suggest that the genetic architecture of complex diseases consists of unexpectedly numerous variants with small effect sizes. However, the polygenic architectures of many diseases have not been well characterized due to lack of simple and fast methods for unbiased estimation of the underlying proportion of disease-associated variants and their effect-size distribution. Applying empirical Bayes estimation of semi-parametric hierarchical mixture models to GWAS summary statistics, we confirmed that schizophrenia was extremely polygenic [~40% of independent genome-wide SNPs are risk variants, most within odds ratio (OR = 1.03], whereas rheumatoid arthritis was less polygenic (~4 to 8% risk variants, significant portion reaching OR = 1.05 to 1.1. For rheumatoid arthritis, stratified estimations revealed that expression quantitative loci in blood explained large genetic variance, and low- and high-frequency derived alleles were prone to be risk and protective, respectively, suggesting a predominance of deleterious-risk and advantageous-protective mutations. Despite genetic correlation, effect-size distributions for schizophrenia and bipolar disorder differed across allele frequency. These analyses distinguished disease polygenic architectures and provided clues for etiological differences in complex diseases.
Dickhaus, Thorsten
2018-01-01
This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.
Bayesian nonparametric data analysis
Müller, Peter; Jara, Alejandro; Hanson, Tim
2015-01-01
This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.
Nonparametric statistics with applications to science and engineering
Kvam, Paul H
2007-01-01
A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...
Quantal Response: Nonparametric Modeling
2017-01-01
capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log
Nonparametric statistical inference
Gibbons, Jean Dickinson
2014-01-01
Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.
Pan, Wei
2003-07-22
Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly. Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.
Nonparametric combinatorial sequence models.
Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa
2011-11-01
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.
Nonparametric tests for censored data
Bagdonavicus, Vilijandas; Nikulin, Mikhail
2013-01-01
This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.
Comparing parametric and nonparametric regression methods for panel data
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
Nonparametric identification of copula structures
Li, Bo; Genton, Marc G.
2013-01-01
We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric
Multi-locus genome-wide association studies has become the state-of-the-art procedure to identify quantitative trait loci (QTL) associated with traits simultaneously. However, implementation of multi-locus model is still difficult. In this study, we integrated least angle regression with empirical B...
Directory of Open Access Journals (Sweden)
Jens Clausen
2010-06-01
Full Text Available This paper discusses the sustainability impact (contribution to sustainability, reduction of adverse environmental impacts of online second-hand trading. A survey of eBay users shows that a relationship between the trading of used goods and the protection of natural resources is hardly realized. Secondly, the environmental motivation and the willingness to act in a sustainable manner differ widely between groups of consumers. Given these results from a user perspective, the paper tries to find some objective hints of online second-hand trading’s environmental impact. The greenhouse gas emissions resulting from the energy used for the trading transactions seem to be considerably lower than the emissions due to the (avoided production of new goods. The paper concludes with a set of recommendations for second-hand trade and consumer policy. Information about the sustainability benefits of purchasing second-hand goods should be included in general consumer information, and arguments for changes in behavior should be targeted to different groups of consumers.
Boomer, Kathleen B; Weller, Donald E; Jordan, Thomas E
2008-01-01
The Universal Soil Loss Equation (USLE) and its derivatives are widely used for identifying watersheds with a high potential for degrading stream water quality. We compared sediment yields estimated from regional application of the USLE, the automated revised RUSLE2, and five sediment delivery ratio algorithms to measured annual average sediment delivery in 78 catchments of the Chesapeake Bay watershed. We did the same comparisons for another 23 catchments monitored by the USGS. Predictions exceeded observed sediment yields by more than 100% and were highly correlated with USLE erosion predictions (Pearson r range, 0.73-0.92; p USLE estimates (r = 0.87; p USLE model did not change the results. In ranked comparisons between observed and predicted sediment yields, the models failed to identify catchments with higher yields (r range, -0.28-0.00; p > 0.14). In a multiple regression analysis, soil erodibility, log (stream flow), basin shape (topographic relief ratio), the square-root transformed proportion of forest, and occurrence in the Appalachian Plateau province explained 55% of the observed variance in measured suspended sediment loads, but the model performed poorly (r(2) = 0.06) at predicting loads in the 23 USGS watersheds not used in fitting the model. The use of USLE or multiple regression models to predict sediment yields is not advisable despite their present widespread application. Integrated watershed models based on the USLE may also be unsuitable for making management decisions.
Decision support using nonparametric statistics
Beatty, Warren
2018-01-01
This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
International Nuclear Information System (INIS)
Conover, W.J.; Cox, D.D.; Martz, H.F.
1997-12-01
When using parametric empirical Bayes estimation methods for estimating the binomial or Poisson parameter, the validity of the assumed beta or gamma conjugate prior distribution is an important diagnostic consideration. Chi-square goodness-of-fit tests of the beta or gamma prior hypothesis are developed for use when the binomial sample sizes or Poisson exposure times vary. Nine examples illustrate the application of the methods, using real data from such diverse applications as the loss of feedwater flow rates in nuclear power plants, the probability of failure to run on demand and the failure rates of the high pressure coolant injection systems at US commercial boiling water reactors, the probability of failure to run on demand of emergency diesel generators in US commercial nuclear power plants, the rate of failure of aircraft air conditioners, baseball batting averages, the probability of testing positive for toxoplasmosis, and the probability of tumors in rats. The tests are easily applied in practice by means of corresponding Mathematica reg-sign computer programs which are provided
Hatfield, Laura A.; Gutreuter, Steve; Boogaard, Michael A.; Carlin, Bradley P.
2011-01-01
Estimation of extreme quantal-response statistics, such as the concentration required to kill 99.9% of test subjects (LC99.9), remains a challenge in the presence of multiple covariates and complex study designs. Accurate and precise estimates of the LC99.9 for mixtures of toxicants are critical to ongoing control of a parasitic invasive species, the sea lamprey, in the Laurentian Great Lakes of North America. The toxicity of those chemicals is affected by local and temporal variations in water chemistry, which must be incorporated into the modeling. We develop multilevel empirical Bayes models for data from multiple laboratory studies. Our approach yields more accurate and precise estimation of the LC99.9 compared to alternative models considered. This study demonstrates that properly incorporating hierarchical structure in laboratory data yields better estimates of LC99.9 stream treatment values that are critical to larvae control in the field. In addition, out-of-sample prediction of the results of in situ tests reveals the presence of a latent seasonal effect not manifest in the laboratory studies, suggesting avenues for future study and illustrating the importance of dual consideration of both experimental and observational data.
Xu, Xu Steven; Yuan, Min; Yang, Haitao; Feng, Yan; Xu, Jinfeng; Pinheiro, Jose
2017-01-01
Covariate analysis based on population pharmacokinetics (PPK) is used to identify clinically relevant factors. The likelihood ratio test (LRT) based on nonlinear mixed effect model fits is currently recommended for covariate identification, whereas individual empirical Bayesian estimates (EBEs) are considered unreliable due to the presence of shrinkage. The objectives of this research were to investigate the type I error for LRT and EBE approaches, to confirm the similarity of power between the LRT and EBE approaches from a previous report and to explore the influence of shrinkage on LRT and EBE inferences. Using an oral one-compartment PK model with a single covariate impacting on clearance, we conducted a wide range of simulations according to a two-way factorial design. The results revealed that the EBE-based regression not only provided almost identical power for detecting a covariate effect, but also controlled the false positive rate better than the LRT approach. Shrinkage of EBEs is likely not the root cause for decrease in power or inflated false positive rate although the size of the covariate effect tends to be underestimated at high shrinkage. In summary, contrary to the current recommendations, EBEs may be a better choice for statistical tests in PPK covariate analysis compared to LRT. We proposed a three-step covariate modeling approach for population PK analysis to utilize the advantages of EBEs while overcoming their shortcomings, which allows not only markedly reducing the run time for population PK analysis, but also providing more accurate covariate tests.
Monteys, Xavier; Harris, Paul; Caloca, Silvia
2014-05-01
The coastal shallow water zone can be a challenging and expensive environment within which to acquire bathymetry and other oceanographic data using traditional survey methods. Dangers and limited swath coverage make some of these areas unfeasible to survey using ship borne systems, and turbidity can preclude marine LIDAR. As a result, an extensive part of the coastline worldwide remains completely unmapped. Satellite EO multispectral data, after processing, allows timely, cost efficient and quality controlled information to be used for planning, monitoring, and regulating coastal environments. It has the potential to deliver repetitive derivation of medium resolution bathymetry, coastal water properties and seafloor characteristics in shallow waters. Over the last 30 years satellite passive imaging methods for bathymetry extraction, implementing analytical or empirical methods, have had a limited success predicting water depths. Different wavelengths of the solar light penetrate the water column to varying depths. They can provide acceptable results up to 20 m but become less accurate in deeper waters. The study area is located in the inner part of Dublin Bay, on the East coast of Ireland. The region investigated is a C-shaped inlet covering an area of 10 km long and 5 km wide with water depths ranging from 0 to 10 m. The methodology employed on this research uses a ratio of reflectance from SPOT 5 satellite bands, differing to standard linear transform algorithms. High accuracy water depths were derived using multibeam data. The final empirical model uses spatially weighted geographical tools to retrieve predicted depths. The results of this paper confirm that SPOT satellite scenes are suitable to predict depths using empirical models in very shallow embayments. Spatial regression models show better adjustments in the predictions over non-spatial models. The spatial regression equation used provides realistic results down to 6 m below the water surface, with
Nonparametric factor analysis of time series
Rodríguez-Poo, Juan M.; Linton, Oliver Bruce
1998-01-01
We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.
Nonparametric Inference for Periodic Sequences
Sun, Ying
2012-02-01
This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.
Nonparametric predictive inference in reliability
International Nuclear Information System (INIS)
Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.
2002-01-01
We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere
Nonparametric tests for equality of psychometric functions.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2017-12-07
Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Nonparametric instrumental regression with non-convex constraints
International Nuclear Information System (INIS)
Grasmair, M; Scherzer, O; Vanhems, A
2013-01-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition. (paper)
Nonparametric instrumental regression with non-convex constraints
Grasmair, M.; Scherzer, O.; Vanhems, A.
2013-03-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.
Nonparametric correlation models for portfolio allocation
DEFF Research Database (Denmark)
Aslanidis, Nektarios; Casas, Isabel
2013-01-01
This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....
A contingency table approach to nonparametric testing
Rayner, JCW
2000-01-01
Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp
Nonparametric statistics for social and behavioral sciences
Kraska-MIller, M
2013-01-01
Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Nonparametric e-Mixture Estimation.
Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru
2016-12-01
This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.
Froeschke, John T.; Stunz, Gregory W.; Sterba-Boatwright, Blair; Wildhaber, Mark L.
2010-01-01
Using a long-term fisheries-independent data set, we tested the 'shark nursery area concept' proposed by Heupel et al. (2007) with the suggested working assumptions that a shark nursery habitat would: (1) have an abundance of immature sharks greater than the mean abundance across all habitats where they occur; (2) be used by sharks repeatedly through time (years); and (3) see immature sharks remaining within the habitat for extended periods of time. We tested this concept using young-of-the-year (age 0) and juvenile (age 1+ yr) bull sharks Carcharhinus leucas from gill-net surveys conducted in Texas bays from 1976 to 2006 to estimate the potential nursery function of 9 coastal bays. Of the 9 bay systems considered as potential nursery habitat, only Matagorda Bay satisfied all 3 criteria for young-of-the-year bull sharks. Both Matagorda and San Antonio Bays met the criteria for juvenile bull sharks. Through these analyses we examined the utility of this approach for characterizing nursery areas and we also describe some practical considerations, such as the influence of the temporal or spatial scales considered when applying the nursery role concept to shark populations.
Bayesian Nonparametric Longitudinal Data Analysis.
Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen
2016-01-01
Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...
Non-parametric production analysis of pesticides use in the Netherlands
Oude Lansink, A.G.J.M.; Silva, E.
2004-01-01
Many previous empirical studies on the productivity of pesticides suggest that pesticides are under-utilized in agriculture despite the general held believe that these inputs are substantially over-utilized. This paper uses data envelopment analysis (DEA) to calculate non-parametric measures of the
Nonparametric estimation of the stationary M/G/1 workload distribution function
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted
2005-01-01
In this paper it is demonstrated how a nonparametric estimator of the stationary workload distribution function of the M/G/1-queue can be obtained by systematic sampling the workload process. Weak convergence results and bootstrap methods for empirical distribution functions for stationary associ...
Nonparametric functional mapping of quantitative trait loci.
Yang, Jie; Wu, Rongling; Casella, George
2009-03-01
Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.
Essays on nonparametric econometrics of stochastic volatility
Zu, Y.
2012-01-01
Volatility is a concept that describes the variation of financial returns. Measuring and modelling volatility dynamics is an important aspect of financial econometrics. This thesis is concerned with nonparametric approaches to volatility measurement and volatility model validation.
Nonparametric methods for volatility density estimation
Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.
2009-01-01
Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on
Recent Advances and Trends in Nonparametric Statistics
Akritas, MG
2003-01-01
The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o
Bayesian nonparametric dictionary learning for compressed sensing MRI.
Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping
2014-12-01
We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.
A Bayesian nonparametric estimation of distributions and quantiles
International Nuclear Information System (INIS)
Poern, K.
1988-11-01
The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)
Multi-Directional Non-Parametric Analysis of Agricultural Efficiency
DEFF Research Database (Denmark)
Balezentis, Tomas
This thesis seeks to develop methodologies for assessment of agricultural efficiency and employ them to Lithuanian family farms. In particular, we focus on three particular objectives throughout the research: (i) to perform a fully non-parametric analysis of efficiency effects, (ii) to extend...... to the Multi-Directional Efficiency Analysis approach when the proposed models were employed to analyse empirical data of Lithuanian family farm performance, we saw substantial differences in efficiencies associated with different inputs. In particular, assets appeared to be the least efficiently used input...... relative to labour, intermediate consumption and land (in some cases land was not treated as a discretionary input). These findings call for further research on relationships among financial structure, investment decisions, and efficiency in Lithuanian family farms. Application of different techniques...
Teaching Nonparametric Statistics Using Student Instrumental Values.
Anderson, Jonathan W.; Diddams, Margaret
Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…
Nonparametric conditional predictive regions for time series
de Gooijer, J.G.; Zerom Godefay, D.
2000-01-01
Several nonparametric predictors based on the Nadaraya-Watson kernel regression estimator have been proposed in the literature. They include the conditional mean, the conditional median, and the conditional mode. In this paper, we consider three types of predictive regions for these predictors — the
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2000-01-01
New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2004-01-01
Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric
Non-Parametric Estimation of Correlation Functions
DEFF Research Database (Denmark)
Brincker, Rune; Rytter, Anders; Krenk, Steen
In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...
Nonparametric estimation in models for unobservable heterogeneity
Hohmann, Daniel
2014-01-01
Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.
Nonparametric estimation of location and scale parameters
Potgieter, C.J.; Lombard, F.
2012-01-01
Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal
A Bayesian Nonparametric Approach to Factor Analysis
DEFF Research Database (Denmark)
Piatek, Rémi; Papaspiliopoulos, Omiros
2018-01-01
This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does no...
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
Filippi, Sarah; Holmes, Chris C; Nieto-Barajas, Luis E
2016-11-16
In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.
Generalized empirical likelihood methods for analyzing longitudinal data
Wang, S.; Qian, L.; Carroll, R. J.
2010-01-01
Efficient estimation of parameters is a major objective in analyzing longitudinal data. We propose two generalized empirical likelihood based methods that take into consideration within-subject correlations. A nonparametric version of the Wilks
A simple non-parametric goodness-of-fit test for elliptical copulas
Directory of Open Access Journals (Sweden)
Jaser Miriam
2017-12-01
Full Text Available In this paper, we propose a simple non-parametric goodness-of-fit test for elliptical copulas of any dimension. It is based on the equality of Kendall’s tau and Blomqvist’s beta for all bivariate margins. Nominal level and power of the proposed test are investigated in a Monte Carlo study. An empirical application illustrates our goodness-of-fit test at work.
Parametric and Non-Parametric System Modelling
DEFF Research Database (Denmark)
Nielsen, Henrik Aalborg
1999-01-01
the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Directory of Open Access Journals (Sweden)
Alejandro Jara
2011-04-01
Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
Network structure exploration via Bayesian nonparametric models
International Nuclear Information System (INIS)
Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z
2015-01-01
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)
portfolio optimization based on nonparametric estimation methods
Directory of Open Access Journals (Sweden)
mahsa ghandehari
2017-03-01
Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.
Nonparametric Mixture Models for Supervised Image Parcellation.
Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina
2009-09-01
We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.
Robustifying Bayesian nonparametric mixtures for count data.
Canale, Antonio; Prünster, Igor
2017-03-01
Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.
Introduction to nonparametric statistics for the biological sciences using R
MacFarland, Thomas W
2016-01-01
This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...
Exact nonparametric confidence bands for the survivor function.
Matthews, David
2013-10-12
A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.
Non-parametric smoothing of experimental data
International Nuclear Information System (INIS)
Kuketayev, A.T.; Pen'kov, F.M.
2007-01-01
Full text: Rapid processing of experimental data samples in nuclear physics often requires differentiation in order to find extrema. Therefore, even at the preliminary stage of data analysis, a range of noise reduction methods are used to smooth experimental data. There are many non-parametric smoothing techniques: interval averages, moving averages, exponential smoothing, etc. Nevertheless, it is more common to use a priori information about the behavior of the experimental curve in order to construct smoothing schemes based on the least squares techniques. The latter methodology's advantage is that the area under the curve can be preserved, which is equivalent to conservation of total speed of counting. The disadvantages of this approach include the lack of a priori information. For example, very often the sums of undifferentiated (by a detector) peaks are replaced with one peak during the processing of data, introducing uncontrolled errors in the determination of the physical quantities. The problem is solvable only by having experienced personnel, whose skills are much greater than the challenge. We propose a set of non-parametric techniques, which allows the use of any additional information on the nature of experimental dependence. The method is based on a construction of a functional, which includes both experimental data and a priori information. Minimum of this functional is reached on a non-parametric smoothed curve. Euler (Lagrange) differential equations are constructed for these curves; then their solutions are obtained analytically or numerically. The proposed approach allows for automated processing of nuclear physics data, eliminating the need for highly skilled laboratory personnel. Pursuant to the proposed approach is the possibility to obtain smoothing curves in a given confidence interval, e.g. according to the χ 2 distribution. This approach is applicable when constructing smooth solutions of ill-posed problems, in particular when solving
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Decompounding random sums: A nonparametric approach
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted; Pitts, Susan M.
Observations from sums of random variables with a random number of summands, known as random, compound or stopped sums arise within many areas of engineering and science. Quite often it is desirable to infer properties of the distribution of the terms in the random sum. In the present paper we...... review a number of applications and consider the nonlinear inverse problem of inferring the cumulative distribution function of the components in the random sum. We review the existing literature on non-parametric approaches to the problem. The models amenable to the analysis are generalized considerably...
A Nonparametric Test for Seasonal Unit Roots
Kunst, Robert M.
2009-01-01
Abstract: We consider a nonparametric test for the null of seasonal unit roots in quarterly time series that builds on the RUR (records unit root) test by Aparicio, Escribano, and Sipols. We find that the test concept is more promising than a formalization of visual aids such as plots by quarter. In order to cope with the sensitivity of the original RUR test to autocorrelation under its null of a unit root, we suggest an augmentation step by autoregression. We present some evidence on the siz...
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.
DEFF Research Database (Denmark)
Carrao, Hugo; Sepulcre, Guadalupe; Horion, Stéphanie Marie Anne F
2013-01-01
This study evaluates the relationship between the frequency and duration of meteorological droughts and the subsequent temporal changes on the quantity of actively photosynthesizing biomass (greenness) estimated from satellite imagery on rainfed croplands in Latin America. An innovative non-parametric...... and non-supervised approach, based on the Fisher-Jenks optimal classification algorithm, is used to identify multi-scale meteorological droughts on the basis of empirical cumulative distributions of 1, 3, 6, and 12-monthly precipitation totals. As input data for the classifier, we use the gridded GPCC...... for the period between 1998 and 2010. The time-series analysis of vegetation greenness is performed during the growing season with a non-parametric method, namely the seasonal Relative Greenness (RG) of spatially accumulated fAPAR. The Global Land Cover map of 2000 and the GlobCover maps of 2005/2006 and 2009...
A local non-parametric model for trade sign inference
Blazejewski, Adam; Coggins, Richard
2005-03-01
We investigate a regularity in market order submission strategies for 12 stocks with large market capitalization on the Australian Stock Exchange. The regularity is evidenced by a predictable relationship between the trade sign (trade initiator), size of the trade, and the contents of the limit order book before the trade. We demonstrate this predictability by developing an empirical inference model to classify trades into buyer-initiated and seller-initiated. The model employs a local non-parametric method, k-nearest neighbor, which in the past was used successfully for chaotic time series prediction. The k-nearest neighbor with three predictor variables achieves an average out-of-sample classification accuracy of 71.40%, compared to 63.32% for the linear logistic regression with seven predictor variables. The result suggests that a non-linear approach may produce a more parsimonious trade sign inference model with a higher out-of-sample classification accuracy. Furthermore, for most of our stocks the observed regularity in market order submissions seems to have a memory of at least 30 trading days.
Empirical Bayes methods in road safety research.
Vogelesang, R.A.W.
1997-01-01
Road safety research is a wonderful combination of counting fatal accidents and using a toolkit containing prior, posterior, overdispersed Poisson, negative binomial and Gamma distributions, together with positive and negative regression effects, shrinkage estimators and fiercy debates concerning
On Parametric (and Non-Parametric Variation
Directory of Open Access Journals (Sweden)
Neil Smith
2009-11-01
Full Text Available This article raises the issue of the correct characterization of ‘Parametric Variation’ in syntax and phonology. After specifying their theoretical commitments, the authors outline the relevant parts of the Principles–and–Parameters framework, and draw a three-way distinction among Universal Principles, Parameters, and Accidents. The core of the contribution then consists of an attempt to provide identity criteria for parametric, as opposed to non-parametric, variation. Parametric choices must be antecedently known, and it is suggested that they must also satisfy seven individually necessary and jointly sufficient criteria. These are that they be cognitively represented, systematic, dependent on the input, deterministic, discrete, mutually exclusive, and irreversible.
Nonparametric predictive pairwise comparison with competing risks
International Nuclear Information System (INIS)
Coolen-Maturi, Tahani
2014-01-01
In reliability, failure data often correspond to competing risks, where several failure modes can cause a unit to fail. This paper presents nonparametric predictive inference (NPI) for pairwise comparison with competing risks data, assuming that the failure modes are independent. These failure modes could be the same or different among the two groups, and these can be both observed and unobserved failure modes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. The focus is on the lower and upper probabilities for the event that the lifetime of a future unit from one group, say Y, is greater than the lifetime of a future unit from the second group, say X. The paper also shows how the two groups can be compared based on particular failure mode(s), and the comparison of the two groups when some of the competing risks are combined is discussed
Nonparametric estimation of location and scale parameters
Potgieter, C.J.
2012-12-01
Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.
A nonparametric test for industrial specialization
Billings, Stephen B.; Johnson, Erik B.
2010-01-01
Urban economists hypothesize that industrial diversity matters for urban growth and development, but metrics for empirically testing this relationship are limited to simple concentration metrics (e.g. location quotient) or summary diversity indices (e.g. Gini, Herfindahl). As shown by recent advances in how we measure localization and specialization, these measures of industrial diversity may be subject to bias under small samples or the Modifiable Areal Unit Problem. Furthermore, empirically...
A NONPARAMETRIC HYPOTHESIS TEST VIA THE BOOTSTRAP RESAMPLING
Temel, Tugrul T.
2001-01-01
This paper adapts an already existing nonparametric hypothesis test to the bootstrap framework. The test utilizes the nonparametric kernel regression method to estimate a measure of distance between the models stated under the null hypothesis. The bootstraped version of the test allows to approximate errors involved in the asymptotic hypothesis test. The paper also develops a Mathematica Code for the test algorithm.
Simple nonparametric checks for model data fit in CAT
Meijer, R.R.
2005-01-01
In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item
Nonparametric Bayesian inference for multidimensional compound Poisson processes
Gugushvili, S.; van der Meulen, F.; Spreij, P.
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,
Nonparametric analysis of blocked ordered categories data: some examples revisited
Directory of Open Access Journals (Sweden)
O. Thas
2006-08-01
Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.
A Structural Labor Supply Model with Nonparametric Preferences
van Soest, A.H.O.; Das, J.W.M.; Gong, X.
2000-01-01
Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained
2nd Conference of the International Society for Nonparametric Statistics
Manteiga, Wenceslao; Romo, Juan
2016-01-01
This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...
Nonparametric bootstrap analysis with applications to demographic effects in demand functions.
Gozalo, P L
1997-12-01
"A new bootstrap proposal, labeled smooth conditional moment (SCM) bootstrap, is introduced for independent but not necessarily identically distributed data, where the classical bootstrap procedure fails.... A good example of the benefits of using nonparametric and bootstrap methods is the area of empirical demand analysis. In particular, we will be concerned with their application to the study of two important topics: what are the most relevant effects of household demographic variables on demand behavior, and to what extent present parametric specifications capture these effects." excerpt
Kerschbamer, Rudolf
2015-05-01
This paper proposes a geometric delineation of distributional preference types and a non-parametric approach for their identification in a two-person context. It starts with a small set of assumptions on preferences and shows that this set (i) naturally results in a taxonomy of distributional archetypes that nests all empirically relevant types considered in previous work; and (ii) gives rise to a clean experimental identification procedure - the Equality Equivalence Test - that discriminates between archetypes according to core features of preferences rather than properties of specific modeling variants. As a by-product the test yields a two-dimensional index of preference intensity.
African Journals Online (AJOL)
user
2015-02-23
Feb 23, 2015 ... surveys to assess the vulnerability of the most important physical and eutrophication parameters along. El- Mex Bay coast. As a result of increasing population and industrial development, poorly untreated industrial waste, domestic sewage, shipping industry and agricultural runoff are being released to the.
Nonparametric methods in actigraphy: An update
Directory of Open Access Journals (Sweden)
Bruno S.B. Gonçalves
2014-09-01
Full Text Available Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm results for each time interval. Simulated data showed that (1 synchronization analysis depends on sample size, and (2 fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization.
Bayesian nonparametric adaptive control using Gaussian processes.
Chowdhary, Girish; Kingravi, Hassan A; How, Jonathan P; Vela, Patricio A
2015-03-01
Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.
Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.
Weak Disposability in Nonparametric Production Analysis with Undesirable Outputs
Kuosmanen, T.K.
2005-01-01
Environmental Economics and Natural Resources Group at Wageningen University in The Netherlands Weak disposability of outputs means that firms can abate harmful emissions by decreasing the activity level. Modeling weak disposability in nonparametric production analysis has caused some confusion.
Multi-sample nonparametric treatments comparison in medical ...
African Journals Online (AJOL)
Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study. P. L. Tan, N.A. Ibrahim, M.B. Adam, J. Arasan ...
A nonparametric mixture model for cure rate estimation.
Peng, Y; Dear, K B
2000-03-01
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
Speaker Linking and Applications using Non-Parametric Hashing Methods
2016-09-08
nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and
Directory of Open Access Journals (Sweden)
Xiaochun Sun
Full Text Available Genomic selection (GS procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA and reproducing kernel Hilbert spaces (RKHS regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.
Sun, Xiaochun; Ma, Ping; Mumm, Rita H
2012-01-01
Genomic selection (GS) procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA) and reproducing kernel Hilbert spaces (RKHS) regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC) and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.
DEFF Research Database (Denmark)
Khair, Tabish
2017-01-01
Review of 'Inglorious Empire: What the British did to India' by Shashi Tharoor, London, Hurst Publishers, 2017, 296 pp., £20.00......Review of 'Inglorious Empire: What the British did to India' by Shashi Tharoor, London, Hurst Publishers, 2017, 296 pp., £20.00...
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Directory of Open Access Journals (Sweden)
Saerom Park
Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Application of nonparametric statistic method for DNBR limit calculation
International Nuclear Information System (INIS)
Dong Bo; Kuang Bo; Zhu Xuenong
2013-01-01
Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)
A nonparametric spatial scan statistic for continuous data.
Jung, Inkyung; Cho, Ho Jin
2015-10-20
Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
High throughput nonparametric probability density estimation.
Farmer, Jenny; Jacobs, Donald
2018-01-01
In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.
Nonparametric regression using the concept of minimum energy
International Nuclear Information System (INIS)
Williams, Mike
2011-01-01
It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.
Nonparametric and group-based person-fit statistics : a validity study and an empirical example
Meijer, R.R.
1994-01-01
In person-fit analysis, the object is to investigate whether an item score pattern is improbable given the item score patterns of the other persons in the group or given what is expected on the basis of a test model. In this study, several existing group-based statistics to detect such improbable
Adaptive nonparametric Bayesian inference using location-scale mixture priors
Jonge, de R.; Zanten, van J.H.
2010-01-01
We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if
The nonparametric bootstrap for the current status model
Groeneboom, P.; Hendrickx, K.
2017-01-01
It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid
Non-Parametric Analysis of Rating Transition and Default Data
DEFF Research Database (Denmark)
Fledelius, Peter; Lando, David; Perch Nielsen, Jens
2004-01-01
We demonstrate the use of non-parametric intensity estimation - including construction of pointwise confidence sets - for analyzing rating transition data. We find that transition intensities away from the class studied here for illustration strongly depend on the direction of the previous move b...
Bayesian nonparametric system reliability using sets of priors
Walter, G.M.; Aslett, L.J.M.; Coolen, F.P.A.
2016-01-01
An imprecise Bayesian nonparametric approach to system reliability with multiple types of components is developed. This allows modelling partial or imperfect prior knowledge on component failure distributions in a flexible way through bounds on the functioning probability. Given component level test
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Nonparametric modeling of dynamic functional connectivity in fmri data
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer H.; Røge, Rasmus
2015-01-01
dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted...
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.
Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen
2011-04-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support
Czech Academy of Sciences Publication Activity Database
Kalina, Jan; Zvárová, Jana
2017-01-01
Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability
On the robust nonparametric regression estimation for a functional regressor
Azzedine , Nadjia; Laksaci , Ali; Ould-Saïd , Elias
2009-01-01
On the robust nonparametric regression estimation for a functional regressor correspondance: Corresponding author. (Ould-Said, Elias) (Azzedine, Nadjia) (Laksaci, Ali) (Ould-Said, Elias) Departement de Mathematiques--> , Univ. Djillali Liabes--> , BP 89--> , 22000 Sidi Bel Abbes--> - ALGERIA (Azzedine, Nadjia) Departement de Mathema...
A general approach to posterior contraction in nonparametric inverse problems
Knapik, Bartek; Salomond, Jean Bernard
In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related
Non-parametric analysis of production efficiency of poultry egg ...
African Journals Online (AJOL)
Non-parametric analysis of production efficiency of poultry egg farmers in Delta ... analysis of factors affecting the output of poultry farmers showed that stock ... should be put in place for farmers to learn the best farm practices carried out on the ...
Energy Technology Data Exchange (ETDEWEB)
Hampf, Benjamin
2011-08-15
In this paper we present a new approach to evaluate the environmental efficiency of decision making units. We propose a model that describes a two-stage process consisting of a production and an end-of-pipe abatement stage with the environmental efficiency being determined by the efficiency of both stages. Taking the dependencies between the two stages into account, we show how nonparametric methods can be used to measure environmental efficiency and to decompose it into production and abatement efficiency. For an empirical illustration we apply our model to an analysis of U.S. power plants.
Owen, Art B
2001-01-01
Empirical likelihood provides inferences whose validity does not depend on specifying a parametric model for the data. Because it uses a likelihood, the method has certain inherent advantages over resampling methods: it uses the data to determine the shape of the confidence regions, and it makes it easy to combined data from multiple sources. It also facilitates incorporating side information, and it simplifies accounting for censored, truncated, or biased sampling.One of the first books published on the subject, Empirical Likelihood offers an in-depth treatment of this method for constructing confidence regions and testing hypotheses. The author applies empirical likelihood to a range of problems, from those as simple as setting a confidence region for a univariate mean under IID sampling, to problems defined through smooth functions of means, regression models, generalized linear models, estimating equations, or kernel smooths, and to sampling with non-identically distributed data. Abundant figures offer vi...
Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab
2011-01-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work
Information about the San Francisco Bay Water Quality Project (SFBWQP) Urban Greening Bay Area, a large-scale effort to re-envision urban landscapes to include green infrastructure (GI) making communities more livable and reducing stormwater runoff.
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.
Deshwar, Amit G; Vembu, Shankar; Morris, Quaid
2015-01-01
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…
Single versus mixture Weibull distributions for nonparametric satellite reliability
International Nuclear Information System (INIS)
Castet, Jean-Francois; Saleh, Joseph H.
2010-01-01
Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.
International Conference on Robust Rank-Based and Nonparametric Methods
McKean, Joseph
2016-01-01
The contributors to this volume include many of the distinguished researchers in this area. Many of these scholars have collaborated with Joseph McKean to develop underlying theory for these methods, obtain small sample corrections, and develop efficient algorithms for their computation. The papers cover the scope of the area, including robust nonparametric rank-based procedures through Bayesian and big data rank-based analyses. Areas of application include biostatistics and spatial areas. Over the last 30 years, robust rank-based and nonparametric methods have developed considerably. These procedures generalize traditional Wilcoxon-type methods for one- and two-sample location problems. Research into these procedures has culminated in complete analyses for many of the models used in practice including linear, generalized linear, mixed, and nonlinear models. Settings are both multivariate and univariate. With the development of R packages in these areas, computation of these procedures is easily shared with r...
Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering
Directory of Open Access Journals (Sweden)
Xin Tian
2017-06-01
Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.
Using non-parametric methods in econometric production analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
2012-01-01
by investigating the relationship between the elasticity of scale and the farm size. We use a balanced panel data set of 371~specialised crop farms for the years 2004-2007. A non-parametric specification test shows that neither the Cobb-Douglas function nor the Translog function are consistent with the "true......Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify a functional form of the production function of which the Cobb...... parameter estimates, but also in biased measures which are derived from the parameters, such as elasticities. Therefore, we propose to use non-parametric econometric methods. First, these can be applied to verify the functional form used in parametric production analysis. Second, they can be directly used...
Zhou, Haiming; Hanson, Timothy; Knapp, Roland
2015-12-01
The global emergence of Batrachochytrium dendrobatidis (Bd) has caused the extinction of hundreds of amphibian species worldwide. It has become increasingly important to be able to precisely predict time to Bd arrival in a population. The data analyzed herein present a unique challenge in terms of modeling because there is a strong spatial component to Bd arrival time and the traditional proportional hazards assumption is grossly violated. To address these concerns, we develop a novel marginal Bayesian nonparametric survival model for spatially correlated right-censored data. This class of models assumes that the logarithm of survival times marginally follow a mixture of normal densities with a linear-dependent Dirichlet process prior as the random mixing measure, and their joint distribution is induced by a Gaussian copula model with a spatial correlation structure. To invert high-dimensional spatial correlation matrices, we adopt a full-scale approximation that can capture both large- and small-scale spatial dependence. An efficient Markov chain Monte Carlo algorithm with delayed rejection is proposed for posterior computation, and an R package spBayesSurv is provided to fit the model. This approach is first evaluated through simulations, then applied to threatened frog populations in Sequoia-Kings Canyon National Park. © 2015, The International Biometric Society.
Nonparametric Bayesian models through probit stick-breaking processes.
Rodríguez, Abel; Dunson, David B
2011-03-01
We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.
Exact nonparametric inference for detection of nonlinear determinism
Luo, Xiaodong; Zhang, Jie; Small, Michael; Moroz, Irene
2005-01-01
We propose an exact nonparametric inference scheme for the detection of nonlinear determinism. The essential fact utilized in our scheme is that, for a linear stochastic process with jointly symmetric innovations, its ordinary least square (OLS) linear prediction error is symmetric about zero. Based on this viewpoint, a class of linear signed rank statistics, e.g. the Wilcoxon signed rank statistic, can be derived with the known null distributions from the prediction error. Thus one of the ad...
Non-parametric estimation of the individual's utility map
Noguchi, Takao; Sanborn, Adam N.; Stewart, Neil
2013-01-01
Models of risky choice have attracted much attention in behavioural economics. Previous research has repeatedly demonstrated that individuals' choices are not well explained by expected utility theory, and a number of alternative models have been examined using carefully selected sets of choice alternatives. The model performance however, can depend on which choice alternatives are being tested. Here we develop a non-parametric method for estimating the utility map over the wide range of choi...
Nonparametric Efficiency Testing of Asian Stock Markets Using Weekly Data
CORNELIS A. LOS
2004-01-01
The efficiency of speculative markets, as represented by Fama's 1970 fair game model, is tested on weekly price index data of six Asian stock markets - Hong Kong, Indonesia, Malaysia, Singapore, Taiwan and Thailand - using Sherry's (1992) non-parametric methods. These scientific testing methods were originally developed to analyze the information processing efficiency of nervous systems. In particular, the stationarity and independence of the price innovations are tested over ten years, from ...
Investigation of MLE in nonparametric estimation methods of reliability function
International Nuclear Information System (INIS)
Ahn, Kwang Won; Kim, Yoon Ik; Chung, Chang Hyun; Kim, Kil Yoo
2001-01-01
There have been lots of trials to estimate a reliability function. In the ESReDA 20 th seminar, a new method in nonparametric way was proposed. The major point of that paper is how to use censored data efficiently. Generally there are three kinds of approach to estimate a reliability function in nonparametric way, i.e., Reduced Sample Method, Actuarial Method and Product-Limit (PL) Method. The above three methods have some limits. So we suggest an advanced method that reflects censored information more efficiently. In many instances there will be a unique maximum likelihood estimator (MLE) of an unknown parameter, and often it may be obtained by the process of differentiation. It is well known that the three methods generally used to estimate a reliability function in nonparametric way have maximum likelihood estimators that are uniquely exist. So, MLE of the new method is derived in this study. The procedure to calculate a MLE is similar just like that of PL-estimator. The difference of the two is that in the new method, the mass (or weight) of each has an influence of the others but the mass in PL-estimator not
International Nuclear Information System (INIS)
Peggs, S.; Talman, R.
1987-01-01
As proton accelerators get larger, and include more magnets, the conventional tracking programs which simulate them run slower. The purpose of this paper is to describe a method, still under development, in which element-by-element tracking around one turn is replaced by a single man, which can be processed far faster. It is assumed for this method that a conventional program exists which can perform faithful tracking in the lattice under study for some hundreds of turns, with all lattice parameters held constant. An empirical map is then generated by comparison with the tracking program. A procedure has been outlined for determining an empirical Hamiltonian, which can represent motion through many nonlinear kicks, by taking data from a conventional tracking program. Though derived by an approximate method this Hamiltonian is analytic in form and can be subjected to further analysis of varying degrees of mathematical rigor. Even though the empirical procedure has only been described in one transverse dimension, there is good reason to hope that it can be extended to include two transverse dimensions, so that it can become a more practical tool in realistic cases
Urban Noise Modelling in Boka Kotorska Bay
Directory of Open Access Journals (Sweden)
Aleksandar Nikolić
2014-04-01
Full Text Available Traffic is the most significant noise source in urban areas. The village of Kamenari in Boka Kotorska Bay is a site where, in a relatively small area, road traffic and sea (ferry traffic take place at the same time. Due to the specificity of the location, i.e. very rare synergy of sound effects of road and sea traffic in the urban area, as well as the expressed need for assessment of noise level in a simple and quick way, a research was conducted, using empirical methods and statistical analysis methods, which led to the creation of acoustic model for the assessment of equivalent noise level (Leq. The developed model for noise assessment in the Village of Kamenari in Boka Kotorska Bay quite realistically provides data on possible noise levels at the observed site, with very little deviations in relation to empirically obtained values.
A Nonparametric Bayesian Approach For Emission Tomography Reconstruction
International Nuclear Information System (INIS)
Barat, Eric; Dautremer, Thomas
2007-01-01
We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM
Using non-parametric methods in econometric production analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify the functional form of the production function. Most often, the Cobb...... results—including measures that are of interest of applied economists, such as elasticities. Therefore, we propose to use nonparametric econometric methods. First, they can be applied to verify the functional form used in parametric estimations of production functions. Second, they can be directly used...
STATCAT, Statistical Analysis of Parametric and Non-Parametric Data
International Nuclear Information System (INIS)
David, Hugh
1990-01-01
1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required
Panel data nonparametric estimation of production risk and risk preferences
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
approaches for obtaining firm-specific measures of risk attitudes. We found that Polish dairy farmers are risk averse regarding production risk and price uncertainty. According to our results, Polish dairy farmers perceive the production risk as being more significant than the risk related to output price......We apply nonparametric panel data kernel regression to investigate production risk, out-put price uncertainty, and risk attitudes of Polish dairy farms based on a firm-level unbalanced panel data set that covers the period 2004–2010. We compare different model specifications and different...
Digital spectral analysis parametric, non-parametric and advanced methods
Castanié, Francis
2013-01-01
Digital Spectral Analysis provides a single source that offers complete coverage of the spectral analysis domain. This self-contained work includes details on advanced topics that are usually presented in scattered sources throughout the literature.The theoretical principles necessary for the understanding of spectral analysis are discussed in the first four chapters: fundamentals, digital signal processing, estimation in spectral analysis, and time-series models.An entire chapter is devoted to the non-parametric methods most widely used in industry.High resolution methods a
A Bayesian nonparametric approach to causal inference on quantiles.
Xu, Dandan; Daniels, Michael J; Winterstein, Almut G
2018-02-25
We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Nonparametric statistics a step-by-step approach
Corder, Gregory W
2014-01-01
"…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory. It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca
Evaluation of Nonparametric Probabilistic Forecasts of Wind Power
DEFF Research Database (Denmark)
Pinson, Pierre; Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg, orlov 31.07.2008
Predictions of wind power production for horizons up to 48-72 hour ahead comprise a highly valuable input to the methods for the daily management or trading of wind generation. Today, users of wind power predictions are not only provided with point predictions, which are estimates of the most...... likely outcome for each look-ahead time, but also with uncertainty estimates given by probabilistic forecasts. In order to avoid assumptions on the shape of predictive distributions, these probabilistic predictions are produced from nonparametric methods, and then take the form of a single or a set...
Estimation of Stochastic Volatility Models by Nonparametric Filtering
DEFF Research Database (Denmark)
Kanaya, Shin; Kristensen, Dennis
2016-01-01
/estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases...... and variances due to the first-step estimation, but under regularity conditions we show that these vanish asymptotically and our estimators inherit the asymptotic properties of the infeasible estimators based on observations of the volatility process. A simulation study examines the finite-sample properties...
1st Conference of the International Society for Nonparametric Statistics
Lahiri, S; Politis, Dimitris
2014-01-01
This volume is composed of peer-reviewed papers that have developed from the First Conference of the International Society for NonParametric Statistics (ISNPS). This inaugural conference took place in Chalkidiki, Greece, June 15-19, 2012. It was organized with the co-sponsorship of the IMS, the ISI, and other organizations. M.G. Akritas, S.N. Lahiri, and D.N. Politis are the first executive committee members of ISNPS, and the editors of this volume. ISNPS has a distinguished Advisory Committee that includes Professors R.Beran, P.Bickel, R. Carroll, D. Cook, P. Hall, R. Johnson, B. Lindsay, E. Parzen, P. Robinson, M. Rosenblatt, G. Roussas, T. SubbaRao, and G. Wahba. The Charting Committee of ISNPS consists of more than 50 prominent researchers from all over the world. The chapters in this volume bring forth recent advances and trends in several areas of nonparametric statistics. In this way, the volume facilitates the exchange of research ideas, promotes collaboration among researchers from all over the wo...
Nonparametric Analyses of Log-Periodic Precursors to Financial Crashes
Zhou, Wei-Xing; Sornette, Didier
We apply two nonparametric methods to further test the hypothesis that log-periodicity characterizes the detrended price trajectory of large financial indices prior to financial crashes or strong corrections. The term "parametric" refers here to the use of the log-periodic power law formula to fit the data; in contrast, "nonparametric" refers to the use of general tools such as Fourier transform, and in the present case the Hilbert transform and the so-called (H, q)-analysis. The analysis using the (H, q)-derivative is applied to seven time series ending with the October 1987 crash, the October 1997 correction and the April 2000 crash of the Dow Jones Industrial Average (DJIA), the Standard & Poor 500 and Nasdaq indices. The Hilbert transform is applied to two detrended price time series in terms of the ln(tc-t) variable, where tc is the time of the crash. Taking all results together, we find strong evidence for a universal fundamental log-frequency f=1.02±0.05 corresponding to the scaling ratio λ=2.67±0.12. These values are in very good agreement with those obtained in earlier works with different parametric techniques. This note is extracted from a long unpublished report with 58 figures available at , which extensively describes the evidence we have accumulated on these seven time series, in particular by presenting all relevant details so that the reader can judge for himself or herself the validity and robustness of the results.
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
A non-parametric framework for estimating threshold limit values
Directory of Open Access Journals (Sweden)
Ulm Kurt
2005-11-01
Full Text Available Abstract Background To estimate a threshold limit value for a compound known to have harmful health effects, an 'elbow' threshold model is usually applied. We are interested on non-parametric flexible alternatives. Methods We describe how a step function model fitted by isotonic regression can be used to estimate threshold limit values. This method returns a set of candidate locations, and we discuss two algorithms to select the threshold among them: the reduced isotonic regression and an algorithm considering the closed family of hypotheses. We assess the performance of these two alternative approaches under different scenarios in a simulation study. We illustrate the framework by analysing the data from a study conducted by the German Research Foundation aiming to set a threshold limit value in the exposure to total dust at workplace, as a causal agent for developing chronic bronchitis. Results In the paper we demonstrate the use and the properties of the proposed methodology along with the results from an application. The method appears to detect the threshold with satisfactory success. However, its performance can be compromised by the low power to reject the constant risk assumption when the true dose-response relationship is weak. Conclusion The estimation of thresholds based on isotonic framework is conceptually simple and sufficiently powerful. Given that in threshold value estimation context there is not a gold standard method, the proposed model provides a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.
Application of nonparametric statistics to material strength/reliability assessment
International Nuclear Information System (INIS)
Arai, Taketoshi
1992-01-01
An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)
DEFF Research Database (Denmark)
Engholm, Ida
2014-01-01
Celebrated as one of the leading and most valuable brands in the world, eBay has acquired iconic status on par with century-old brands such as Coca-Cola and Disney. The eBay logo is now synonymous with the world’s leading online auction website, and its design is associated with the company...
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
International Nuclear Information System (INIS)
Peggs, S.; Talman, R.
1986-08-01
As proton accelerators get larger, and include more magnets, the conventional tracking programs which simulate them run slower. At the same time, in order to more carefully optimize the higher cost of the accelerators, they must return more accurate results, even in the presence of a longer list of realistic effects, such as magnet errors and misalignments. For these reasons conventional tracking programs continue to be computationally bound, despite the continually increasing computing power available. This limitation is especially severe for a class of problems in which some lattice parameter is slowly varying, when a faithful description is only obtained by tracking for an exceedingly large number of turns. Examples are synchrotron oscillations in which the energy varies slowly with a period of, say, hundreds of turns, or magnet ripple or noise on a comparably slow time scale. In these cases one may with to track for hundreds of periods of the slowly varying parameter. The purpose of this paper is to describe a method, still under development, in which element-by-element tracking around one turn is replaced by a single map, which can be processed far faster. Similar programs have already been written in which successive elements are ''concatenated'' with truncation to linear, sextupole, or octupole order, et cetera, using Lie algebraic techniques to preserve symplecticity. The method described here is rather more empirical than this but, in principle, contains information to all orders and is able to handle resonances in a more straightforward fashion
Nonparametric Bayesian inference for mean residual life functions in survival analysis.
Poynor, Valerie; Kottas, Athanasios
2018-01-19
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Xu, Yonghong; Gao, Xiaohuan; Wang, Zhengxi
2014-04-01
Missing data represent a general problem in many scientific fields, especially in medical survival analysis. Dealing with censored data, interpolation method is one of important methods. However, most of the interpolation methods replace the censored data with the exact data, which will distort the real distribution of the censored data and reduce the probability of the real data falling into the interpolation data. In order to solve this problem, we in this paper propose a nonparametric method of estimating the survival function of right-censored and interval-censored data and compare its performance to SC (self-consistent) algorithm. Comparing to the average interpolation and the nearest neighbor interpolation method, the proposed method in this paper replaces the right-censored data with the interval-censored data, and greatly improves the probability of the real data falling into imputation interval. Then it bases on the empirical distribution theory to estimate the survival function of right-censored and interval-censored data. The results of numerical examples and a real breast cancer data set demonstrated that the proposed method had higher accuracy and better robustness for the different proportion of the censored data. This paper provides a good method to compare the clinical treatments performance with estimation of the survival data of the patients. This pro vides some help to the medical survival data analysis.
Generative Temporal Modelling of Neuroimaging - Decomposition and Nonparametric Testing
DEFF Research Database (Denmark)
Hald, Ditte Høvenhoff
The goal of this thesis is to explore two improvements for functional magnetic resonance imaging (fMRI) analysis; namely our proposed decomposition method and an extension to the non-parametric testing framework. Analysis of fMRI allows researchers to investigate the functional processes...... of the brain, and provides insight into neuronal coupling during mental processes or tasks. The decomposition method is a Gaussian process-based independent components analysis (GPICA), which incorporates a temporal dependency in the sources. A hierarchical model specification is used, featuring both...... instantaneous and convolutive mixing, and the inferred temporal patterns. Spatial maps are seen to capture smooth and localized stimuli-related components, and often identifiable noise components. The implementation is freely available as a GUI/SPM plugin, and we recommend using GPICA as an additional tool when...
Nonparametric Estimation of Distributions in Random Effects Models
Hart, Jeffrey D.
2011-01-01
We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.
Prior processes and their applications nonparametric Bayesian estimation
Phadia, Eswar G
2016-01-01
This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...
Spurious Seasonality Detection: A Non-Parametric Test Proposal
Directory of Open Access Journals (Sweden)
Aurelio F. Bariviera
2018-01-01
Full Text Available This paper offers a general and comprehensive definition of the day-of-the-week effect. Using symbolic dynamics, we develop a unique test based on ordinal patterns in order to detect it. This test uncovers the fact that the so-called “day-of-the-week” effect is partly an artifact of the hidden correlation structure of the data. We present simulations based on artificial time series as well. While time series generated with long memory are prone to exhibit daily seasonality, pure white noise signals exhibit no pattern preference. Since ours is a non-parametric test, it requires no assumptions about the distribution of returns, so that it could be a practical alternative to conventional econometric tests. We also made an exhaustive application of the here-proposed technique to 83 stock indexes around the world. Finally, the paper highlights the relevance of symbolic analysis in economic time series studies.
Nonparametric autocovariance estimation from censored time series by Gaussian imputation.
Park, Jung Wook; Genton, Marc G; Ghosh, Sujit K
2009-02-01
One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic.
Nonparametric estimation of stochastic differential equations with sparse Gaussian processes.
García, Constantino A; Otero, Abraham; Félix, Paulo; Presedo, Jesús; Márquez, David G
2017-08-01
The application of stochastic differential equations (SDEs) to the analysis of temporal data has attracted increasing attention, due to their ability to describe complex dynamics with physically interpretable equations. In this paper, we introduce a nonparametric method for estimating the drift and diffusion terms of SDEs from a densely observed discrete time series. The use of Gaussian processes as priors permits working directly in a function-space view and thus the inference takes place directly in this space. To cope with the computational complexity that requires the use of Gaussian processes, a sparse Gaussian process approximation is provided. This approximation permits the efficient computation of predictions for the drift and diffusion terms by using a distribution over a small subset of pseudosamples. The proposed method has been validated using both simulated data and real data from economy and paleoclimatology. The application of the method to real data demonstrates its ability to capture the behavior of complex systems.
Debt and growth: A non-parametric approach
Brida, Juan Gabriel; Gómez, David Matesanz; Seijas, Maria Nela
2017-11-01
In this study, we explore the dynamic relationship between public debt and economic growth by using a non-parametric approach based on data symbolization and clustering methods. The study uses annual data of general government consolidated gross debt-to-GDP ratio and gross domestic product for sixteen countries between 1977 and 2015. Using symbolic sequences, we introduce a notion of distance between the dynamical paths of different countries. Then, a Minimal Spanning Tree and a Hierarchical Tree are constructed from time series to help detecting the existence of groups of countries sharing similar economic performance. The main finding of the study appears for the period 2008-2016 when several countries surpassed the 90% debt-to-GDP threshold. During this period, three groups (clubs) of countries are obtained: high, mid and low indebted countries, suggesting that the employed debt-to-GDP threshold drives economic dynamics for the selected countries.
Nonparametric estimation of benchmark doses in environmental risk assessment
Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen
2013-01-01
Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133
Indoor Positioning Using Nonparametric Belief Propagation Based on Spanning Trees
Directory of Open Access Journals (Sweden)
Savic Vladimir
2010-01-01
Full Text Available Nonparametric belief propagation (NBP is one of the best-known methods for cooperative localization in sensor networks. It is capable of providing information about location estimation with appropriate uncertainty and to accommodate non-Gaussian distance measurement errors. However, the accuracy of NBP is questionable in loopy networks. Therefore, in this paper, we propose a novel approach, NBP based on spanning trees (NBP-ST created by breadth first search (BFS method. In addition, we propose a reliable indoor model based on obtained measurements in our lab. According to our simulation results, NBP-ST performs better than NBP in terms of accuracy and communication cost in the networks with high connectivity (i.e., highly loopy networks. Furthermore, the computational and communication costs are nearly constant with respect to the transmission radius. However, the drawbacks of proposed method are a little bit higher computational cost and poor performance in low-connected networks.
Hyperspectral image segmentation using a cooperative nonparametric approach
Taher, Akar; Chehdi, Kacem; Cariou, Claude
2013-10-01
In this paper a new unsupervised nonparametric cooperative and adaptive hyperspectral image segmentation approach is presented. The hyperspectral images are partitioned band by band in parallel and intermediate classification results are evaluated and fused, to get the final segmentation result. Two unsupervised nonparametric segmentation methods are used in parallel cooperation, namely the Fuzzy C-means (FCM) method, and the Linde-Buzo-Gray (LBG) algorithm, to segment each band of the image. The originality of the approach relies firstly on its local adaptation to the type of regions in an image (textured, non-textured), and secondly on the introduction of several levels of evaluation and validation of intermediate segmentation results before obtaining the final partitioning of the image. For the management of similar or conflicting results issued from the two classification methods, we gradually introduced various assessment steps that exploit the information of each spectral band and its adjacent bands, and finally the information of all the spectral bands. In our approach, the detected textured and non-textured regions are treated separately from feature extraction step, up to the final classification results. This approach was first evaluated on a large number of monocomponent images constructed from the Brodatz album. Then it was evaluated on two real applications using a respectively multispectral image for Cedar trees detection in the region of Baabdat (Lebanon) and a hyperspectral image for identification of invasive and non invasive vegetation in the region of Cieza (Spain). A correct classification rate (CCR) for the first application is over 97% and for the second application the average correct classification rate (ACCR) is over 99%.
Wangmo, Tenzin; Hauri, Sirin; Gennet, Eloise; Anane-Sarpong, Evelyn; Provoost, Veerle; Elger, Bernice S
2018-02-07
A review of literature published a decade ago noted a significant increase in empirical papers across nine bioethics journals. This study provides an update on the presence of empirical papers in the same nine journals. It first evaluates whether the empirical trend is continuing as noted in the previous study, and second, how it is changing, that is, what are the characteristics of the empirical works published in these nine bioethics journals. A review of the same nine journals (Bioethics; Journal of Medical Ethics; Journal of Clinical Ethics; Nursing Ethics; Cambridge Quarterly of Healthcare Ethics; Hastings Center Report; Theoretical Medicine and Bioethics; Christian Bioethics; and Kennedy Institute of Ethics Journal) was conducted for a 12-year period from 2004 to 2015. Data obtained was analysed descriptively and using a non-parametric Chi-square test. Of the total number of original papers (N = 5567) published in the nine bioethics journals, 18.1% (n = 1007) collected and analysed empirical data. Journal of Medical Ethics and Nursing Ethics led the empirical publications, accounting for 89.4% of all empirical papers. The former published significantly more quantitative papers than qualitative, whereas the latter published more qualitative papers. Our analysis reveals no significant difference (χ2 = 2.857; p = 0.091) between the proportion of empirical papers published in 2004-2009 and 2010-2015. However, the increasing empirical trend has continued in these journals with the proportion of empirical papers increasing from 14.9% in 2004 to 17.8% in 2015. This study presents the current state of affairs regarding empirical research published nine bioethics journals. In the quarter century of data that is available about the nine bioethics journals studied in two reviews, the proportion of empirical publications continues to increase, signifying a trend towards empirical research in bioethics. The growing volume is mainly attributable to two
A ¤nonparametric dynamic additive regression model for longitudinal data
DEFF Research Database (Denmark)
Martinussen, T.; Scheike, T. H.
2000-01-01
dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...
DEFF Research Database (Denmark)
Effraimidis, Georgios; Dahl, Christian Møller
In this paper, we develop a fully nonparametric approach for the estimation of the cumulative incidence function with Missing At Random right-censored competing risks data. We obtain results on the pointwise asymptotic normality as well as the uniform convergence rate of the proposed nonparametric...
Non-parametric tests of productive efficiency with errors-in-variables
Kuosmanen, T.K.; Post, T.; Scholtes, S.
2007-01-01
We develop a non-parametric test of productive efficiency that accounts for errors-in-variables, following the approach of Varian. [1985. Nonparametric analysis of optimizing behavior with measurement error. Journal of Econometrics 30(1/2), 445-458]. The test is based on the general Pareto-Koopmans
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...
Biscayne Bay Alongshore Epifauna
National Oceanic and Atmospheric Administration, Department of Commerce — Field studies to characterize the alongshore epifauna (shrimp, crabs, echinoderms, and small fishes) along the western shore of southern Biscayne Bay were started in...
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 4x4 meter resolution bathymetric surface for Jobos Bay, Puerto Rico (in NAD83 UTM 19 North). The depth values are in meters referenced to the...
Hammond Bay Biological Station
Federal Laboratory Consortium — Hammond Bay Biological Station (HBBS), located near Millersburg, Michigan, is a field station of the USGS Great Lakes Science Center (GLSC). HBBS was established by...
National Oceanic and Atmospheric Administration, Department of Commerce — This data set consists of 0.5-meter pixel resolution, four band orthoimages covering the Humboldt Bay area. An orthoimage is remotely sensed image data in which...
Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A
2017-06-30
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Begum, Mehmuna; Sahu, Biraja Kumar; Das, Apurba Kumar; Vinithkumar, Nambali Valsalan; Kirubagaran, Ramalingam
2015-05-01
Blooming of diatom species Chaetoceros curvisetus (Cleve, 1889) was observed in Junglighat Bay and Haddo Harbour of Port Blair Bay of Andaman and Nicobar Islands during June 2010. Physico-chemical parameters, nutrient concentrations and phytoplankton composition data collected from five stations during 2010 were classified as bloom area (BA) and non-bloom area (NBA) and compared. Elevated values of dissolved oxygen were recorded in the BA, and it significantly varied (p NBA. Among the nutrient parameters studied, nitrate concentration indicated significant variation in BA and NBA (p NBA, indicating its utilization. In Junglighat Bay, the C. curvisetus species constituted 93.4 and 69.2% composition of total phytoplankton population during day 1 and day 2, respectively. The bloom forming stations separated out from the non-bloom forming station in non-parametric multidimensional scaling (nMDS) ordinations; cluster analysis powered by SIMPROF test also grouped the stations as BA and NBA.
International Nuclear Information System (INIS)
Kuosmanen, Timo
2012-01-01
Electricity distribution network is a prime example of a natural local monopoly. In many countries, electricity distribution is regulated by the government. Many regulators apply frontier estimation techniques such as data envelopment analysis (DEA) or stochastic frontier analysis (SFA) as an integral part of their regulatory framework. While more advanced methods that combine nonparametric frontier with stochastic error term are known in the literature, in practice, regulators continue to apply simplistic methods. This paper reports the main results of the project commissioned by the Finnish regulator for further development of the cost frontier estimation in their regulatory framework. The key objectives of the project were to integrate a stochastic SFA-style noise term to the nonparametric, axiomatic DEA-style cost frontier, and to take the heterogeneity of firms and their operating environments better into account. To achieve these objectives, a new method called stochastic nonparametric envelopment of data (StoNED) was examined. Based on the insights and experiences gained in the empirical analysis using the real data of the regulated networks, the Finnish regulator adopted the StoNED method in use from 2012 onwards.
Poisson-generalized gamma empirical Bayes model for disease ...
African Journals Online (AJOL)
In spatial disease mapping, the use of Bayesian models of estimation technique is becoming popular for smoothing relative risks estimates for disease mapping. The most common Bayesian conjugate model for disease mapping is the Poisson-Gamma Model (PG). To explore further the activity of smoothing of relative risk ...
Empirical Bayes Estimation of Proportions in Several Groups.
1981-01-01
Street NW Washington, DC 20208 Dr. William Graham Testing Directorate 1 Dr. Lorraine D. Eyde MEPCOM/MEPCT-P Personnel R&D Center Ft. Sheridan, IL...D-53 BONN 1, GERMANY Princeton, NJ 08540 1 Dr. Mark D. Reckase Dr. Gary Marco Educational Psychology Dept. Educational Testing Service University of...GED Testing Service, Suite 20 Columbia, SC 29208 One Dupont Cirle, NW Washington, DC 20036 1 PROF. FUMIKO SAMEJIMA DEPT. OF PSYCHOLOGY UNIVERSITY OF
Wesselink, Christiaan; Heeg, Govert P.; Jansonius, Nomdo M.
Objective: To compare prospectively 2 perimetric progression detection algorithms for glaucoma, the Early Manifest Glaucoma Trial algorithm (glaucoma progression analysis [GPA]) and a nonparametric algorithm applied to the mean deviation (MD) (nonparametric progression analysis [NPA]). Methods:
Efficient nonparametric n -body force fields from machine learning
Glielmo, Aldo; Zeni, Claudio; De Vita, Alessandro
2018-05-01
We provide a definition and explicit expressions for n -body Gaussian process (GP) kernels, which can learn any interatomic interaction occurring in a physical system, up to n -body contributions, for any value of n . The series is complete, as it can be shown that the "universal approximator" squared exponential kernel can be written as a sum of n -body kernels. These recipes enable the choice of optimally efficient force models for each target system, as confirmed by extensive testing on various materials. We furthermore describe how the n -body kernels can be "mapped" on equivalent representations that provide database-size-independent predictions and are thus crucially more efficient. We explicitly carry out this mapping procedure for the first nontrivial (three-body) kernel of the series, and we show that this reproduces the GP-predicted forces with meV /Å accuracy while being orders of magnitude faster. These results pave the way to using novel force models (here named "M-FFs") that are computationally as fast as their corresponding standard parametrized n -body force fields, while retaining the nonparametric character, the ease of training and validation, and the accuracy of the best recently proposed machine-learning potentials.
Non-parametric Bayesian networks: Improving theory and reviewing applications
International Nuclear Information System (INIS)
Hanea, Anca; Morales Napoles, Oswaldo; Ababei, Dan
2015-01-01
Applications in various domains often lead to high dimensional dependence modelling. A Bayesian network (BN) is a probabilistic graphical model that provides an elegant way of expressing the joint distribution of a large number of interrelated variables. BNs have been successfully used to represent uncertain knowledge in a variety of fields. The majority of applications use discrete BNs, i.e. BNs whose nodes represent discrete variables. Integrating continuous variables in BNs is an area fraught with difficulty. Several methods that handle discrete-continuous BNs have been proposed in the literature. This paper concentrates only on one method called non-parametric BNs (NPBNs). NPBNs were introduced in 2004 and they have been or are currently being used in at least twelve professional applications. This paper provides a short introduction to NPBNs, a couple of theoretical advances, and an overview of applications. The aim of the paper is twofold: one is to present the latest improvements of the theory underlying NPBNs, and the other is to complement the existing overviews of BNs applications with the NPNBs applications. The latter opens the opportunity to discuss some difficulties that applications pose to the theoretical framework and in this way offers some NPBN modelling guidance to practitioners. - Highlights: • The paper gives an overview of the current NPBNs methodology. • We extend the NPBN methodology by relaxing the conditions of one of its fundamental theorems. • We propose improvements of the data mining algorithm for the NPBNs. • We review the professional applications of the NPBNs.
Nonparametric predictive inference for combined competing risks data
International Nuclear Information System (INIS)
Coolen-Maturi, Tahani; Coolen, Frank P.A.
2014-01-01
The nonparametric predictive inference (NPI) approach for competing risks data has recently been presented, in particular addressing the question due to which of the competing risks the next unit will fail, and also considering the effects of unobserved, re-defined, unknown or removed competing risks. In this paper, we introduce how the NPI approach can be used to deal with situations where units are not all at risk from all competing risks. This may typically occur if one combines information from multiple samples, which can, e.g. be related to further aspects of units that define the samples or groups to which the units belong or to different applications where the circumstances under which the units operate can vary. We study the effect of combining the additional information from these multiple samples, so effectively borrowing information on specific competing risks from other units, on the inferences. Such combination of information can be relevant to competing risks scenarios in a variety of application areas, including engineering and medical studies
Transition redshift: new constraints from parametric and nonparametric methods
Energy Technology Data Exchange (ETDEWEB)
Rani, Nisha; Mahajan, Shobhit; Mukherjee, Amitabha [Department of Physics and Astrophysics, University of Delhi, New Delhi 110007 (India); Jain, Deepak [Deen Dayal Upadhyaya College, University of Delhi, New Delhi 110015 (India); Pires, Nilza, E-mail: nrani@physics.du.ac.in, E-mail: djain@ddu.du.ac.in, E-mail: shobhit.mahajan@gmail.com, E-mail: amimukh@gmail.com, E-mail: npires@dfte.ufrn.br [Departamento de Física Teórica e Experimental, UFRN, Campus Universitário, Natal, RN 59072-970 (Brazil)
2015-12-01
In this paper, we use the cosmokinematics approach to study the accelerated expansion of the Universe. This is a model independent approach and depends only on the assumption that the Universe is homogeneous and isotropic and is described by the FRW metric. We parametrize the deceleration parameter, q(z), to constrain the transition redshift (z{sub t}) at which the expansion of the Universe goes from a decelerating to an accelerating phase. We use three different parametrizations of q(z) namely, q{sub I}(z)=q{sub 1}+q{sub 2}z, q{sub II} (z) = q{sub 3} + q{sub 4} ln (1 + z) and q{sub III} (z)=½+q{sub 5}/(1+z){sup 2}. A joint analysis of the age of galaxies, strong lensing and supernovae Ia data indicates that the transition redshift is less than unity i.e. z{sub t} < 1. We also use a nonparametric approach (LOESS+SIMEX) to constrain z{sub t}. This too gives z{sub t} < 1 which is consistent with the value obtained by the parametric approach.
Discrete non-parametric kernel estimation for global sensitivity analysis
International Nuclear Information System (INIS)
Senga Kiessé, Tristan; Ventura, Anne
2016-01-01
This work investigates the discrete kernel approach for evaluating the contribution of the variance of discrete input variables to the variance of model output, via analysis of variance (ANOVA) decomposition. Until recently only the continuous kernel approach has been applied as a metamodeling approach within sensitivity analysis framework, for both discrete and continuous input variables. Now the discrete kernel estimation is known to be suitable for smoothing discrete functions. We present a discrete non-parametric kernel estimator of ANOVA decomposition of a given model. An estimator of sensitivity indices is also presented with its asymtotic convergence rate. Some simulations on a test function analysis and a real case study from agricultural have shown that the discrete kernel approach outperforms the continuous kernel one for evaluating the contribution of moderate or most influential discrete parameters to the model output. - Highlights: • We study a discrete kernel estimation for sensitivity analysis of a model. • A discrete kernel estimator of ANOVA decomposition of the model is presented. • Sensitivity indices are calculated for discrete input parameters. • An estimator of sensitivity indices is also presented with its convergence rate. • An application is realized for improving the reliability of environmental models.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines
Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.
2011-01-01
Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433
Nonparametric Integrated Agrometeorological Drought Monitoring: Model Development and Application
Zhang, Qiang; Li, Qin; Singh, Vijay P.; Shi, Peijun; Huang, Qingzhong; Sun, Peng
2018-01-01
Drought is a major natural hazard that has massive impacts on the society. How to monitor drought is critical for its mitigation and early warning. This study proposed a modified version of the multivariate standardized drought index (MSDI) based on precipitation, evapotranspiration, and soil moisture, i.e., modified multivariate standardized drought index (MMSDI). This study also used nonparametric joint probability distribution analysis. Comparisons were done between standardized precipitation evapotranspiration index (SPEI), standardized soil moisture index (SSMI), MSDI, and MMSDI, and real-world observed drought regimes. Results indicated that MMSDI detected droughts that SPEI and/or SSMI failed to do. Also, MMSDI detected almost all droughts that were identified by SPEI and SSMI. Further, droughts detected by MMSDI were similar to real-world observed droughts in terms of drought intensity and drought-affected area. When compared to MMSDI, MSDI has the potential to overestimate drought intensity and drought-affected area across China, which should be attributed to exclusion of the evapotranspiration components from estimation of drought intensity. Therefore, MMSDI is proposed for drought monitoring that can detect agrometeorological droughts. Results of this study provide a framework for integrated drought monitoring in other regions of the world and can help to develop drought mitigation.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Nonparametric adaptive age replacement with a one-cycle criterion
International Nuclear Information System (INIS)
Coolen-Schrijner, P.; Coolen, F.P.A.
2007-01-01
Age replacement of technical units has received much attention in the reliability literature over the last four decades. Mostly, the failure time distribution for the units is assumed to be known, and minimal costs per unit of time is used as optimality criterion, where renewal reward theory simplifies the mathematics involved but requires the assumption that the same process and replacement strategy continues over a very large ('infinite') period of time. Recently, there has been increasing attention to adaptive strategies for age replacement, taking into account the information from the process. Although renewal reward theory can still be used to provide an intuitively and mathematically attractive optimality criterion, it is more logical to use minimal costs per unit of time over a single cycle as optimality criterion for adaptive age replacement. In this paper, we first show that in the classical age replacement setting, with known failure time distribution with increasing hazard rate, the one-cycle criterion leads to earlier replacement than the renewal reward criterion. Thereafter, we present adaptive age replacement with a one-cycle criterion within the nonparametric predictive inferential framework. We study the performance of this approach via simulations, which are also used for comparisons with the use of the renewal reward criterion within the same statistical framework
Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution
Directory of Open Access Journals (Sweden)
Emmanuel Kidando
2017-01-01
Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
75 FR 11837 - Chesapeake Bay Watershed Initiative
2010-03-12
... DEPARTMENT OF AGRICULTURE Commodity Credit Corporation Chesapeake Bay Watershed Initiative AGENCY...: Notice of availability of program funds for the Chesapeake Bay Watershed Initiative. SUMMARY: The... through the Chesapeake Bay Watershed Initiative for agricultural producers in the Chesapeake Bay watershed...
33 CFR 100.124 - Maggie Fischer Memorial Great South Bay Cross Bay Swim, Great South Bay, New York.
2010-07-01
... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false Maggie Fischer Memorial Great South Bay Cross Bay Swim, Great South Bay, New York. 100.124 Section 100.124 Navigation and Navigable... NAVIGABLE WATERS § 100.124 Maggie Fischer Memorial Great South Bay Cross Bay Swim, Great South Bay, New York...
Testing the equality of nonparametric regression curves based on ...
African Journals Online (AJOL)
Abstract. In this work we propose a new methodology for the comparison of two regression functions f1 and f2 in the case of homoscedastic error structure and a fixed design. Our approach is based on the empirical Fourier coefficients of the regression functions f1 and f2 respectively. As our main results we obtain the ...
According to extensive data obtained over its 13,000 km of shoreline, the Chesapeake Bay has been suffering a major, indeed unprecedented, reduction in submerged vegetation. Chesapeake Bay is alone in experiencing decline in submerged vegetation. Other estuary systems on the east coast of the United States are not so affected. These alarming results were obtained by the synthesis of the findings of numerous individual groups in addition to large consortium projects on the Chesapeake done over the past decade. R. J. Orth and R. A. Moore of the Virginia Institute of Marine Science pointed to the problem of the severe decline of submerged grasses on the Bay and along its tributaries. In a recent report, Orth and Moore note: “The decline, which began in the 1960's and accelerated in the 1970's, has affected all species in all areas. Many major river systems are now totally devoid of any rooted vegetation” (Science, 222, 51-53, 1983).
A Bayes linear Bayes method for estimation of correlated event rates.
Quigley, John; Wilson, Kevin J; Walls, Lesley; Bedford, Tim
2013-12-01
Typically, full Bayesian estimation of correlated event rates can be computationally challenging since estimators are intractable. When estimation of event rates represents one activity within a larger modeling process, there is an incentive to develop more efficient inference than provided by a full Bayesian model. We develop a new subjective inference method for correlated event rates based on a Bayes linear Bayes model under the assumption that events are generated from a homogeneous Poisson process. To reduce the elicitation burden we introduce homogenization factors to the model and, as an alternative to a subjective prior, an empirical method using the method of moments is developed. Inference under the new method is compared against estimates obtained under a full Bayesian model, which takes a multivariate gamma prior, where the predictive and posterior distributions are derived in terms of well-known functions. The mathematical properties of both models are presented. A simulation study shows that the Bayes linear Bayes inference method and the full Bayesian model provide equally reliable estimates. An illustrative example, motivated by a problem of estimating correlated event rates across different users in a simple supply chain, shows how ignoring the correlation leads to biased estimation of event rates. © 2013 Society for Risk Analysis.
Akhtar, Naveed; Mian, Ajmal
2017-10-03
We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
A robust nonparametric method for quantifying undetected extinctions.
Chisholm, Ryan A; Giam, Xingli; Sadanandan, Keren R; Fung, Tak; Rheindt, Frank E
2016-06-01
How many species have gone extinct in modern times before being described by science? To answer this question, and thereby get a full assessment of humanity's impact on biodiversity, statistical methods that quantify undetected extinctions are required. Such methods have been developed recently, but they are limited by their reliance on parametric assumptions; specifically, they assume the pools of extant and undetected species decay exponentially, whereas real detection rates vary temporally with survey effort and real extinction rates vary with the waxing and waning of threatening processes. We devised a new, nonparametric method for estimating undetected extinctions. As inputs, the method requires only the first and last date at which each species in an ensemble was recorded. As outputs, the method provides estimates of the proportion of species that have gone extinct, detected, or undetected and, in the special case where the number of undetected extant species in the present day is assumed close to zero, of the absolute number of undetected extinct species. The main assumption of the method is that the per-species extinction rate is independent of whether a species has been detected or not. We applied the method to the resident native bird fauna of Singapore. Of 195 recorded species, 58 (29.7%) have gone extinct in the last 200 years. Our method projected that an additional 9.6 species (95% CI 3.4, 19.8) have gone extinct without first being recorded, implying a true extinction rate of 33.0% (95% CI 31.0%, 36.2%). We provide R code for implementing our method. Because our method does not depend on strong assumptions, we expect it to be broadly useful for quantifying undetected extinctions. © 2016 Society for Conservation Biology.
Economic decision making and the application of nonparametric prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2008-01-01
Sustained increases in energy prices have focused attention on gas resources in low-permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are often large. Planning and development decisions for extraction of such resources must be areawide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm, the decision to enter such plays depends on reconnaissance-level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional-scale cost functions. The context of the worked example is the Devonian Antrim-shale gas play in the Michigan basin. One finding relates to selection of the resource prediction model to be used with economic models. Models chosen because they can best predict aggregate volume over larger areas (many hundreds of sites) smooth out granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined arbitrarily by extraneous factors. The analysis shows a 15-20% gain in gas volume when these simple models are applied to order drilling prospects strategically rather than to choose drilling locations randomly. Copyright ?? 2008 Society of Petroleum Engineers.
An empirical examination of restructured electricity prices
International Nuclear Information System (INIS)
Knittel, C.R.; Roberts, M.R.
2005-01-01
We present an empirical analysis of restructured electricity prices. We study the distributional and temporal properties of the price process in a non-parametric framework, after which we parametrically model the price process using several common asset price specifications from the asset-pricing literature, as well as several less conventional models motivated by the peculiarities of electricity prices. The findings reveal several characteristics unique to electricity prices including several deterministic components of the price series at different frequencies. An 'inverse leverage effect' is also found, where positive shocks to the price series result in larger increases in volatility than negative shocks. We find that forecasting performance in dramatically improved when we incorporate features of electricity prices not commonly modelled in other asset prices. Our findings have implications for how empiricists model electricity prices, as well as how theorists specify models of energy pricing. (author)
Crozier, G. F.; Schroeder, W. W.
1978-01-01
The termination of studies carried on for almost three years in the Mobile Bay area and adjacent continental shelf are reported. The initial results concentrating on the shelf and lower bay were presented in the interim report. The continued scope of work was designed to attempt a refinement of the mathematical model, assess the effectiveness of optical measurement of suspended particulate material and disseminate the acquired information. The optical characteristics of particulate solutions are affected by density gradients within the medium, density of the suspended particles, particle size, particle shape, particle quality, albedo, and the angle of refracted light. Several of these are discussed in detail.
Nonparametric Monitoring for Geotechnical Structures Subject to Long-Term Environmental Change
Directory of Open Access Journals (Sweden)
Hae-Bum Yun
2011-01-01
Full Text Available A nonparametric, data-driven methodology of monitoring for geotechnical structures subject to long-term environmental change is discussed. Avoiding physical assumptions or excessive simplification of the monitored structures, the nonparametric monitoring methodology presented in this paper provides reliable performance-related information particularly when the collection of sensor data is limited. For the validation of the nonparametric methodology, a field case study was performed using a full-scale retaining wall, which had been monitored for three years using three tilt gauges. Using the very limited sensor data, it is demonstrated that important performance-related information, such as drainage performance and sensor damage, could be disentangled from significant daily, seasonal and multiyear environmental variations. Extensive literature review on recent developments of parametric and nonparametric data processing techniques for geotechnical applications is also presented.
Kernel bandwidth estimation for non-parametric density estimation: a comparative study
CSIR Research Space (South Africa)
Van der Walt, CM
2013-12-01
Full Text Available We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high...
Examples of the Application of Nonparametric Information Geometry to Statistical Physics
Directory of Open Access Journals (Sweden)
Giovanni Pistone
2013-09-01
Full Text Available We review a nonparametric version of Amari’s information geometry in which the set of positive probability densities on a given sample space is endowed with an atlas of charts to form a differentiable manifold modeled on Orlicz Banach spaces. This nonparametric setting is used to discuss the setting of typical problems in machine learning and statistical physics, such as black-box optimization, Kullback-Leibler divergence, Boltzmann-Gibbs entropy and the Boltzmann equation.
Screen Wars, Star Wars, and Sequels: Nonparametric Reanalysis of Movie Profitability
W. D. Walls
2012-01-01
In this paper we use nonparametric statistical tools to quantify motion-picture profit. We quantify the unconditional distribution of profit, the distribution of profit conditional on stars and sequels, and we also model the conditional expectation of movie profits using a non- parametric data-driven regression model. The flexibility of the non-parametric approach accommodates the full range of possible relationships among the variables without prior specification of a functional form, thereb...
Froelich, Markus; Puhani, Patrick
2004-01-01
Based on a nonparametrically estimated model of labor market classifications, this paper makes suggestions for immigration policy using data from western Germany in the 1990s. It is demonstrated that nonparametric regression is feasible in higher dimensions with only a few thousand observations. In sum, labor markets able to absorb immigrants are characterized by above average age and by professional occupations. On the other hand, labor markets for young workers in service occupations are id...
Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices
Hiroyuki Kasahara; Katsumi Shimotsu
2006-01-01
In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...
Generalized empirical likelihood methods for analyzing longitudinal data
Wang, S.
2010-02-16
Efficient estimation of parameters is a major objective in analyzing longitudinal data. We propose two generalized empirical likelihood based methods that take into consideration within-subject correlations. A nonparametric version of the Wilks theorem for the limiting distributions of the empirical likelihood ratios is derived. It is shown that one of the proposed methods is locally efficient among a class of within-subject variance-covariance matrices. A simulation study is conducted to investigate the finite sample properties of the proposed methods and compare them with the block empirical likelihood method by You et al. (2006) and the normal approximation with a correctly estimated variance-covariance. The results suggest that the proposed methods are generally more efficient than existing methods which ignore the correlation structure, and better in coverage compared to the normal approximation with correctly specified within-subject correlation. An application illustrating our methods and supporting the simulation study results is also presented.
Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality
Directory of Open Access Journals (Sweden)
Zhanchao Li
2013-01-01
Full Text Available The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model and change of sequence distribution law of nonparametric statistical model. On this basis, through the reduction of change point problem, the establishment of basic nonparametric change point model, and asymptotic analysis on test method of basic change point problem, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is created in consideration of the situation that in practice concrete dam crack behavior may have more abnormality points. And the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is used in the actual project, demonstrating the effectiveness and scientific reasonableness of the method established. Meanwhile, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality has a complete theoretical basis and strong practicality with a broad application prospect in actual project.
Richards Bay effluent pipeline
CSIR Research Space (South Africa)
Lord, DA
1986-07-01
Full Text Available of major concern identified in the effluent are the large volume of byproduct calcium sulphate (phosphogypsum) which would smother marine life, high concentrations of fluoride highly toxic to marine life, heavy metals, chlorinated organic material... ........................ 9 THE RICHARDS BAY PIPELINE ........................................ 16 Environmental considerations ................................... 16 - Phosphogypsum disposal ................................... 16 - Effects of fluoride on locally occurring...
Gao, F.
2017-01-01
The dissertation consists of research in three subjects in two themes—Bayes and networks: The first studies the posterior contraction rates for the Dirichlet-Laplace mixtures in a deconvolution setting (Chapter 1). The second subject regards the statistical inference in preferential attachment
An adaptive distance measure for use with nonparametric models
International Nuclear Information System (INIS)
Garvey, D. R.; Hines, J. W.
2006-01-01
Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction
Sustainable development in the Hudson Bay/James Bay bioregion
International Nuclear Information System (INIS)
Anon.
1991-01-01
An overview is presented of projects planned for the James Bay/Hudson Bay region, and the expected environmental impacts of these projects. The watershed of James Bay and Hudson Bay covers well over one third of Canada, from southern Alberta to central Ontario to Baffin Island, as well as parts of north Dakota and Minnesota in the U.S.A. Hydroelectric power developments that change the timing and rate of flow of fresh water may cause changes in the nature and duration of ice cover, habitats of marine mammals, fish and migratory birds, currents into and out of Hudson Bay/James Bay, seasonal and annual loads of sediments and nutrients to marine ecosystems, and anadromous fish populations. Hydroelectric projects are proposed for the region by Quebec, Ontario and Manitoba. In January 1992, the Canadian Arctic Resources Committee (CARC), the Environmental Committee of Sanikuluaq, and the Rawson Academy of Arctic Science will launch the Hudson Bay/James Bay Bioregion Program, an independent initiative to apply an ecosystem approach to the region. Two main objectives are to provide a comprehensive assessment of the cumulative impacts of human activities on the marine and freshwater ecosystems of the Hudson Bay/James Bay bioregion, and to foster sustainable development by examining and proposing cooperative processes for decision making among governments, developers, aboriginal peoples and other stakeholders. 1 fig
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
Bioprocess iterative batch-to-batch optimization based on hybrid parametric/nonparametric models.
Teixeira, Ana P; Clemente, João J; Cunha, António E; Carrondo, Manuel J T; Oliveira, Rui
2006-01-01
This paper presents a novel method for iterative batch-to-batch dynamic optimization of bioprocesses. The relationship between process performance and control inputs is established by means of hybrid grey-box models combining parametric and nonparametric structures. The bioreactor dynamics are defined by material balance equations, whereas the cell population subsystem is represented by an adjustable mixture of nonparametric and parametric models. Thus optimizations are possible without detailed mechanistic knowledge concerning the biological system. A clustering technique is used to supervise the reliability of the nonparametric subsystem during the optimization. Whenever the nonparametric outputs are unreliable, the objective function is penalized. The technique was evaluated with three simulation case studies. The overall results suggest that the convergence to the optimal process performance may be achieved after a small number of batches. The model unreliability risk constraint along with sampling scheduling are crucial to minimize the experimental effort required to attain a given process performance. In general terms, it may be concluded that the proposed method broadens the application of the hybrid parametric/nonparametric modeling technique to "newer" processes with higher potential for optimization.
THE RESPONSE OF MONTEREY BAY TO THE 2010 CHILEAN EARTHQUAKE
Directory of Open Access Journals (Sweden)
Laurence C. Breaker
2011-01-01
Full Text Available The primary frequencies contained in the arrival sequence produced by the tsunami from the Chilean earthquake of 2010 in Monterey Bay were extracted to determine the seiche modes that were produced. Singular Spectrum Analysis (SSA and Ensemble Empirical Mode Decomposition (EEMD were employed to extract the primary frequencies of interest. The wave train from the Chilean tsunami lasted for at least four days due to multipath arrivals that may not have included reflections from outside the bay but most likely did include secondary undulations, and energy trapping in the form of edge waves, inside the bay. The SSA decomposition resolved oscillations with periods of 52-57, 34-35, 26-27, and 21-22 minutes, all frequencies that have been predicted and/or observed in previous studies. The EEMD decomposition detected oscillations with periods of 50-55 and 21-22 minutes. Periods in the range of 50-57 minutes varied due to measurement uncertainties but almost certainly correspond to the first longitudinal mode of oscillation for Monterey Bay, periods of 34-35 minutes correspond to the first transverse mode of oscillation that assumes a nodal line across the entrance of the bay, a period of 26- 27 minutes, although previously observed, may not represent a fundamental oscillation, and a period of 21-22 minutes has been predicted and observed previously. A period of ~37 minutes, close to the period of 34-35 minutes, was generated by the Great Alaskan Earthquake of 1964 in Monterey Bay and most likely represents the same mode of oscillation. The tsunamis associated with the Great Alaskan Earthquake and the Chilean Earthquake both entered Monterey Bay but initially arrived outside the bay from opposite directions. Unlike the Great Alaskan Earthquake, however, which excited only one resonant mode inside the bay, the Chilean Earthquake excited several modes suggesting that the asymmetric shape of the entrance to Monterey Bay was an important factor and that the
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
California Natural Resource Agency — The Bay Trail provides easily accessible recreational opportunities for outdoor enthusiasts, including hikers, joggers, bicyclists and skaters. It also offers a...
Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.
Bhattacharya, Abhishek; Dunson, David B
2010-12-01
Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.
Zhao, Zhibiao
2011-06-01
We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.
Multivariate nonparametric regression and visualization with R and applications to finance
Klemelä, Jussi
2014-01-01
A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio
Directory of Open Access Journals (Sweden)
Rabia Ece OMAY
2013-06-01
Full Text Available In this study, relationship between gross domestic product (GDP per capita and sulfur dioxide (SO2 and particulate matter (PM10 per capita is modeled for Turkey. Nonparametric fixed effect panel data analysis is used for the modeling. The panel data covers 12 territories, in first level of Nomenclature of Territorial Units for Statistics (NUTS, for period of 1990-2001. Modeling of the relationship between GDP and SO2 and PM10 for Turkey, the non-parametric models have given good results.
Nonparametric method for failures diagnosis in the actuating subsystem of aircraft control system
Terentev, M. N.; Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.
2018-02-01
In this paper we design a nonparametric method for failures diagnosis in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on analytical nonparametric one-step-ahead state prediction approach. This makes it possible to predict the behavior of unidentified and failure dynamic systems, to weaken the requirements to control signals, and to reduce the diagnostic time and problem complexity.
Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.
2018-03-01
We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Directory of Open Access Journals (Sweden)
Jinchao Feng
2018-03-01
Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.
Fronczyk, Kassandra; Kottas, Athanasios
2014-03-01
We develop a Bayesian nonparametric mixture modeling framework for quantal bioassay settings. The approach is built upon modeling dose-dependent response distributions. We adopt a structured nonparametric prior mixture model, which induces a monotonicity restriction for the dose-response curve. Particular emphasis is placed on the key risk assessment goal of calibration for the dose level that corresponds to a specified response. The proposed methodology yields flexible inference for the dose-response relationship as well as for other inferential objectives, as illustrated with two data sets from the literature. © 2013, The International Biometric Society.
Modern nonparametric, robust and multivariate methods festschrift in honour of Hannu Oja
Taskinen, Sara
2015-01-01
Written by leading experts in the field, this edited volume brings together the latest findings in the area of nonparametric, robust and multivariate statistical methods. The individual contributions cover a wide variety of topics ranging from univariate nonparametric methods to robust methods for complex data structures. Some examples from statistical signal processing are also given. The volume is dedicated to Hannu Oja on the occasion of his 65th birthday and is intended for researchers as well as PhD students with a good knowledge of statistics.
Tang, Niansheng; Chow, Sy-Miin; Ibrahim, Joseph G; Zhu, Hongtu
2017-12-01
Many psychological concepts are unobserved and usually represented as latent factors apprehended through multiple observed indicators. When multiple-subject multivariate time series data are available, dynamic factor analysis models with random effects offer one way of modeling patterns of within- and between-person variations by combining factor analysis and time series analysis at the factor level. Using the Dirichlet process (DP) as a nonparametric prior for individual-specific time series parameters further allows the distributional forms of these parameters to deviate from commonly imposed (e.g., normal or other symmetric) functional forms, arising as a result of these parameters' restricted ranges. Given the complexity of such models, a thorough sensitivity analysis is critical but computationally prohibitive. We propose a Bayesian local influence method that allows for simultaneous sensitivity analysis of multiple modeling components within a single fitting of the model of choice. Five illustrations and an empirical example are provided to demonstrate the utility of the proposed approach in facilitating the detection of outlying cases and common sources of misspecification in dynamic factor analysis models, as well as identification of modeling components that are sensitive to changes in the DP prior specification.
Humic Substances from Manila Bay and Bolinao Bay Sediments
Directory of Open Access Journals (Sweden)
Elma Llaguno
1997-12-01
Full Text Available The C,H,N composition of sedimentary humic acids (HA extracted from three sites in Manila Bay and six sites in Bolinao Bay yielded H/C atomic ratios of 1.1-1.4 and N/C atomic ratios of 0.09 - 0.16. The Manila Bay HA's had lower H/C and N/C ratios compared to those from Bolinao Bay. The IR spectra showed prominent aliphatic C-H and amide I and II bands. Manila Bay HA's also had less diverse molecular composition based on the GC-MS analysis of the CuO and alkaline permanganate oxidation products of the humic acids.
Empirical Test Case Specification
DEFF Research Database (Denmark)
Kalyanova, Olena; Heiselberg, Per
This document includes the empirical specification on the IEA task of evaluation building energy simulation computer programs for the Double Skin Facades (DSF) constructions. There are two approaches involved into this procedure, one is the comparative approach and another is the empirical one. I....... In the comparative approach the outcomes of different software tools are compared, while in the empirical approach the modelling results are compared with the results of experimental test cases....
Rights, Jason D; Sterba, Sonya K
2016-11-01
Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.
Block Empirical Likelihood for Longitudinal Single-Index Varying-Coefficient Model
Directory of Open Access Journals (Sweden)
Yunquan Song
2013-01-01
Full Text Available In this paper, we consider a single-index varying-coefficient model with application to longitudinal data. In order to accommodate the within-group correlation, we apply the block empirical likelihood procedure to longitudinal single-index varying-coefficient model, and prove a nonparametric version of Wilks’ theorem which can be used to construct the block empirical likelihood confidence region with asymptotically correct coverage probability for the parametric component. In comparison with normal approximations, the proposed method does not require a consistent estimator for the asymptotic covariance matrix, making it easier to conduct inference for the model's parametric component. Simulations demonstrate how the proposed method works.
Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression
Yoo, W.W.; Ghosal, S
2016-01-01
In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a
Does Private Tutoring Work? The Effectiveness of Private Tutoring: A Nonparametric Bounds Analysis
Hof, Stefanie
2014-01-01
Private tutoring has become popular throughout the world. However, evidence for the effect of private tutoring on students' academic outcome is inconclusive; therefore, this paper presents an alternative framework: a nonparametric bounds method. The present examination uses, for the first time, a large representative data-set in a European setting…
Testing a parametric function against a nonparametric alternative in IV and GMM settings
DEFF Research Database (Denmark)
Gørgens, Tue; Wurtz, Allan
This paper develops a specification test for functional form for models identified by moment restrictions, including IV and GMM settings. The general framework is one where the moment restrictions are specified as functions of data, a finite-dimensional parameter vector, and a nonparametric real ...
A structural nonparametric reappraisal of the CO2 emissions-income relationship
Azomahou, T.T.; Goedhuys - Degelin, Micheline; Nguyen-Van, P.
Relying on a structural nonparametric estimation, we show that co2 emissions clearly increase with income at low income levels. For higher income levels, we observe a decreasing relationship, though not significant. We also find thatco2 emissions monotonically increases with energy use at a
Wei, Jiawei
2011-07-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee, et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey one-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity, et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.
Assessing pupil and school performance by non-parametric and parametric techniques
de Witte, K.; Thanassoulis, E.; Simpson, G.; Battisti, G.; Charlesworth-May, A.
2010-01-01
This paper discusses the use of the non-parametric free disposal hull (FDH) and the parametric multi-level model (MLM) as alternative methods for measuring pupil and school attainment where hierarchical structured data are available. Using robust FDH estimates, we show how to decompose the overall
Nonparametric Estimation of Interval Reliability for Discrete-Time Semi-Markov Systems
DEFF Research Database (Denmark)
Georgiadis, Stylianos; Limnios, Nikolaos
2016-01-01
In this article, we consider a repairable discrete-time semi-Markov system with finite state space. The measure of the interval reliability is given as the probability of the system being operational over a given finite-length time interval. A nonparametric estimator is proposed for the interval...
Low default credit scoring using two-class non-parametric kernel density estimation
CSIR Research Space (South Africa)
Rademeyer, E
2016-12-01
Full Text Available This paper investigates the performance of two-class classification credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and non-parametric Parzen classifiers are extended, using Bayes’ rule, to include either...
DEFF Research Database (Denmark)
Ramirez, José Rangel; Sørensen, John Dalsgaard
2011-01-01
This work illustrates the updating and incorporation of information in the assessment of fatigue reliability for offshore wind turbine. The new information, coming from external and condition monitoring can be used to direct updating of the stochastic variables through a non-parametric Bayesian u...
Analyzing cost efficient production behavior under economies of scope : A nonparametric methodology
Cherchye, L.J.H.; de Rock, B.; Vermeulen, F.M.P.
2008-01-01
In designing a production model for firms that generate multiple outputs, we take as a starting point that such multioutput production refers to economies of scope, which in turn originate from joint input use and input externalities. We provide a nonparametric characterization of cost-efficient
Analyzing Cost Efficient Production Behavior Under Economies of Scope : A Nonparametric Methodology
Cherchye, L.J.H.; de Rock, B.; Vermeulen, F.M.P.
2006-01-01
In designing a production model for firms that generate multiple outputs, we take as a starting point that such multi-output production refers to economies of scope, which in turn originate from joint input use and input externalities. We provide a nonparametric characterization of cost efficient
The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models
GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.
2008-01-01
In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
Non-parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods
DEFF Research Database (Denmark)
Høg, Esben
In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean-reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...
Non-Parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods
DEFF Research Database (Denmark)
Høg, Esben
2003-01-01
In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean--reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...
A non-parametric Bayesian approach to decompounding from high frequency data
Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter
2016-01-01
Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.
Mittag, Kathleen Cage
Most researchers using factor analysis extract factors from a matrix of Pearson product-moment correlation coefficients. A method is presented for extracting factors in a non-parametric way, by extracting factors from a matrix of Spearman rho (rank correlation) coefficients. It is possible to factor analyze a matrix of association such that…
Data analysis with small samples and non-normal data nonparametrics and other strategies
Siebert, Carl F
2017-01-01
Written in everyday language for non-statisticians, this book provides all the information needed to successfully conduct nonparametric analyses. This ideal reference book provides step-by-step instructions to lead the reader through each analysis, screenshots of the software and output, and case scenarios to illustrate of all the analytic techniques.
A non-parametric method for correction of global radiation observations
DEFF Research Database (Denmark)
Bacher, Peder; Madsen, Henrik; Perers, Bengt
2013-01-01
in the observations are corrected. These are errors such as: tilt in the leveling of the sensor, shadowing from surrounding objects, clipping and saturation in the signal processing, and errors from dirt and wear. The method is based on a statistical non-parametric clear-sky model which is applied to both...
Nonparametric estimation in an "illness-death" model when all transition times are interval censored
DEFF Research Database (Denmark)
Frydman, Halina; Gerds, Thomas; Grøn, Randi
2013-01-01
We develop nonparametric maximum likelihood estimation for the parameters of an irreversible Markov chain on states {0,1,2} from the observations with interval censored times of 0 → 1, 0 → 2 and 1 → 2 transitions. The distinguishing aspect of the data is that, in addition to all transition times ...
Non-parametric Tuning of PID Controllers A Modified Relay-Feedback-Test Approach
Boiko, Igor
2013-01-01
The relay feedback test (RFT) has become a popular and efficient tool used in process identification and automatic controller tuning. Non-parametric Tuning of PID Controllers couples new modifications of classical RFT with application-specific optimal tuning rules to form a non-parametric method of test-and-tuning. Test and tuning are coordinated through a set of common parameters so that a PID controller can obtain the desired gain or phase margins in a system exactly, even with unknown process dynamics. The concept of process-specific optimal tuning rules in the nonparametric setup, with corresponding tuning rules for flow, level pressure, and temperature control loops is presented in the text. Common problems of tuning accuracy based on parametric and non-parametric approaches are addressed. In addition, the text treats the parametric approach to tuning based on the modified RFT approach and the exact model of oscillations in the system under test using the locus of a perturbedrelay system (LPRS) meth...
A comparative study of non-parametric models for identification of ...
African Journals Online (AJOL)
However, the frequency response method using random binary signals was good for unpredicted white noise characteristics and considered the best method for non-parametric system identifica-tion. The autoregressive external input (ARX) model was very useful for system identification, but on applicati-on, few input ...
A non-parametric hierarchical model to discover behavior dynamics from tracks
Kooij, J.F.P.; Englebienne, G.; Gavrila, D.M.
2012-01-01
We present a novel non-parametric Bayesian model to jointly discover the dynamics of low-level actions and high-level behaviors of tracked people in open environments. Our model represents behaviors as Markov chains of actions which capture high-level temporal dynamics. Actions may be shared by
Verrelst, Jochem; Rivera, Juan Pablo; Veroustraete, Frank; Muñoz-Marí, Jordi; Clevers, J.G.P.W.; Camps-Valls, Gustau; Moreno, José
2015-01-01
Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC),
Empirical Philosophy of Science
DEFF Research Database (Denmark)
Mansnerus, Erika; Wagenknecht, Susann
2015-01-01
knowledge takes place through the integration of the empirical or historical research into the philosophical studies, as Chang, Nersessian, Thagard and Schickore argue in their work. Building upon their contributions we will develop a blueprint for an Empirical Philosophy of Science that draws upon...... qualitative methods from the social sciences in order to advance our philosophical understanding of science in practice. We will regard the relationship between philosophical conceptualization and empirical data as an iterative dialogue between theory and data, which is guided by a particular ‘feeling with......Empirical insights are proven fruitful for the advancement of Philosophy of Science, but the integration of philosophical concepts and empirical data poses considerable methodological challenges. Debates in Integrated History and Philosophy of Science suggest that the advancement of philosophical...
International Nuclear Information System (INIS)
Janurová, Kateřina; Briš, Radim
2014-01-01
Medical survival right-censored data of about 850 patients are evaluated to analyze the uncertainty related to the risk of mortality on one hand and compare two basic surgery techniques in the context of risk of mortality on the other hand. Colorectal data come from patients who underwent colectomy in the University Hospital of Ostrava. Two basic surgery operating techniques are used for the colectomy: either traditional (open) or minimally invasive (laparoscopic). Basic question arising at the colectomy operation is, which type of operation to choose to guarantee longer overall survival time. Two non-parametric approaches have been used to quantify probability of mortality with uncertainties. In fact, complement of the probability to one, i.e. survival function with corresponding confidence levels is calculated and evaluated. First approach considers standard nonparametric estimators resulting from both the Kaplan–Meier estimator of survival function in connection with Greenwood's formula and the Nelson–Aalen estimator of cumulative hazard function including confidence interval for survival function as well. The second innovative approach, represented by Nonparametric Predictive Inference (NPI), uses lower and upper probabilities for quantifying uncertainty and provides a model of predictive survival function instead of the population survival function. The traditional log-rank test on one hand and the nonparametric predictive comparison of two groups of lifetime data on the other hand have been compared to evaluate risk of mortality in the context of mentioned surgery techniques. The size of the difference between two groups of lifetime data has been considered and analyzed as well. Both nonparametric approaches led to the same conclusion, that the minimally invasive operating technique guarantees the patient significantly longer survival time in comparison with the traditional operating technique
2006-01-01
The highest tides on Earth occur in the Minas Basin, the eastern extremity of the Bay of Fundy, Nova Scotia, Canada, where the tide range can reach 16 meters when the various factors affecting the tides are in phase. The primary cause of the immense tides of Fundy is a resonance of the Bay of Fundy-Gulf of Maine system. The system is effectively bounded at this outer end by the edge of the continental shelf with its approximately 40:1 increase in depth. The system has a natural period of approximately 13 hours, which is close to the 12h25m period of the dominant lunar tide of the Atlantic Ocean. Like a father pushing his daughter on a swing, the gentle Atlantic tidal pulse pushes the waters of the Bay of Fundy-Gulf of Maine basin at nearly the optimum frequency to cause a large to-and-fro oscillation. The greatest slosh occurs at the head (northeast end) of the system. The high tide image (top) was acquired April 20, 2001, and the low tide image (bottom) was acquired September 30, 2002. The images cover an area of 16.5 by 21 km, and are centered near 64 degrees west longitude and 45.5 degrees north latitude. With its 14 spectral bands from the visible to the thermal infrared wavelength region, and its high spatial resolution of 15 to 90 meters (about 50 to 300 feet), ASTER images Earth to map and monitor the changing surface of our planet. ASTER is one of five Earth-observing instruments launched December 18, 1999, on NASA's Terra satellite. The instrument was built by Japan's Ministry of Economy, Trade and Industry. A joint U.S./Japan science team is responsible for validation and calibration of the instrument and the data products. The broad spectral coverage and high spectral resolution of ASTER provides scientists in numerous disciplines with critical information for surface mapping, and monitoring of dynamic conditions and temporal change. Example applications are: monitoring glacial advances and retreats; monitoring potentially active volcanoes; identifying
Bio-optical water quality dynamics observed from MERIS in Pensacola Bay, Florida
Observed bio-optical water quality data collected from 2009 to 2011 in Pensacola Bay, Florida were used to develop empirical remote sensing retrieval algorithms for chlorophyll a (Chla), colored dissolved organic matter (CDOM), and suspended particulate matter (SPM). Time-series ...
Bayes procedures for adaptive inference in inverse problems for the white noise model
Knapik, B.T.; Szabó, B.T.; van der Vaart, A.W.; van Zanten, J.H.
2016-01-01
We study empirical and hierarchical Bayes approaches to the problem of estimating an infinite-dimensional parameter in mildly ill-posed inverse problems. We consider a class of prior distributions indexed by a hyperparameter that quantifies regularity. We prove that both methods we consider succeed
DEFF Research Database (Denmark)
A watershed moment of the twentieth century, the end of empire saw upheavals to global power structures and national identities. However, decolonisation profoundly affected individual subjectivities too. Life Writing After Empire examines how people around the globe have made sense of the post...... in order to understand how individual life writing reflects broader societal changes. From far-flung corners of the former British Empire, people have turned to life writing to manage painful or nostalgic memories, as well as to think about the past and future of the nation anew through the personal...
Theological reflections on empire
Directory of Open Access Journals (Sweden)
Allan A. Boesak
2009-11-01
Full Text Available Since the meeting of the World Alliance of Reformed Churches in Accra, Ghana (2004, and the adoption of the Accra Declaration, a debate has been raging in the churches about globalisation, socio-economic justice, ecological responsibility, political and cultural domination and globalised war. Central to this debate is the concept of empire and the way the United States is increasingly becoming its embodiment. Is the United States a global empire? This article argues that the United States has indeed become the expression of a modern empire and that this reality has considerable consequences, not just for global economics and politics but for theological refl ection as well.
Yates, K.K.; Cronin, T. M.; Crane, M.; Hansen, M.; Nayeghandi, A.; Swarzenski, P.; Edgar, T.; Brooks, G.R.; Suthard, B.; Hine, A.; Locker, S.; Willard, D.A.; Hastings, D.; Flower, B.; Hollander, D.; Larson, R.A.; Smith, K.
2007-01-01
Many of the nation's estuaries have been environmentally stressed since the turn of the 20th century and will continue to be impacted in the future. Tampa Bay, one the Gulf of Mexico's largest estuaries, exemplifies the threats that our estuaries face (EPA Report 2001, Tampa Bay Estuary Program-Comprehensive Conservation and Management Plan (TBEP-CCMP)). More than 2 million people live in the Tampa Bay watershed, and the population constitutes to grow. Demand for freshwater resources, conversion of undeveloped areas to resident and industrial uses, increases in storm-water runoff, and increased air pollution from urban and industrial sources are some of the known human activities that impact Tampa Bay. Beginning on 2001, additional anthropogenic modifications began in Tampa Bat including construction of an underwater gas pipeline and a desalinization plant, expansion of existing ports, and increased freshwater withdrawal from three major tributaries to the bay. In January of 2001, the Tampa Bay Estuary Program (TBEP) and its partners identifies a critical need for participation from the U.S. Geological Survey (USGS) in providing multidisciplinary expertise and a regional-scale, integrated science approach to address complex scientific research issue and critical scientific information gaps that are necessary for continued restoration and preservation of Tampa Bay. Tampa Bay stakeholders identified several critical science gaps for which USGS expertise was needed (Yates et al. 2001). These critical science gaps fall under four topical categories (or system components): 1) water and sediment quality, 2) hydrodynamics, 3) geology and geomorphology, and 4) ecosystem structure and function. Scientists and resource managers participating in Tampa Bay studies recognize that it is no longer sufficient to simply examine each of these estuarine system components individually, Rather, the interrelation among system components must be understood to develop conceptual and
Directory of Open Access Journals (Sweden)
Z. Nematollahi
2016-03-01
Full Text Available Introduction: Due to existence of the risk and uncertainty in agriculture, risk management is crucial for management in agriculture. Therefore the present study was designed to determine the risk aversion coefficient for Esfarayens farmers. Materials and Methods: The following approaches have been utilized to assess risk attitudes: (1 direct elicitation of utility functions, (2 experimental procedures in which individuals are presented with hypothetical questionnaires regarding risky alternatives with or without real payments and (3: Inference from observation of economic behavior. In this paper, we focused on approach (3: inference from observation of economic behavior, based on this assumption of existence of the relationship between the actual behavior of a decision maker and the behavior predicted from empirically specified models. A new non-parametric method and the QP method were used to calculate the coefficient of risk aversion. We maximized the decision maker expected utility with the E-V formulation (Freund, 1956. Ideally, in constructing a QP model, the variance-covariance matrix should be formed for each individual farmer. For this purpose, a sample of 100 farmers was selected using random sampling and their data about 14 products of years 2008- 2012 were assembled. The lowlands of Esfarayen were used since within this area, production possibilities are rather homogeneous. Results and Discussion: The results of this study showed that there was low correlation between some of the activities, which implies opportunities for income stabilization through diversification. With respect to transitory income, Ra, vary from 0.000006 to 0.000361 and the absolute coefficient of risk aversion in our sample were 0.00005. The estimated Ra values vary considerably from farm to farm. The results showed that the estimated Ra for the subsample existing of 'non-wealthy' farmers was 0.00010. The subsample with farmers in the 'wealthy' group had an
DEFF Research Database (Denmark)
Linnet, Kristian
2005-01-01
Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...
Management case study: Tampa Bay, Florida
Morrison, Gerold; Greening, Holly; Yates, Kimberly K.; Wolanski, Eric; McLusky, Donald S.
2011-01-01
Tampa Bay, Florida, USA, is a shallow, subtropical estuary that experienced severe cultural eutrophication between the 1940s and 1980s, a period when the human population of its watershed quadrupled. In response, citizen action led to the formation of a public- and private-sector partnership (the Tampa Bay Estuary Program), which adopted a number of management objectives to support the restoration and protection of the bay’s living resources. These included numeric chlorophyll a and water-clarity targets, as well as long-term goals addressing the spatial extent of seagrasses and other selected habitat types, to support estuarine-dependent faunal guilds. Over the past three decades, nitrogen controls involving sources such as wastewater treatment plants, stormwater conveyance systems, fertilizer manufacturing and shipping operations, and power plants have been undertaken to meet these and other management objectives. Cumulatively, these controls have resulted in a 60% reduction in annual total nitrogen (TN) loads relative to earlier worse-case (latter 1970s) conditions. As a result, annual water-clarity and chlorophyll a targets are currently met in most years, and seagrass cover measured in 2008 was the highest recorded since 1950. Factors that have contributed to the observed improvements in Tampa Bay over the past several decades include the following: (1) Development of numeric, science-based water-quality targets to meet a long-term goal of restoring seagrass acreage to 1950s levels. Empirical and mechanistic models found that annual average chlorophyll a concentrations were a primary manageable factor affecting light attenuation. The models also quantified relationships between TN loads, chlorophyll a concentrations, light attenuation, and fluctuations in seagrass cover. The availability of long-term monitoring data, and a systematic process for using the data to evaluate the effectiveness of management actions, has allowed managers to track progress and
African Journals Online (AJOL)
FIRST LADY
2011-01-18
Jan 18, 2011 ... Empirical results reveal that consumption of sugar in. Kenya varies ... experiences in trade in different regions of the world. Some studies ... To assess the relationship between domestic sugar retail prices and sugar sales in ...
National Oceanic and Atmospheric Administration, Department of Commerce — Samples were collected from October 15, 1985 through June 12, 1987 in emergent marsh and non-vegetated habitats throughout the Lavaca Bay system to characterize...
National Oceanic and Atmospheric Administration, Department of Commerce — Juvenile spotted seatrout and other sportfish are being monitored annually over a 6-mo period in Florida Bay to assess their abundance over time relative to...
Resilience of coastal wetlands to extreme hydrologicevents in Apalachicola Bay
Medeiros, S. C.; Singh, A.; Tahsin, S.
2017-12-01
Extreme hydrologic events such as hurricanes and droughts continuously threaten wetlands which provide key ecosystem services in coastal areas. The recovery time for vegetation after impact fromthese extreme events can be highly variable depending on the hazard type and intensity. Apalachicola Bay in Florida is home to a rich variety of saltwater and freshwater wetlands and is subject to a wide rangeof hydrologic hazards. Using spatiotemporal changes in Landsat-based empirical vegetation indices, we investigate the impact of hurricane and drought on both freshwater and saltwater wetlands from year 2000to 2015 in Apalachicola Bay. Our results indicate that saltwater wetlands are more resilient than freshwater wetlands and suggest that in response to hurricanes, the coastal wetlands took almost a year to recover,while recovery following a drought period was observed after only a month.
Directory of Open Access Journals (Sweden)
Chua Ming-chung
2016-01-01
Full Text Available Utilizing powerful nuclear reactors as antineutrino sources, high mountains to provide ample shielding from cosmic rays in the vicinity, and functionally identical detectors with large target volume for near-far relative measurement, the Daya Bay Reactor Neutrino Experiment has achieved unprecedented precision in measuring the neutrino mixing angle θ13 and the neutrino mass squared difference |Δm2ee|. I will report the latest Daya Bay results on neutrino oscillations and light sterile neutrino search.
Hadron Energy Reconstruction for ATLAS Barrel Combined Calorimeter Using Non-Parametrical Method
Kulchitskii, Yu A
2000-01-01
Hadron energy reconstruction for the ATLAS barrel prototype combined calorimeter in the framework of the non-parametrical method is discussed. The non-parametrical method utilizes only the known e/h ratios and the electron calibration constants and does not require the determination of any parameters by a minimization technique. Thus, this technique lends itself to fast energy reconstruction in a first level trigger. The reconstructed mean values of the hadron energies are within \\pm1% of the true values and the fractional energy resolution is [(58\\pm 3)%{\\sqrt{GeV}}/\\sqrt{E}+(2.5\\pm0.3)%]\\bigoplus(1.7\\pm0.2) GeV/E. The value of the e/h ratio obtained for the electromagnetic compartment of the combined calorimeter is 1.74\\pm0.04. Results of a study of the longitudinal hadronic shower development are also presented.
On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests
Directory of Open Access Journals (Sweden)
Aaditya Ramdas
2017-01-01
Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.
Nonparametric NAR-ARCH Modelling of Stock Prices by the Kernel Methodology
Directory of Open Access Journals (Sweden)
Mohamed Chikhi
2018-02-01
Full Text Available This paper analyses cyclical behaviour of Orange stock price listed in French stock exchange over 01/03/2000 to 02/02/2017 by testing the nonlinearities through a class of conditional heteroscedastic nonparametric models. The linearity and Gaussianity assumptions are rejected for Orange Stock returns and informational shocks have transitory effects on returns and volatility. The forecasting results show that Orange stock prices are short-term predictable and nonparametric NAR-ARCH model has better performance over parametric MA-APARCH model for short horizons. Plus, the estimates of this model are also better comparing to the predictions of the random walk model. This finding provides evidence for weak form of inefficiency in Paris stock market with limited rationality, thus it emerges arbitrage opportunities.
Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors
Directory of Open Access Journals (Sweden)
Xibin Zhang
2016-04-01
Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.
Bootstrap Prediction Intervals in Non-Parametric Regression with Applications to Anomaly Detection
Kumar, Sricharan; Srivistava, Ashok N.
2012-01-01
Prediction intervals provide a measure of the probable interval in which the outputs of a regression model can be expected to occur. Subsequently, these prediction intervals can be used to determine if the observed output is anomalous or not, conditioned on the input. In this paper, a procedure for determining prediction intervals for outputs of nonparametric regression models using bootstrap methods is proposed. Bootstrap methods allow for a non-parametric approach to computing prediction intervals with no specific assumptions about the sampling distribution of the noise or the data. The asymptotic fidelity of the proposed prediction intervals is theoretically proved. Subsequently, the validity of the bootstrap based prediction intervals is illustrated via simulations. Finally, the bootstrap prediction intervals are applied to the problem of anomaly detection on aviation data.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
Bayesian Non-Parametric Mixtures of GARCH(1,1 Models
Directory of Open Access Journals (Sweden)
John W. Lau
2012-01-01
Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.
Bornkamp, Björn; Ickstadt, Katja
2009-03-01
In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.
Promotion time cure rate model with nonparametric form of covariate effects.
Chen, Tianlei; Du, Pang
2018-05-10
Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.
Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki
2010-06-01
Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks
Gray-Davies, Tristan; Holmes, Chris C.; Caron, François
2018-01-01
We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150
Yau, Christopher; Holmes, Chris
2011-07-01
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
von Hirschhausen, Christian R.; Cullmann, Astrid
2005-01-01
Abstract This paper applies parametric and non-parametric and parametric tests to assess the efficiency of electricity distribution companies in Germany. We address traditional issues in electricity sector benchmarking, such as the role of scale effects and optimal utility size, as well as new evidence specific to the situation in Germany. We use labour, capital, and peak load capacity as inputs, and units sold and the number of customers as output. The data cover 307 (out of 553) ...
Ambrogioni, Luca; Güçlü, Umut; van Gerven, Marcel A. J.; Maris, Eric
2017-01-01
This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of kernel functions centered at a subset of training points. The weights are determined by the outer layer of a deep neural network, trained by minimizing the negative log likelihood. This generalizes the popular quantized softmax approach, which can be seen ...
Driving Style Analysis Using Primitive Driving Patterns With Bayesian Nonparametric Approaches
Wang, Wenshuo; Xi, Junqiang; Zhao, Ding
2017-01-01
Analysis and recognition of driving styles are profoundly important to intelligent transportation and vehicle calibration. This paper presents a novel driving style analysis framework using the primitive driving patterns learned from naturalistic driving data. In order to achieve this, first, a Bayesian nonparametric learning method based on a hidden semi-Markov model (HSMM) is introduced to extract primitive driving patterns from time series driving data without prior knowledge of the number...
Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality
Li, Zhanchao; Gu, Chongshi; Wu, Zhongru
2013-01-01
The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model ...
Adaptive nonparametric estimation for L\\'evy processes observed at low frequency
Kappus, Johanna
2013-01-01
This article deals with adaptive nonparametric estimation for L\\'evy processes observed at low frequency. For general linear functionals of the L\\'evy measure, we construct kernel estimators, provide upper risk bounds and derive rates of convergence under regularity assumptions. Our focus lies on the adaptive choice of the bandwidth, using model selection techniques. We face here a non-standard problem of model selection with unknown variance. A new approach towards this problem is proposed, ...
Bootstrapping the economy -- a non-parametric method of generating consistent future scenarios
Müller, Ulrich A; Bürgi, Roland; Dacorogna, Michel M
2004-01-01
The fortune and the risk of a business venture depends on the future course of the economy. There is a strong demand for economic forecasts and scenarios that can be applied to planning and modeling. While there is an ongoing debate on modeling economic scenarios, the bootstrapping (or resampling) approach presented here has several advantages. As a non-parametric method, it directly relies on past market behaviors rather than debatable assumptions on models and parameters. Simultaneous dep...
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data
DEFF Research Database (Denmark)
Tan, Qihua; Thomassen, Mads; Burton, Mark
2017-01-01
the heterogeneous time-course gene expression patterns. Application of the method identified nonlinear time-course patterns in high agreement with parametric analysis. We conclude that the non-parametric nature in the generalized correlation analysis could be an useful and efficient tool for analyzing microarray...... time-course data and for exploring the complex relationships in the omics data for studying their association with disease and health....
Remotely Sensing Pollution: Detection and Monitoring of PCBs in the San Francisco Bay
Hilton, A.; Kudela, R. M.; Bausell, J.
2016-12-01
While the EPA banned polychlorinated biphenyls (PCBs) in 1977, they continue to persist in San Francisco Bay (SF Bay), often at dangerously high concentrations due to their long half-life. However, in spite of their associated health and environmental risks, PCB monitoring within SF Bay is extremely limited, due in large part to the high costs, both in terms of labor and capital that are associated with it. In this study, a cost effective alternative to in-situ PCB sampling is presented by demonstrating the feasibility of PCB detection via remote sensing. This was done by first establishing relationships between in-situ measurements of sum of 40 PCB concentrations and total suspended sediment concentration (SSC) collected from 1998-2006 at 37 stations distributed throughout SF Bay. A correlation was discovered for all stations at (R2 =0.32), which improved markedly upon partitioning stations into north bay, (R2 =0.64), central bay (R2 =0.80) and south bay (R2 =0.52) regions. SSC was then compared from three USGS monitoring stations with temporally consistent Landsat 8 imagery. The resulting correlation between Landsat 8 (Rrs 654) and SSC measured at USGS stations (R2 =0.50) was validated using an Airborne Visible/ Infrared Imaging Spectrometer (AVIRIS) image. The end product is a two-step empirical algorithm that can derive PCB from Landsat 8 imagery within SF Bay. This algorithm can generate spatial PCB concentration maps for SF Bay, which can in turn be utilized to increase ability to forecast PCB concentration. The observation that correlation between AVIRIS (Rrs 657) and SSC was stronger than that of Landsat 8 suggests that the accuracy of this algorithm could be enhanced with improved atmospheric correction.
Directory of Open Access Journals (Sweden)
Ibsen Chivatá Cárdenas
2008-05-01
Full Text Available This article presents a rainfall model constructed by applying non-parametric modelling and imprecise probabilities; these tools were used because there was not enough homogeneous information in the study area. The area’s hydro-logical information regarding rainfall was scarce and existing hydrological time series were not uniform. A distributed extended rainfall model was constructed from so-called probability boxes (p-boxes, multinomial probability distribu-tion and confidence intervals (a friendly algorithm was constructed for non-parametric modelling by combining the last two tools. This model confirmed the high level of uncertainty involved in local rainfall modelling. Uncertainty en-compassed the whole range (domain of probability values thereby showing the severe limitations on information, leading to the conclusion that a detailed estimation of probability would lead to significant error. Nevertheless, rele-vant information was extracted; it was estimated that maximum daily rainfall threshold (70 mm would be surpassed at least once every three years and the magnitude of uncertainty affecting hydrological parameter estimation. This paper’s conclusions may be of interest to non-parametric modellers and decisions-makers as such modelling and imprecise probability represents an alternative for hydrological variable assessment and maybe an obligatory proce-dure in the future. Its potential lies in treating scarce information and represents a robust modelling strategy for non-seasonal stochastic modelling conditions
Dai, Wenlin; Tong, Tiejun; Zhu, Lixing
2017-01-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Smooth semi-nonparametric (SNP) estimation of the cumulative incidence function.
Duc, Anh Nguyen; Wolbers, Marcel
2017-08-15
This paper presents a novel approach to estimation of the cumulative incidence function in the presence of competing risks. The underlying statistical model is specified via a mixture factorization of the joint distribution of the event type and the time to the event. The time to event distributions conditional on the event type are modeled using smooth semi-nonparametric densities. One strength of this approach is that it can handle arbitrary censoring and truncation while relying on mild parametric assumptions. A stepwise forward algorithm for model estimation and adaptive selection of smooth semi-nonparametric polynomial degrees is presented, implemented in the statistical software R, evaluated in a sequence of simulation studies, and applied to data from a clinical trial in cryptococcal meningitis. The simulations demonstrate that the proposed method frequently outperforms both parametric and nonparametric alternatives. They also support the use of 'ad hoc' asymptotic inference to derive confidence intervals. An extension to regression modeling is also presented, and its potential and challenges are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Dai, Wenlin
2017-09-01
Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.
Empirical philosophy of science
DEFF Research Database (Denmark)
Wagenknecht, Susann; Nersessian, Nancy J.; Andersen, Hanne
2015-01-01
A growing number of philosophers of science make use of qualitative empirical data, a development that may reconfigure the relations between philosophy and sociology of science and that is reminiscent of efforts to integrate history and philosophy of science. Therefore, the first part...... of this introduction to the volume Empirical Philosophy of Science outlines the history of relations between philosophy and sociology of science on the one hand, and philosophy and history of science on the other. The second part of this introduction offers an overview of the papers in the volume, each of which...... is giving its own answer to questions such as: Why does the use of qualitative empirical methods benefit philosophical accounts of science? And how should these methods be used by the philosopher?...
DEFF Research Database (Denmark)
Gravier, Magali
2011-01-01
The article discusses the concepts of federation and empire in the context of the European Union (EU). Even if these two concepts are not usually contrasted to one another, the article shows that they refer to related type of polities. Furthermore, they can be used at a time because they shed light...... on different and complementary aspects of the European integration process. The article concludes that the EU is at the crossroads between federation and empire and may remain an ‘imperial federation’ for several decades. This could mean that the EU is on the verge of transforming itself to another type...
Empirical comparison of theories
International Nuclear Information System (INIS)
Opp, K.D.; Wippler, R.
1990-01-01
The book represents the first, comprehensive attempt to take an empirical approach for comparative assessment of theories in sociology. The aims, problems, and advantages of the empirical approach are discussed in detail, and the three theories selected for the purpose of this work are explained. Their comparative assessment is performed within the framework of several research projects, which among other subjects also investigate the social aspects of the protest against nuclear power plants. The theories analysed in this context are the theory of mental incongruities and that of the benefit, and their efficiency in explaining protest behaviour is compared. (orig./HSCH) [de
DEFF Research Database (Denmark)
Grund, Cynthia M.
The toolbox for empirically exploring the ways that artistic endeavors convey and activate meaning on the part of performers and audiences continues to expand. Current work employing methods at the intersection of performance studies, philosophy, motion capture and neuroscience to better understand...... musical performance and reception is inspired by traditional approaches within aesthetics, but it also challenges some of the presuppositions inherent in them. As an example of such work I present a research project in empirical music aesthetics begun last year and of which I am a team member....
Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.
2011-05-01
Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation
Directory of Open Access Journals (Sweden)
Joaquín Texeira Quirós
2013-09-01
Full Text Available Purpose: This empirical study analyzes a questionnaire answered by a sample of ISO 9000 certified companies and a control sample of companies which have not been certified, using a multivariate predictive model. With this approach, we assess which quality practices are associated to the likelihood of the firm being certified. Design/methodology/approach: We implemented nonparametric decision trees, in order to see which variables influence more the fact that the company be certified or not, i.e., the motivations that lead companies to make sure. Findings: The results show that only four questionnaire items are sufficient to predict if a firm is certified or not. It is shown that companies in which the respondent manifests greater concern with respect to customers relations; motivations of the employees and strategic planning have higher likelihood of being certified. Research implications: the reader should note that this study is based on data from a single country and, of course, these results capture many idiosyncrasies if its economic and corporate environment. It would be of interest to understand if this type of analysis reveals some regularities across different countries. Practical implications: companies should look for a set of practices congruent with total quality management and ISO 9000 certified. Originality/value: This study contributes to the literature on the internal motivation of companies to achieve certification under the ISO 9000 standard, by performing a comparative analysis of questionnaires answered by a sample of certified companies and a control sample of companies which have not been certified. In particular, we assess how the manager’s perception on the intensity in which quality practices are deployed in their firms is associated to the likelihood of the firm being certified.Purpose: This empirical study analyzes a questionnaire answered by a sample of ISO 9000 certified companies and a control sample of companies
International Nuclear Information System (INIS)
Bobrowski, Sebastian; Chen, Hong; Döring, Maik; Jensen, Uwe; Schinköthe, Wolfgang
2015-01-01
In practice manufacturers may have lots of failure data of similar products using the same technology basis under different operating conditions. Thus, one can try to derive predictions for the distribution of the lifetime of newly developed components or new application environments through the existing data using regression models based on covariates. Three categories of such regression models are considered: a parametric, a semiparametric and a nonparametric approach. First, we assume that the lifetime is Weibull distributed, where its parameters are modelled as linear functions of the covariate. Second, the Cox proportional hazards model, well-known in Survival Analysis, is applied. Finally, a kernel estimator is used to interpolate between empirical distribution functions. In particular the last case is new in the context of reliability analysis. We propose a goodness of fit measure (GoF), which can be applied to all three types of regression models. Using this GoF measure we discuss a new model selection procedure. To illustrate this method of reliability prediction, the three classes of regression models are applied to real test data of motor experiments. Further the performance of the approaches is investigated by Monte Carlo simulations. - Highlights: • We estimate the lifetime distribution in the presence of a covariate. • Three types of regression models are considered and compared. • A new nonparametric estimator based on our particular data structure is introduced. • We propose a goodness of fit measure and show a new model selection procedure. • A case study with real data and Monte Carlo simulations are performed
Empirical research through design
Keyson, D.V.; Bruns, M.
2009-01-01
This paper describes the empirical research through design method (ERDM), which differs from current approaches to research through design by enforcing the need for the designer, after a series of pilot prototype based studies, to a-priori develop a number of testable interaction design hypothesis
Essays in empirical microeconomics
Péter, A.N.
2016-01-01
The empirical studies in this thesis investigate various factors that could affect individuals' labor market, family formation and educational outcomes. Chapter 2 focuses on scheduling as a potential determinant of individuals' productivity. Chapter 3 looks at the role of a family factor on
Worship, Reflection, Empirical Research
Ding Dong,
2012-01-01
In my youth, I was a worshipper of Mao Zedong. From the latter stage of the Mao Era to the early years of Reform and Opening, I began to reflect on Mao and the Communist Revolution he launched. In recent years I’ve devoted myself to empirical historical research on Mao, seeking the truth about Mao and China’s modern history.
DEFF Research Database (Denmark)
Bang, Peter Fibiger
2007-01-01
This articles seeks to establish a new set of organizing concepts for the analysis of the Roman imperial economy from Republic to late antiquity: tributary empire, port-folio capitalism and protection costs. Together these concepts explain better economic developments in the Roman world than the...
Empirically sampling Universal Dependencies
DEFF Research Database (Denmark)
Schluter, Natalie; Agic, Zeljko
2017-01-01
Universal Dependencies incur a high cost in computation for unbiased system development. We propose a 100% empirically chosen small subset of UD languages for efficient parsing system development. The technique used is based on measurements of model capacity globally. We show that the diversity o...
33 CFR 100.919 - International Bay City River Roar, Bay City, MI.
2010-07-01
... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false International Bay City River Roar, Bay City, MI. 100.919 Section 100.919 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF... Bay City River Roar, Bay City, MI. (a) Regulated Area. A regulated area is established to include all...
77 FR 2972 - Thunder Bay Power Company, Thunder Bay Power, LLC, et al.
2012-01-20
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Thunder Bay Power Company, Thunder Bay Power, LLC, et al.; Notice of Application for Transfer of Licenses, and Soliciting Comments and Motions To Intervene Thunder Bay Power Company Project No. 2404-095 Thunder Bay Power, LLC Midwest Hydro, Inc...
33 CFR 162.125 - Sturgeon Bay and the Sturgeon Bay Ship Canal, Wisc.
2010-07-01
... 33 Navigation and Navigable Waters 2 2010-07-01 2010-07-01 false Sturgeon Bay and the Sturgeon Bay Ship Canal, Wisc. 162.125 Section 162.125 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) PORTS AND WATERWAYS SAFETY INLAND WATERWAYS NAVIGATION REGULATIONS § 162.125 Sturgeon Bay and the Sturgeon Bay Ship...
2012-06-28
... 1625-AA00 Safety Zone; Alexandria Bay Chamber of Commerce, St. Lawrence River, Alexandria Bay, NY... restrict vessels from a portion of the St. Lawrence River during the Alexandria Bay Chamber of Commerce... of proposed rulemaking (NPRM) entitled Safety Zone; Alexandria Bay Chamber of Commerce, St. Lawrence...
Humboldt Bay, California Benthic Habitats 2009 Geodatabase
National Oceanic and Atmospheric Administration, Department of Commerce — Humboldt Bay is the largest estuary in California north of San Francisco Bay and represents a significant resource for the north coast region. Beginning in 2007 the...
Humboldt Bay Benthic Habitats 2009 Aquatic Setting
National Oceanic and Atmospheric Administration, Department of Commerce — Humboldt Bay is the largest estuary in California north of San Francisco Bay and represents a significant resource for the north coast region. Beginning in 2007 the...
San Francisco Bay Water Quality Improvement Fund
EPAs grant program to protect and restore San Francisco Bay. The San Francisco Bay Water Quality Improvement Fund (SFBWQIF) has invested in 58 projects along with 70 partners contributing to restore wetlands, water quality, and reduce polluted runoff.,
South Bay Salt Pond Mercury Studies Project
Information about the SFBWQP South Bay Salt Pond Mercury Studies Project, part of an EPA competitive grant program to improve SF Bay water quality focused on restoring impaired waters and enhancing aquatic resources.
Humboldt Bay, California Benthic Habitats 2009 Substrate
National Oceanic and Atmospheric Administration, Department of Commerce — Humboldt Bay is the largest estuary in California north of San Francisco Bay and represents a significant resource for the north coast region. Beginning in 2007 the...
Humboldt Bay, California Benthic Habitats 2009 Geoform
National Oceanic and Atmospheric Administration, Department of Commerce — Humboldt Bay is the largest estuary in California north of San Francisco Bay and represents a significant resource for the north coast region. Beginning in 2007 the...
Contaminant transport in Massachusetts Bay
Butman, Bradford
Construction of a new treatment plant and outfall to clean up Boston Harbor is currently one of the world's largest public works projects, costing about $4 billion. There is concern about the long-term impact of contaminants on Massachusetts Bay and adjacent Gulf of Maine because these areas are used extensively for transportation, recreation, fishing, and tourism, as well as waste disposal. Public concern also focuses on Stellwagen Bank, located on the eastern side of Massachusetts Bay, which is an important habitat for endangered whales. Contaminants reach Massachusetts Bay not only from Boston Harbor, but from other coastal communities on the Gulf of Maine, as well as from the atmosphere. Knowledge of the pathways, mechanisms, and rates at which pollutants are transported throughout these coastal environments is needed to address a wide range of management questions.
Bayes linear statistics, theory & methods
Goldstein, Michael
2007-01-01
Bayesian methods combine information available from data with any prior information available from expert knowledge. The Bayes linear approach follows this path, offering a quantitative structure for expressing beliefs, and systematic methods for adjusting these beliefs, given observational data. The methodology differs from the full Bayesian methodology in that it establishes simpler approaches to belief specification and analysis based around expectation judgements. Bayes Linear Statistics presents an authoritative account of this approach, explaining the foundations, theory, methodology, and practicalities of this important field. The text provides a thorough coverage of Bayes linear analysis, from the development of the basic language to the collection of algebraic results needed for efficient implementation, with detailed practical examples. The book covers:The importance of partial prior specifications for complex problems where it is difficult to supply a meaningful full prior probability specification...
International Nuclear Information System (INIS)
Davis, J.M.; Pollock, J.R.
1992-01-01
Almost every day, it seems, someone is mentioning Prudhoe Bay---its development activities, the direction of its oil production, and more recently its decline rate. Almost as frequently, someone is mentioning the number of companies abandoning exploration in Alaska. The state faces a double-edged dilemma: decline of its most important oil field and a diminished effort to find a replacement for the lost production. ARCO has seen the Prudhoe Bay decline coming for some time and has been planning for it. We have reduced staff, and ARCO and BP Exploration are finding cost-effective ways to work more closely together through such vehicles as shared services. At the same time, ARCO is continuing its high level of Alaskan exploration. This article will assess the future of Prudhoe Bay from a technical perspective, review ARCO's exploration plans for Alaska, and suggest what the state can do to encourage other companies to invest in this crucial producing region and exploratory frontier
International Nuclear Information System (INIS)
Honda, Teruyuki; Kimura, Ken-ichiro
2003-01-01
Fourteen major and trace elements in marine sediment core samples collected from the coasts along eastern Japan, i.e. Tokyo Bay (II) (the recess), Tokyo Bay (IV) (the mouth), Mutsu Bay and Funka Bay and the Northwest Pacific basin as a comparative subject were determined by the instrumental neutron activation analysis (INAA). The sedimentation rates and sedimentary ages were calculated for the coastal sediment cores by the 210 Pb method. The results obtained in this study are summarized as follows: (1) Lanthanoid abundance patterns suggested that the major origin of the sediments was terrigenous material. La*/Lu* and Ce*/La* ratios revealed that the sediments from Tokyo Bay (II) and Mutsu Bay more directly reflected the contribution from river than those of other regions. In addition, the Th/Sc ratio indicated that the coastal sediments mainly originated in the materials from the volcanic island-arcs, Japanese islands, whereas those from the Northwest Pacific mainly from the continent. (2) The correlation between the Ce/U and Th/U ratios with high correlation coefficients of 0.920 to 0.991 indicated that all the sediments from Tokyo Bay (II) and Funka Bay were in reducing conditions while at least the upper sediments from Tokyo Bay (IV) and Mutsu Bay were in oxidizing conditions. (3) It became quite obvious that the sedimentation mechanism and the sedimentation environment at Tokyo Bay (II) was different from those at Tokyo Bay (IV), since the sedimentation rate at Tokyo Bay (II) was approximately twice as large as that at Tokyo Bay (IV). The sedimentary age of the 5th layer (8∼10 cm in depth) from Funka Bay was calculated at approximately 1940∼50, which agreed with the time, 1943∼45 when Showa-shinzan was formed by the eruption of the Usu volcano. (author)
Mobile Bay turbidity plume study
Crozier, G. F.
1976-01-01
Laboratory and field transmissometer studies on the effect of suspended particulate material upon the appearance of water are reported. Quantitative correlations were developed between remotely sensed image density, optical sea truth data, and actual sediment load. Evaluation of satellite image sea truth data for an offshore plume projects contours of transmissivity for two different tidal phases. Data clearly demonstrate the speed of change and movement of the optical plume for water patterns associated with the mouth of Mobile bay in which relatively clear Gulf of Mexico water enters the bay on the eastern side. Data show that wind stress in excess of 15 knots has a marked impact in producing suspended sediment loads.
Automation in tube finishing bay
International Nuclear Information System (INIS)
Bhatnagar, Prateek; Satyadev, B.; Raghuraman, S.; Syama Sundara Rao, B.
1997-01-01
Automation concept in tube finishing bay, introduced after the final pass annealing of PHWR tubes resulted in integration of number of sub-systems in synchronisation with each other to produce final cut fuel tubes of specified length, tube finish etc. The tube finishing bay which was physically segregated into four distinct areas: 1. tube spreader and stacking area, 2. I.D. sand blasting area, 3. end conditioning, wad blowing, end capping and O.D. wet grinding area, 4. tube inspection, tube cutting and stacking area has been studied
Chesapeake Bay plume dynamics from LANDSAT
Munday, J. C., Jr.; Fedosh, M. S.
1981-01-01
LANDSAT images with enhancement and density slicing show that the Chesapeake Bay plume usually frequents the Virginia coast south of the Bay mouth. Southwestern (compared to northern) winds spread the plume easterly over a large area. Ebb tide images (compared to flood tide images) show a more dispersed plume. Flooding waters produce high turbidity levels over the shallow northern portion of the Bay mouth.
Default Bayes factors for ANOVA designs
Rouder, Jeffrey N.; Morey, Richard D.; Speckman, Paul L.; Province, Jordan M.
2012-01-01
Bayes factors have been advocated as superior to p-values for assessing statistical evidence in data. Despite the advantages of Bayes factors and the drawbacks of p-values, inference by p-values is still nearly ubiquitous. One impediment to the adoption of Bayes factors is a lack of practical
Curceac, S.; Ternynck, C.; Ouarda, T.
2015-12-01
Over the past decades, a substantial amount of research has been conducted to model and forecast climatic variables. In this study, Nonparametric Functional Data Analysis (NPFDA) methods are applied to forecast air temperature and wind speed time series in Abu Dhabi, UAE. The dataset consists of hourly measurements recorded for a period of 29 years, 1982-2010. The novelty of the Functional Data Analysis approach is in expressing the data as curves. In the present work, the focus is on daily forecasting and the functional observations (curves) express the daily measurements of the above mentioned variables. We apply a non-linear regression model with a functional non-parametric kernel estimator. The computation of the estimator is performed using an asymmetrical quadratic kernel function for local weighting based on the bandwidth obtained by a cross validation procedure. The proximities between functional objects are calculated by families of semi-metrics based on derivatives and Functional Principal Component Analysis (FPCA). Additionally, functional conditional mode and functional conditional median estimators are applied and the advantages of combining their results are analysed. A different approach employs a SARIMA model selected according to the minimum Akaike (AIC) and Bayessian (BIC) Information Criteria and based on the residuals of the model. The performance of the models is assessed by calculating error indices such as the root mean square error (RMSE), relative RMSE, BIAS and relative BIAS. The results indicate that the NPFDA models provide more accurate forecasts than the SARIMA models. Key words: Nonparametric functional data analysis, SARIMA, time series forecast, air temperature, wind speed
Martinez Manzanera, Octavio; Elting, Jan Willem; van der Hoeven, Johannes H.; Maurits, Natasha M.
2016-01-01
In the clinic, tremor is diagnosed during a time-limited process in which patients are observed and the characteristics of tremor are visually assessed. For some tremor disorders, a more detailed analysis of these characteristics is needed. Accelerometry and electromyography can be used to obtain a better insight into tremor. Typically, routine clinical assessment of accelerometry and electromyography data involves visual inspection by clinicians and occasionally computational analysis to obtain objective characteristics of tremor. However, for some tremor disorders these characteristics may be different during daily activity. This variability in presentation between the clinic and daily life makes a differential diagnosis more difficult. A long-term recording of tremor by accelerometry and/or electromyography in the home environment could help to give a better insight into the tremor disorder. However, an evaluation of such recordings using routine clinical standards would take too much time. We evaluated a range of techniques that automatically detect tremor segments in accelerometer data, as accelerometer data is more easily obtained in the home environment than electromyography data. Time can be saved if clinicians only have to evaluate the tremor characteristics of segments that have been automatically detected in longer daily activity recordings. We tested four non-parametric methods and five parametric methods on clinical accelerometer data from 14 patients with different tremor disorders. The consensus between two clinicians regarding the presence or absence of tremor on 3943 segments of accelerometer data was employed as reference. The nine methods were tested against this reference to identify their optimal parameters. Non-parametric methods generally performed better than parametric methods on our dataset when optimal parameters were used. However, one parametric method, employing the high frequency content of the tremor bandwidth under consideration
Wang, Yuanjia; Garcia, Tanya P; Ma, Yanyuan
2012-01-01
This work presents methods for estimating genotype-specific distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs) which do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators which do not assume parametric density models and are easy to implement. They are based on the inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). The AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington's Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated non-carrier survival rates to that of the U.S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared to non-carriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic testing, and in facilitating future subjects at risk
International Nuclear Information System (INIS)
McIntee, Erin; Viglino, Emilie; Rinke, Caitlin; Kumor, Stephanie; Ni Liqiang; Sigman, Michael E.
2010-01-01
Laser-induced breakdown spectroscopy (LIBS) has been investigated for the discrimination of automobile paint samples. Paint samples from automobiles of different makes, models, and years were collected and separated into sets based on the color, presence or absence of effect pigments and the number of paint layers. Twelve LIBS spectra were obtained for each paint sample, each an average of a five single shot 'drill down' spectra from consecutive laser ablations in the same spot on the sample. Analyses by a nonparametric permutation test and a parametric Wald test were performed to determine the extent of discrimination within each set of paint samples. The discrimination power and Type I error were assessed for each data analysis method. Conversion of the spectral intensity to a log-scale (base 10) resulted in a higher overall discrimination power while observing the same significance level. Working on the log-scale, the nonparametric permutation tests gave an overall 89.83% discrimination power with a size of Type I error being 4.44% at the nominal significance level of 5%. White paint samples, as a group, were the most difficult to differentiate with the power being only 86.56% followed by 95.83% for black paint samples. Parametric analysis of the data set produced lower discrimination (85.17%) with 3.33% Type I errors, which is not recommended for both theoretical and practical considerations. The nonparametric testing method is applicable across many analytical comparisons, with the specific application described here being the pairwise comparison of automotive paint samples.
DEFF Research Database (Denmark)
Rasch, Astrid
of the collective, but insufficient attention has been paid to how individuals respond to such narrative changes. This dissertation examines the relationship between individual and collective memory at the end of empire through analysis of 13 end of empire autobiographies by public intellectuals from Australia......Decolonisation was a major event of the twentieth century, redrawing maps and impacting on identity narratives around the globe. As new nations defined their place in the world, the national and imperial past was retold in new cultural memories. These developments have been studied at the level......, the Anglophone Caribbean and Zimbabwe. I conceive of memory as reconstructive and social, with individual memory striving to make sense of the past in the present in dialogue with surrounding narratives. By examining recurring tropes in the autobiographies, like colonial education, journeys to the imperial...
International Nuclear Information System (INIS)
Guillemoles, A.; Lazareva, A.
2008-01-01
Gazprom is conquering the world. The Russian industrial giant owns the hugest gas reserves and enjoys the privilege of a considerable power. Gazprom edits journals, owns hospitals, airplanes and has even built cities where most of the habitants work for him. With 400000 workers, Gazprom represents 8% of Russia's GDP. This inquiry describes the history and operation of this empire and show how its has become a masterpiece of the government's strategy of russian influence reconquest at the world scale. Is it going to be a winning game? Are the corruption affairs and the expected depletion of resources going to weaken the empire? The authors shade light on the political and diplomatic strategies that are played around the crucial dossier of the energy supply. (J.S.)
Causal Diagrams for Empirical Research
Pearl, Judea
1994-01-01
The primary aim of this paper is to show how graphical models can be used as a mathematical language for integrating statistical and subject-matter information. In particular, the paper develops a principled, nonparametric framework for causal inference, in which diagrams are queried to determine if the assumptions available are sufficient for identifiying causal effects from non-experimental data. If so the diagrams can be queried to produce mathematical expressions for causal effects in ter...
DEFF Research Database (Denmark)
Christensen, Kim; Hounyo, Ulrich; Podolskij, Mark
In this paper, we propose a nonparametric way to test the hypothesis that time-variation in intraday volatility is caused solely by a deterministic and recurrent diurnal pattern. We assume that noisy high-frequency data from a discretely sampled jump-diffusion process are available. The test...... inference, we propose a new bootstrap approach, which leads to almost correctly sized tests of the null hypothesis. We apply the developed framework to a large cross-section of equity high-frequency data and find that the diurnal pattern accounts for a rather significant fraction of intraday variation...
Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.
2018-02-01
In this paper we design a nonparametric method for failures detection and localization in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on algebraic solvability conditions for the aircraft model identification problem. This makes it possible to significantly increase the efficiency of detection and localization problem solution by completely eliminating errors, associated with aircraft model uncertainties.
Comparative Study of Parametric and Non-parametric Approaches in Fault Detection and Isolation
DEFF Research Database (Denmark)
Katebi, S.D.; Blanke, M.; Katebi, M.R.
This report describes a comparative study between two approaches to fault detection and isolation in dynamic systems. The first approach uses a parametric model of the system. The main components of such techniques are residual and signature generation for processing and analyzing. The second...... approach is non-parametric in the sense that the signature analysis is only dependent on the frequency or time domain information extracted directly from the input-output signals. Based on these approaches, two different fault monitoring schemes are developed where the feature extraction and fault decision...
Energy Technology Data Exchange (ETDEWEB)
Lopez Fontan, J.L.; Costa, J.; Ruso, J.M.; Prieto, G. [Dept. of Applied Physics, Univ. of Santiago de Compostela, Santiago de Compostela (Spain); Sarmiento, F. [Dept. of Mathematics, Faculty of Informatics, Univ. of A Coruna, A Coruna (Spain)
2004-02-01
The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found. (orig.)
International Nuclear Information System (INIS)
Peterson, James R.; Haas, Timothy C.; Lee, Danny C.
2000-01-01
Natural resource professionals are increasingly required to develop rigorous statistical models that relate environmental data to categorical responses data. Recent advances in the statistical and computing sciences have led to the development of sophisticated methods for parametric and nonparametric analysis of data with categorical responses. The statistical software package CATDAT was designed to make some of these relatively new and powerful techniques available to scientists. The CATDAT statistical package includes 4 analytical techniques: generalized logit modeling; binary classification tree; extended K-nearest neighbor classification; and modular neural network
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data.
Tan, Qihua; Thomassen, Mads; Burton, Mark; Mose, Kristian Fredløv; Andersen, Klaus Ejner; Hjelmborg, Jacob; Kruse, Torben
2017-06-06
Modeling complex time-course patterns is a challenging issue in microarray study due to complex gene expression patterns in response to the time-course experiment. We introduce the generalized correlation coefficient and propose a combinatory approach for detecting, testing and clustering the heterogeneous time-course gene expression patterns. Application of the method identified nonlinear time-course patterns in high agreement with parametric analysis. We conclude that the non-parametric nature in the generalized correlation analysis could be an useful and efficient tool for analyzing microarray time-course data and for exploring the complex relationships in the omics data for studying their association with disease and health.
Non-parametric system identification from non-linear stochastic response
DEFF Research Database (Denmark)
Rüdinger, Finn; Krenk, Steen
2001-01-01
An estimation method is proposed for identification of non-linear stiffness and damping of single-degree-of-freedom systems under stationary white noise excitation. Non-parametric estimates of the stiffness and damping along with an estimate of the white noise intensity are obtained by suitable...... of the energy at mean-level crossings, which yields the damping relative to white noise intensity. Finally, an estimate of the noise intensity is extracted by estimating the absolute damping from the autocovariance functions of a set of modified phase plane variables at different energy levels. The method...
Classification using Hierarchical Naive Bayes models
DEFF Research Database (Denmark)
Langseth, Helge; Dyhre Nielsen, Thomas
2006-01-01
Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe......, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models...
Vorobel, Vit; Daya Bay Collaboration
2017-07-01
The Daya Bay Reactor Neutrino Experiment was designed to measure θ 13, the smallest mixing angle in the three-neutrino mixing framework, with unprecedented precision. The experiment consists of eight functionally identical detectors placed underground at different baselines from three pairs of nuclear reactors in South China. Since Dec. 2011, the experiment has been running stably for more than 4 years, and has collected the largest reactor anti-neutrino sample to date. Daya Bay is able to greatly improve the precision on θ 13 and to make an independent measurement of the effective mass splitting in the electron antineutrino disappearance channel. Daya Bay can also perform a number of other precise measurements, such as a high-statistics determination of the absolute reactor antineutrino flux and spectrum, as well as a search for sterile neutrino mixing, among others. The most recent results from Daya Bay are discussed in this paper, as well as the current status and future prospects of the experiment.
Daya bay reactor neutrino experiment
International Nuclear Information System (INIS)
Cao Jun
2010-01-01
Daya Bay Reactor Neutrino Experiment is a large international collaboration experiment under construction. The experiment aims to precisely determine the neutrino mixing angle θ 13 by detecting the neutrinos produced by the Daya Bay Nuclear Power Plant. θ 13 is one of two unknown fundamental parameters in neutrino mixing. Its magnitude is a roadmap of the future neutrino physics, and very likely related to the puzzle of missing antimatter in our universe. The precise measurement has very important physics significance. The detectors of Daya Bay is under construction now. The full operation is expected in 2011. Three years' data taking will reach the designed the precision, to determine sin 2 2θ 13 to better than 0.01. Daya Bay neutrino detector is an underground large nuclear detector of low background, low energy, and high precision. In this paper, the layout of the experiment, the design and fabrication progress of the detectors, and some highlighted nuclear detecting techniques developed in the detector R and D are introduced. (author)
Jaber, Abobaker M; Ismail, Mohd Tahir; Altaher, Alsaidi M
2014-01-01
This paper mainly forecasts the daily closing price of stock markets. We propose a two-stage technique that combines the empirical mode decomposition (EMD) with nonparametric methods of local linear quantile (LLQ). We use the proposed technique, EMD-LLQ, to forecast two stock index time series. Detailed experiments are implemented for the proposed method, in which EMD-LPQ, EMD, and Holt-Winter methods are compared. The proposed EMD-LPQ model is determined to be superior to the EMD and Holt-Winter methods in predicting the stock closing prices.
An empirical study to determine the critical success factors of export industry
Directory of Open Access Journals (Sweden)
Masoud Babakhani
2011-01-01
Full Text Available Exporting goods and services play an important role on economy of developing countries. There are many countries in the world whose economy is solely based on exporting raw materials such as oil and gas. Many believe that countries cannot develop their economy as long as they rely on exporting one single group of raw materials. Therefore, there is a need to help other sectors of industries build good infrastructure for exporting diversified products. In this paper, we perform an empirical analysis to determine the critical success factors on exporting different goods. The results are analyzed using some statistical non-parametric methods and some useful guidelines are also suggested.
Measuring energy performance with sectoral heterogeneity: A non-parametric frontier approach
International Nuclear Information System (INIS)
Wang, H.; Ang, B.W.; Wang, Q.W.; Zhou, P.
2017-01-01
Evaluating economy-wide energy performance is an integral part of assessing the effectiveness of a country's energy efficiency policy. Non-parametric frontier approach has been widely used by researchers for such a purpose. This paper proposes an extended non-parametric frontier approach to studying economy-wide energy efficiency and productivity performances by accounting for sectoral heterogeneity. Relevant techniques in index number theory are incorporated to quantify the driving forces behind changes in the economy-wide energy productivity index. The proposed approach facilitates flexible modelling of different sectors' production processes, and helps to examine sectors' impact on the aggregate energy performance. A case study of China's economy-wide energy efficiency and productivity performances in its 11th five-year plan period (2006–2010) is presented. It is found that sectoral heterogeneities in terms of energy performance are significant in China. Meanwhile, China's economy-wide energy productivity increased slightly during the study period, mainly driven by the technical efficiency improvement. A number of other findings have also been reported. - Highlights: • We model economy-wide energy performance by considering sectoral heterogeneity. • The proposed approach can identify sectors' impact on the aggregate energy performance. • Obvious sectoral heterogeneities are identified in evaluating China's energy performance.
Nonparametric Identification of Glucose-Insulin Process in IDDM Patient with Multi-meal Disturbance
Bhattacharjee, A.; Sutradhar, A.
2012-12-01
Modern close loop control for blood glucose level in a diabetic patient necessarily uses an explicit model of the process. A fixed parameter full order or reduced order model does not characterize the inter-patient and intra-patient parameter variability. This paper deals with a frequency domain nonparametric identification of the nonlinear glucose-insulin process in an insulin dependent diabetes mellitus patient that captures the process dynamics in presence of uncertainties and parameter variations. An online frequency domain kernel estimation method has been proposed that uses the input-output data from the 19th order first principle model of the patient in intravenous route. Volterra equations up to second order kernels with extended input vector for a Hammerstein model are solved online by adaptive recursive least square (ARLS) algorithm. The frequency domain kernels are estimated using the harmonic excitation input data sequence from the virtual patient model. A short filter memory length of M = 2 was found sufficient to yield acceptable accuracy with lesser computation time. The nonparametric models are useful for closed loop control, where the frequency domain kernels can be directly used as the transfer function. The validation results show good fit both in frequency and time domain responses with nominal patient as well as with parameter variations.
kruX: matrix-based non-parametric eQTL discovery.
Qi, Jianlong; Asl, Hassan Foroughi; Björkegren, Johan; Michoel, Tom
2014-01-14
The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com.
Chiu, Chun-Huo; Wang, Yi-Ting; Walther, Bruno A; Chao, Anne
2014-09-01
It is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper-diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265-270) for individual-based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good-Turing frequency formula, we derive an approximate formula for the first-order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators. © 2014, The International Biometric Society.
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
Parametric and nonparametric Granger causality testing: Linkages between international stock markets
De Gooijer, Jan G.; Sivarajasingham, Selliah
2008-04-01
This study investigates long-term linear and nonlinear causal linkages among eleven stock markets, six industrialized markets and five emerging markets of South-East Asia. We cover the period 1987-2006, taking into account the on-set of the Asian financial crisis of 1997. We first apply a test for the presence of general nonlinearity in vector time series. Substantial differences exist between the pre- and post-crisis period in terms of the total number of significant nonlinear relationships. We then examine both periods, using a new nonparametric test for Granger noncausality and the conventional parametric Granger noncausality test. One major finding is that the Asian stock markets have become more internationally integrated after the Asian financial crisis. An exception is the Sri Lankan market with almost no significant long-term linear and nonlinear causal linkages with other markets. To ensure that any causality is strictly nonlinear in nature, we also examine the nonlinear causal relationships of VAR filtered residuals and VAR filtered squared residuals for the post-crisis sample. We find quite a few remaining significant bi- and uni-directional causal nonlinear relationships in these series. Finally, after filtering the VAR-residuals with GARCH-BEKK models, we show that the nonparametric test statistics are substantially smaller in both magnitude and statistical significance than those before filtering. This indicates that nonlinear causality can, to a large extent, be explained by simple volatility effects.
MEASURING DARK MATTER PROFILES NON-PARAMETRICALLY IN DWARF SPHEROIDALS: AN APPLICATION TO DRACO
International Nuclear Information System (INIS)
Jardel, John R.; Gebhardt, Karl; Fabricius, Maximilian H.; Williams, Michael J.; Drory, Niv
2013-01-01
We introduce a novel implementation of orbit-based (or Schwarzschild) modeling that allows dark matter density profiles to be calculated non-parametrically in nearby galaxies. Our models require no assumptions to be made about velocity anisotropy or the dark matter profile. The technique can be applied to any dispersion-supported stellar system, and we demonstrate its use by studying the Local Group dwarf spheroidal galaxy (dSph) Draco. We use existing kinematic data at larger radii and also present 12 new radial velocities within the central 13 pc obtained with the VIRUS-W integral field spectrograph on the 2.7 m telescope at McDonald Observatory. Our non-parametric Schwarzschild models find strong evidence that the dark matter profile in Draco is cuspy for 20 ≤ r ≤ 700 pc. The profile for r ≥ 20 pc is well fit by a power law with slope α = –1.0 ± 0.2, consistent with predictions from cold dark matter simulations. Our models confirm that, despite its low baryon content relative to other dSphs, Draco lives in a massive halo.
Robust non-parametric one-sample tests for the analysis of recurrent events.
Rebora, Paola; Galimberti, Stefania; Valsecchi, Maria Grazia
2010-12-30
One-sample non-parametric tests are proposed here for inference on recurring events. The focus is on the marginal mean function of events and the basis for inference is the standardized distance between the observed and the expected number of events under a specified reference rate. Different weights are considered in order to account for various types of alternative hypotheses on the mean function of the recurrent events process. A robust version and a stratified version of the test are also proposed. The performance of these tests was investigated through simulation studies under various underlying event generation processes, such as homogeneous and nonhomogeneous Poisson processes, autoregressive and renewal processes, with and without frailty effects. The robust versions of the test have been shown to be suitable in a wide variety of event generating processes. The motivating context is a study on gene therapy in a very rare immunodeficiency in children, where a major end-point is the recurrence of severe infections. Robust non-parametric one-sample tests for recurrent events can be useful to assess efficacy and especially safety in non-randomized studies or in epidemiological studies for comparison with a standard population. Copyright © 2010 John Wiley & Sons, Ltd.
Non-parametric transformation for data correlation and integration: From theory to practice
Energy Technology Data Exchange (ETDEWEB)
Datta-Gupta, A.; Xue, Guoping; Lee, Sang Heon [Texas A& M Univ., College Station, TX (United States)
1997-08-01
The purpose of this paper is two-fold. First, we introduce the use of non-parametric transformations for correlating petrophysical data during reservoir characterization. Such transformations are completely data driven and do not require a priori functional relationship between response and predictor variables which is the case with traditional multiple regression. The transformations are very general, computationally efficient and can easily handle mixed data types for example, continuous variables such as porosity, permeability and categorical variables such as rock type, lithofacies. The power of the non-parametric transformation techniques for data correlation has been illustrated through synthetic and field examples. Second, we utilize these transformations to propose a two-stage approach for data integration during heterogeneity characterization. The principal advantages of our approach over traditional cokriging or cosimulation methods are: (1) it does not require a linear relationship between primary and secondary data, (2) it exploits the secondary information to its fullest potential by maximizing the correlation between the primary and secondary data, (3) it can be easily applied to cases where several types of secondary or soft data are involved, and (4) it significantly reduces variance function calculations and thus, greatly facilitates non-Gaussian cosimulation. We demonstrate the data integration procedure using synthetic and field examples. The field example involves estimation of pore-footage distribution using well data and multiple seismic attributes.
Directory of Open Access Journals (Sweden)
Urbi Garay
2016-03-01
Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.
Comparison of Parametric and Nonparametric Methods for Analyzing the Bias of a Numerical Model
Directory of Open Access Journals (Sweden)
Isaac Mugume
2016-01-01
Full Text Available Numerical models are presently applied in many fields for simulation and prediction, operation, or research. The output from these models normally has both systematic and random errors. The study compared January 2015 temperature data for Uganda as simulated using the Weather Research and Forecast model with actual observed station temperature data to analyze the bias using parametric (the root mean square error (RMSE, the mean absolute error (MAE, mean error (ME, skewness, and the bias easy estimate (BES and nonparametric (the sign test, STM methods. The RMSE normally overestimates the error compared to MAE. The RMSE and MAE are not sensitive to direction of bias. The ME gives both direction and magnitude of bias but can be distorted by extreme values while the BES is insensitive to extreme values. The STM is robust for giving the direction of bias; it is not sensitive to extreme values but it does not give the magnitude of bias. The graphical tools (such as time series and cumulative curves show the performance of the model with time. It is recommended to integrate parametric and nonparametric methods along with graphical methods for a comprehensive analysis of bias of a numerical model.
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.
A semi-nonparametric mixture model for selecting functionally consistent proteins.
Yu, Lianbo; Doerge, Rw
2010-09-28
High-throughput technologies have led to a new era of proteomics. Although protein microarray experiments are becoming more common place there are a variety of experimental and statistical issues that have yet to be addressed, and that will carry over to new high-throughput technologies unless they are investigated. One of the largest of these challenges is the selection of functionally consistent proteins. We present a novel semi-nonparametric mixture model for classifying proteins as consistent or inconsistent while controlling the false discovery rate and the false non-discovery rate. The performance of the proposed approach is compared to current methods via simulation under a variety of experimental conditions. We provide a statistical method for selecting functionally consistent proteins in the context of protein microarray experiments, but the proposed semi-nonparametric mixture model method can certainly be generalized to solve other mixture data problems. The main advantage of this approach is that it provides the posterior probability of consistency for each protein.
Transformation-invariant and nonparametric monotone smooth estimation of ROC curves.
Du, Pang; Tang, Liansheng
2009-01-30
When a new diagnostic test is developed, it is of interest to evaluate its accuracy in distinguishing diseased subjects from non-diseased subjects. The accuracy of the test is often evaluated by receiver operating characteristic (ROC) curves. Smooth ROC estimates are often preferable for continuous test results when the underlying ROC curves are in fact continuous. Nonparametric and parametric methods have been proposed by various authors to obtain smooth ROC curve estimates. However, there are certain drawbacks with the existing methods. Parametric methods need specific model assumptions. Nonparametric methods do not always satisfy the inherent properties of the ROC curves, such as monotonicity and transformation invariance. In this paper we propose a monotone spline approach to obtain smooth monotone ROC curves. Our method ensures important inherent properties of the underlying ROC curves, which include monotonicity, transformation invariance, and boundary constraints. We compare the finite sample performance of the newly proposed ROC method with other ROC smoothing methods in large-scale simulation studies. We illustrate our method through a real life example. Copyright (c) 2008 John Wiley & Sons, Ltd.
Akhmadaliev, S Z; Ambrosini, G; Amorim, A; Anderson, K; Andrieux, M L; Aubert, Bernard; Augé, E; Badaud, F; Baisin, L; Barreiro, F; Battistoni, G; Bazan, A; Bazizi, K; Belymam, A; Benchekroun, D; Berglund, S R; Berset, J C; Blanchot, G; Bogush, A A; Bohm, C; Boldea, V; Bonivento, W; Bosman, M; Bouhemaid, N; Breton, D; Brette, P; Bromberg, C; Budagov, Yu A; Burdin, S V; Calôba, L P; Camarena, F; Camin, D V; Canton, B; Caprini, M; Carvalho, J; Casado, M P; Castillo, M V; Cavalli, D; Cavalli-Sforza, M; Cavasinni, V; Chadelas, R; Chalifour, M; Chekhtman, A; Chevalley, J L; Chirikov-Zorin, I E; Chlachidze, G; Citterio, M; Cleland, W E; Clément, C; Cobal, M; Cogswell, F; Colas, Jacques; Collot, J; Cologna, S; Constantinescu, S; Costa, G; Costanzo, D; Crouau, M; Daudon, F; David, J; David, M; Davidek, T; Dawson, J; De, K; de La Taille, C; Del Peso, J; Del Prete, T; de Saintignon, P; Di Girolamo, B; Dinkespiler, B; Dita, S; Dodd, J; Dolejsi, J; Dolezal, Z; Downing, R; Dugne, J J; Dzahini, D; Efthymiopoulos, I; Errede, D; Errede, S; Evans, H; Eynard, G; Fassi, F; Fassnacht, P; Ferrari, A; Ferrer, A; Flaminio, Vincenzo; Fournier, D; Fumagalli, G; Gallas, E; Gaspar, M; Giakoumopoulou, V; Gianotti, F; Gildemeister, O; Giokaris, N; Glagolev, V; Glebov, V Yu; Gomes, A; González, V; González de la Hoz, S; Grabskii, V; Graugès-Pous, E; Grenier, P; Hakopian, H H; Haney, M; Hébrard, C; Henriques, A; Hervás, L; Higón, E; Holmgren, Sven Olof; Hostachy, J Y; Hoummada, A; Huston, J; Imbault, D; Ivanyushenkov, Yu M; Jézéquel, S; Johansson, E K; Jon-And, K; Jones, R; Juste, A; Kakurin, S; Karyukhin, A N; Khokhlov, Yu A; Khubua, J I; Klioukhine, V I; Kolachev, G M; Kopikov, S V; Kostrikov, M E; Kozlov, V; Krivkova, P; Kukhtin, V V; Kulagin, M; Kulchitskii, Yu A; Kuzmin, M V; Labarga, L; Laborie, G; Lacour, D; Laforge, B; Lami, S; Lapin, V; Le Dortz, O; Lefebvre, M; Le Flour, T; Leitner, R; Leltchouk, M; Li, J; Liablin, M V; Linossier, O; Lissauer, D; Lobkowicz, F; Lokajícek, M; Lomakin, Yu F; López-Amengual, J M; Lund-Jensen, B; Maio, A; Makowiecki, D S; Malyukov, S N; Mandelli, L; Mansoulié, B; Mapelli, Livio P; Marin, C P; Marrocchesi, P S; Marroquim, F; Martin, P; Maslennikov, A L; Massol, N; Mataix, L; Mazzanti, M; Mazzoni, E; Merritt, F S; Michel, B; Miller, R; Minashvili, I A; Miralles, L; Mnatzakanian, E A; Monnier, E; Montarou, G; Mornacchi, Giuseppe; Moynot, M; Muanza, G S; Nayman, P; Némécek, S; Nessi, Marzio; Nicoleau, S; Niculescu, M; Noppe, J M; Onofre, A; Pallin, D; Pantea, D; Paoletti, R; Park, I C; Parrour, G; Parsons, J; Pereira, A; Perini, L; Perlas, J A; Perrodo, P; Pilcher, J E; Pinhão, J; Plothow-Besch, Hartmute; Poggioli, Luc; Poirot, S; Price, L; Protopopov, Yu; Proudfoot, J; Puzo, P; Radeka, V; Rahm, David Charles; Reinmuth, G; Renzoni, G; Rescia, S; Resconi, S; Richards, R; Richer, J P; Roda, C; Rodier, S; Roldán, J; Romance, J B; Romanov, V; Romero, P; Rossel, F; Rusakovitch, N A; Sala, P; Sanchis, E; Sanders, H; Santoni, C; Santos, J; Sauvage, D; Sauvage, G; Sawyer, L; Says, L P; Schaffer, A C; Schwemling, P; Schwindling, J; Seguin-Moreau, N; Seidl, W; Seixas, J M; Selldén, B; Seman, M; Semenov, A; Serin, L; Shaldaev, E; Shochet, M J; Sidorov, V; Silva, J; Simaitis, V J; Simion, S; Sissakian, A N; Snopkov, R; Söderqvist, J; Solodkov, A A; Soloviev, A; Soloviev, I V; Sonderegger, P; Soustruznik, K; Spanó, F; Spiwoks, R; Stanek, R; Starchenko, E A; Stavina, P; Stephens, R; Suk, M; Surkov, A; Sykora, I; Takai, H; Tang, F; Tardell, S; Tartarelli, F; Tas, P; Teiger, J; Thaler, J; Thion, J; Tikhonov, Yu A; Tisserant, S; Tokar, S; Topilin, N D; Trka, Z; Turcotte, M; Valkár, S; Varanda, M J; Vartapetian, A H; Vazeille, F; Vichou, I; Vinogradov, V; Vorozhtsov, S B; Vuillemin, V; White, A; Wielers, M; Wingerter-Seez, I; Wolters, H; Yamdagni, N; Yosef, C; Zaitsev, A; Zitoun, R; Zolnierowski, Y
2002-01-01
This paper discusses hadron energy reconstruction for the ATLAS barrel prototype combined calorimeter (consisting of a lead-liquid argon electromagnetic part and an iron-scintillator hadronic part) in the framework of the nonparametrical method. The nonparametrical method utilizes only the known e/h ratios and the electron calibration constants and does not require the determination of any parameters by a minimization technique. Thus, this technique lends itself to an easy use in a first level trigger. The reconstructed mean values of the hadron energies are within +or-1% of the true values and the fractional energy resolution is [(58+or-3)%/ square root E+(2.5+or-0.3)%](+)(1.7+or-0.2)/E. The value of the e/h ratio obtained for the electromagnetic compartment of the combined calorimeter is 1.74+or-0.04 and agrees with the prediction that e/h >1.66 for this electromagnetic calorimeter. Results of a study of the longitudinal hadronic shower development are also presented. The data have been taken in the H8 beam...
Impulse response identification with deterministic inputs using non-parametric methods
International Nuclear Information System (INIS)
Bhargava, U.K.; Kashyap, R.L.; Goodman, D.M.
1985-01-01
This paper addresses the problem of impulse response identification using non-parametric methods. Although the techniques developed herein apply to the truncated, untruncated, and the circulant models, we focus on the truncated model which is useful in certain applications. Two methods of impulse response identification will be presented. The first is based on the minimization of the C/sub L/ Statistic, which is an estimate of the mean-square prediction error; the second is a Bayesian approach. For both of these methods, we consider the effects of using both the identity matrix and the Laplacian matrix as weights on the energy in the impulse response. In addition, we present a method for estimating the effective length of the impulse response. Estimating the length is particularly important in the truncated case. Finally, we develop a method for estimating the noise variance at the output. Often, prior information on the noise variance is not available, and a good estimate is crucial to the success of estimating the impulse response with a nonparametric technique
Macmillan, N A; Creelman, C D
1996-06-01
Can accuracy and response bias in two-stimulus, two-response recognition or detection experiments be measured nonparametrically? Pollack and Norman (1964) answered this question affirmatively for sensitivity, Hodos (1970) for bias: Both proposed measures based on triangular areas in receiver-operating characteristic space. Their papers, and especially a paper by Grier (1971) that provided computing formulas for the measures, continue to be heavily cited in a wide range of content areas. In our sample of articles, most authors described triangle-based measures as making fewer assumptions than measures associated with detection theory. However, we show that statistics based on products or ratios of right triangle areas, including a recently proposed bias index and a not-yetproposed but apparently plausible sensitivity index, are consistent with a decision process based on logistic distributions. Even the Pollack and Norman measure, which is based on non-right triangles, is approximately logistic for low values of sensitivity. Simple geometric models for sensitivity and bias are not nonparametric, even if their implications are not acknowledged in the defining publications.
Epistemology and Empirical Investigation
DEFF Research Database (Denmark)
Ahlström, Kristoffer
2008-01-01
Recently, Hilary Kornblith has argued that epistemological investigation is substantially empirical. In the present paper, I will ¿rst show that his claim is not contingent upon the further and, admittedly, controversial assumption that all objects of epistemological investigation are natural kinds....... Then, I will argue that, contrary to what Kornblith seems to assume, this methodological contention does not imply that there is no need for attending to our epistemic concepts in epistemology. Understanding the make-up of our concepts and, in particular, the purposes they ¿ll, is necessary...
Constructive Verification, Empirical Induction, and Falibilist Deduction: A Threefold Contrast
Directory of Open Access Journals (Sweden)
Julio Michael Stern
2011-10-01
Full Text Available This article explores some open questions related to the problem of verification of theories in the context of empirical sciences by contrasting three epistemological frameworks. Each of these epistemological frameworks is based on a corresponding central metaphor, namely: (a Neo-empiricism and the gambling metaphor; (b Popperian falsificationism and the scientific tribunal metaphor; (c Cognitive constructivism and the object as eigen-solution metaphor. Each of one of these epistemological frameworks has also historically co-evolved with a certain statistical theory and method for testing scientific hypotheses, respectively: (a Decision theoretic Bayesian statistics and Bayes factors; (b Frequentist statistics and p-values; (c Constructive Bayesian statistics and e-values. This article examines with special care the Zero Probability Paradox (ZPP, related to the verification of sharp or precise hypotheses. Finally, this article makes some remarks on Lakatos’ view of mathematics as a quasi-empirical science.
Empirical microeconomics action functionals
Baaquie, Belal E.; Du, Xin; Tanputraman, Winson
2015-06-01
A statistical generalization of microeconomics has been made in Baaquie (2013), where the market price of every traded commodity, at each instant of time, is considered to be an independent random variable. The dynamics of commodity market prices is modeled by an action functional-and the focus of this paper is to empirically determine the action functionals for different commodities. The correlation functions of the model are defined using a Feynman path integral. The model is calibrated using the unequal time correlation of the market commodity prices as well as their cubic and quartic moments using a perturbation expansion. The consistency of the perturbation expansion is verified by a numerical evaluation of the path integral. Nine commodities drawn from the energy, metal and grain sectors are studied and their market behavior is described by the model to an accuracy of over 90% using only six parameters. The paper empirically establishes the existence of the action functional for commodity prices that was postulated to exist in Baaquie (2013).
What 'empirical turn in bioethics'?
Hurst, Samia
2010-10-01
Uncertainty as to how we should articulate empirical data and normative reasoning seems to underlie most difficulties regarding the 'empirical turn' in bioethics. This article examines three different ways in which we could understand 'empirical turn'. Using real facts in normative reasoning is trivial and would not represent a 'turn'. Becoming an empirical discipline through a shift to the social and neurosciences would be a turn away from normative thinking, which we should not take. Conducting empirical research to inform normative reasoning is the usual meaning given to the term 'empirical turn'. In this sense, however, the turn is incomplete. Bioethics has imported methodological tools from empirical disciplines, but too often it has not imported the standards to which researchers in these disciplines are held. Integrating empirical and normative approaches also represents true added difficulties. Addressing these issues from the standpoint of debates on the fact-value distinction can cloud very real methodological concerns by displacing the debate to a level of abstraction where they need not be apparent. Ideally, empirical research in bioethics should meet standards for empirical and normative validity similar to those used in the source disciplines for these methods, and articulate these aspects clearly and appropriately. More modestly, criteria to ensure that none of these standards are completely left aside would improve the quality of empirical bioethics research and partly clear the air of critiques addressing its theoretical justification, when its rigour in the particularly difficult context of interdisciplinarity is what should be at stake.
2002-01-01
Rivers that empty into large bodies of water can have a significant impact on the thawing of nearshore winter ice. This true-color Moderate Resolution Imaging Spectroradiometer (MODIS) image from May 18, 2001, shows the Nelson River emptying spring runoff from the Manitoba province to the south into the southwestern corner of Canada's Hudson Bay. The warmer waters from more southern latitudes hasten melting of ice near the shore, though some still remained, perhaps because in shallow coastal waters, the ice could have been anchored to the bottom. High volumes of sediment in the runoff turned the inflow brown, and the rim of the retreating ice has taken on a dirty appearance even far to the east of the river's entrance into the Bay. The sediment would have further hastened the melting of the ice because its darker color would have absorbed more solar radiation than cleaner, whiter ice. Image courtesy Jacques Descloitres, MODIS Land Rapid Response Team at NASA GSFC
Human Power Empirically Explored
Energy Technology Data Exchange (ETDEWEB)
Jansen, A.J.
2011-01-18
Harvesting energy from the users' muscular power to convert this into electricity is a relatively unknown way to power consumer products. It nevertheless offers surprising opportunities for product designers; human-powered products function independently from regular power infrastructure, are convenient and can be environmentally and economically beneficial. This work provides insight into the knowledge required to design human-powered energy systems in consumer products from a scientific perspective. It shows the developments of human-powered products from the first introduction of the BayGen Freeplay radio in 1995 till current products and provides an overview and analysis of 211 human-powered products currently on the market. Although human power is generally perceived as beneficial for the environment, this thesis shows that achieving environmental benefit is only feasible when the environmental impact of additional materials in the energy conversion system is well balanced with the energy demands of the products functionality. User testing with existing products showed a preference for speeds in the range of 70 to 190 rpm for crank lengths from 32 to 95 mm. The muscular input power varied from 5 to 21 W. The analysis of twenty graduation projects from the Faculty of Industrial Design Engineering in the field of human-powered products, offers an interesting set of additional practice based design recommendations. The knowledge based approach of human power is very powerful to support the design of human-powered products. There is substantial potential for improvements in the domains energy conversion, ergonomics and environment. This makes that human power, when applied properly, is environmentally and economically competitive over a wider range of applications than thought previously.
Böhning, Dankmar; Karasek, Sarah; Terschüren, Claudia; Annuß, Rolf; Fehr, Rainer
2013-03-09
Life expectancy is of increasing prime interest for a variety of reasons. In many countries, life expectancy is growing linearly, without any indication of reaching a limit. The state of North Rhine-Westphalia (NRW) in Germany with its 54 districts is considered here where the above mentioned growth in life expectancy is occurring as well. However, there is also empirical evidence that life expectancy is not growing linearly at the same level for different regions. To explore this situation further a likelihood-based cluster analysis is suggested and performed. The modelling uses a nonparametric mixture approach for the latent random effect. Maximum likelihood estimates are determined by means of the EM algorithm and the number of components in the mixture model are found on the basis of the Bayesian Information Criterion. Regions are classified into the mixture components (clusters) using the maximum posterior allocation rule. For the data analyzed here, 7 components are found with a spatial concentration of lower life expectancy levels in a centre of NRW, formerly an enormous conglomerate of heavy industry, still the most densely populated area with Gelsenkirchen having the lowest level of life expectancy growth for both genders. The paper offers some explanations for this fact including demographic and socio-economic sources. This case study shows that life expectancy growth is widely linear, but it might occur on different levels.
Spill management strategy for the Chesapeake Bay
International Nuclear Information System (INIS)
Butler, H.L.; Chapman, R.S.; Johnson, B.H.
1990-01-01
The Chesapeake Bay Program is a unique cooperative effort between state and Federal agencies to restore the health and productivity of America's largest estuary. To assist in addressing specific management issues, a comprehensive three-dimensional, time-varying hydrodynamic and water quality model has ben developed. The Bay modeling strategy will serve as an excellent framework for including submodules to predict the movement, dispersion, and weathering of accidental spills, such as for petroleum products or other chemicals. This paper presents sample results from the Bay application to illustrate the success of the model system in simulating Bay processes. Also, a review of model requirements for successful spill modeling in Chesapeake Bay is presented. Recommendations are given for implementing appropriate spill modules with the Bay model framework and establishing a strategy for model use in addressing management issues
Transitioning a Chesapeake Bay Ecological Prediction System to Operations
Brown, C.; Green, D. S.; Eco Forecasters
2011-12-01
Ecological prediction of the impacts of physical, chemical, biological, and human-induced change on ecosystems and their components, encompass a wide range of space and time scales, and subject matter. They vary from predicting the occurrence and/or transport of certain species, such harmful algal blooms, or biogeochemical constituents, such as dissolved oxygen concentrations, to large-scale ecosystem responses and higher trophic levels. The timescales of ecological prediction, including guidance and forecasts, range from nowcasts and short-term forecasts (days), to intraseasonal and interannual outlooks (weeks to months), to decadal and century projections in climate change scenarios. The spatial scales range from small coastal inlets to basin and global scale biogeochemical and ecological forecasts. The types of models that have been used include conceptual, empirical, mechanistic, and hybrid approaches. This presentation will identify the challenges and progress toward transitioning experimental model-based ecological prediction into operational guidance and forecasting. Recent efforts are targeting integration of regional ocean, hydrodynamic and hydrological models and leveraging weather and water service infrastructure to enable the prototyping of an operational ecological forecast capability for the Chesapeake Bay and its tidal tributaries. A path finder demonstration predicts the probability of encountering sea nettles (Chrysaora quinquecirrha), a stinging jellyfish. These jellyfish can negatively impact safety and economic activities in the bay and an impact-based forecast that predicts where and when this biotic nuisance occurs may help management effects. The issuance of bay-wide nowcasts and three-day forecasts of sea nettle probability are generated daily by forcing an empirical habitat model (that predicts the probability of sea nettles) with real-time and 3-day forecasts of sea-surface temperature (SST) and salinity (SSS). In the first demonstration
EGG: Empirical Galaxy Generator
Schreiber, C.; Elbaz, D.; Pannella, M.; Merlin, E.; Castellano, M.; Fontana, A.; Bourne, N.; Boutsia, K.; Cullen, F.; Dunlop, J.; Ferguson, H. C.; MichaÅowski, M. J.; Okumura, K.; Santini, P.; Shu, X. W.; Wang, T.; White, C.
2018-04-01
The Empirical Galaxy Generator (EGG) generates fake galaxy catalogs and images with realistic positions, morphologies and fluxes from the far-ultraviolet to the far-infrared. The catalogs are generated by egg-gencat and stored in binary FITS tables (column oriented). Another program, egg-2skymaker, is used to convert the generated catalog into ASCII tables suitable for ingestion by SkyMaker (ascl:1010.066) to produce realistic high resolution images (e.g., Hubble-like), while egg-gennoise and egg-genmap can be used to generate the low resolution images (e.g., Herschel-like). These tools can be used to test source extraction codes, or to evaluate the reliability of any map-based science (stacking, dropout identification, etc.).
Directory of Open Access Journals (Sweden)
Yin Wang
2014-01-01
Full Text Available We present a nonparametric shape constrained algorithm for segmentation of coronary arteries in computed tomography images within the framework of active contours. An adaptive scale selection scheme, based on the global histogram information of the image data, is employed to determine the appropriate window size for each point on the active contour, which improves the performance of the active contour model in the low contrast local image regions. The possible leakage, which cannot be identified by using intensity features alone, is reduced through the application of the proposed shape constraint, where the shape of circular sampled intensity profile is used to evaluate the likelihood of current segmentation being considered vascular structures. Experiments on both synthetic and clinical datasets have demonstrated the efficiency and robustness of the proposed method. The results on clinical datasets have shown that the proposed approach is capable of extracting more detailed coronary vessels with subvoxel accuracy.
[Do we always correctly interpret the results of statistical nonparametric tests].
Moczko, Jerzy A
2014-01-01
Mann-Whitney, Wilcoxon, Kruskal-Wallis and Friedman tests create a group of commonly used tests to analyze the results of clinical and laboratory data. These tests are considered to be extremely flexible and their asymptotic relative efficiency exceeds 95 percent. Compared with the corresponding parametric tests they do not require checking the fulfillment of the conditions such as the normality of data distribution, homogeneity of variance, the lack of correlation means and standard deviations, etc. They can be used both in the interval and or-dinal scales. The article presents an example Mann-Whitney test, that does not in any case the choice of these four nonparametric tests treated as a kind of gold standard leads to correct inference.
Xu, Zhiqiang
2017-02-16
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.
Nonparametric Inference of Doubly Stochastic Poisson Process Data via the Kernel Method.
Zhang, Tingting; Kou, S C
2010-01-01
Doubly stochastic Poisson processes, also known as the Cox processes, frequently occur in various scientific fields. In this article, motivated primarily by analyzing Cox process data in biophysics, we propose a nonparametric kernel-based inference method. We conduct a detailed study, including an asymptotic analysis, of the proposed method, and provide guidelines for its practical use, introducing a fast and stable regression method for bandwidth selection. We apply our method to real photon arrival data from recent single-molecule biophysical experiments, investigating proteins' conformational dynamics. Our result shows that conformational fluctuation is widely present in protein systems, and that the fluctuation covers a broad range of time scales, highlighting the dynamic and complex nature of proteins' structure.
A Nonparametric Operational Risk Modeling Approach Based on Cornish-Fisher Expansion
Directory of Open Access Journals (Sweden)
Xiaoqian Zhu
2014-01-01
Full Text Available It is generally accepted that the choice of severity distribution in loss distribution approach has a significant effect on the operational risk capital estimation. However, the usually used parametric approaches with predefined distribution assumption might be not able to fit the severity distribution accurately. The objective of this paper is to propose a nonparametric operational risk modeling approach based on Cornish-Fisher expansion. In this approach, the samples of severity are generated by Cornish-Fisher expansion and then used in the Monte Carlo simulation to sketch the annual operational loss distribution. In the experiment, the proposed approach is employed to calculate the operational risk capital charge for the overall Chinese banking. The experiment dataset is the most comprehensive operational risk dataset in China as far as we know. The results show that the proposed approach is able to use the information of high order moments and might be more effective and stable than the usually used parametric approach.
Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke
2017-01-01
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.
Non-parametric Bayesian graph models reveal community structure in resting state fMRI
DEFF Research Database (Denmark)
Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman
2014-01-01
Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...... models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite...... between clusters. BCD restricts the between-cluster link probabilities to be strictly lower than within-cluster link probabilities to conform to the community structure typically seen in social networks. IDM only models a single between-cluster link probability, which can be interpreted as a background...
Parametric, nonparametric and parametric modelling of a chaotic circuit time series
Timmer, J.; Rust, H.; Horbelt, W.; Voss, H. U.
2000-09-01
The determination of a differential equation underlying a measured time series is a frequently arising task in nonlinear time series analysis. In the validation of a proposed model one often faces the dilemma that it is hard to decide whether possible discrepancies between the time series and model output are caused by an inappropriate model or by bad estimates of parameters in a correct type of model, or both. We propose a combination of parametric modelling based on Bock's multiple shooting algorithm and nonparametric modelling based on optimal transformations as a strategy to test proposed models and if rejected suggest and test new ones. We exemplify this strategy on an experimental time series from a chaotic circuit where we obtain an extremely accurate reconstruction of the observed attractor.
A non-parametric consistency test of the ΛCDM model with Planck CMB data
Energy Technology Data Exchange (ETDEWEB)
Aghamousa, Amir; Shafieloo, Arman [Korea Astronomy and Space Science Institute, Daejeon 305-348 (Korea, Republic of); Hamann, Jan, E-mail: amir@aghamousa.com, E-mail: jan.hamann@unsw.edu.au, E-mail: shafieloo@kasi.re.kr [School of Physics, The University of New South Wales, Sydney NSW 2052 (Australia)
2017-09-01
Non-parametric reconstruction methods, such as Gaussian process (GP) regression, provide a model-independent way of estimating an underlying function and its uncertainty from noisy data. We demonstrate how GP-reconstruction can be used as a consistency test between a given data set and a specific model by looking for structures in the residuals of the data with respect to the model's best-fit. Applying this formalism to the Planck temperature and polarisation power spectrum measurements, we test their global consistency with the predictions of the base ΛCDM model. Our results do not show any serious inconsistencies, lending further support to the interpretation of the base ΛCDM model as cosmology's gold standard.
Semi-nonparametric estimates of interfuel substitution in US energy demand
Energy Technology Data Exchange (ETDEWEB)
Serletis, A.; Shahmoradi, A. [University of Calgary, Calgary, AB (Canada). Dept. of Economics
2008-09-15
This paper focuses on the demand for crude oil, natural gas, and coal in the United States in the context of two globally flexible functional forms - the Fourier and the Asymptotically Ideal Model (AIM) - estimated subject to full regularity, using methods suggested over 20 years ago by Gallant and Golub (Gallant, A. Ronald and Golub, Gene H. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics 26 (1984), 295-321) and recently used by Serletis and Shahmoradi (Serletis, A., Shahmoradi, A., 2005. Semi-nonparametric estimates of the demand for money in the United States. Macroeconomic Dynamics 9, 542-559) in the monetary demand systems literature. We provide a comparison in terms of a full set of elasticities and also a policy perspective, using (for the first time) parameter estimates that are consistent with global regularity.
Semi-nonparametric estimates of interfuel substitution in U.S. energy demand
Energy Technology Data Exchange (ETDEWEB)
Serletis, Apostolos [Department of Economics, University of Calgary, Calgary, Alberta (Canada); Shahmoradi, Asghar [Faculty of Economics, University of Tehran, Tehran (Iran)
2008-09-15
This paper focuses on the demand for crude oil, natural gas, and coal in the United States in the context of two globally flexible functional forms - the Fourier and the Asymptotically Ideal Model (AIM) - estimated subject to full regularity, using methods suggested over 20 years ago by Gallant and Golub [Gallant, A. Ronald and Golub, Gene H. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics 26 (1984), 295-321] and recently used by Serletis and Shahmoradi [Serletis, A., Shahmoradi, A., 2005. Semi-nonparametric estimates of the demand for money in the United States. Macroeconomic Dynamics 9, 542-559] in the monetary demand systems literature. We provide a comparison in terms of a full set of elasticities and also a policy perspective, using (for the first time) parameter estimates that are consistent with global regularity. (author)
A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets
Carrig, Madeline M.; Manrique-Vallier, Daniel; Ranby, Krista W.; Reiter, Jerome P.; Hoyle, Rick H.
2015-01-01
Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches. PMID:26257437
International Nuclear Information System (INIS)
Morio, Jerome
2011-01-01
Importance sampling (IS) is a useful simulation technique to estimate critical probability with a better accuracy than Monte Carlo methods. It consists in generating random weighted samples from an auxiliary distribution rather than the distribution of interest. The crucial part of this algorithm is the choice of an efficient auxiliary PDF that has to be able to simulate more rare random events. The optimisation of this auxiliary distribution is often in practice very difficult. In this article, we propose to approach the IS optimal auxiliary density with non-parametric adaptive importance sampling (NAIS). We apply this technique for the probability estimation of spatial launcher impact position since it has currently become a more and more important issue in the field of aeronautics.
Faraway, Julian J
2005-01-01
Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...
Bayesian nonparametric modeling for comparison of single-neuron firing intensities.
Kottas, Athanasios; Behseta, Sam
2010-03-01
We propose a fully inferential model-based approach to the problem of comparing the firing patterns of a neuron recorded under two distinct experimental conditions. The methodology is based on nonhomogeneous Poisson process models for the firing times of each condition with flexible nonparametric mixture prior models for the corresponding intensity functions. We demonstrate posterior inferences from a global analysis, which may be used to compare the two conditions over the entire experimental time window, as well as from a pointwise analysis at selected time points to detect local deviations of firing patterns from one condition to another. We apply our method on two neurons recorded from the primary motor cortex area of a monkey's brain while performing a sequence of reaching tasks.
A BAYESIAN NONPARAMETRIC MIXTURE MODEL FOR SELECTING GENES AND GENE SUBNETWORKS.
Zhao, Yize; Kang, Jian; Yu, Tianwei
2014-06-01
It is very challenging to select informative features from tens of thousands of measured features in high-throughput data analysis. Recently, several parametric/regression models have been developed utilizing the gene network information to select genes or pathways strongly associated with a clinical/biological outcome. Alternatively, in this paper, we propose a nonparametric Bayesian model for gene selection incorporating network information. In addition to identifying genes that have a strong association with a clinical outcome, our model can select genes with particular expressional behavior, in which case the regression models are not directly applicable. We show that our proposed model is equivalent to an infinity mixture model for which we develop a posterior computation algorithm based on Markov chain Monte Carlo (MCMC) methods. We also propose two fast computing algorithms that approximate the posterior simulation with good accuracy but relatively low computational cost. We illustrate our methods on simulation studies and the analysis of Spellman yeast cell cycle microarray data.
Genomic outlier profile analysis: mixture models, null hypotheses, and nonparametric estimation.
Ghosh, Debashis; Chinnaiyan, Arul M
2009-01-01
In most analyses of large-scale genomic data sets, differential expression analysis is typically assessed by testing for differences in the mean of the distributions between 2 groups. A recent finding by Tomlins and others (2005) is of a different type of pattern of differential expression in which a fraction of samples in one group have overexpression relative to samples in the other group. In this work, we describe a general mixture model framework for the assessment of this type of expression, called outlier profile analysis. We start by considering the single-gene situation and establishing results on identifiability. We propose 2 nonparametric estimation procedures that have natural links to familiar multiple testing procedures. We then develop multivariate extensions of this methodology to handle genome-wide measurements. The proposed methodologies are compared using simulation studies as well as data from a prostate cancer gene expression study.
Assessing T cell clonal size distribution: a non-parametric approach.
Directory of Open Access Journals (Sweden)
Olesya V Bolkhovskaya
Full Text Available Clonal structure of the human peripheral T-cell repertoire is shaped by a number of homeostatic mechanisms, including antigen presentation, cytokine and cell regulation. Its accurate tuning leads to a remarkable ability to combat pathogens in all their variety, while systemic failures may lead to severe consequences like autoimmune diseases. Here we develop and make use of a non-parametric statistical approach to assess T cell clonal size distributions from recent next generation sequencing data. For 41 healthy individuals and a patient with ankylosing spondylitis, who undergone treatment, we invariably find power law scaling over several decades and for the first time calculate quantitatively meaningful values of decay exponent. It has proved to be much the same among healthy donors, significantly different for an autoimmune patient before the therapy, and converging towards a typical value afterwards. We discuss implications of the findings for theoretical understanding and mathematical modeling of adaptive immunity.
Assessing T cell clonal size distribution: a non-parametric approach.
Bolkhovskaya, Olesya V; Zorin, Daniil Yu; Ivanchenko, Mikhail V
2014-01-01
Clonal structure of the human peripheral T-cell repertoire is shaped by a number of homeostatic mechanisms, including antigen presentation, cytokine and cell regulation. Its accurate tuning leads to a remarkable ability to combat pathogens in all their variety, while systemic failures may lead to severe consequences like autoimmune diseases. Here we develop and make use of a non-parametric statistical approach to assess T cell clonal size distributions from recent next generation sequencing data. For 41 healthy individuals and a patient with ankylosing spondylitis, who undergone treatment, we invariably find power law scaling over several decades and for the first time calculate quantitatively meaningful values of decay exponent. It has proved to be much the same among healthy donors, significantly different for an autoimmune patient before the therapy, and converging towards a typical value afterwards. We discuss implications of the findings for theoretical understanding and mathematical modeling of adaptive immunity.
Nonparametric estimation of age-specific reference percentile curves with radial smoothing.
Wan, Xiaohai; Qu, Yongming; Huang, Yao; Zhang, Xiao; Song, Hanping; Jiang, Honghua
2012-01-01
Reference percentile curves represent the covariate-dependent distribution of a quantitative measurement and are often used to summarize and monitor dynamic processes such as human growth. We propose a new nonparametric method based on a radial smoothing (RS) technique to estimate age-specific reference percentile curves assuming the underlying distribution is relatively close to normal. We compared the RS method with both the LMS and the generalized additive models for location, scale and shape (GAMLSS) methods using simulated data and found that our method has smaller estimation error than the two existing methods. We also applied the new method to analyze height growth data from children being followed in a clinical observational study of growth hormone treatment, and compared the growth curves between those with growth disorders and the general population. Copyright © 2011 Elsevier Inc. All rights reserved.
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
Nonparametric Methods in Astronomy: Think, Regress, Observe—Pick Any Three
Steinhardt, Charles L.; Jermyn, Adam S.
2018-02-01
Telescopes are much more expensive than astronomers, so it is essential to minimize required sample sizes by using the most data-efficient statistical methods possible. However, the most commonly used model-independent techniques for finding the relationship between two variables in astronomy are flawed. In the worst case they can lead without warning to subtly yet catastrophically wrong results, and even in the best case they require more data than necessary. Unfortunately, there is no single best technique for nonparametric regression. Instead, we provide a guide for how astronomers can choose the best method for their specific problem and provide a python library with both wrappers for the most useful existing algorithms and implementations of two new algorithms developed here.
Energy Technology Data Exchange (ETDEWEB)
Takamizawa, Hisashi, E-mail: takamizawa.hisashi@jaea.go.jp; Itoh, Hiroto, E-mail: ito.hiroto@jaea.go.jp; Nishiyama, Yutaka, E-mail: nishiyama.yutaka93@jaea.go.jp
2016-10-15
In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.
Using nonparametrics to specify a model to measure the value of travel time
DEFF Research Database (Denmark)
Fosgerau, Mogens
2007-01-01
Using a range of nonparametric methods, the paper examines the specification of a model to evaluate the willingness-to-pay (WTP) for travel time changes from binomial choice data from a simple time-cost trading experiment. The analysis favours a model with random WTP as the only source...... of randomness over a model with fixed WTP which is linear in time and cost and has an additive random error term. Results further indicate that the distribution of log WTP can be described as a sum of a linear index fixing the location of the log WTP distribution and an independent random variable representing...... unobserved heterogeneity. This formulation is useful for parametric modelling. The index indicates that the WTP varies systematically with income and other individual characteristics. The WTP varies also with the time difference presented in the experiment which is in contradiction of standard utility theory....
The application of non-parametric statistical method for an ALARA implementation
International Nuclear Information System (INIS)
Cho, Young Ho; Herr, Young Hoi
2003-01-01
The cost-effective reduction of Occupational Radiation Dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results was verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data
Rogers, Jeffrey N.; Parrish, Christopher E.; Ward, Larry G.; Burdick, David M.
2018-03-01
Salt marsh vegetation tends to increase vertical uncertainty in light detection and ranging (lidar) derived elevation data, often causing the data to become ineffective for analysis of topographic features governing tidal inundation or vegetation zonation. Previous attempts at improving lidar data collected in salt marsh environments range from simply computing and subtracting the global elevation bias to more complex methods such as computing vegetation-specific, constant correction factors. The vegetation specific corrections can be used along with an existing habitat map to apply separate corrections to different areas within a study site. It is hypothesized here that correcting salt marsh lidar data by applying location-specific, point-by-point corrections, which are computed from lidar waveform-derived features, tidal-datum based elevation, distance from shoreline and other lidar digital elevation model based variables, using nonparametric regression will produce better results. The methods were developed and tested using full-waveform lidar and ground truth for three marshes in Cape Cod, Massachusetts, U.S.A. Five different model algorithms for nonparametric regression were evaluated, with TreeNet's stochastic gradient boosting algorithm consistently producing better regression and classification results. Additionally, models were constructed to predict the vegetative zone (high marsh and low marsh). The predictive modeling methods used in this study estimated ground elevation with a mean bias of 0.00 m and a standard deviation of 0.07 m (0.07 m root mean square error). These methods appear very promising for correction of salt marsh lidar data and, importantly, do not require an existing habitat map, biomass measurements, or image based remote sensing data such as multi/hyperspectral imagery.
Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
Directory of Open Access Journals (Sweden)
Terrance D. Savitsky
2016-08-01
Full Text Available We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS. The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1 A Gaussian process (GP construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2 An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households, rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot
Performance of non-parametric algorithms for spatial mapping of tropical forest structure
Directory of Open Access Journals (Sweden)
Liang Xu
2016-08-01
Full Text Available Abstract Background Mapping tropical forest structure is a critical requirement for accurate estimation of emissions and removals from land use activities. With the availability of a wide range of remote sensing imagery of vegetation characteristics from space, development of finer resolution and more accurate maps has advanced in recent years. However, the mapping accuracy relies heavily on the quality of input layers, the algorithm chosen, and the size and quality of inventory samples for calibration and validation. Results By using airborne lidar data as the “truth” and focusing on the mean canopy height (MCH as a key structural parameter, we test two commonly-used non-parametric techniques of maximum entropy (ME and random forest (RF for developing maps over a study site in Central Gabon. Results of mapping show that both approaches have improved accuracy with more input layers in mapping canopy height at 100 m (1-ha pixels. The bias-corrected spatial models further improve estimates for small and large trees across the tails of height distributions with a trade-off in increasing overall mean squared error that can be readily compensated by increasing the sample size. Conclusions A significant improvement in tropical forest mapping can be achieved by weighting the number of inventory samples against the choice of image layers and the non-parametric algorithms. Without future satellite observations with better sensitivity to forest biomass, the maps based on existing data will remain slightly biased towards the mean of the distribution and under and over estimating the upper and lower tails of the distribution.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.
Nonparametric test of consistency between cosmological models and multiband CMB measurements
Energy Technology Data Exchange (ETDEWEB)
Aghamousa, Amir [Asia Pacific Center for Theoretical Physics, Pohang, Gyeongbuk 790-784 (Korea, Republic of); Shafieloo, Arman, E-mail: amir@apctp.org, E-mail: shafieloo@kasi.re.kr [Korea Astronomy and Space Science Institute, Daejeon 305-348 (Korea, Republic of)
2015-06-01
We present a novel approach to test the consistency of the cosmological models with multiband CMB data using a nonparametric approach. In our analysis we calibrate the REACT (Risk Estimation and Adaptation after Coordinate Transformation) confidence levels associated with distances in function space (confidence distances) based on the Monte Carlo simulations in order to test the consistency of an assumed cosmological model with observation. To show the applicability of our algorithm, we confront Planck 2013 temperature data with concordance model of cosmology considering two different Planck spectra combination. In order to have an accurate quantitative statistical measure to compare between the data and the theoretical expectations, we calibrate REACT confidence distances and perform a bias control using many realizations of the data. Our results in this work using Planck 2013 temperature data put the best fit ΛCDM model at 95% (∼ 2σ) confidence distance from the center of the nonparametric confidence set while repeating the analysis excluding the Planck 217 × 217 GHz spectrum data, the best fit ΛCDM model shifts to 70% (∼ 1σ) confidence distance. The most prominent features in the data deviating from the best fit ΛCDM model seems to be at low multipoles 18 < ℓ < 26 at greater than 2σ, ℓ ∼ 750 at ∼1 to 2σ and ℓ ∼ 1800 at greater than 2σ level. Excluding the 217×217 GHz spectrum the feature at ℓ ∼ 1800 becomes substantially less significance at ∼1 to 2σ confidence level. Results of our analysis based on the new approach we propose in this work are in agreement with other analysis done using alternative methods.
Nonparametric identification of nonlinear dynamic systems using a synchronisation-based method
Kenderi, Gábor; Fidlin, Alexander
2014-12-01
The present study proposes an identification method for highly nonlinear mechanical systems that does not require a priori knowledge of the underlying nonlinearities to reconstruct arbitrary restoring force surfaces between degrees of freedom. This approach is based on the master-slave synchronisation between a dynamic model of the system as the slave and the real system as the master using measurements of the latter. As the model synchronises to the measurements, it becomes an observer of the real system. The optimal observer algorithm in a least-squares sense is given by the Kalman filter. Using the well-known state augmentation technique, the Kalman filter can be turned into a dual state and parameter estimator to identify parameters of a priori characterised nonlinearities. The paper proposes an extension of this technique towards nonparametric identification. A general system model is introduced by describing the restoring forces as bilateral spring-dampers with time-variant coefficients, which are estimated as augmented states. The estimation procedure is followed by an a posteriori statistical analysis to reconstruct noise-free restoring force characteristics using the estimated states and their estimated variances. Observability is provided using only one measured mechanical quantity per degree of freedom, which makes this approach less demanding in the number of necessary measurement signals compared with truly nonparametric solutions, which typically require displacement, velocity and acceleration signals. Additionally, due to the statistical rigour of the procedure, it successfully addresses signals corrupted by significant measurement noise. In the present paper, the method is described in detail, which is followed by numerical examples of one degree of freedom (1DoF) and 2DoF mechanical systems with strong nonlinearities of vibro-impact type to demonstrate the effectiveness of the proposed technique.
Non-parametric PSF estimation from celestial transit solar images using blind deconvolution
Directory of Open Access Journals (Sweden)
González Adriana
2016-01-01
Full Text Available Context: Characterization of instrumental effects in astronomical imaging is important in order to extract accurate physical information from the observations. The measured image in a real optical instrument is usually represented by the convolution of an ideal image with a Point Spread Function (PSF. Additionally, the image acquisition process is also contaminated by other sources of noise (read-out, photon-counting. The problem of estimating both the PSF and a denoised image is called blind deconvolution and is ill-posed. Aims: We propose a blind deconvolution scheme that relies on image regularization. Contrarily to most methods presented in the literature, our method does not assume a parametric model of the PSF and can thus be applied to any telescope. Methods: Our scheme uses a wavelet analysis prior model on the image and weak assumptions on the PSF. We use observations from a celestial transit, where the occulting body can be assumed to be a black disk. These constraints allow us to retain meaningful solutions for the filter and the image, eliminating trivial, translated, and interchanged solutions. Under an additive Gaussian noise assumption, they also enforce noise canceling and avoid reconstruction artifacts by promoting the whiteness of the residual between the blurred observations and the cleaned data. Results: Our method is applied to synthetic and experimental data. The PSF is estimated for the SECCHI/EUVI instrument using the 2007 Lunar transit, and for SDO/AIA using the 2012 Venus transit. Results show that the proposed non-parametric blind deconvolution method is able to estimate the core of the PSF with a similar quality to parametric methods proposed in the literature. We also show that, if these parametric estimations are incorporated in the acquisition model, the resulting PSF outperforms both the parametric and non-parametric methods.
Carroll, Raymond J.
2011-03-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Xu, Richard Yi Da; Luo, Xiangfeng
2018-05-01
Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
75 FR 8297 - Tongass National Forest, Thorne Bay Ranger District, Thorne Bay, AK
2010-02-24
..., Thorne Bay, AK AGENCY: Forest Service, USDA. ACTION: Cancellation of Notice of intent to prepare an... Roberts, Zone Planner, Thorne Bay Ranger District, Tongass National Forest, P.O. Box 19001, Thorne Bay, AK 99919, telephone: 907-828-3250. SUPPLEMENTARY INFORMATION: The 47,007-acre Kosciusko Project Area is...
77 FR 44140 - Drawbridge Operation Regulation; Sturgeon Bay Ship Canal, Sturgeon Bay, WI
2012-07-27
... Maple-Oregon Bridges so vehicular traffic congestion would not develop on downtown Sturgeon Bay streets... movement of vehicular traffic in Sturgeon Bay. The Sturgeon Bay Ship Canal is approximately 8.6 miles long... significant increase in vehicular and vessel traffic during the peak tourist and navigation season between...
The onset of deglaciation of Cumberland Bay and Stromness Bay, South Georgia
Van Der Putten, N.; Verbruggen, C.
Carbon dating of basal peat deposits in Cumberland Bay and Stromness Bay and sediments from a lake in Stromness Bay, South Georgia indicates deglaciation at the very beginning of the Holocene before c. 9500 14C yr BP. This post-dates the deglaciation of one local lake which has been ice-free since
78 FR 46813 - Safety Zone; Evening on the Bay Fireworks; Sturgeon Bay, WI
2013-08-02
...-AA00 Safety Zone; Evening on the Bay Fireworks; Sturgeon Bay, WI AGENCY: Coast Guard, DHS. ACTION.... This temporary safety zone will restrict vessels from a portion of Sturgeon Bay due to a fireworks... hazards associated with the fireworks display. DATES: This rule is effective from 8 p.m. until 10 p.m. on...
Empirical validation of directed functional connectivity.
Mill, Ravi D; Bagic, Anto; Bostan, Andreea; Schneider, Walter; Cole, Michael W
2017-02-01
Mapping directions of influence in the human brain connectome represents the next phase in understanding its functional architecture. However, a host of methodological uncertainties have impeded the application of directed connectivity methods, which have primarily been validated via "ground truth" connectivity patterns embedded in simulated functional MRI (fMRI) and magneto-/electro-encephalography (MEG/EEG) datasets. Such simulations rely on many generative assumptions, and we hence utilized a different strategy involving empirical data in which a ground truth directed connectivity pattern could be anticipated with confidence. Specifically, we exploited the established "sensory reactivation" effect in episodic memory, in which retrieval of sensory information reactivates regions involved in perceiving that sensory modality. Subjects performed a paired associate task in separate fMRI and MEG sessions, in which a ground truth reversal in directed connectivity between auditory and visual sensory regions was instantiated across task conditions. This directed connectivity reversal was successfully recovered across different algorithms, including Granger causality and Bayes network (IMAGES) approaches, and across fMRI ("raw" and deconvolved) and source-modeled MEG. These results extend simulation studies of directed connectivity, and offer practical guidelines for the use of such methods in clarifying causal mechanisms of neural processing. Copyright © 2016 Elsevier Inc. All rights reserved.
Eugene N. Anderson
2016-01-01
The Mongol Empire, the largest contiguous empire the world has ever known, had, among other things, a goodly number of falconers, poultry raisers, birdcatchers, cooks, and other experts on various aspects of birding. We have records of this, largely in the Yinshan Zhengyao, the court nutrition manual of the Mongol empire in China (the Yuan Dynasty). It discusses in some detail 22 bird taxa, from swans to chickens. The Huihui Yaofang, a medical encyclopedia, lists ten taxa used medicinally. Ma...
Le Bihan, Nicolas; Margerin, Ludovic
2009-07-01
In this paper, we present a nonparametric method to estimate the heterogeneity of a random medium from the angular distribution of intensity of waves transmitted through a slab of random material. Our approach is based on the modeling of forward multiple scattering using compound Poisson processes on compact Lie groups. The estimation technique is validated through numerical simulations based on radiative transfer theory.
Lange, C; Lyon, H; DeMeo, D; Raby, B; Silverman, EK; Weiss, ST
2003-01-01
We introduce a new powerful nonparametric testing strategy for family-based association studies in which multiple quantitative traits are recorded and the phenotype with the strongest genetic component is not known prior to the analysis. In the first stage, using a population-based test based on the
Jiang, GJ
1998-01-01
This paper develops a nonparametric model of interest rate term structure dynamics based an a spot rate process that permits only positive interest rates and a market price of interest rate risk that precludes arbitrage opportunities. Both the spot rate process and the market price of interest rate
Sueiro, Manuel J.; Abad, Francisco J.
2011-01-01
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Kator, H. I.; Zubkoff, P. L.
1981-01-01
Seasonal baseline data on bacterial biomass and heterotrophic uptake in the Chesapeake Bay plume and contiguous Atlantic Ocean shelf waters are discussed. Viable count bacterial numbers in surface water samples collected during June 1980 ranged from a maximum of 190,000 MPN (most probable number)/ml at the Bay mouth to a minimum of 7900 MPN/ml offshore. Similarly, direct count densities ranged from 1,800,000 BU (bacterial units)/ml to 24,000 BU/ml. Heterotrophic potential (V max) was largest at the Bay mouth and lowest offshore. Biomass and V max values usually decreased with depth although subsurface maxima were occasionally observed at inshore stations. Correlation of biomass and heterotrophic potential data with selected hydrographic variables was determind with a nonparametric statistic. Results indicate viable counts are positively and significantly correlated with total chlorophyll, temperature, direct count and V max during June 1980; significant negative correlations are obtained with salinity and depth. Calculations of bacterial standing crop are discussed.
Empirical techniques in finance
Bhar, Ramaprasad
2005-01-01
This book offers the opportunity to study and experience advanced empi- cal techniques in finance and in general financial economics. It is not only suitable for students with an interest in the field, it is also highly rec- mended for academic researchers as well as the researchers in the industry. The book focuses on the contemporary empirical techniques used in the analysis of financial markets and how these are implemented using actual market data. With an emphasis on Implementation, this book helps foc- ing on strategies for rigorously combing finance theory and modeling technology to extend extant considerations in the literature. The main aim of this book is to equip the readers with an array of tools and techniques that will allow them to explore financial market problems with a fresh perspective. In this sense it is not another volume in eco- metrics. Of course, the traditional econometric methods are still valid and important; the contents of this book will bring in other related modeling topics tha...
Gibbs, James F.; Borcherdt, Roger D.
1974-01-01
Measurements of ground motion generated by nuclear explosions in Nevada have been completed for 99 locations in the San Francisco Bay region, California. The seismograms, Fourier amplitude spectra, spectral amplification curves for the signal, and the Fourier amplitude spectra of the seismic noise are presented for 60 locations. Analog amplifications, based on the maximum signal amplitude, are computed for an additional 39 locations. The recordings of the nuclear explosions show marked amplitude variations which are consistently related to the local geologic conditions of the recording site. The average spectral amplifications observed for vertical and horizontal ground motions are, respectively: (1, 1) for granite, (1.5, 1.6) for the Franciscan Formation, (2.3, 2.3), for other pre-Tertiary and Tertiary rocks, (3.0, 2.7) for the Santa Clara Formation, (3.3, 4.4) for older bay sediments, and (3.7, 11.3) for younger bay mud. Spectral amplification curves define predominant ground frequencies for younger bay mud sites and for some older bay sediment sites. The predominant frequencies for most sites were not clearly defined by the amplitude spectra computed from the seismic background noise. The intensities ascribed to various sites in the San Francisco Bay region for the California earthquake of April 18, 1906, are strongly dependent on distance from the zone of surface faulting and the geological character of the ground. Considering only those sites (approximately one square city block in size) for which there is good evidence for the degree of ascribed intensity, the intensities for 917 sites on Franciscan rocks generally decrease with the logarithm of distance as Intensity = 2.69 - 1.90 log (Distance Km). For sites on other geologic units, intensity increments, derived from this empirical rela.tion, correlate strongly with the Average Horizontal Spectral Amplifications (MISA) according to the empirical relation Intensity Increment= 0.27 + 2.70 log(AHSA). Average
Unique thermal record in False Bay
CSIR Research Space (South Africa)
Grundlingh, ML
1993-10-01
Full Text Available Over the past decade False Bay has assumed a prime position in terms of research in to large South African bays. This is manifested by investigations that cover flow conditions modelling, thermal structure, management, biology and nutrients, geology...
Hierarchical mixtures of naive Bayes classifiers
Wiering, M.A.
2002-01-01
Naive Bayes classifiers tend to perform very well on a large number of problem domains, although their representation power is quite limited compared to more sophisticated machine learning algorithms. In this pa- per we study combining multiple naive Bayes classifiers by using the hierar- chical
Safety culture development at Daya Bay NPP
International Nuclear Information System (INIS)
Zhang Shanming
2001-01-01
From view on Organization Behavior theory, the concept, development and affecting factors of safety culture are introduced. The focuses are on the establishment, development and management practice for safety culture at Daya Bay NPP. A strong safety culture, also demonstrated, has contributed greatly to improving performance at Daya Bay
The Holocene History of Placentia Bay, Newfoundland
DEFF Research Database (Denmark)
Sheldon, Christina; Seidenkrantz, Marit-Solveig; Reynisson, Njall
2013-01-01
Marine sediments analyzed from cores taken in Placentia Bay, Newfoundland, located in the Labrador Sea, captured oceanographic and climatic changes from the end of the Younger Dryas through the Holocene. Placentia Bay is an ideal site to capture changes in both the south-flowing Labrador Current ...
Towards a sustainable future in Hudson Bay
International Nuclear Information System (INIS)
Okrainetz, G.
1991-01-01
To date, ca $40-50 billion has been invested in or committed to hydroelectric development on the rivers feeding Hudson Bay. In addition, billions more have been invested in land uses such as forestry and mining within the Hudson Bay drainage basin. However, there has never been a study of the possible impacts on Hudson Bay resulting from this activity. Neither has there been any federal environmental assessment on any of the economic developments that affect Hudson Bay. To fill this gap in knowledge, the Hudson Bay Program was established. The program will not conduct scientific field research but will rather scan the published literature and consult with leading experts in an effort to identify biophysical factors that are likely to be significantly affected by the cumulative influence of hydroelectric and other developments within and outside the region. An annotated bibliography on Hudson Bay has been completed and used to prepare a science overview paper, which will be circulated for comment, revised, and used as the basis for a workshop on cumulative effects in Hudson Bay. Papers will then be commissioned for a second workshop to be held in fall 1993. A unique feature of the program is its integration of traditional ecological knowledge among the Inuit and Cree communities around Hudson Bay with the scientific approach to cumulative impact assessment. One goal of the program is to help these communities bring forward their knowledge in such a way that it can be integrated into the cumulative effects assessment
Final Empirical Test Case Specification
DEFF Research Database (Denmark)
Kalyanova, Olena; Heiselberg, Per
This document includes the empirical specification on the IEA task of evaluation building energy simulation computer programs for the Double Skin Facades (DSF) constructions. There are two approaches involved into this procedure, one is the comparative approach and another is the empirical one....
Directory of Open Access Journals (Sweden)
Robert Aldrich
2010-03-01
Full Text Available This paper argues that the colonial legacy is ever present in contemporary Europe. For a generation, most Europeans largely tried, publicly, to forget the colonial past, or remembered it only through the rose-coloured lenses of nostalgia; now the pendulum has swung to memory of that past – even perhaps, in the views of some, to a surfeit of memory, where each group agitates for its own version of history, its own recognition in laws and ceremonies, its own commemoration in museums and monuments, the valorization or repatriation of its own art and artefacts. Word such as ‘invasion,’ ‘racism’ and ‘genocide’ are emotional terms that provoke emotional reactions. Whether leaders should apologize for wrongs of the past – and which wrongs – remains a highly sensitive issue. The ‘return of the colonial’ thus has to do with ethics and politics as well as with history, and can link to statements of apology or recognition, legislation about certain views of history, monetary compensation, repatriation of objects, and—perhaps most importantly—redefinition of national identity and policy. The colonial flags may have been lowered, but many barricades seem to have been raised. Private memories—of loss of land, of unacknowledged service, of political, economic, social and cultural disenfranchisement, but also on the other side of defeat, national castigation and self-flagellation—have been increasingly public. Monuments and museums act not only as sites of history but as venues for political agitation and forums for academic debate – differences of opinion that have spread to the streets. Empire has a long after-life.
Empirical Support for Perceptual Conceptualism
Directory of Open Access Journals (Sweden)
Nicolás Alejandro Serrano
2018-03-01
Full Text Available The main objective of this paper is to show that perceptual conceptualism can be understood as an empirically meaningful position and, furthermore, that there is some degree of empirical support for its main theses. In order to do this, I will start by offering an empirical reading of the conceptualist position, and making three predictions from it. Then, I will consider recent experimental results from cognitive sciences that seem to point towards those predictions. I will conclude that, while the evidence offered by those experiments is far from decisive, it is enough not only to show that conceptualism is an empirically meaningful position but also that there is empirical support for it.
Empire as a Geopolitical Figure
DEFF Research Database (Denmark)
Parker, Noel
2010-01-01
This article analyses the ingredients of empire as a pattern of order with geopolitical effects. Noting the imperial form's proclivity for expansion from a critical reading of historical sociology, the article argues that the principal manifestation of earlier geopolitics lay not in the nation...... but in empire. That in turn has been driven by a view of the world as disorderly and open to the ordering will of empires (emanating, at the time of geopolitics' inception, from Europe). One implication is that empires are likely to figure in the geopolitical ordering of the globe at all times, in particular...... after all that has happened in the late twentieth century to undermine nationalism and the national state. Empire is indeed a probable, even for some an attractive form of regime for extending order over the disorder produced by globalisation. Geopolitics articulated in imperial expansion is likely...
Prediction of maximum earthquake intensities for the San Francisco Bay region
Borcherdt, Roger D.; Gibbs, James F.
1975-01-01
The intensity data for the California earthquake of April 18, 1906, are strongly dependent on distance from the zone of surface faulting and the geological character of the ground. Considering only those sites (approximately one square city block in size) for which there is good evidence for the degree of ascribed intensity, the empirical relation derived between 1906 intensities and distance perpendicular to the fault for 917 sites underlain by rocks of the Franciscan Formation is: Intensity = 2.69 - 1.90 log (Distance) (km). For sites on other geologic units intensity increments, derived with respect to this empirical relation, correlate strongly with the Average Horizontal Spectral Amplifications (AHSA) determined from 99 three-component recordings of ground motion generated by nuclear explosions in Nevada. The resulting empirical relation is: Intensity Increment = 0.27 +2.70 log (AHSA), and average intensity increments for the various geologic units are -0.29 for granite, 0.19 for Franciscan Formation, 0.64 for the Great Valley Sequence, 0.82 for Santa Clara Formation, 1.34 for alluvium, 2.43 for bay mud. The maximum intensity map predicted from these empirical relations delineates areas in the San Francisco Bay region of potentially high intensity from future earthquakes on either the San Andreas fault or the Hazard fault.
Prediction of maximum earthquake intensities for the San Francisco Bay region
Energy Technology Data Exchange (ETDEWEB)
Borcherdt, R.D.; Gibbs, J.F.
1975-01-01
The intensity data for the California earthquake of Apr 18, 1906, are strongly dependent on distance from the zone of surface faulting and the geological character of the ground. Considering only those sites (approximately one square city block in size) for which there is good evidence for the degree of ascribed intensity, the empirical relation derived between 1906 intensities and distance perpendicular to the fault for 917 sites underlain by rocks of the Franciscan formation is intensity = 2.69 - 1.90 log (distance) (km). For sites on other geologic units, intensity increments, derived with respect to this empirical relation, correlate strongly with the average horizontal spectral amplifications (AHSA) determined from 99 three-component recordings of ground motion generated by nuclear explosions in Nevada. The resulting empirical relation is intensity increment = 0.27 + 2.70 log (AHSA), and average intensity increments for the various geologic units are -0.29 for granite, 0.19 for Franciscan formation, 0.64 for the Great Valley sequence, 0.82 for Santa Clara formation, 1.34 for alluvium, and 2.43 for bay mud. The maximum intensity map predicted from these empirical relations delineates areas in the San Francisco Bay region of potentially high intensity from future earthquakes on either the San Andreas fault or the Hayward fault.
Xu, Maoqi; Chen, Liang
2018-01-01
The individual sample heterogeneity is one of the biggest obstacles in biomarker identification for complex diseases such as cancers. Current statistical models to identify differentially expressed genes between disease and control groups often overlook the substantial human sample heterogeneity. Meanwhile, traditional nonparametric tests lose detailed data information and sacrifice the analysis power, although they are distribution free and robust to heterogeneity. Here, we propose an empirical likelihood ratio test with a mean-variance relationship constraint (ELTSeq) for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Bird surveys at McKinley Bay and Hutchison Bay, Northwest Territories, in 1991
Energy Technology Data Exchange (ETDEWEB)
Cornish, B J; Dickson, D L; Dickson, H L
1992-03-01
McKinley Bay is a shallow protected bay along the eastern Beaufort Sea coast which provides an important habitat for diving ducks. Since 1979, the bay has been the site of a winter harbor and support base for oil and gas exploraton in the Beaufort Sea. Aerial surveys for bird abundance and distribution were conducted in August 1991 as a continuation of long-term monitoring of birds in McKinley Bay and Hutchison Bay, a nearby area used as a control. The main objectives of the 1991 surveys were to expand the set of baseline data on natural annual fluctuations in diving duck numbers, and to determine if numbers of diving ducks had changed since the initial 1981-85 surveys. On the day with the best survey conditions, the population of diving ducks at McKinley bay was estimated at ca 32,000, significantly more than 1981-85. At Hutchison Bay, there were an estimated 11,000 ducks. As in previous years, large numbers of diving ducks were observed off Atkinson Point at the northwest corner of McKinley Bay, at the south end of the bay, and in the northeast corner near a long spit. Most divers in Hutchison Bay were at the west side. Diving ducks, primarily Oldsquaw and scoter, were the most abundant bird group in the study area. Observed distribution patterns of birds are discussed with reference to habitat preferences. 16 refs., 7 figs., 30 tabs.
Kliem, Sören; Weusthoff, Sarah; Hahlweg, Kurt; Baucom, Katherine J W; Baucom, Brian R
2015-12-01
Identifying risk factors for divorce or separation is an important step in the prevention of negative individual outcomes and societal costs associated with relationship dissolution. Programs that aim to prevent relationship distress and dissolution typically focus on changing processes that occur during couple conflict, although the predictive ability of conflict-specific variables has not been examined in the context of other factors related to relationship dissolution. The authors examine whether emotional responding and communication during couple conflict predict relationship dissolution after controlling for overall relationship quality and individual well-being. Using nonparametric conditional survival trees, the study at hand simultaneously examined the predictive abilities of physiological (systolic and diastolic blood pressure, heart rate, cortisol) and behavioral (fundamental frequency; f0) indices of emotional responding, as well as observationally coded positive and negative communication behavior, on long-term relationship stability after controlling for relationship satisfaction and symptoms of depression. One hundred thirty-six spouses were assessed after participating in a randomized clinical trial of a relationship distress prevention program as well as 11 years thereafter; 32.5% of the couples' relationships had dissolved by follow up. For men, the only significant predictor of relationship dissolution was cortisol change score (p = .012). For women, only f0 range was a significant predictor of relationship dissolution (p = .034). These findings highlight the importance of emotional responding during couple conflict for long-term relationship stability. (c) 2015 APA, all rights reserved).
Yap, John Stephen; Fan, Jianqing; Wu, Rongling
2009-12-01
Estimation of the covariance structure of longitudinal processes is a fundamental prerequisite for the practical deployment of functional mapping designed to study the genetic regulation and network of quantitative variation in dynamic complex traits. We present a nonparametric approach for estimating the covariance structure of a quantitative trait measured repeatedly at a series of time points. Specifically, we adopt Huang et al.'s (2006, Biometrika 93, 85-98) approach of invoking the modified Cholesky decomposition and converting the problem into modeling a sequence of regressions of responses. A regularized covariance estimator is obtained using a normal penalized likelihood with an L(2) penalty. This approach, embedded within a mixture likelihood framework, leads to enhanced accuracy, precision, and flexibility of functional mapping while preserving its biological relevance. Simulation studies are performed to reveal the statistical properties and advantages of the proposed method. A real example from a mouse genome project is analyzed to illustrate the utilization of the methodology. The new method will provide a useful tool for genome-wide scanning for the existence and distribution of quantitative trait loci underlying a dynamic trait important to agriculture, biology, and health sciences.
Rank-based permutation approaches for non-parametric factorial designs.
Umlauft, Maria; Konietschke, Frank; Pauly, Markus
2017-11-01
Inference methods for null hypotheses formulated in terms of distribution functions in general non-parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set-up Wald-type statistics and ANOVA-type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal-Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability. © 2017 The British Psychological Society.
Nonparametric estimates of drift and diffusion profiles via Fokker-Planck algebra.
Lund, Steven P; Hubbard, Joseph B; Halter, Michael
2014-11-06
Diffusion processes superimposed upon deterministic motion play a key role in understanding and controlling the transport of matter, energy, momentum, and even information in physics, chemistry, material science, biology, and communications technology. Given functions defining these random and deterministic components, the Fokker-Planck (FP) equation is often used to model these diffusive systems. Many methods exist for estimating the drift and diffusion profiles from one or more identifiable diffusive trajectories; however, when many identical entities diffuse simultaneously, it may not be possible to identify individual trajectories. Here we present a method capable of simultaneously providing nonparametric estimates for both drift and diffusion profiles from evolving density profiles, requiring only the validity of Langevin/FP dynamics. This algebraic FP manipulation provides a flexible and robust framework for estimating stationary drift and diffusion coefficient profiles, is not based on fluctuation theory or solved diffusion equations, and may facilitate predictions for many experimental systems. We illustrate this approach on experimental data obtained from a model lipid bilayer system exhibiting free diffusion and electric field induced drift. The wide range over which this approach provides accurate estimates for drift and diffusion profiles is demonstrated through simulation.
Single molecule force spectroscopy at high data acquisition: A Bayesian nonparametric analysis
Sgouralis, Ioannis; Whitmore, Miles; Lapidus, Lisa; Comstock, Matthew J.; Pressé, Steve
2018-03-01
Bayesian nonparametrics (BNPs) are poised to have a deep impact in the analysis of single molecule data as they provide posterior probabilities over entire models consistent with the supplied data, not just model parameters of one preferred model. Thus they provide an elegant and rigorous solution to the difficult problem encountered when selecting an appropriate candidate model. Nevertheless, BNPs' flexibility to learn models and their associated parameters from experimental data is a double-edged sword. Most importantly, BNPs are prone to increasing the complexity of the estimated models due to artifactual features present in time traces. Thus, because of experimental challenges unique to single molecule methods, naive application of available BNP tools is not possible. Here we consider traces with time correlations and, as a specific example, we deal with force spectroscopy traces collected at high acquisition rates. While high acquisition rates are required in order to capture dwells in short-lived molecular states, in this setup, a slow response of the optical trap instrumentation (i.e., trapped beads, ambient fluid, and tethering handles) distorts the molecular signals introducing time correlations into the data that may be misinterpreted as true states by naive BNPs. Our adaptation of BNP tools explicitly takes into consideration these response dynamics, in addition to drift and noise, and makes unsupervised time series analysis of correlated single molecule force spectroscopy measurements possible, even at acquisition rates similar to or below the trap's response times.
Trend Analysis of Pahang River Using Non-Parametric Analysis: Mann Kendalls Trend Test
International Nuclear Information System (INIS)
Nur Hishaam Sulaiman; Mohd Khairul Amri Kamarudin; Mohd Khairul Amri Kamarudin; Ahmad Dasuki Mustafa; Muhammad Azizi Amran; Fazureen Azaman; Ismail Zainal Abidin; Norsyuhada Hairoma
2015-01-01
Flood is common in Pahang especially during northeast monsoon season from November to February. Three river cross station: Lubuk Paku, Sg. Yap and Temerloh were selected as area of this study. The stream flow and water level data were gathered from DID record. Data set for this study were analysed by using non-parametric analysis, Mann-Kendall Trend Test. The results that obtained from stream flow and water level analysis indicate that there are positively significant trend for Lubuk Paku (0.001) and Sg. Yap (<0.0001) from 1972-2011 with the p-value < 0.05. Temerloh (0.178) data from 1963-2011 recorded no trend for stream flow parameter but negative trend for water level parameter. Hydrological pattern and trend are extremely affected by outside factors such as north east monsoon season that occurred in South China Sea and affected Pahang during November to March. There are other factors such as development and management of the areas which can be considered as factors affected the data and results. Hydrological Pattern is important to indicate the river trend such as stream flow and water level. It can be used as flood mitigation by local authorities. (author)
A NON-PARAMETRIC APPROACH TO CONSTRAIN THE TRANSFER FUNCTION IN REVERBERATION MAPPING
International Nuclear Information System (INIS)
Li, Yan-Rong; Wang, Jian-Min; Bai, Jin-Ming
2016-01-01
Broad emission lines of active galactic nuclei stem from a spatially extended region (broad-line region, BLR) that is composed of discrete clouds and photoionized by the central ionizing continuum. The temporal behaviors of these emission lines are blurred echoes of continuum variations (i.e., reverberation mapping, RM) and directly reflect the structures and kinematic information of BLRs through the so-called transfer function (also known as the velocity-delay map). Based on the previous works of Rybicki and Press and Zu et al., we develop an extended, non-parametric approach to determine the transfer function for RM data, in which the transfer function is expressed as a sum of a family of relatively displaced Gaussian response functions. Therefore, arbitrary shapes of transfer functions associated with complicated BLR geometry can be seamlessly included, enabling us to relax the presumption of a specified transfer function frequently adopted in previous studies and to let it be determined by observation data. We formulate our approach in a previously well-established framework that incorporates the statistical modeling of continuum variations as a damped random walk process and takes into account long-term secular variations which are irrelevant to RM signals. The application to RM data shows the fidelity of our approach.
A NON-PARAMETRIC APPROACH TO CONSTRAIN THE TRANSFER FUNCTION IN REVERBERATION MAPPING
Energy Technology Data Exchange (ETDEWEB)
Li, Yan-Rong; Wang, Jian-Min [Key Laboratory for Particle Astrophysics, Institute of High Energy Physics, Chinese Academy of Sciences, 19B Yuquan Road, Beijing 100049 (China); Bai, Jin-Ming, E-mail: liyanrong@mail.ihep.ac.cn [Yunnan Observatories, Chinese Academy of Sciences, Kunming 650011 (China)
2016-11-10
Broad emission lines of active galactic nuclei stem from a spatially extended region (broad-line region, BLR) that is composed of discrete clouds and photoionized by the central ionizing continuum. The temporal behaviors of these emission lines are blurred echoes of continuum variations (i.e., reverberation mapping, RM) and directly reflect the structures and kinematic information of BLRs through the so-called transfer function (also known as the velocity-delay map). Based on the previous works of Rybicki and Press and Zu et al., we develop an extended, non-parametric approach to determine the transfer function for RM data, in which the transfer function is expressed as a sum of a family of relatively displaced Gaussian response functions. Therefore, arbitrary shapes of transfer functions associated with complicated BLR geometry can be seamlessly included, enabling us to relax the presumption of a specified transfer function frequently adopted in previous studies and to let it be determined by observation data. We formulate our approach in a previously well-established framework that incorporates the statistical modeling of continuum variations as a damped random walk process and takes into account long-term secular variations which are irrelevant to RM signals. The application to RM data shows the fidelity of our approach.
Nonparametric Tree-Based Predictive Modeling of Storm Outages on an Electric Distribution Network.
He, Jichao; Wanik, David W; Hartman, Brian M; Anagnostou, Emmanouil N; Astitha, Marina; Frediani, Maria E B
2017-03-01
This article compares two nonparametric tree-based models, quantile regression forests (QRF) and Bayesian additive regression trees (BART), for predicting storm outages on an electric distribution network in Connecticut, USA. We evaluated point estimates and prediction intervals of outage predictions for both models using high-resolution weather, infrastructure, and land use data for 89 storm events (including hurricanes, blizzards, and thunderstorms). We found that spatially BART predicted more accurate point estimates than QRF. However, QRF produced better prediction intervals for high spatial resolutions (2-km grid cells and towns), while BART predictions aggregated to coarser resolutions (divisions and service territory) more effectively. We also found that the predictive accuracy was dependent on the season (e.g., tree-leaf condition, storm characteristics), and that the predictions were most accurate for winter storms. Given the merits of each individual model, we suggest that BART and QRF be implemented together to show the complete picture of a storm's potential impact on the electric distribution network, which would allow for a utility to make better decisions about allocating prestorm resources. © 2016 Society for Risk Analysis.
Non-parametric early seizure detection in an animal model of temporal lobe epilepsy
Talathi, Sachin S.; Hwang, Dong-Uk; Spano, Mark L.; Simonotto, Jennifer; Furman, Michael D.; Myers, Stephen M.; Winters, Jason T.; Ditto, William L.; Carney, Paul R.
2008-03-01
The performance of five non-parametric, univariate seizure detection schemes (embedding delay, Hurst scale, wavelet scale, nonlinear autocorrelation and variance energy) were evaluated as a function of the sampling rate of EEG recordings, the electrode types used for EEG acquisition, and the spatial location of the EEG electrodes in order to determine the applicability of the measures in real-time closed-loop seizure intervention. The criteria chosen for evaluating the performance were high statistical robustness (as determined through the sensitivity and the specificity of a given measure in detecting a seizure) and the lag in seizure detection with respect to the seizure onset time (as determined by visual inspection of the EEG signal by a trained epileptologist). An optimality index was designed to evaluate the overall performance of each measure. For the EEG data recorded with microwire electrode array at a sampling rate of 12 kHz, the wavelet scale measure exhibited better overall performance in terms of its ability to detect a seizure with high optimality index value and high statistics in terms of sensitivity and specificity.
International Nuclear Information System (INIS)
Zhong, X.; Ichchou, M.; Saidi, A.
2010-01-01
Various parametric skewed distributions are widely used to model the time-to-failure (TTF) in the reliability analysis of mechatronic systems, where many items are unobservable due to the high cost of testing. Estimating the parameters of those distributions becomes a challenge. Previous research has failed to consider this problem due to the difficulty of dependency modeling. Recently the methodology of Bayesian networks (BNs) has greatly contributed to the reliability analysis of complex systems. In this paper, the problem of system reliability assessment (SRA) is formulated as a BN considering the parameter uncertainty. As the quantitative specification of BN, a normal distribution representing the stochastic nature of TTF distribution is learned to capture the interactions between the basic items and their output items. The approximation inference of our continuous BN model is performed by a modified version of nonparametric belief propagation (NBP) which can avoid using a junction tree that is inefficient for the mechatronic case because of the large treewidth. After reasoning, we obtain the marginal posterior density of each TTF model parameter. Other information from diverse sources and expert priors can be easily incorporated in this SRA model to achieve more accurate results. Simulation in simple and complex cases of mechatronic systems demonstrates that the posterior of the parameter network fits the data well and the uncertainty passes effectively through our BN based SRA model by using the modified NBP.
Design Automation Using Script Languages. High-Level CAD Templates in Non-Parametric Programs
Moreno, R.; Bazán, A. M.
2017-10-01
The main purpose of this work is to study the advantages offered by the application of traditional techniques of technical drawing in processes for automation of the design, with non-parametric CAD programs, provided with scripting languages. Given that an example drawing can be solved with traditional step-by-step detailed procedures, is possible to do the same with CAD applications and to generalize it later, incorporating references. In today’s modern CAD applications, there are striking absences of solutions for building engineering: oblique projections (military and cavalier), 3D modelling of complex stairs, roofs, furniture, and so on. The use of geometric references (using variables in script languages) and their incorporation into high-level CAD templates allows the automation of processes. Instead of repeatedly creating similar designs or modifying their data, users should be able to use these templates to generate future variations of the same design. This paper presents the automation process of several complex drawing examples based on CAD script files aided with parametric geometry calculation tools. The proposed method allows us to solve complex geometry designs not currently incorporated in the current CAD applications and to subsequently create other new derivatives without user intervention. Automation in the generation of complex designs not only saves time but also increases the quality of the presentations and reduces the possibility of human errors.
Lee, Kit-Hang; Fu, Denny K.C.; Leong, Martin C.W.; Chow, Marco; Fu, Hing-Choi; Althoefer, Kaspar; Sze, Kam Yim; Yeung, Chung-Kwong
2017-01-01
Abstract Bioinspired robotic structures comprising soft actuation units have attracted increasing research interest. Taking advantage of its inherent compliance, soft robots can assure safe interaction with external environments, provided that precise and effective manipulation could be achieved. Endoscopy is a typical application. However, previous model-based control approaches often require simplified geometric assumptions on the soft manipulator, but which could be very inaccurate in the presence of unmodeled external interaction forces. In this study, we propose a generic control framework based on nonparametric and online, as well as local, training to learn the inverse model directly, without prior knowledge of the robot's structural parameters. Detailed experimental evaluation was conducted on a soft robot prototype with control redundancy, performing trajectory tracking in dynamically constrained environments. Advanced element formulation of finite element analysis is employed to initialize the control policy, hence eliminating the need for random exploration in the robot's workspace. The proposed control framework enabled a soft fluid-driven continuum robot to follow a 3D trajectory precisely, even under dynamic external disturbance. Such enhanced control accuracy and adaptability would facilitate effective endoscopic navigation in complex and changing environments. PMID:29251567
Nonparametric Second-Order Theory of Error Propagation on Motion Groups.
Wang, Yunfeng; Chirikjian, Gregory S
2008-01-01
Error propagation on the Euclidean motion group arises in a number of areas such as in dead reckoning errors in mobile robot navigation and joint errors that accumulate from the base to the distal end of kinematic chains such as manipulators and biological macromolecules. We address error propagation in rigid-body poses in a coordinate-free way. In this paper we show how errors propagated by convolution on the Euclidean motion group, SE(3), can be approximated to second order using the theory of Lie algebras and Lie groups. We then show how errors that are small (but not so small that linearization is valid) can be propagated by a recursive formula derived here. This formula takes into account errors to second-order, whereas prior efforts only considered the first-order case. Our formulation is nonparametric in the sense that it will work for probability density functions of any form (not only Gaussians). Numerical tests demonstrate the accuracy of this second-order theory in the context of a manipulator arm and a flexible needle with bevel tip.
A Non-Parametric Delphi Approach to Foster Innovation Policy Debate in Spain
Directory of Open Access Journals (Sweden)
Juan Carlos Salazar-Elena
2016-05-01
Full Text Available The aim of this paper is to identify some changes needed in Spain’s innovation policy to fill the gap between its innovation results and those of other European countries in lieu of sustainable leadership. To do this we apply the Delphi methodology to experts from academia, business, and government. To overcome the shortcomings of traditional descriptive methods, we develop an inferential analysis by following a non-parametric bootstrap method which enables us to identify important changes that should be implemented. Particularly interesting is the support found for improving the interconnections among the relevant agents of the innovation system (instead of focusing exclusively in the provision of knowledge and technological inputs through R and D activities, or the support found for “soft” policy instruments aimed at providing a homogeneous framework to assess the innovation capabilities of firms (e.g., for funding purposes. Attention to potential innovators among small and medium enterprises (SMEs and traditional industries is particularly encouraged by experts.
Nonparametric Forecasting for Biochar Utilization in Poyang Lake Eco-Economic Zone in China
Directory of Open Access Journals (Sweden)
Meng-Shiuh Chang
2014-01-01
Full Text Available Agriculture is the least profitable industry in China. However, even with large financial subsidies from the government, farmers’ living standards have had no significant impact so far due to the historical, geographical, climatic factors. The study examines and quantifies the net economic and environmental benefits by utilizing biochar as a soil amendment in eleven counties in the Poyang Lake Eco-Economic Zone. A nonparametric kernel regression model is employed to estimate the relation between the scaled environmental and economic factors, which are determined as regression variables. In addition, the partial linear and single index regression models are used for comparison. In terms of evaluations of mean squared errors, the kernel estimator, exceeding the other estimators, is employed to forecast benefits of using biochar under various scenarios. The results indicate that biochar utilization can potentially increase farmers’ income if rice is planted and the net economic benefits can be achieved up to ¥114,900. The net economic benefits are higher when the pyrolysis plant is built in the south of Poyang Lake Eco-Economic Zone than when it is built in the north as the southern land is relatively barren, and biochar can save more costs on irrigation and fertilizer use.
International Nuclear Information System (INIS)
Dimas, George; Iakovidis, Dimitris K; Karargyris, Alexandros; Ciuti, Gastone; Koulaouzidis, Anastasios
2017-01-01
Wireless capsule endoscopy is a non-invasive screening procedure of the gastrointestinal (GI) tract performed with an ingestible capsule endoscope (CE) of the size of a large vitamin pill. Such endoscopes are equipped with a usually low-frame-rate color camera which enables the visualization of the GI lumen and the detection of pathologies. The localization of the commercially available CEs is performed in the 3D abdominal space using radio-frequency (RF) triangulation from external sensor arrays, in combination with transit time estimation. State-of-the-art approaches, such as magnetic localization, which have been experimentally proved more accurate than the RF approach, are still at an early stage. Recently, we have demonstrated that CE localization is feasible using solely visual cues and geometric models. However, such approaches depend on camera parameters, many of which are unknown. In this paper the authors propose a novel non-parametric visual odometry (VO) approach to CE localization based on a feed-forward neural network architecture. The effectiveness of this approach in comparison to state-of-the-art geometric VO approaches is validated using a robotic-assisted in vitro experimental setup. (paper)
Dimas, George; Iakovidis, Dimitris K.; Karargyris, Alexandros; Ciuti, Gastone; Koulaouzidis, Anastasios
2017-09-01
Wireless capsule endoscopy is a non-invasive screening procedure of the gastrointestinal (GI) tract performed with an ingestible capsule endoscope (CE) of the size of a large vitamin pill. Such endoscopes are equipped with a usually low-frame-rate color camera which enables the visualization of the GI lumen and the detection of pathologies. The localization of the commercially available CEs is performed in the 3D abdominal space using radio-frequency (RF) triangulation from external sensor arrays, in combination with transit time estimation. State-of-the-art approaches, such as magnetic localization, which have been experimentally proved more accurate than the RF approach, are still at an early stage. Recently, we have demonstrated that CE localization is feasible using solely visual cues and geometric models. However, such approaches depend on camera parameters, many of which are unknown. In this paper the authors propose a novel non-parametric visual odometry (VO) approach to CE localization based on a feed-forward neural network architecture. The effectiveness of this approach in comparison to state-of-the-art geometric VO approaches is validated using a robotic-assisted in vitro experimental setup.
Graziani, Rebecca; Guindani, Michele; Thall, Peter F.
2015-01-01
Summary The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212
Lee, Soojeong; Rajan, Sreeraman; Jeon, Gwanggil; Chang, Joon-Hyuk; Dajani, Hilmi R; Groza, Voicu Z
2017-06-01
Blood pressure (BP) is one of the most important vital indicators and plays a key role in determining the cardiovascular activity of patients. This paper proposes a hybrid approach consisting of nonparametric bootstrap (NPB) and machine learning techniques to obtain the characteristic ratios (CR) used in the blood pressure estimation algorithm to improve the accuracy of systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates and obtain confidence intervals (CI). The NPB technique is used to circumvent the requirement for large sample set for obtaining the CI. A mixture of Gaussian densities is assumed for the CRs and Gaussian mixture model (GMM) is chosen to estimate the SBP and DBP ratios. The K-means clustering technique is used to obtain the mixture order of the Gaussian densities. The proposed approach achieves grade "A" under British Society of Hypertension testing protocol and is superior to the conventional approach based on maximum amplitude algorithm (MAA) that uses fixed CR ratios. The proposed approach also yields a lower mean error (ME) and the standard deviation of the error (SDE) in the estimates when compared to the conventional MAA method. In addition, CIs obtained through the proposed hybrid approach are also narrower with a lower SDE. The proposed approach combining the NPB technique with the GMM provides a methodology to derive individualized characteristic ratio. The results exhibit that the proposed approach enhances the accuracy of SBP and DBP estimation and provides narrower confidence intervals for the estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.