Energy Technology Data Exchange (ETDEWEB)
Suh, M. Y.; Jee, K. Y.; Park, K. K. [Korea Atomic Energy Research Institute, Taejon (Korea)
1999-08-01
This report is intended to describe the statistical methods necessary to design and conduct radiation counting experiments and evaluate the data from the experiments. The methods are described for the evaluation of the stability of a counting system and the estimation of the precision of counting data by application of probability distribution models. The methods for the determination of the uncertainty of the results calculated from the number of counts, as well as various statistical methods for the reduction of counting error are also described. 11 refs., 6 figs., 8 tabs. (Author)
International Nuclear Information System (INIS)
Suh, M. Y.; Jee, K. Y.; Park, K. K.; Park, Y. J.; Kim, W. H.
1999-08-01
This report is intended to describe the statistical methods necessary to design and conduct radiation counting experiments and evaluate the data from the experiment. The methods are described for the evaluation of the stability of a counting system and the estimation of the precision of counting data by application of probability distribution models. The methods for the determination of the uncertainty of the results calculated from the number of counts, as well as various statistical methods for the reduction of counting error are also described. (Author). 11 refs., 8 tabs., 8 figs
Experimental investigation of statistical models describing distribution of counts
International Nuclear Information System (INIS)
Salma, I.; Zemplen-Papp, E.
1992-01-01
The binomial, Poisson and modified Poisson models which are used for describing the statistical nature of the distribution of counts are compared theoretically, and conclusions for application are considered. The validity of the Poisson and the modified Poisson statistical distribution for observing k events in a short time interval is investigated experimentally for various measuring times. The experiments to measure the influence of the significant radioactive decay were performed with 89 Y m (T 1/2 =16.06 s), using a multichannel analyser (4096 channels) in the multiscaling mode. According to the results, Poisson statistics describe the counting experiment for short measuring times (up to T=0.5T 1/2 ) and its application is recommended. However, analysis of the data demonstrated, with confidence, that for long measurements (T≥T 1/2 ) Poisson distribution is not valid and the modified Poisson function is preferable. The practical implications in calculating uncertainties and in optimizing the measuring time are discussed. Differences between the standard deviations evaluated on the basis of the Poisson and binomial models are especially significant for experiments with long measuring time (T/T 1/2 ≥2) and/or large detection efficiency (ε>0.30). Optimization of the measuring time for paired observations yields the same solution for either the binomial or the Poisson distribution. (orig.)
Directory of Open Access Journals (Sweden)
Yinglin Xia
2012-01-01
Full Text Available Modeling count data from sexual behavioral outcomes involves many challenges, especially when the data exhibit a preponderance of zeros and overdispersion. In particular, the popular Poisson log-linear model is not appropriate for modeling such outcomes. Although alternatives exist for addressing both issues, they are not widely and effectively used in sex health research, especially in HIV prevention intervention and related studies. In this paper, we discuss how to analyze count outcomes distributed with excess of zeros and overdispersion and introduce appropriate model-fit indices for comparing the performance of competing models, using data from a real study on HIV prevention intervention. The in-depth look at these common issues arising from studies involving behavioral outcomes will promote sound statistical analyses and facilitate research in this and other related areas.
Non-Poisson counting statistics of a hybrid G-M counter dead time model
International Nuclear Information System (INIS)
Lee, Sang Hoon; Jae, Moosung; Gardner, Robin P.
2007-01-01
The counting statistics of a G-M counter with a considerable dead time event rate deviates from Poisson statistics. Important characteristics such as observed counting rates as a function true counting rates, variances and interval distributions were analyzed for three dead time models, non-paralyzable, paralyzable and hybrid, with the help of GMSIM, a Monte Carlo dead time effect simulator. The simulation results showed good agreements with the models in observed counting rates and variances. It was found through GMSIM simulations that the interval distribution for the hybrid model showed three distinctive regions, a complete cutoff region for the duration of the total dead time, a degraded exponential and an enhanced exponential regions. By measuring the cutoff and the duration of degraded exponential from the pulse interval distribution, it is possible to evaluate the two dead times in the hybrid model
Statistical models for the estimation of the origin-destination matrix from traffic counts
Directory of Open Access Journals (Sweden)
Anselmo Ramalho Pitombeira Neto
2017-12-01
Full Text Available In transportation planning, one of the first steps is to estimate the travel demand. The final product of the estimation process is an origin-destination (OD matrix, whose entries correspond to the number of trips between pairs of origin-destination zones in a study region. In this paper, we review the main statistical models proposed in the literature for the estimation of the OD matrix based on traffic counts. Unlike reconstruction models, statistical models do not aim at estimating the exact OD matrix corresponding to observed traffic volumes, but they rather aim at estimating the parameters of a statistical model of the population of OD matrices. Initially we define the estimation problem, emphasizing its underspecified nature, which has lead to the development of several models based on different approaches. We describe static models whose parameters are estimated by means of maximum likelihood, the method of moments, and Bayesian inference. We also describe some recent dynamic models. Following that, we discuss research questions related to the underspecification problem, model assumptions and the estimation of the route choice matrix, and indicate promising research directions.
DEFF Research Database (Denmark)
Nielsen, Martin Krarup; Vidyashankar, Anand N.; Hanlon, Bret
statistical model was therefore developed for analysis of FECRT data from multiple farms. Horse age, gender, zip code and pre-treatment egg count were incorporated into the model. Horses and farms were kept as random effects. Resistance classifications were based on model-based 95% lower confidence limit (LCL...
Statistical Methods for Unusual Count Data
DEFF Research Database (Denmark)
Guthrie, Katherine A.; Gammill, Hilary S.; Kamper-Jørgensen, Mads
2016-01-01
microchimerism data present challenges for statistical analysis, including a skewed distribution, excess zero values, and occasional large values. Methods for comparing microchimerism levels across groups while controlling for covariates are not well established. We compared statistical models for quantitative...... microchimerism values, applied to simulated data sets and 2 observed data sets, to make recommendations for analytic practice. Modeling the level of quantitative microchimerism as a rate via Poisson or negative binomial model with the rate of detection defined as a count of microchimerism genome equivalents per...... total cell equivalents tested utilizes all available data and facilitates a comparison of rates between groups. We found that both the marginalized zero-inflated Poisson model and the negative binomial model can provide unbiased and consistent estimates of the overall association of exposure or study...
Contribution to statistics in fission track counting
International Nuclear Information System (INIS)
Bigazzi, G.; Bonadonna, F.; Neto, J.C.H.
1986-01-01
In order to test the new statistical model proposed in two papers by McGee, Johnson and Naeser for calculating the standard error in fission track dating, spontaneous and induced track counts from external detector method-EDM-or similar were simulated by random numbers, assuming that, for a given uranium content, fission tracks were spatially distributed according to Poisson distributions. By the results of such a simulation it can be concluded that: (1) In EDM, (1/nsub(s)+1/nsub(I)sup(1/2) represents a reliable evaluation of the relative standard error of rhosub(s)/rhosub(I)-ratio of spontaneous to induced track densities in a sample in which nsub(s) spontaneous and nsub(I) induced tracks were counted. (2) The new model confirms the validity of the above conclusion, by applying it to the spontaneous and induced track counts whose relative standard deviations of the means were evaluated by normal sampling statistics. Population method-PM-data were also simulated; (σ-bar' 2 sub(s)+σ-bar' 2 sub(I))sup(1/2), where σ-bar'sub(s) and σ-bar'sub(I) are the above relative standard deviations of the mean, offers a reliable evaluations of the uncertainty of rhosub(s)/rhosub(I) ratio for the simple cases analyzed in the present work. (author)
De Backer, A; Martinez, G T; Rosenauer, A; Van Aert, S
2013-11-01
In the present paper, a statistical model-based method to count the number of atoms of monotype crystalline nanostructures from high resolution high-angle annular dark-field (HAADF) scanning transmission electron microscopy (STEM) images is discussed in detail together with a thorough study on the possibilities and inherent limitations. In order to count the number of atoms, it is assumed that the total scattered intensity scales with the number of atoms per atom column. These intensities are quantitatively determined using model-based statistical parameter estimation theory. The distribution describing the probability that intensity values are generated by atomic columns containing a specific number of atoms is inferred on the basis of the experimental scattered intensities. Finally, the number of atoms per atom column is quantified using this estimated probability distribution. The number of atom columns available in the observed STEM image, the number of components in the estimated probability distribution, the width of the components of the probability distribution, and the typical shape of a criterion to assess the number of components in the probability distribution directly affect the accuracy and precision with which the number of atoms in a particular atom column can be estimated. It is shown that single atom sensitivity is feasible taking the latter aspects into consideration. © 2013 Elsevier B.V. All rights reserved.
Statistical data filtration in neutron coincidence counting
International Nuclear Information System (INIS)
Beddingfield, D.H.; Menlove, H.O.
1992-11-01
We assessed the effectiveness of statistical data filtration to minimize the contribution of matrix materials in 200-ell drums to the nondestructive assay of plutonium. Those matrices were examined: polyethylene, concrete, aluminum, iron, cadmium, and lead. Statistical filtration of neutron coincidence data improved the low-end sensitivity of coincidence counters. Spurious data arising from electrical noise, matrix spallation, and geometric effects were smoothed in a predictable fashion by the statistical filter. The filter effectively lowers the minimum detectable mass limit that can be achieved for plutonium assay using passive neutron coincidence counting
Photon Counts Statistics in Leukocyte Cell Dynamics
van Wijk, Eduard; van der Greef, Jan; van Wijk, Roeland
2011-12-01
In the present experiment ultra-weak photon emission/ chemiluminescence from isolated neutrophils was recorded. It is associated with the production of reactive oxygen species (ROS) in the "respiratory burst" process which can be activated by PMA (Phorbol 12-Myristate 13-Acetate). Commonly, the reaction is demonstrated utilizing the enhancer luminol. However, with the use of highly sensitive photomultiplier equipment it is also recorded without enhancer. In that case, it can be hypothesized that photon count statistics may assist in understanding the underlying metabolic activity and cooperation of these cells. To study this hypothesis leukocytes were stimulated with PMA and increased photon signals were recorded in the quasi stable period utilizing Fano factor analysis at different window sizes. The Fano factor is defined by the variance over the mean of the number of photon within the observation time. The analysis demonstrated that the Fano factor of true signal and not of the surrogate signals obtained by random shuffling increases when the window size increased. It is concluded that photon count statistics, in particular Fano factor analysis, provides information regarding leukocyte interactions. It opens the perspective to utilize this analytical procedure in (in vivo) inflammation research. However, this needs further validation.
Energy Technology Data Exchange (ETDEWEB)
Jones, K.W. [Los Alamos National Lab., NM (United States); Browman, A. [Amparo Corp., Sante Fe, NM (United States)
1997-01-01
The BPI Model 2080 Pulsed Neutron Detector has been used for over seven years as an area radiation monitor and dose limiter at the LANSCE accelerator complex. Operating experience and changing environments over this time have revealed several vulnerabilities (susceptibility to electrical noise, paralysis in high dose rate fields, etc.). Identified vulnerabilities have been connected; these modifications include component replacement and circuit design changes. The data and experiments leading to these modifications will be presented and discussed. Calibration of the instrument is performed in mixed static gamma and neutron source fields. The statistical characteristics of the Geiger-Muller tubes coupled with significantly different sensitivity to gamma and neutron doses require that careful attention be paid to acceptable fluctuations in dose rate over time during calibration. The performance of the instrument has been modeled using simple Poisson statistics and the operating characteristics of the Geiger-Muller tubes. The results are in excellent agreement with measurements. The analysis and comparison with experimental data will be presented.
Nathaniel E. Seavy; Suhel Quader; John D. Alexander; C. John Ralph
2005-01-01
The success of avian monitoring programs to effectively guide management decisions requires that studies be efficiently designed and data be properly analyzed. A complicating factor is that point count surveys often generate data with non-normal distributional properties. In this paper we review methods of dealing with deviations from normal assumptions, and we focus...
SUBMILLIMETER NUMBER COUNTS FROM STATISTICAL ANALYSIS OF BLAST MAPS
International Nuclear Information System (INIS)
Patanchon, Guillaume; Ade, Peter A. R.; Griffin, Matthew; Hargrave, Peter C.; Mauskopf, Philip; Moncelsi, Lorenzo; Pascale, Enzo; Bock, James J.; Chapin, Edward L.; Halpern, Mark; Marsden, Gaelen; Scott, Douglas; Devlin, Mark J.; Dicker, Simon R.; Klein, Jeff; Rex, Marie; Gundersen, Joshua O.; Hughes, David H.; Netterfield, Calvin B.; Olmi, Luca
2009-01-01
We describe the application of a statistical method to estimate submillimeter galaxy number counts from confusion-limited observations by the Balloon-borne Large Aperture Submillimeter Telescope (BLAST). Our method is based on a maximum likelihood fit to the pixel histogram, sometimes called 'P(D)', an approach which has been used before to probe faint counts, the difference being that here we advocate its use even for sources with relatively high signal-to-noise ratios. This method has an advantage over standard techniques of source extraction in providing an unbiased estimate of the counts from the bright end down to flux densities well below the confusion limit. We specifically analyze BLAST observations of a roughly 10 deg 2 map centered on the Great Observatories Origins Deep Survey South field. We provide estimates of number counts at the three BLAST wavelengths 250, 350, and 500 μm; instead of counting sources in flux bins we estimate the counts at several flux density nodes connected with power laws. We observe a generally very steep slope for the counts of about -3.7 at 250 μm, and -4.5 at 350 and 500 μm, over the range ∼0.02-0.5 Jy, breaking to a shallower slope below about 0.015 Jy at all three wavelengths. We also describe how to estimate the uncertainties and correlations in this method so that the results can be used for model-fitting. This method should be well suited for analysis of data from the Herschel satellite.
Counting statistics: is anything really there
Energy Technology Data Exchange (ETDEWEB)
Glynn, J.E.; Grasty, R.L. (Geological Survey of Canada, Ottawa, ON (Canada))
1990-01-01
A nonparametric technique is developed for estimating the confidence interval on concentration inhomogeneities in samples where the observations are Poisson distributed. The technique is illustrated by applying it to a series of {gamma}-ray measurements on a subset of 49 samples drawn from a set of 1000 laboratory {gamma}-ray reference standards. Analysis of the data indicated that at the 50% confidence level there was a small but significance variation in the sample concentrations. The results were confirmed by application of the Bootstrap method to the same data set. The technique has several advantages over standard analysis of variance techniques and can be applied to a range of similar nuclear counting problems. (author).
Theory of overdispersion in counting statistics caused by fluctuating probabilities
International Nuclear Information System (INIS)
Semkow, Thomas M.
1999-01-01
It is shown that the random Lexis fluctuations of probabilities such as probability of decay or detection cause the counting statistics to be overdispersed with respect to the classical binomial, Poisson, or Gaussian distributions. The generating and the distribution functions for the overdispersed counting statistics are derived. Applications to radioactive decay with detection and more complex experiments are given, as well as distinguishing between the source and background, in the presence of overdispersion. Monte-Carlo verifications are provided
Correlations and Counting Statistics of an Atom Laser
International Nuclear Information System (INIS)
Oettl, Anton; Ritter, Stephan; Koehl, Michael; Esslinger, Tilman
2005-01-01
We demonstrate time-resolved counting of single atoms extracted from a weakly interacting Bose-Einstein condensate of 87 Rb atoms. The atoms are detected with a high-finesse optical cavity and single atom transits are identified. An atom laser beam is formed by continuously output coupling atoms from the Bose-Einstein condensate. We investigate the full counting statistics of this beam and measure its second order correlation function g (2) (τ) in a Hanbury Brown-Twiss type experiment. For the monoenergetic atom laser we observe a constant correlation function g (2) (τ)=1.00±0.01 and an atom number distribution close to a Poissonian statistics. A pseudothermal atomic beam shows a bunching behavior and a Bose distributed counting statistics
Reducing bias in the analysis of counting statistics data
International Nuclear Information System (INIS)
Hammersley, A.P.; Antoniadis, A.
1997-01-01
In the analysis of counting statistics data it is common practice to estimate the variance of the measured data points as the data points themselves. This practice introduces a bias into the results of further analysis which may be significant, and under certain circumstances lead to false conclusions. In the case of normal weighted least squares fitting this bias is quantified and methods to avoid it are proposed. (orig.)
Hybrid statistics-simulations based method for atom-counting from ADF STEM images
Energy Technology Data Exchange (ETDEWEB)
De wael, Annelies, E-mail: annelies.dewael@uantwerpen.be [Electron Microscopy for Materials Science (EMAT), University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp (Belgium); De Backer, Annick [Electron Microscopy for Materials Science (EMAT), University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp (Belgium); Jones, Lewys; Nellist, Peter D. [Department of Materials, University of Oxford, Parks Road, OX1 3PH Oxford (United Kingdom); Van Aert, Sandra, E-mail: sandra.vanaert@uantwerpen.be [Electron Microscopy for Materials Science (EMAT), University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp (Belgium)
2017-06-15
A hybrid statistics-simulations based method for atom-counting from annular dark field scanning transmission electron microscopy (ADF STEM) images of monotype crystalline nanostructures is presented. Different atom-counting methods already exist for model-like systems. However, the increasing relevance of radiation damage in the study of nanostructures demands a method that allows atom-counting from low dose images with a low signal-to-noise ratio. Therefore, the hybrid method directly includes prior knowledge from image simulations into the existing statistics-based method for atom-counting, and accounts in this manner for possible discrepancies between actual and simulated experimental conditions. It is shown by means of simulations and experiments that this hybrid method outperforms the statistics-based method, especially for low electron doses and small nanoparticles. The analysis of a simulated low dose image of a small nanoparticle suggests that this method allows for far more reliable quantitative analysis of beam-sensitive materials. - Highlights: • A hybrid method for atom-counting from ADF STEM images is introduced. • Image simulations are incorporated into a statistical framework in a reliable manner. • Limits of the existing methods for atom-counting are far exceeded. • Reliable counting results from an experimental low dose image are obtained. • Progress towards reliable quantitative analysis of beam-sensitive materials is made.
Predictive Model Assessment for Count Data
National Research Council Canada - National Science Library
Czado, Claudia; Gneiting, Tilmann; Held, Leonhard
2007-01-01
.... In case studies, we critique count regression models for patent data, and assess the predictive performance of Bayesian age-period-cohort models for larynx cancer counts in Germany. Key words: Calibration...
Spatial Statistics for Tumor Cell Counting and Classification
Wirjadi, Oliver; Kim, Yoo-Jin; Breuel, Thomas
To count and classify cells in histological sections is a standard task in histology. One example is the grading of meningiomas, benign tumors of the meninges, which requires to assess the fraction of proliferating cells in an image. As this process is very time consuming when performed manually, automation is required. To address such problems, we propose a novel application of Markov point process methods in computer vision, leading to algorithms for computing the locations of circular objects in images. In contrast to previous algorithms using such spatial statistics methods in image analysis, the present one is fully trainable. This is achieved by combining point process methods with statistical classifiers. Using simulated data, the method proposed in this paper will be shown to be more accurate and more robust to noise than standard image processing methods. On the publicly available SIMCEP benchmark for cell image analysis algorithms, the cell count performance of the present paper is significantly more accurate than results published elsewhere, especially when cells form dense clusters. Furthermore, the proposed system performs as well as a state-of-the-art algorithm for the computer-aided histological grading of meningiomas when combined with a simple k-nearest neighbor classifier for identifying proliferating cells.
Statistical method for resolving the photon-photoelectron-counting inversion problem
International Nuclear Information System (INIS)
Wu Jinlong; Li Tiejun; Peng, Xiang; Guo Hong
2011-01-01
A statistical inversion method is proposed for the photon-photoelectron-counting statistics in quantum key distribution experiment. With the statistical viewpoint, this problem is equivalent to the parameter estimation for an infinite binomial mixture model. The coarse-graining idea and Bayesian methods are applied to deal with this ill-posed problem, which is a good simple example to show the successful application of the statistical methods to the inverse problem. Numerical results show the applicability of the proposed strategy. The coarse-graining idea for the infinite mixture models should be general to be used in the future.
Full counting statistics of a single-molecule quantum dot
Dong, Bing; Ding, G. H.; Lei, X. L.
2013-08-01
We investigate the full counting statistics of a single quantum dot strongly coupled to a local phonon and weakly tunnel connected to two metallic electrodes. By employing the generalized nonequilibrium Green-function method and the Lang-Firsov transformation, we derive an explicit analytical formula for the cumulant generating function, which makes one able to identify distinctly the elastic and inelastic contributions to the current and zero-frequency shot noise. We find that at zero temperature, the inelastic effect causes upward steps in the current and downward jumps in the noise at the bias voltages corresponding to the opening of the inelastic channels, which are ascribed to the vibration-induced complex dependencies of electronic self-energies on the energy and bias voltage. More interestingly, the Fano factor exhibits oscillatory behavior with increasing bias voltage and its minimum value is observed to be smaller than one-half.
Rectifying full-counting statistics in a spin Seebeck engine
Tang, Gaomin; Chen, Xiaobin; Ren, Jie; Wang, Jian
2018-02-01
In terms of the nonequilibrium Green's function framework, we formulate the full-counting statistics of conjugate thermal spin transport in a spin Seebeck engine, which is made by a metal-ferromagnet insulator interface driven by a temperature bias. We obtain general expressions of scaled cumulant generating functions of both heat and spin currents that hold special fluctuation symmetry relations, and demonstrate intriguing properties, such as rectification and negative differential effects of high-order fluctuations of thermal excited spin current, maximum output spin power, and efficiency. The transport and noise depend on the strongly fluctuating electron density of states at the interface. The results are relevant for designing an efficient spin Seebeck engine and can broaden our view in nonequilibrium thermodynamics and the nonlinear phenomenon in quantum transport systems.
2012-01-01
Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent
Directory of Open Access Journals (Sweden)
Adrion Christine
2012-09-01
Full Text Available Abstract Background A statistical analysis plan (SAP is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs. The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC or probability integral transform (PIT, and by using proper scoring rules (e.g. the logarithmic score. Results The instruments under study
Sakhr, Jamal; Nieminen, John M.
2018-03-01
Two decades ago, Wang and Ong, [Phys. Rev. A 55, 1522 (1997)], 10.1103/PhysRevA.55.1522 hypothesized that the local box-counting dimension of a discrete quantum spectrum should depend exclusively on the nearest-neighbor spacing distribution (NNSD) of the spectrum. In this Rapid Communication, we validate their hypothesis by deriving an explicit formula for the local box-counting dimension of a countably-infinite discrete quantum spectrum. This formula expresses the local box-counting dimension of a spectrum in terms of single and double integrals of the NNSD of the spectrum. As applications, we derive an analytical formula for Poisson spectra and closed-form approximations to the local box-counting dimension for spectra having Gaussian orthogonal ensemble (GOE), Gaussian unitary ensemble (GUE), and Gaussian symplectic ensemble (GSE) spacing statistics. In the Poisson and GOE cases, we compare our theoretical formulas with the published numerical data of Wang and Ong and observe excellent agreement between their data and our theory. We also study numerically the local box-counting dimensions of the Riemann zeta function zeros and the alternate levels of GOE spectra, which are often used as numerical models of spectra possessing GUE and GSE spacing statistics, respectively. In each case, the corresponding theoretical formula is found to accurately describe the numerically computed local box-counting dimension.
Modelling the Covariance Structure in Marginal Multivariate Count Models
DEFF Research Database (Denmark)
Bonat, W. H.; Olivero, J.; Grande-Vega, M.
2017-01-01
The main goal of this article is to present a flexible statistical modelling framework to deal with multivariate count data along with longitudinal and repeated measures structures. The covariance structure for each response variable is defined in terms of a covariance link function combined...... with a matrix linear predictor involving known matrices. In order to specify the joint covariance matrix for the multivariate response vector, the generalized Kronecker product is employed. We take into account the count nature of the data by means of the power dispersion function associated with the Poisson...
Goldstein, Harvey
2011-01-01
This book provides a clear introduction to this important area of statistics. The author provides a wide of coverage of different kinds of multilevel models, and how to interpret different statistical methodologies and algorithms applied to such models. This 4th edition reflects the growth and interest in this area and is updated to include new chapters on multilevel models with mixed response types, smoothing and multilevel data, models with correlated random effects and modeling with variance.
Sampling, Probability Models and Statistical Reasoning Statistical ...
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Sampling, Probability Models and Statistical Reasoning Statistical
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Hybrid statistics-simulations based method for atom-counting from ADF STEM images.
De Wael, Annelies; De Backer, Annick; Jones, Lewys; Nellist, Peter D; Van Aert, Sandra
2017-06-01
A hybrid statistics-simulations based method for atom-counting from annular dark field scanning transmission electron microscopy (ADF STEM) images of monotype crystalline nanostructures is presented. Different atom-counting methods already exist for model-like systems. However, the increasing relevance of radiation damage in the study of nanostructures demands a method that allows atom-counting from low dose images with a low signal-to-noise ratio. Therefore, the hybrid method directly includes prior knowledge from image simulations into the existing statistics-based method for atom-counting, and accounts in this manner for possible discrepancies between actual and simulated experimental conditions. It is shown by means of simulations and experiments that this hybrid method outperforms the statistics-based method, especially for low electron doses and small nanoparticles. The analysis of a simulated low dose image of a small nanoparticle suggests that this method allows for far more reliable quantitative analysis of beam-sensitive materials. Copyright © 2017 Elsevier B.V. All rights reserved.
Statistical analysis of data from dilution assays with censored correlated counts.
Quiroz, Jorge; Wilson, Jeffrey R; Roychoudhury, Satrajit
2012-01-01
Frequently, count data obtained from dilution assays are subject to an upper detection limit, and as such, data obtained from these assays are usually censored. Also, counts from the same subject at different dilution levels are correlated. Ignoring the censoring and the correlation may provide unreliable and misleading results. Therefore, any meaningful data modeling requires that the censoring and the correlation be simultaneously addressed. Such comprehensive approaches of modeling censoring and correlation are not widely used in the analysis of dilution assays data. Traditionally, these data are analyzed using a general linear model on a logarithmic-transformed average count per subject. However, this traditional approach ignores the between-subject variability and risks, providing inconsistent results and unreliable conclusions. In this paper, we propose the use of a censored negative binomial model with normal random effects to analyze such data. This model addresses, in addition to the censoring and the correlation, any overdispersion that may be present in count data. The model is shown to be widely accessible through the use of several modern statistical software. Copyright © 2012 John Wiley & Sons, Ltd.
Diffeomorphic Statistical Deformation Models
DEFF Research Database (Denmark)
Hansen, Michael Sass; Hansen, Mads/Fogtman; Larsen, Rasmus
2007-01-01
In this paper we present a new method for constructing diffeomorphic statistical deformation models in arbitrary dimensional images with a nonlinear generative model and a linear parameter space. Our deformation model is a modified version of the diffeomorphic model introduced by Cootes et al....... The modifications ensure that no boundary restriction has to be enforced on the parameter space to prevent folds or tears in the deformation field. For straightforward statistical analysis, principal component analysis and sparse methods, we assume that the parameters for a class of deformations lie on a linear...
Counting Processes for Retail Default Modeling
DEFF Research Database (Denmark)
Kiefer, Nicholas Maximilian; Larson, C. Erik
in a discrete state space. In a simple case, the states could be default/non-default; in other models relevant for credit modeling the states could be credit scores or payment status (30 dpd, 60 dpd, etc.). Here we focus on the use of stochastic counting processes for mortgage default modeling, using data...
Statistical analysis of the direct count method for enumerating bacteria.
Kirchman, D; Sigda, J; Kapuscinski, R; Mitchell, R
1982-08-01
The direct count method for enumerating bacteria in natural environments is widely used. This paper analyzes the sources of variation contributed by the various levels of the method: subsamples, filters, and microscope fields. Based on a nested analysis of variance, we show that most of the variance (less than 80%) is caused by the fields and that the filters contributed nearly all of the remaining variance. The replication at each of the levels determines the total cost and error of a measurement. We compared several sampling schemes, including an optimal strategy which gives the lowest possible variance for a given cost. We recommend that preparing one filter from one subsample is adequate only if the samples are closely spaced in time or distance; otherwise, one filter should be prepared from two or preferably three subsamples. This sampling scheme emphasizes the importance of the highest level of replication. Our analysis shows that the accuracy of the direct count method can be substantially improved (by 20 to 50%) without a large increase in cost when the proper degree of replication at each level is performed.
Farnsworth, G.L.; Nichols, J.D.; Sauer, J.R.; Fancy, S.G.; Pollock, K.H.; Shriner, S.A.; Simons, T.R.; Ralph, C. John; Rich, Terrell D.
2005-01-01
Point counts are a standard sampling procedure for many bird species, but lingering concerns still exist about the quality of information produced from the method. It is well known that variation in observer ability and environmental conditions can influence the detection probability of birds in point counts, but many biologists have been reluctant to abandon point counts in favor of more intensive approaches to counting. However, over the past few years a variety of statistical and methodological developments have begun to provide practical ways of overcoming some of the problems with point counts. We describe some of these approaches, and show how they can be integrated into standard point count protocols to greatly enhance the quality of the information. Several tools now exist for estimation of detection probability of birds during counts, including distance sampling, double observer methods, time-depletion (removal) methods, and hybrid methods that combine these approaches. Many counts are conducted in habitats that make auditory detection of birds much more likely than visual detection. As a framework for understanding detection probability during such counts, we propose separating two components of the probability a bird is detected during a count into (1) the probability a bird vocalizes during the count and (2) the probability this vocalization is detected by an observer. In addition, we propose that some measure of the area sampled during a count is necessary for valid inferences about bird populations. This can be done by employing fixed-radius counts or more sophisticated distance-sampling models. We recommend any studies employing point counts be designed to estimate detection probability and to include a measure of the area sampled.
Optimization of counting time using count statistics on a diffraction beamline
Energy Technology Data Exchange (ETDEWEB)
Marais, D., E-mail: Deon.Marais@necsa.co.za [Research and Development Division, South African Nuclear Energy Corporation (Necsa) SOC Limited, PO Box 582, Pretoria 0001 (South Africa); School of Mechanical and Nuclear Engineering, North-West University, Potchefstroom 2520 (South Africa); Venter, A.M., E-mail: Andrew.Venter@necsa.co.za [Research and Development Division, South African Nuclear Energy Corporation (Necsa) SOC Limited, PO Box 582, Pretoria 0001 (South Africa); Faculty of Agriculture Science and Technology, North-West University, Mahikeng 2790 (South Africa); Markgraaff, J., E-mail: Johan.Markgraaff@nwu.ac.za [School of Mechanical and Nuclear Engineering, North-West University, Potchefstroom 2520 (South Africa)
2016-05-11
The feasibility of an alternative data acquisition strategy to improve the efficiency of beam time usage with neutron strain scanner instruments is demonstrated. By performing strain measurements against set statistical criteria, rather than time, not only leads to substantially reduced sample investigation time but also renders data of similar quality throughout.
Optimization of counting time using count statistics on a diffraction beamline
Marais, D.; Venter, A. M.; Markgraaff, J.
2016-05-01
The feasibility of an alternative data acquisition strategy to improve the efficiency of beam time usage with neutron strain scanner instruments is demonstrated. By performing strain measurements against set statistical criteria, rather than time, not only leads to substantially reduced sample investigation time but also renders data of similar quality throughout.
Szöcs, Eduard; Schäfer, Ralf B
2015-09-01
Ecotoxicologists often encounter count and proportion data that are rarely normally distributed. To meet the assumptions of the linear model, such data are usually transformed or non-parametric methods are used if the transformed data still violate the assumptions. Generalized linear models (GLMs) allow to directly model such data, without the need for transformation. Here, we compare the performance of two parametric methods, i.e., (1) the linear model (assuming normality of transformed data), (2) GLMs (assuming a Poisson, negative binomial, or binomially distributed response), and (3) non-parametric methods. We simulated typical data mimicking low replicated ecotoxicological experiments of two common data types (counts and proportions from counts). We compared the performance of the different methods in terms of statistical power and Type I error for detecting a general treatment effect and determining the lowest observed effect concentration (LOEC). In addition, we outlined differences on a real-world mesocosm data set. For count data, we found that the quasi-Poisson model yielded the highest power. The negative binomial GLM resulted in increased Type I errors, which could be fixed using the parametric bootstrap. For proportions, binomial GLMs performed better than the linear model, except to determine LOEC at extremely low sample sizes. The compared non-parametric methods had generally lower power. We recommend that counts in one-factorial experiments should be analyzed using quasi-Poisson models and proportions from counts by binomial GLMs. These methods should become standard in ecotoxicology.
Statistical Model for Content Extraction
DEFF Research Database (Denmark)
Qureshi, Pir Abdul Rasool; Memon, Nasrullah
2011-01-01
We present a statistical model for content extraction from HTML documents. The model operates on Document Object Model (DOM) tree of the corresponding HTML document. It evaluates each tree node and associated statistical features to predict significance of the node towards overall content of the ...... also describe the significance of the model in the domain of counterterrorism and open source intelligence....
Methods of statistical model estimation
Hilbe, Joseph
2013-01-01
Methods of Statistical Model Estimation examines the most important and popular methods used to estimate parameters for statistical models and provide informative model summary statistics. Designed for R users, the book is also ideal for anyone wanting to better understand the algorithms used for statistical model fitting. The text presents algorithms for the estimation of a variety of regression procedures using maximum likelihood estimation, iteratively reweighted least squares regression, the EM algorithm, and MCMC sampling. Fully developed, working R code is constructed for each method. Th
Counting statistics of chaotic resonances at optical frequencies: Theory and experiments
Lippolis, Domenico; Wang, Li; Xiao, Yun-Feng
2017-07-01
A deformed dielectric microcavity is used as an experimental platform for the analysis of the statistics of chaotic resonances, in the perspective of testing fractal Weyl laws at optical frequencies. In order to surmount the difficulties that arise from reading strongly overlapping spectra, we exploit the mixed nature of the phase space at hand, and only count the high-Q whispering-gallery modes (WGMs) directly. That enables us to draw statistical information on the more lossy chaotic resonances, coupled to the high-Q regular modes via dynamical tunneling. Three different models [classical, Random-Matrix-Theory (RMT) based, semiclassical] to interpret the experimental data are discussed. On the basis of least-squares analysis, theoretical estimates of Ehrenfest time, and independent measurements, we find that a semiclassically modified RMT-based expression best describes the experiment in all its realizations, particularly when the resonator is coupled to visible light, while RMT alone still works quite well in the infrared. In this work we reexamine and substantially extend the results of a short paper published earlier [L. Wang et al., Phys. Rev. E 93, 040201(R) (2016), 10.1103/PhysRevE.93.040201].
Exclusion statistics and integrable models
International Nuclear Information System (INIS)
Mashkevich, S.
1998-01-01
The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question
Sensometrics: Thurstonian and Statistical Models
DEFF Research Database (Denmark)
Christensen, Rune Haubo Bojesen
This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the us...... of generalized linear mixed models, cumulative link models and cumulative link mixed models. The relation between the Wald, likelihood and score statistics is expanded upon using the shape of the (profile) likelihood function as common reference....
Mixed models for repeated count data.
Duijn, Maria Aukje Julianne van
1993-01-01
Discrete data resulting from repeated counts are often collected in various fields of scientific research. They may come from experiments where, in various conditions or tasks, the number of certain happenings (e.g. a right or wrong answer) are counted. When the data are balanced (i.e., every
Statistical modeling for degradation data
Lio, Yuhlong; Ng, Hon; Tsai, Tzong-Ru
2017-01-01
This book focuses on the statistical aspects of the analysis of degradation data. In recent years, degradation data analysis has come to play an increasingly important role in different disciplines such as reliability, public health sciences, and finance. For example, information on products’ reliability can be obtained by analyzing degradation data. In addition, statistical modeling and inference techniques have been developed on the basis of different degradation measures. The book brings together experts engaged in statistical modeling and inference, presenting and discussing important recent advances in degradation data analysis and related applications. The topics covered are timely and have considerable potential to impact both statistics and reliability engineering.
Statistical modelling with quantile functions
Gilchrist, Warren
2000-01-01
Galton used quantiles more than a hundred years ago in describing data. Tukey and Parzen used them in the 60s and 70s in describing populations. Since then, the authors of many papers, both theoretical and practical, have used various aspects of quantiles in their work. Until now, however, no one put all the ideas together to form what turns out to be a general approach to statistics.Statistical Modelling with Quantile Functions does just that. It systematically examines the entire process of statistical modelling, starting with using the quantile function to define continuous distributions. The author shows that by using this approach, it becomes possible to develop complex distributional models from simple components. A modelling kit can be developed that applies to the whole model - deterministic and stochastic components - and this kit operates by adding, multiplying, and transforming distributions rather than data.Statistical Modelling with Quantile Functions adds a new dimension to the practice of stati...
Statistical validation of stochastic models
Energy Technology Data Exchange (ETDEWEB)
Hunter, N.F. [Los Alamos National Lab., NM (United States). Engineering Science and Analysis Div.; Barney, P.; Paez, T.L. [Sandia National Labs., Albuquerque, NM (United States). Experimental Structural Dynamics Dept.; Ferregut, C.; Perez, L. [Univ. of Texas, El Paso, TX (United States). Dept. of Civil Engineering
1996-12-31
It is common practice in structural dynamics to develop mathematical models for system behavior, and the authors are now capable of developing stochastic models, i.e., models whose parameters are random variables. Such models have random characteristics that are meant to simulate the randomness in characteristics of experimentally observed systems. This paper suggests a formal statistical procedure for the validation of mathematical models of stochastic systems when data taken during operation of the stochastic system are available. The statistical characteristics of the experimental system are obtained using the bootstrap, a technique for the statistical analysis of non-Gaussian data. The authors propose a procedure to determine whether or not a mathematical model is an acceptable model of a stochastic system with regard to user-specified measures of system behavior. A numerical example is presented to demonstrate the application of the technique.
Modeling patterns in count data using loglinear and related models
International Nuclear Information System (INIS)
Atwood, C.L.
1995-12-01
This report explains the use of loglinear and logit models, for analyzing Poisson and binomial counts in the presence of explanatory variables. The explanatory variables may be unordered categorical variables or numerical variables, or both. The report shows how to construct models to fit data, and how to test whether a model is too simple or too complex. The appropriateness of the methods with small data sets is discussed. Several example analyses, using the SAS computer package, illustrate the methods
Statistical modelling of citation exchange between statistics journals.
Varin, Cristiano; Cattelan, Manuela; Firth, David
2016-01-01
Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation 'export scores' within the discipline of statistics.
Statistical Models for Social Networks
Snijders, Tom A. B.; Cook, KS; Massey, DS
2011-01-01
Statistical models for social networks as dependent variables must represent the typical network dependencies between tie variables such as reciprocity, homophily, transitivity, etc. This review first treats models for single (cross-sectionally observed) networks and then for network dynamics. For
On-line statistical processing of radiation detector pulse trains with time-varying count rates
International Nuclear Information System (INIS)
Apostolopoulos, G.
2008-01-01
Statistical analysis is of primary importance for the correct interpretation of nuclear measurements, due to the inherent random nature of radioactive decay processes. This paper discusses the application of statistical signal processing techniques to the random pulse trains generated by radiation detectors. The aims of the presented algorithms are: (i) continuous, on-line estimation of the underlying time-varying count rate θ(t) and its first-order derivative dθ/dt; (ii) detection of abrupt changes in both of these quantities and estimation of their new value after the change point. Maximum-likelihood techniques, based on the Poisson probability distribution, are employed for the on-line estimation of θ and dθ/dt. Detection of abrupt changes is achieved on the basis of the generalized likelihood ratio statistical test. The properties of the proposed algorithms are evaluated by extensive simulations and possible applications for on-line radiation monitoring are discussed
Tutorial on Using Regression Models with Count Outcomes Using R
Directory of Open Access Journals (Sweden)
A. Alexander Beaujean
2016-02-01
Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.
Modeling Zero-Inflated and Overdispersed Count Data: An Empirical Study of School Suspensions
Desjardins, Christopher David
2016-01-01
The purpose of this article is to develop a statistical model that best explains variability in the number of school days suspended. Number of school days suspended is a count variable that may be zero-inflated and overdispersed relative to a Poisson model. Four models were examined: Poisson, negative binomial, Poisson hurdle, and negative…
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Hansen, Kurt Schaldemose; Larsen, Gunner Chr.
2005-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... (PDF) of turbulence driven short-term extreme wind shear events, conditioned on the mean wind speed, for an arbitrary recurrence period. The model is based on an asymptotic expansion, and only a few and easily accessible parameters are needed as input. The model of the extreme PDF is supplemented...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Larsen, Gunner Chr.; Hansen, Kurt Schaldemose
2004-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... (PDF) of turbulence driven short-term extreme wind shear events, conditioned on the mean wind speed, for an arbitrary recurrence period. The model is based on an asymptotic expansion, and only a few and easily accessible parameters are needed as input. The model of the extreme PDF is supplemented...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...
A Statistical Programme Assignment Model
DEFF Research Database (Denmark)
Rosholm, Michael; Staghøj, Jonas; Svarer, Michael
When treatment effects of active labour market programmes are heterogeneous in an observable way across the population, the allocation of the unemployed into different programmes becomes a particularly important issue. In this paper, we present a statistical model designed to improve the present...
Textual information access statistical models
Gaussier, Eric
2013-01-01
This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:- information extraction and retrieval;- text classification and clustering;- opinion mining;- comprehension aids (automatic summarization, machine translation, visualization).In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications
Optimization of statistical methods for HpGe gamma-ray spectrometer used in wide count rate ranges
Energy Technology Data Exchange (ETDEWEB)
Gervino, G., E-mail: gervino@to.infn.it [UNITO - Università di Torino, Dipartimento di Fisica, Turin (Italy); INFN - Istituto Nazionale di Fisica Nucleare, Sez. Torino, Turin (Italy); Mana, G. [INRIM - Istituto Nazionale di Ricerca Metrologica, Turin (Italy); Palmisano, C. [UNITO - Università di Torino, Dipartimento di Fisica, Turin (Italy); INRIM - Istituto Nazionale di Ricerca Metrologica, Turin (Italy)
2016-07-11
The need to perform γ-ray measurements with HpGe detectors is a common technique in many fields such as nuclear physics, radiochemistry, nuclear medicine and neutron activation analysis. The use of HpGe detectors is chosen in situations where isotope identification is needed because of their excellent resolution. Our challenge is to obtain the “best” spectroscopy data possible in every measurement situation. “Best” is a combination of statistical (number of counts) and spectral quality (peak, width and position) over a wide range of counting rates. In this framework, we applied Bayesian methods and the Ellipsoidal Nested Sampling (a multidimensional integration technique) to study the most likely distribution for the shape of HpGe spectra. In treating these experiments, the prior information suggests to model the likelihood function with a product of Poisson distributions. We present the efforts that have been done in order to optimize the statistical methods to HpGe detector outputs with the aim to evaluate to a better order of precision the detector efficiency, the absolute measured activity and the spectra background. Reaching a more precise knowledge of statistical and systematic uncertainties for the measured physical observables is the final goal of this research project.
Optimization of statistical methods for HpGe gamma-ray spectrometer used in wide count rate ranges
Gervino, G.; Mana, G.; Palmisano, C.
2016-07-01
The need to perform γ-ray measurements with HpGe detectors is a common technique in many fields such as nuclear physics, radiochemistry, nuclear medicine and neutron activation analysis. The use of HpGe detectors is chosen in situations where isotope identification is needed because of their excellent resolution. Our challenge is to obtain the "best" spectroscopy data possible in every measurement situation. "Best" is a combination of statistical (number of counts) and spectral quality (peak, width and position) over a wide range of counting rates. In this framework, we applied Bayesian methods and the Ellipsoidal Nested Sampling (a multidimensional integration technique) to study the most likely distribution for the shape of HpGe spectra. In treating these experiments, the prior information suggests to model the likelihood function with a product of Poisson distributions. We present the efforts that have been done in order to optimize the statistical methods to HpGe detector outputs with the aim to evaluate to a better order of precision the detector efficiency, the absolute measured activity and the spectra background. Reaching a more precise knowledge of statistical and systematic uncertainties for the measured physical observables is the final goal of this research project.
Directory of Open Access Journals (Sweden)
Thomas Weidinger
2016-01-01
Full Text Available This work proposes a dedicated statistical algorithm to perform a direct reconstruction of material-decomposed images from data acquired with photon-counting detectors (PCDs in computed tomography. It is based on local approximations (surrogates of the negative logarithmic Poisson probability function. Exploiting the convexity of this function allows for parallel updates of all image pixels. Parallel updates can compensate for the rather slow convergence that is intrinsic to statistical algorithms. We investigate the accuracy of the algorithm for ideal photon-counting detectors. Complementarily, we apply the algorithm to simulation data of a realistic PCD with its spectral resolution limited by K-escape, charge sharing, and pulse-pileup. For data from both an ideal and realistic PCD, the proposed algorithm is able to correct beam-hardening artifacts and quantitatively determine the material fractions of the chosen basis materials. Via regularization we were able to achieve a reduction of image noise for the realistic PCD that is up to 90% lower compared to material images form a linear, image-based material decomposition using FBP images. Additionally, we find a dependence of the algorithms convergence speed on the threshold selection within the PCD.
Improved model for statistical alignment
Energy Technology Data Exchange (ETDEWEB)
Miklos, I.; Toroczkai, Z. (Zoltan)
2001-01-01
The statistical approach to molecular sequence evolution involves the stochastic modeling of the substitution, insertion and deletion processes. Substitution has been modeled in a reliable way for more than three decades by using finite Markov-processes. Insertion and deletion, however, seem to be more difficult to model, and thc recent approaches cannot acceptably deal with multiple insertions and deletions. A new method based on a generating function approach is introduced to describe the multiple insertion process. The presented algorithm computes the approximate joint probability of two sequences in 0(13) running time where 1 is the geometric mean of the sequence lengths.
RCT: Module 2.03, Counting Errors and Statistics, Course 8768
Energy Technology Data Exchange (ETDEWEB)
Hillmer, Kurt T. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-04-01
Radiological sample analysis involves the observation of a random process that may or may not occur and an estimation of the amount of radioactive material present based on that observation. Across the country, radiological control personnel are using the activity measurements to make decisions that may affect the health and safety of workers at those facilities and their surrounding environments. This course will present an overview of measurement processes, a statistical evaluation of both measurements and equipment performance, and some actions to take to minimize the sources of error in count room operations. This course will prepare the student with the skills necessary for radiological control technician (RCT) qualification by passing quizzes, tests, and the RCT Comprehensive Phase 1, Unit 2 Examination (TEST 27566) and by providing in the field skills.
Physics colloquium: Single-electron counting in quantum metrology and in statistical mechanics
Geneva University
2011-01-01
GENEVA UNIVERSITY Ecole de physique Département de physique nucléaire et corspusculaire 24, quai Ernest-Ansermet 1211 Genève 4 Tél.: (022) 379 62 73 Fax: (022) 379 69 92olé Lundi 17 octobre 2011 17h00 - Ecole de Physique, Auditoire Stueckelberg PHYSICS COLLOQUIUM « Single-electron counting in quantum metrology and in statistical mechanics » Prof. Jukka Pekola Low Temperature Laboratory, Aalto University Helsinki, Finland First I discuss the basics of single-electron tunneling and its potential applications in metrology. My main focus is in developing an accurate source of single-electron current for the realization of the unit ampere. I discuss the principle and the present status of the so-called single- electron turnstile. Investigation of errors in transporting electrons one by one has revealed a wealth of observations on fundamental phenomena in mesoscopic superconductivity, including individual Andreev...
Applications of some discrete regression models for count data
Directory of Open Access Journals (Sweden)
B. M. Golam Kibria
2006-01-01
Full Text Available In this paper we have considered several regression models to fit the count data that encounter in the field of Biometrical, Environmental, Social Sciences and Transportation Engineering. We have fitted Poisson (PO, Negative Binomial (NB, Zero-Inflated Poisson (ZIP and Zero-Inflated Negative Binomial (ZINB regression models to run-off-road (ROR crash data which collected on arterial roads in south region (rural of Florida State. To compare the performance of these models, we analyzed data with moderate to high percentage of zero counts. Because the variances were almost three times greater than the means, it appeared that both NB and ZINB models performed better than PO and ZIP models for the zero inflated and over dispersed count data.
Statistically Reconstructed Multiplexing for Very Dense, High-Channel-Count Acquisition Systems.
Tsai, David; Yuste, Rafael; Shepard, Kenneth L
2018-02-01
Multiplexing is an important strategy in multichannel acquisition systems. The per-channel antialiasing filters needed in the traditional multiplexing architecture limit its scalability for applications requiring high channel density, high channel count, and low noise. A particularly challenging example is multielectrode arrays for recording from neural systems. We show that conventional approaches must tradeoff recording density and noise performance, at a scale far from the ideal goal of one-to-one mapping between neurons and sensors. We present a multiplexing architecture without per-channel antialiasing filters. The sparsely sampled data are recovered through a compressed sensing strategy, involving statistical reconstruction and removal of the undersampled thermal noise. In doing so, we replace large analog components with digital signal processing blocks, which are much more amenable to scaled CMOS implementation. The resulting statistically reconstructed multiplexing architecture recovers input signals at significantly improved signal-to-noise ratios when compared to conventional multiplexing with antialiasing filters at the same per-channel area. We implement the new architecture in a 65 536-channel neural recording system and show that it is able to recover signals with performance comparable to conventional high-performance, single-channel systems, despite a more than four-orders-of-magnitude increase in channel density.
Modeling Repeated Count Data : Some Extensions of the Rasch Poisson Counts Model
van Duijn, M.A.J.; Jansen, Margo
1995-01-01
We consider data that can be summarized as an N X K table of counts-for example, test data obtained by administering K tests to N subjects. The cell entries y(ij) are assumed to be conditionally independent Poisson-distributed random variables, given the NK Poisson intensity parameters mu(ij). The
Poisson statistics-mediated particle/cell counting in microwell arrays.
Ahrberg, Christian D; Lee, Jong Min; Chung, Bong Geun
2018-02-05
Precise determination of particle or cell numbers is of importance for a wide array of applications in environmental studies, medical and biological applications, or manufacturing and monitoring applications in industrial production processes. A number of techniques ranging from manual counting to sophisticated equipment (e.g., flow cytometry) are available for this task. However, these methods are either labour intensive, prone to error, or require expensive equipment. Here, we present a fast, simple method for determining the number density of cells or microparticles using a microwell array. We analyze the light transmission of the microwells and categorize the microwells into two groups. As particles/cells contained in a microwell locally reduce the light transmission, these wells displayed a lower average transmission compared to unoccupied microwells. The number density of particles/cells can be calculated by Poisson statistics from the ratio of occupied to unoccupied microwells. Following this approach, the number densities of two different types of microparticles, as well as HeLa and E. Coli cells, ranging over four orders of magnitude were determined. Through the microwell array defined by microfabrication, a simple image recognition algorithm can be used with the formation of aggregates or irregular shaped samples providing no additional difficulty to the microwell recognition. Additionally, this method can be carried out using only simple equipment and data analysis automated by a computer program.
Full counting statistics in a serially coupled double quantum dot system with spin-orbit coupling
Wang, Qiang; Xue, Hai-Bin; Xie, Hai-Qing
2018-04-01
We study the full counting statistics of electron transport through a serially coupled double quantum dot (QD) system with spin-orbit coupling (SOC) weakly coupled to two electrodes. We demonstrate that the spin polarizations of the source and drain electrodes determine whether the shot noise maintains super-Poissonian distribution, and whether the sign transitions of the skewness from positive to negative values and of the kurtosis from negative to positive values take place. In particular, the interplay between the spin polarizations of the source and drain electrodes and the magnitude of the external magnetic field, can give rise to a gate-voltage-tunable strong negative differential conductance (NDC) and the shot noise in this NDC region is significantly enhanced. Importantly, for a given SOC parameter, the obvious variation of the high-order current cumulants as a function of the energy-level detuning in a certain range, especially the dip position of the Fano factor of the skewness can be used to qualitatively extract the information about the magnitude of the SOC.
Statistical Analysis by Statistical Physics Model for the STOCK Markets
Wang, Tiansong; Wang, Jun; Fan, Bingli
A new stochastic stock price model of stock markets based on the contact process of the statistical physics systems is presented in this paper, where the contact model is a continuous time Markov process, one interpretation of this model is as a model for the spread of an infection. Through this model, the statistical properties of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE) are studied. In the present paper, the data of SSE Composite Index and the data of SZSE Component Index are analyzed, and the corresponding simulation is made by the computer computation. Further, we investigate the statistical properties, fat-tail phenomena, the power-law distributions, and the long memory of returns for these indices. The techniques of skewness-kurtosis test, Kolmogorov-Smirnov test, and R/S analysis are applied to study the fluctuation characters of the stock price returns.
Bilinear modulation models for seasonal tables of counts
B.D. Marx (Brian); P.H.C. Eilers (Paul); J. Gampe (Jutta); R. Rau (Roland)
2010-01-01
textabstractWe propose generalized linear models for time or age-time tables of seasonal counts, with the goal of better understanding seasonal patterns in the data. The linear predictor contains a smooth component for the trend and the product of a smooth component (the modulation) and a periodic
Comparing distribution models for small samples of overdispersed counts of freshwater fish
Vaudor, Lise; Lamouroux, Nicolas; Olivier, Jean-Michel
2011-05-01
The study of species abundance often relies on repeated abundance counts whose number is limited by logistic or financial constraints. The distribution of abundance counts is generally right-skewed (i.e. with many zeros and few high values) and needs to be modelled for statistical inference. We used an extensive dataset involving about 100,000 fish individuals of 12 freshwater fish species collected in electrofishing points (7 m 2) during 350 field surveys made in 25 stream sites, in order to compare the performance and the generality of four distribution models of counts (Poisson, negative binomial and their zero-inflated counterparts). The negative binomial distribution was the best model (Bayesian Information Criterion) for 58% of the samples (species-survey combinations) and was suitable for a variety of life histories, habitat, and sample characteristics. The performance of the models was closely related to samples' statistics such as total abundance and variance. Finally, we illustrated the consequences of a distribution assumption by calculating confidence intervals around the mean abundance, either based on the most suitable distribution assumption or on an asymptotical, distribution-free (Student's) method. Student's method generally corresponded to narrower confidence intervals, especially when there were few (≤3) non-null counts in the samples.
International Nuclear Information System (INIS)
Valor, Alma; Alfonso, Lester; Caleyo, Francisco; Vidal, Julio; Perez-Baruch, Eloy; Hallen, José M.
2015-01-01
Highlights: • Observed external-corrosion defects in underground pipelines revealed a tendency to cluster. • The Poisson distribution is unable to fit extensive count data for these type of defects. • In contrast, the negative binomial distribution provides a suitable count model for them. • Two spatial stochastic processes lead to the negative binomial distribution for defect counts. • They are the Gamma-Poisson mixed process and the compound Poisson process. • A Rogeŕs process also arises as a plausible temporal stochastic process leading to corrosion defect clustering and to negative binomially distributed defect counts. - Abstract: The spatial distribution of external corrosion defects in buried pipelines is usually described as a Poisson process, which leads to corrosion defects being randomly distributed along the pipeline. However, in real operating conditions, the spatial distribution of defects considerably departs from Poisson statistics due to the aggregation of defects in groups or clusters. In this work, the statistical analysis of real corrosion data from underground pipelines operating in southern Mexico leads to conclude that the negative binomial distribution provides a better description for defect counts. The origin of this distribution from several processes is discussed. The analysed processes are: mixed Gamma-Poisson, compound Poisson and Roger’s processes. The physical reasons behind them are discussed for the specific case of soil corrosion.
Hidden Markov models for zero-inflated Poisson counts with an application to substance use.
DeSantis, Stacia M; Bandyopadhyay, Dipankar
2011-06-30
Paradigms for substance abuse cue-reactivity research involve pharmacological or stressful stimulation designed to elicit stress and craving responses in cocaine-dependent subjects. It is unclear as to whether stress induced from participation in such studies increases drug-seeking behavior. We propose a 2-state Hidden Markov model to model the number of cocaine abuses per week before and after participation in a stress-and cue-reactivity study. The hypothesized latent state corresponds to 'high' or 'low' use. To account for a preponderance of zeros, we assume a zero-inflated Poisson model for the count data. Transition probabilities depend on the prior week's state, fixed demographic variables, and time-varying covariates. We adopt a Bayesian approach to model fitting, and use the conditional predictive ordinate statistic to demonstrate that the zero-inflated Poisson hidden Markov model outperforms other models for longitudinal count data. Copyright © 2011 John Wiley & Sons, Ltd.
Models for Predicting and Explaining Citation Count of Biomedical Articles
Fu, Lawrence D.; Aliferis, Constantin
2008-01-01
The single most important bibliometric criterion for judging the impact of biomedical papers and their authors’ work is the number of citations received which is commonly referred to as “citation count”. This metric however is unavailable until several years after publication time. In the present work, we build computer models that accurately predict citation counts of biomedical publications within a deep horizon of ten years using only predictive information available at publication time. O...
Peak-counts blood flow model-errors and limitations
International Nuclear Information System (INIS)
Mullani, N.A.; Marani, S.K.; Ekas, R.D.; Gould, K.L.
1984-01-01
The peak-counts model has several advantages, but its use may be limited due to the condition that the venous egress may not be negligible at the time of peak-counts. Consequently, blood flow measurements by the peak-counts model will depend on the bolus size, bolus duration, and the minimum transit time of the bolus through the region of interest. The effect of bolus size on the measurement of extraction fraction and blood flow was evaluated by injecting 1 to 30ml of rubidium chloride in the femoral vein of a dog and measuring the myocardial activity with a beta probe over the heart. Regional blood flow measurements were not found to vary with bolus sizes up to 30ml. The effect of bolus duration was studied by injecting a 10cc bolus of tracer at different speeds in the femoral vein of a dog. All intravenous injections undergo a broadening of the bolus duration due to the transit time of the tracer through the lungs and the heart. This transit time was found to range from 4-6 second FWHM and dominates the duration of the bolus to the myocardium for up to 3 second injections. A computer simulation has been carried out in which the different parameters of delay time, extraction fraction, and bolus duration can be changed to assess the errors in the peak-counts model. The results of the simulations show that the error will be greatest for short transit time delays and for low extraction fractions
Statistical modelling of fish stocks
DEFF Research Database (Denmark)
Kvist, Trine
1999-01-01
for modelling the dynamics of a fish population is suggested. A new approach is introduced to analyse the sources of variation in age composition data, which is one of the most important sources of information in the cohort based models for estimation of stock abundancies and mortalities. The approach combines...... and it is argued that an approach utilising stochastic differential equations might be advantagous in fish stoch assessments....
Statistical lung model for microdosimetry
International Nuclear Information System (INIS)
Fisher, D.R.; Hadley, R.T.
1984-03-01
To calculate the microdosimetry of plutonium in the lung, a mathematical description is needed of lung tissue microstructure that defines source-site parameters. Beagle lungs were expanded using a glutaraldehyde fixative at 30 cm water pressure. Tissue specimens, five microns thick, were stained with hematoxylin and eosin then studied using an image analyzer. Measurements were made along horizontal lines through the magnified tissue image. The distribution of air space and tissue chord lengths and locations of epithelial cell nuclei were recorded from about 10,000 line scans. The distribution parameters constituted a model of lung microstructure for predicting the paths of random alpha particle tracks in the lung and the probability of traversing biologically sensitive sites. This lung model may be used in conjunction with established deposition and retention models for determining the microdosimetry in the pulmonary lung for a wide variety of inhaled radioactive materials
Hannequin, Pascal Paul
2015-06-07
Noise reduction in photon-counting images remains challenging, especially at low count levels. We have developed an original procedure which associates two complementary filters using a Wiener-derived approach. This approach combines two statistically adaptive filters into a dual-weighted (DW) filter. The first one, a statistically weighted adaptive (SWA) filter, replaces the central pixel of a sliding window with a statistically weighted sum of its neighbors. The second one, a statistical and heuristic noise extraction (extended) (SHINE-Ext) filter, performs a discrete cosine transformation (DCT) using sliding blocks. Each block is reconstructed using its significant components which are selected using tests derived from multiple linear regression (MLR). The two filters are weighted according to Wiener theory. This approach has been validated using a numerical phantom and a real planar Jaszczak phantom. It has also been illustrated using planar bone scintigraphy and myocardial single-photon emission computed tomography (SPECT) data. Performances of filters have been tested using mean normalized absolute error (MNAE) between the filtered images and the reference noiseless or high-count images.Results show that the proposed filters quantitatively decrease the MNAE in the images and then increase the signal-to-noise Ratio (SNR). This allows one to work with lower count images. The SHINE-Ext filter is well suited to high-size images and low-variance areas. DW filtering is efficient for low-size images and in high-variance areas. The relative proportion of eliminated noise generally decreases when count level increases. In practice, SHINE filtering alone is recommended when pixel spacing is less than one-quarter of the effective resolution of the system and/or the size of the objects of interest. It can also be used when the practical interest of high frequencies is low. In any case, DW filtering will be preferable.The proposed filters have been applied to nuclear
Negative binomial mixed models for analyzing microbiome count data.
Zhang, Xinyan; Mallick, Himel; Tang, Zaixiang; Zhang, Lei; Cui, Xiangqin; Benson, Andrew K; Yi, Nengjun
2017-01-03
Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to the well-known properties of microbiome count measurements, for example, varied total sequence reads across samples, over-dispersion and zero-inflation, microbiome studies usually collect samples with hierarchical structures, which introduce correlation among the samples and thus further complicate the analysis and interpretation of microbiome count data. In this article, we propose negative binomial mixed models (NBMMs) for detecting the association between the microbiome and host environmental/clinical factors for correlated microbiome count data. Although having not dealt with zero-inflation, the proposed mixed-effects models account for correlation among the samples by incorporating random effects into the commonly used fixed-effects negative binomial model, and can efficiently handle over-dispersion and varying total reads. We have developed a flexible and efficient IWLS (Iterative Weighted Least Squares) algorithm to fit the proposed NBMMs by taking advantage of the standard procedure for fitting the linear mixed models. We evaluate and demonstrate the proposed method via extensive simulation studies and the application to mouse gut microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of both empirical power and Type I error. The method has been incorporated into the freely available R package BhGLM ( http://www.ssg.uab.edu/bhglm/ and http://github.com/abbyyan3/BhGLM ), providing a useful tool for analyzing microbiome data.
Actuarial statistics with generalized linear mixed models
Antonio, K.; Beirlant, J.
2007-01-01
Over the last decade the use of generalized linear models (GLMs) in actuarial statistics has received a lot of attention, starting from the actuarial illustrations in the standard text by McCullagh and Nelder [McCullagh, P., Nelder, J.A., 1989. Generalized linear models. In: Monographs on Statistics
Nino, Daniel; Rafiei, Nafiseh; Wang, Yong; Zilman, Anton; Milstein, Joshua N
2017-05-09
Superresolved localization microscopy has the potential to serve as an accurate, single-cell technique for counting the abundance of intracellular molecules. However, the stochastic blinking of single fluorophores can introduce large uncertainties into the final count. Here we provide a theoretical foundation for applying superresolved localization microscopy to the problem of molecular counting based on the distribution of blinking events from a single fluorophore. We also show that by redundantly tagging single molecules with multiple, blinking fluorophores, the accuracy of the technique can be enhanced by harnessing the central limit theorem. The coefficient of variation then, for the number of molecules M estimated from a given number of blinks B, scales like ∼1/N l , where N l is the mean number of labels on a target. As an example, we apply our theory to the challenging problem of quantifying the cell-to-cell variability of plasmid copy number in bacteria. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Statistical Modeling of Bivariate Data.
1982-08-01
to one. Following Crain (1974), one may consider order m approximators m log f111(X) - k k (x) - c(e), asx ;b. (4.4.5) k,-r A m and attempt to find...literature. Consider the approximate model m log fn (x) = 7 ekk(x) + a G(x), aSx ;b, (44.8) " k=-Mn ’ where G(x) is a Gaussian process and n is a
Statistical Models and Methods for Lifetime Data
Lawless, Jerald F
2011-01-01
Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,
Accelerated life models modeling and statistical analysis
Bagdonavicius, Vilijandas
2001-01-01
Failure Time DistributionsIntroductionParametric Classes of Failure Time DistributionsAccelerated Life ModelsIntroductionGeneralized Sedyakin's ModelAccelerated Failure Time ModelProportional Hazards ModelGeneralized Proportional Hazards ModelsGeneralized Additive and Additive-Multiplicative Hazards ModelsChanging Shape and Scale ModelsGeneralizationsModels Including Switch-Up and Cycling EffectsHeredity HypothesisSummaryAccelerated Degradation ModelsIntroductionDegradation ModelsModeling the Influence of Explanatory Varia
Multimode model for projective photon-counting measurements
International Nuclear Information System (INIS)
Tualle-Brouri, Rosa; Ourjoumtsev, Alexei; Dantan, Aurelien; Grangier, Philippe; Wubs, Martijn; Soerensen, Anders S.
2009-01-01
We present a general model to account for the multimode nature of the quantum electromagnetic field in projective photon-counting measurements. We focus on photon-subtraction experiments, where non-Gaussian states are produced conditionally. These are useful states for continuous-variable quantum-information processing. We present a general method called mode reduction that reduces the multimode model to an effective two-mode problem. We apply this method to a multimode model describing broadband parametric down-conversion, thereby improving the analysis of existing experimental results. The main improvement is that spatial and frequency filters before the photon detector are taken into account explicitly. We find excellent agreement with previously published experimental results, using fewer free parameters than before, and discuss the implications of our analysis for the optimized production of states with negative Wigner functions.
Bayesian Correction for Misclassification in Multilevel Count Data Models
Directory of Open Access Journals (Sweden)
Tyler Nelson
2018-01-01
Full Text Available Covariate misclassification is well known to yield biased estimates in single level regression models. The impact on hierarchical count models has been less studied. A fully Bayesian approach to modeling both the misclassified covariate and the hierarchical response is proposed. Models with a single diagnostic test and with multiple diagnostic tests are considered. Simulation studies show the ability of the proposed model to appropriately account for the misclassification by reducing bias and improving performance of interval estimators. A real data example further demonstrated the consequences of ignoring the misclassification. Ignoring misclassification yielded a model that indicated there was a significant, positive impact on the number of children of females who observed spousal abuse between their parents. When the misclassification was accounted for, the relationship switched to negative, but not significant. Ignoring misclassification in standard linear and generalized linear models is well known to lead to biased results. We provide an approach to extend misclassification modeling to the important area of hierarchical generalized linear models.
Assessment of noise in a digital image using the join-count statistic and the Moran test
International Nuclear Information System (INIS)
Kehshih Chuang; Huang, H.K.
1992-01-01
It is assumed that data bits of a pixel in digital images can be divided into signal and noise bits. The signal bits occupy the most significant part of the pixel. The signal parts of each pixel are correlated while the noise parts are uncorrelated. Two statistical methods, the Moran test and the join-count statistic, are used to examine the noise parts. Images from computerized tomography, magnetic resonance and computed radiography are used for the evaluation of the noise bits. A residual image is formed by subtracting the original image from its smoothed version. The noise level in the residual image is then identical to that in the original image. Both statistical tests are then performed on the bit planes of the residual image. Results show that most digital images contain only 8-9 bits of correlated information. Both methods are easy to implement and fast to perform. (author)
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Uncertainty the soul of modeling, probability & statistics
Briggs, William
2016-01-01
This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science. The ultimate goal is to call into question many standard tenets and lay the philosophical and probabilistic groundwork and infrastructure for statistical modeling. It is the first book devoted to the philosophy of data aimed at working scientists and calls for a new consideration in the practice of probability and statistics to eliminate what has been referred to as the "Cult of Statistical Significance". The book explains the philosophy of these ideas and not the mathematics, though there are a handful of mathematical examples. The topics are logically laid out, starting with basic philosophy as related to probability, statistics, and science, and stepping through the key probabilistic ideas and concepts, and ending with statistical models. Its jargon-free approach asserts that standard methods, suc...
Automated statistical modeling of analytical measurement systems
International Nuclear Information System (INIS)
Jacobson, J.J.
1992-01-01
The statistical modeling of analytical measurement systems at the Idaho Chemical Processing Plant (ICPP) has been completely automated through computer software. The statistical modeling of analytical measurement systems is one part of a complete quality control program used by the Remote Analytical Laboratory (RAL) at the ICPP. The quality control program is an integration of automated data input, measurement system calibration, database management, and statistical process control. The quality control program and statistical modeling program meet the guidelines set forth by the American Society for Testing Materials and American National Standards Institute. A statistical model is a set of mathematical equations describing any systematic bias inherent in a measurement system and the precision of a measurement system. A statistical model is developed from data generated from the analysis of control standards. Control standards are samples which are made up at precise known levels by an independent laboratory and submitted to the RAL. The RAL analysts who process control standards do not know the values of those control standards. The object behind statistical modeling is to describe real process samples in terms of their bias and precision and, to verify that a measurement system is operating satisfactorily. The processing of control standards gives us this ability
Topology for statistical modeling of petascale data.
Energy Technology Data Exchange (ETDEWEB)
Pascucci, Valerio (University of Utah, Salt Lake City, UT); Mascarenhas, Ajith Arthur; Rusek, Korben (Texas A& M University, College Station, TX); Bennett, Janine Camille; Levine, Joshua (University of Utah, Salt Lake City, UT); Pebay, Philippe Pierre; Gyulassy, Attila (University of Utah, Salt Lake City, UT); Thompson, David C.; Rojas, Joseph Maurice (Texas A& M University, College Station, TX)
2011-07-01
This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled 'Topology for Statistical Modeling of Petascale Data', funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program. Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is thus to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, our approach is based on the complementary techniques of combinatorial topology and statistical modeling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modeling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. This document summarizes the technical advances we have made to date that were made possible in whole or in part by MAPD funding. These technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modeling, and (3) new integrated topological and statistical methods.
Muller, Benjamin J.; Cade, Brian S.; Schwarzkoph, Lin
2018-01-01
Many different factors influence animal activity. Often, the value of an environmental variable may influence significantly the upper or lower tails of the activity distribution. For describing relationships with heterogeneous boundaries, quantile regressions predict a quantile of the conditional distribution of the dependent variable. A quantile count model extends linear quantile regression methods to discrete response variables, and is useful if activity is quantified by trapping, where there may be many tied (equal) values in the activity distribution, over a small range of discrete values. Additionally, different environmental variables in combination may have synergistic or antagonistic effects on activity, so examining their effects together, in a modeling framework, is a useful approach. Thus, model selection on quantile counts can be used to determine the relative importance of different variables in determining activity, across the entire distribution of capture results. We conducted model selection on quantile count models to describe the factors affecting activity (numbers of captures) of cane toads (Rhinella marina) in response to several environmental variables (humidity, temperature, rainfall, wind speed, and moon luminosity) over eleven months of trapping. Environmental effects on activity are understudied in this pest animal. In the dry season, model selection on quantile count models suggested that rainfall positively affected activity, especially near the lower tails of the activity distribution. In the wet season, wind speed limited activity near the maximum of the distribution, while minimum activity increased with minimum temperature. This statistical methodology allowed us to explore, in depth, how environmental factors influenced activity across the entire distribution, and is applicable to any survey or trapping regime, in which environmental variables affect activity.
Infinite Random Graphs as Statistical Mechanical Models
DEFF Research Database (Denmark)
Durhuus, Bergfinnur Jøgvan; Napolitano, George Maria
2011-01-01
We discuss two examples of infinite random graphs obtained as limits of finite statistical mechanical systems: a model of two-dimensional dis-cretized quantum gravity defined in terms of causal triangulated surfaces, and the Ising model on generic random trees. For the former model we describe...
van den Ende, Jan; van Oost, Elizabeth C.J.
2001-01-01
This article is a longitudinal analysis of the relation between gendered labour divisions and new data processing technologies at the Dutch Central Bureau of Statistics (CBS). Following social-constructivist and evolutionary economic approaches, the authors hold that the relation between technology
Review of statistical models for nuclear reactions
International Nuclear Information System (INIS)
Igarasi, Sin-iti
1991-01-01
Statistical model calculations have been widely performed for nuclear data evaluations. These were based on the models of Hauser-Feshbach, Weisskopf-Ewing and their modifications. Since the 1940s, non-compound nuclear phenomena have been observed, and stimulated many nuclear physicists to study compound and non-compound nuclear reaction mechanisms. Concerning compound nuclear reactions, they investigated problems on the basis of fundamental properties of S-matrix, statistical distributions of resonance pole parameters, random matrix elements of the nuclear Hamiltonian, and so forth. They have presented many sophisticated results. But old statistical models have been still useful, because these models were simple and easily utilizable. In this report, these old and new models will be briefly reviewed with a purpose of application to nuclear data evaluation, and examine applicability of the new models. (author)
Matrix Tricks for Linear Statistical Models
Puntanen, Simo; Styan, George PH
2011-01-01
In teaching linear statistical models to first-year graduate students or to final-year undergraduate students there is no way to proceed smoothly without matrices and related concepts of linear algebra; their use is really essential. Our experience is that making some particular matrix tricks very familiar to students can substantially increase their insight into linear statistical models (and also multivariate statistical analysis). In matrix algebra, there are handy, sometimes even very simple "tricks" which simplify and clarify the treatment of a problem - both for the student and
Daily precipitation statistics in regional climate models
DEFF Research Database (Denmark)
Frei, Christoph; Christensen, Jens Hesselbjerg; Déqué, Michel
2003-01-01
. The 15-year integrations were forced from reanalyses and observed sea surface temperature and sea ice (global model from sea surface only). The observational reference is based on 6400 rain gauge records (10-50 stations per grid box). Evaluation statistics encompass mean precipitation, wet-day frequency...... for other statistics. In summer, all models underestimate precipitation intensity (by 16-42%) and there is a too low frequency of heavy events. This bias reflects too dry summer mean conditions in three of the models, while it is partly compensated by too many low-intensity events in the other two models...
Distributions with given marginals and statistical modelling
Fortiana, Josep; Rodriguez-Lallena, José
2002-01-01
This book contains a selection of the papers presented at the meeting `Distributions with given marginals and statistical modelling', held in Barcelona (Spain), July 17-20, 2000. In 24 chapters, this book covers topics such as the theory of copulas and quasi-copulas, the theory and compatibility of distributions, models for survival distributions and other well-known distributions, time series, categorical models, definition and estimation of measures of dependence, monotonicity and stochastic ordering, shape and separability of distributions, hidden truncation models, diagonal families, orthogonal expansions, tests of independence, and goodness of fit assessment. These topics share the use and properties of distributions with given marginals, this being the fourth specialised text on this theme. The innovative aspect of the book is the inclusion of statistical aspects such as modelling, Bayesian statistics, estimation, and tests.
Statistical Modeling for Radiation Hardness Assurance
Ladbury, Raymond L.
2014-01-01
We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.
Performance modeling, loss networks, and statistical multiplexing
Mazumdar, Ravi
2009-01-01
This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of understanding the phenomenon of statistical multiplexing. The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the important ideas of Palm distributions associated with traffic models and their role in performance measures. Also presented are recent ideas of large buffer, and many sources asymptotics that play an important role in understanding statistical multiplexing. I
Simple statistical model for branched aggregates
DEFF Research Database (Denmark)
Lemarchand, Claire; Hansen, Jesper Schmidt
2015-01-01
We propose a statistical model that can reproduce the size distribution of any branched aggregate, including amylopectin, dendrimers, molecular clusters of monoalcohols, and asphaltene nanoaggregates. It is based on the conditional probability for one molecule to form a new bond with a molecule......, given that it already has bonds with others. The model is applied here to asphaltene nanoaggregates observed in molecular dynamics simulations of Cooee bitumen. The variation with temperature of the probabilities deduced from this model is discussed in terms of statistical mechanics arguments....... The relevance of the statistical model in the case of asphaltene nanoaggregates is checked by comparing the predicted value of the probability for one molecule to have exactly i bonds with the same probability directly measured in the molecular dynamics simulations. The agreement is satisfactory...
Statistical Model Checking for Stochastic Hybrid Systems
DEFF Research Database (Denmark)
David, Alexandre; Du, Dehui; Larsen, Kim Guldstrand
2012-01-01
This paper presents novel extensions and applications of the UPPAAL-SMC model checker. The extensions allow for statistical model checking of stochastic hybrid systems. We show how our race-based stochastic semantics extends to networks of hybrid systems, and indicate the integration technique ap...
AbouZahr, Carla; de Savigny, Don; Mikkelsen, Lene; Setel, Philip W; Lozano, Rafael; Nichols, Erin; Notzon, Francis; Lopez, Alan D
2015-10-03
New momentum for civil registration and vital statistics (CRVS) is building, driven by the confluence of growing demands for accountability and results in health, improved equity, and rights-based approaches to development challenges, and by the immense potential of innovation and new technologies to accelerate CRVS improvement. Examples of country successes in strengthening of hitherto weak systems are emerging. The key to success has been to build collaborative partnerships involving local ownership by several sectors that span registration, justice, health, statistics, and civil society. Regional partners can be important to raise awareness, set regional goals and targets, foster country-to-country exchange and mutual learning, and build high-level political commitment. These regional partners continue to provide a platform through which country stakeholders, development partners, and technical experts can share experiences, develop and document good practices, and propose innovative approaches to tackle CRVS challenges. This country and regional momentum would benefit from global leadership, commitment, and support. Copyright © 2015 Elsevier Ltd. All rights reserved.
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Zheng, Han; Kimber, Alan; Goodwin, Victoria A; Pickering, Ruth M
2018-01-01
A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow-up period of time. This paper addresses how best to include the baseline count in the analysis of the follow-up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log-transformation as a regressor in NB regression (NB-logged) or as an offset (NB-offset) resulted in greater power than including the untransformed baseline count (NB-unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB-logged, NB-offset, and CNB models, but not from NB-unlogged, and large, outlying baseline counts were overly influential in NB-unlogged but not in NB-logged. We conclude that there is little to lose by including the log-transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Statistical physics of pairwise probability models
DEFF Research Database (Denmark)
Roudi, Yasser; Aurell, Erik; Hertz, John
2009-01-01
(dansk abstrakt findes ikke) Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data......: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying...... and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring...
Growth curve models and statistical diagnostics
Pan, Jian-Xin
2002-01-01
Growth-curve models are generalized multivariate analysis-of-variance models. These models are especially useful for investigating growth problems on short times in economics, biology, medical research, and epidemiology. This book systematically introduces the theory of the GCM with particular emphasis on their multivariate statistical diagnostics, which are based mainly on recent developments made by the authors and their collaborators. The authors provide complete proofs of theorems as well as practical data sets and MATLAB code.
Topology for Statistical Modeling of Petascale Data
Energy Technology Data Exchange (ETDEWEB)
Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Bremer, P. -T. [Univ. of Utah, Salt Lake City, UT (United States)
2013-10-31
Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, the approach of the entire team involving all three institutions is based on the complementary techniques of combinatorial topology and statistical modelling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modelling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. The overall technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modelling, and (3) new integrated topological and statistical methods. Roughly speaking, the division of labor between our 3 groups (Sandia Labs in Livermore, Texas A&M in College Station, and U Utah in Salt Lake City) is as follows: the Sandia group focuses on statistical methods and their formulation in algebraic terms, and finds the application problems (and data sets) most relevant to this project, the Texas A&M Group develops new algebraic geometry algorithms, in particular with fewnomial theory, and the Utah group develops new algorithms in computational topology via Discrete Morse Theory. However, we hasten to point out that our three groups stay in tight contact via videconference every 2 weeks, so there is much synergy of ideas between the groups. The following of this document is focused on the contributions that had grater direct involvement from the team at the University of Utah in Salt Lake City.
An R companion to linear statistical models
Hay-Jahans, Christopher
2011-01-01
Focusing on user-developed programming, An R Companion to Linear Statistical Models serves two audiences: those who are familiar with the theory and applications of linear statistical models and wish to learn or enhance their skills in R; and those who are enrolled in an R-based course on regression and analysis of variance. For those who have never used R, the book begins with a self-contained introduction to R that lays the foundation for later chapters.This book includes extensive and carefully explained examples of how to write programs using the R programming language. These examples cove
Bayesian models a statistical primer for ecologists
Hobbs, N Thompson
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili
Central Limit Theorem for Exponentially Quasi-local Statistics of Spin Models on Cayley Graphs
Reddy, Tulasi Ram; Vadlamani, Sreekar; Yogeshwaran, D.
2018-04-01
Central limit theorems for linear statistics of lattice random fields (including spin models) are usually proven under suitable mixing conditions or quasi-associativity. Many interesting examples of spin models do not satisfy mixing conditions, and on the other hand, it does not seem easy to show central limit theorem for local statistics via quasi-associativity. In this work, we prove general central limit theorems for local statistics and exponentially quasi-local statistics of spin models on discrete Cayley graphs with polynomial growth. Further, we supplement these results by proving similar central limit theorems for random fields on discrete Cayley graphs taking values in a countable space, but under the stronger assumptions of α -mixing (for local statistics) and exponential α -mixing (for exponentially quasi-local statistics). All our central limit theorems assume a suitable variance lower bound like many others in the literature. We illustrate our general central limit theorem with specific examples of lattice spin models and statistics arising in computational topology, statistical physics and random networks. Examples of clustering spin models include quasi-associated spin models with fast decaying covariances like the off-critical Ising model, level sets of Gaussian random fields with fast decaying covariances like the massive Gaussian free field and determinantal point processes with fast decaying kernels. Examples of local statistics include intrinsic volumes, face counts, component counts of random cubical complexes while exponentially quasi-local statistics include nearest neighbour distances in spin models and Betti numbers of sub-critical random cubical complexes.
STATISTICAL MODELS OF REPRESENTING INTELLECTUAL CAPITAL
Directory of Open Access Journals (Sweden)
Andreea Feraru
2016-06-01
Full Text Available This article entitled Statistical Models of Representing Intellectual Capital approaches and analyses the concept of intellectual capital, as well as the main models which can support enterprisers/managers in evaluating and quantifying the advantages of intellectual capital. Most authors examine intellectual capital from a static perspective and focus on the development of its various evaluation models. In this chapter we surveyed the classical static models: Sveiby, Edvisson, Balanced Scorecard, as well as the canonical model of intellectual capital. Among the group of static models for evaluating organisational intellectual capital the canonical model stands out. This model enables the structuring of organisational intellectual capital in: human capital, structural capital and relational capital. Although the model is widely spread, it is a static one and can thus create a series of errors in the process of evaluation, because all the three entities mentioned above are not independent from the viewpoint of their contents, as any logic of structuring complex entities requires.
Directory of Open Access Journals (Sweden)
H Christina Fan
Full Text Available We recently demonstrated noninvasive detection of fetal aneuploidy by shotgun sequencing cell-free DNA in maternal plasma using next-generation high throughput sequencer. However, GC bias introduced by the sequencer placed a practical limit on the sensitivity of aneuploidy detection. In this study, we describe a method to computationally remove GC bias in short read sequencing data by applying weight to each sequenced read based on local genomic GC content. We show that sensitivity is limited only by counting statistics and that sensitivity can be increased to arbitrary precision in sample containing arbitrarily small fraction of fetal DNA simply by sequencing more DNA molecules. High throughput shotgun sequencing of maternal plasma DNA should therefore enable noninvasive diagnosis of any type of fetal aneuploidy.
Statistical Model Checking for Product Lines
DEFF Research Database (Denmark)
ter Beek, Maurice H.; Legay, Axel; Lluch Lafuente, Alberto
2016-01-01
We report on the suitability of statistical model checking for the analysis of quantitative properties of product line models by an extended treatment of earlier work by the authors. The type of analysis that can be performed includes the likelihood of specific product behaviour, the expected...... average cost of products (in terms of the attributes of the products’ features) and the probability of features to be (un)installed at runtime. The product lines must be modelled in QFLan, which extends the probabilistic feature-oriented language PFLan with novel quantitative constraints among features...... behaviour converge in a discrete-time Markov chain semantics, enabling the analysis of quantitative properties. Technically, a Maude implementation of QFLan, integrated with Microsoft’s SMT constraint solver Z3, is combined with the distributed statistical model checker MultiVeStA, developed by one...
(ajst) statistical mechanics model for orientational
African Journals Online (AJOL)
2: December, 2005. African Journal of Science and Technology (AJST). Science and Engineering Series Vol. 6, No. 2, pp. 94 - 101. STATISTICAL MECHANICS MODEL FOR ORIENTATIONAL. MOTION OF TWO-DIMENSIONAL RIGID ROTATOR. Malo, J.O.. Department of Physics, University of Nairobi, P.O. Box 30197 ...
Probing NWP model deficiencies by statistical postprocessing
DEFF Research Database (Denmark)
Rosgaard, Martin Haubjerg; Nielsen, Henrik Aalborg; Nielsen, Torben S.
2016-01-01
The objective in this article is twofold. On one hand, a Model Output Statistics (MOS) framework for improved wind speed forecast accuracy is described and evaluated. On the other hand, the approach explored identifies unintuitive explanatory value from a diagnostic variable in an operational num...
Topology for Statistical Modeling of Petascale Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pebay, Philippe Pierre [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Rojas, Maurice [Texas A & M Univ., College Station, TX (United States)
2014-07-01
This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled "Topology for Statistical Modeling of Petascale Data", funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program.
Statistical models for competing risk analysis
International Nuclear Information System (INIS)
Sather, H.N.
1976-08-01
Research results on three new models for potential applications in competing risks problems. One section covers the basic statistical relationships underlying the subsequent competing risks model development. Another discusses the problem of comparing cause-specific risk structure by competing risks theory in two homogeneous populations, P1 and P2. Weibull models which allow more generality than the Berkson and Elveback models are studied for the effect of time on the hazard function. The use of concomitant information for modeling single-risk survival is extended to the multiple failure mode domain of competing risks. The model used to illustrate the use of this methodology is a life table model which has constant hazards within pre-designated intervals of the time scale. Two parametric models for bivariate dependent competing risks, which provide interesting alternatives, are proposed and examined
Performance modeling, stochastic networks, and statistical multiplexing
Mazumdar, Ravi R
2013-01-01
This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of introducing an appropriate mathematical framework for modeling and analysis as well as understanding the phenomenon of statistical multiplexing. The models, techniques, and results presented form the core of traffic engineering methods used to design, control and allocate resources in communication networks.The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the importan
Bayesian dynamic modeling of time series of dengue disease case counts.
Martínez-Bello, Daniel Adyro; López-Quílez, Antonio; Torres-Prieto, Alexander
2017-07-01
The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model's short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC) for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease, producing useful
Bayesian dynamic modeling of time series of dengue disease case counts.
Directory of Open Access Journals (Sweden)
Daniel Adyro Martínez-Bello
2017-07-01
Full Text Available The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model's short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease
Statistical physics of pairwise probability models
Directory of Open Access Journals (Sweden)
Yasser Roudi
2009-11-01
Full Text Available Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring the parameters in a pairwise model depends on the time bin chosen for binning the data. We also study the effect of the size of the time bin on the model quality itself, again using simulated data. We show that using finer time bins increases the quality of the pairwise model. We offer new ways of deriving the expressions reported in our previous work for assessing the quality of pairwise models.
D. Todd Jones-Farrand; Todd M. Fearer; Wayne E. Thogmartin; Frank R. Thompson; Mark D. Nelson; John M. Tirpak
2011-01-01
Selection of a modeling approach is an important step in the conservation planning process, but little guidance is available. We compared two statistical and three theoretical habitat modeling approaches representing those currently being used for avian conservation planning at landscape and regional scales: hierarchical spatial count (HSC), classification and...
Statistical models of petrol engines vehicles dynamics
Ilie, C. O.; Marinescu, M.; Alexa, O.; Vilău, R.; Grosu, D.
2017-10-01
This paper focuses on studying statistical models of vehicles dynamics. It was design and perform a one year testing program. There were used many same type cars with gasoline engines and different mileage. Experimental data were collected of onboard sensors and those on the engine test stand. A database containing data of 64th tests was created. Several mathematical modelling were developed using database and the system identification method. Each modelling is a SISO or a MISO linear predictive ARMAX (AutoRegressive-Moving-Average with eXogenous inputs) model. It represents a differential equation with constant coefficients. It were made 64th equations for each dependency like engine torque as output and engine’s load and intake manifold pressure, as inputs. There were obtained strings with 64 values for each type of model. The final models were obtained using average values of the coefficients. The accuracy of models was assessed.
Equilibrium statistical mechanics of lattice models
Lavis, David A
2015-01-01
Most interesting and difficult problems in equilibrium statistical mechanics concern models which exhibit phase transitions. For graduate students and more experienced researchers this book provides an invaluable reference source of approximate and exact solutions for a comprehensive range of such models. Part I contains background material on classical thermodynamics and statistical mechanics, together with a classification and survey of lattice models. The geometry of phase transitions is described and scaling theory is used to introduce critical exponents and scaling laws. An introduction is given to finite-size scaling, conformal invariance and Schramm—Loewner evolution. Part II contains accounts of classical mean-field methods. The parallels between Landau expansions and catastrophe theory are discussed and Ginzburg—Landau theory is introduced. The extension of mean-field theory to higher-orders is explored using the Kikuchi—Hijmans—De Boer hierarchy of approximations. In Part III the use of alge...
Statistical Models of Adaptive Immune populations
Sethna, Zachary; Callan, Curtis; Walczak, Aleksandra; Mora, Thierry
The availability of large (104-106 sequences) datasets of B or T cell populations from a single individual allows reliable fitting of complex statistical models for naïve generation, somatic selection, and hypermutation. It is crucial to utilize a probabilistic/informational approach when modeling these populations. The inferred probability distributions allow for population characterization, calculation of probability distributions of various hidden variables (e.g. number of insertions), as well as statistical properties of the distribution itself (e.g. entropy). In particular, the differences between the T cell populations of embryonic and mature mice will be examined as a case study. Comparing these populations, as well as proposed mixed populations, provides a concrete exercise in model creation, comparison, choice, and validation.
Statistical shape and appearance models of bones.
Sarkalkan, Nazli; Weinans, Harrie; Zadpoor, Amir A
2014-03-01
When applied to bones, statistical shape models (SSM) and statistical appearance models (SAM) respectively describe the mean shape and mean density distribution of bones within a certain population as well as the main modes of variations of shape and density distribution from their mean values. The availability of this quantitative information regarding the detailed anatomy of bones provides new opportunities for diagnosis, evaluation, and treatment of skeletal diseases. The potential of SSM and SAM has been recently recognized within the bone research community. For example, these models have been applied for studying the effects of bone shape on the etiology of osteoarthritis, improving the accuracy of clinical osteoporotic fracture prediction techniques, design of orthopedic implants, and surgery planning. This paper reviews the main concepts, methods, and applications of SSM and SAM as applied to bone. Copyright © 2013 Elsevier Inc. All rights reserved.
Cellular automata and statistical mechanical models
International Nuclear Information System (INIS)
Rujan, P.
1987-01-01
The authors elaborate on the analogy between the transfer matrix of usual lattice models and the master equation describing the time development of cellular automata. Transient and stationary properties of probabilistic automata are linked to surface and bulk properties, respectively, of restricted statistical mechanical systems. It is demonstrated that methods of statistical physics can be successfully used to describe the dynamic and the stationary behavior of such automata. Some exact results are derived, including duality transformations, exact mappings, disorder, and linear solutions. Many examples are worked out in detail to demonstrate how to use statistical physics in order to construct cellular automata with desired properties. This approach is considered to be a first step toward the design of fully parallel, probabilistic systems whose computational abilities rely on the cooperative behavior of their components
Link, W.A.; Sauer, J.R.; Helbig, Andreas J.; Flade, Martin
1999-01-01
Count survey data are commonly used for estimating temporal and spatial patterns of population change. Since count surveys are not censuses, counts can be influenced by 'nuisance factors' related to the probability of detecting animals but unrelated to the actual population size. The effects of systematic changes in these factors can be confounded with patterns of population change. Thus, valid analysis of count survey data requires the identification of nuisance factors and flexible models for their effects. We illustrate using data from the Christmas Bird Count (CBC), a midwinter survey of bird populations in North America. CBC survey effort has substantially increased in recent years, suggesting that unadjusted counts may overstate population growth (or understate declines). We describe a flexible family of models for the effect of effort, that includes models in which increasing effort leads to diminishing returns in terms of the number of birds counted.
The Rasch Poisson Counts Model for Incomplete Data: An Application of the EM Algorithm.
Jansen, Margo G. H.
1995-01-01
The Rasch Poisson counts model is a latent trait model for the situation in which "K" tests are administered to "N" examinees and the test score is a count (repeated number of some event). A mixed model is presented that applies the EM algorithm and that can allow for missing data. (SLD)
Simulation on Poisson and negative binomial models of count road accident modeling
Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.
2016-11-01
Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.
Statistical modeling of geopressured geothermal reservoirs
Ansari, Esmail; Hughes, Richard; White, Christopher D.
2017-06-01
Identifying attractive candidate reservoirs for producing geothermal energy requires predictive models. In this work, inspectional analysis and statistical modeling are used to create simple predictive models for a line drive design. Inspectional analysis on the partial differential equations governing this design yields a minimum number of fifteen dimensionless groups required to describe the physics of the system. These dimensionless groups are explained and confirmed using models with similar dimensionless groups but different dimensional parameters. This study models dimensionless production temperature and thermal recovery factor as the responses of a numerical model. These responses are obtained by a Box-Behnken experimental design. An uncertainty plot is used to segment the dimensionless time and develop a model for each segment. The important dimensionless numbers for each segment of the dimensionless time are identified using the Boosting method. These selected numbers are used in the regression models. The developed models are reduced to have a minimum number of predictors and interactions. The reduced final models are then presented and assessed using testing runs. Finally, applications of these models are offered. The presented workflow is generic and can be used to translate the output of a numerical simulator into simple predictive models in other research areas involving numerical simulation.
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Logarithmic transformed statistical models in calibration
International Nuclear Information System (INIS)
Zeis, C.D.
1975-01-01
A general type of statistical model used for calibration of instruments having the property that the standard deviations of the observed values increase as a function of the mean value is described. The application to the Helix Counter at the Rocky Flats Plant is primarily from a theoretical point of view. The Helix Counter measures the amount of plutonium in certain types of chemicals. The method described can be used also for other calibrations. (U.S.)
Statistical model for high energy inclusive processes
International Nuclear Information System (INIS)
Pomorisac, B.
1980-01-01
We propose a statistical model of inclusive processes. The model is an extension of the model proposed by Salapino and Sugar for the inclusive distributions in rapidity. The model is defined in terms of a random variable on the full phase space of the produced particles and in terms of a Lorentz-invariant probability distribution. We suggest that the Lorentz invariance is broken spontaneously, this may describe the observed anisotropy of the inclusive distributions. Based on this model we calculate the distribution in transverse momentum. An explicit calculation is given of the one-particle inclusive cross sections and the two-particle correlation. The results give a fair representation of the shape of one-particle inclusive cross sections, and positive correlation for the particles emitted. The relevance of our results to experiments is discussed
International Nuclear Information System (INIS)
Béthermin, Matthieu; Daddi, Emanuele; Sargent, Mark T.; Elbaz, David; Mullaney, James; Pannella, Maurilio; Magdis, Georgios; Hezaveh, Yashar; Le Borgne, Damien; Buat, Véronique; Charmandaris, Vassilis; Lagache, Guilaine; Scott, Douglas
2012-01-01
We reproduce the mid-infrared to radio galaxy counts with a new empirical model based on our current understanding of the evolution of main-sequence (MS) and starburst (SB) galaxies. We rely on a simple spectral energy distribution (SED) library based on Herschel observations: a single SED for the MS and another one for SB, getting warmer with redshift. Our model is able to reproduce recent measurements of galaxy counts performed with Herschel, including counts per redshift slice. This agreement demonstrates the power of our 2-Star-Formation Modes (2SFM) decomposition in describing the statistical properties of infrared sources and their evolution with cosmic time. We discuss the relative contribution of MS and SB galaxies to the number counts at various wavelengths and flux densities. We also show that MS galaxies are responsible for a bump in the 1.4 GHz radio counts around 50 μJy. Material of the model (predictions, SED library, mock catalogs, etc.) is available online.
The Rasch Poisson counts model for incomplete data : An application of the EM algorithm
Jansen, G.G.H.
Rasch's Poisson counts model is a latent trait model for the situation in which K tests are administered to N examinees and the test score is a count [e.g., the repeated occurrence of some event, such as the number of items completed or the number of items answered (in)correctly]. The Rasch Poisson
Modelling T4 cell count as a marker of HIV progression in the ...
African Journals Online (AJOL)
Modelling T4 cell count as a marker of HIV progression in the absence of any defense mechanism. VSM Yadavalli, MMO Labeodan, S Udayabaskaran, N Forche. Abstract. The T4 cell count, which is considered one of the markers of disease progression in an HIV infected individual, is modelled in this paper. The World ...
Statistical Modelling of the Soil Dielectric Constant
Usowicz, Boguslaw; Marczewski, Wojciech; Bogdan Usowicz, Jerzy; Lipiec, Jerzy
2010-05-01
The dielectric constant of soil is the physical property being very sensitive on water content. It funds several electrical measurement techniques for determining the water content by means of direct (TDR, FDR, and others related to effects of electrical conductance and/or capacitance) and indirect RS (Remote Sensing) methods. The work is devoted to a particular statistical manner of modelling the dielectric constant as the property accounting a wide range of specific soil composition, porosity, and mass density, within the unsaturated water content. Usually, similar models are determined for few particular soil types, and changing the soil type one needs switching the model on another type or to adjust it by parametrization of soil compounds. Therefore, it is difficult comparing and referring results between models. The presented model was developed for a generic representation of soil being a hypothetical mixture of spheres, each representing a soil fraction, in its proper phase state. The model generates a serial-parallel mesh of conductive and capacitive paths, which is analysed for a total conductive or capacitive property. The model was firstly developed to determine the thermal conductivity property, and now it is extended on the dielectric constant by analysing the capacitive mesh. The analysis is provided by statistical means obeying physical laws related to the serial-parallel branching of the representative electrical mesh. Physical relevance of the analysis is established electrically, but the definition of the electrical mesh is controlled statistically by parametrization of compound fractions, by determining the number of representative spheres per unitary volume per fraction, and by determining the number of fractions. That way the model is capable covering properties of nearly all possible soil types, all phase states within recognition of the Lorenz and Knudsen conditions. In effect the model allows on generating a hypothetical representative of
Encoding Dissimilarity Data for Statistical Model Building.
Wahba, Grace
2010-12-01
We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.
Statistical Model Checking for Biological Systems
DEFF Research Database (Denmark)
David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel
2014-01-01
Statistical Model Checking (SMC) is a highly scalable simulation-based verification approach for testing and estimating the probability that a stochastic system satisfies a given linear temporal property. The technique has been applied to (discrete and continuous time) Markov chains, stochastic...... timed automata and most recently hybrid systems using the tool Uppaal SMC. In this paper we enable the application of SMC to complex biological systems, by combining Uppaal SMC with ANIMO, a plugin of the tool Cytoscape used by biologists, as well as with SimBiology®, a plugin of Matlab to simulate...
Average Nuclear properties based on statistical model
International Nuclear Information System (INIS)
El-Jaick, L.J.
1974-01-01
The rough properties of nuclei were investigated by statistical model, in systems with the same and different number of protons and neutrons, separately, considering the Coulomb energy in the last system. Some average nuclear properties were calculated based on the energy density of nuclear matter, from Weizsscker-Beth mass semiempiric formulae, generalized for compressible nuclei. In the study of a s surface energy coefficient, the great influence exercised by Coulomb energy and nuclear compressibility was verified. For a good adjust of beta stability lines and mass excess, the surface symmetry energy were established. (M.C.K.) [pt
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
Statistical tests of simple earthquake cycle models
Devries, Phoebe M. R.; Evans, Eileen
2016-01-01
A central goal of observing and modeling the earthquake cycle is to forecast when a particular fault may generate an earthquake: a fault late in its earthquake cycle may be more likely to generate an earthquake than a fault early in its earthquake cycle. Models that can explain geodetic observations throughout the entire earthquake cycle may be required to gain a more complete understanding of relevant physics and phenomenology. Previous efforts to develop unified earthquake models for strike-slip faults have largely focused on explaining both preseismic and postseismic geodetic observations available across a few faults in California, Turkey, and Tibet. An alternative approach leverages the global distribution of geodetic and geologic slip rate estimates on strike-slip faults worldwide. Here we use the Kolmogorov-Smirnov test for similarity of distributions to infer, in a statistically rigorous manner, viscoelastic earthquake cycle models that are inconsistent with 15 sets of observations across major strike-slip faults. We reject a large subset of two-layer models incorporating Burgers rheologies at a significance level of α = 0.05 (those with long-term Maxwell viscosities ηM ~ 4.6 × 1020 Pa s) but cannot reject models on the basis of transient Kelvin viscosity ηK. Finally, we examine the implications of these results for the predicted earthquake cycle timing of the 15 faults considered and compare these predictions to the geologic and historical record.
Statistical modeling to support power system planning
Staid, Andrea
This dissertation focuses on data-analytic approaches that improve our understanding of power system applications to promote better decision-making. It tackles issues of risk analysis, uncertainty management, resource estimation, and the impacts of climate change. Tools of data mining and statistical modeling are used to bring new insight to a variety of complex problems facing today's power system. The overarching goal of this research is to improve the understanding of the power system risk environment for improved operation, investment, and planning decisions. The first chapter introduces some challenges faced in planning for a sustainable power system. Chapter 2 analyzes the driving factors behind the disparity in wind energy investments among states with a goal of determining the impact that state-level policies have on incentivizing wind energy. Findings show that policy differences do not explain the disparities; physical and geographical factors are more important. Chapter 3 extends conventional wind forecasting to a risk-based focus of predicting maximum wind speeds, which are dangerous for offshore operations. Statistical models are presented that issue probabilistic predictions for the highest wind speed expected in a three-hour interval. These models achieve a high degree of accuracy and their use can improve safety and reliability in practice. Chapter 4 examines the challenges of wind power estimation for onshore wind farms. Several methods for wind power resource assessment are compared, and the weaknesses of the Jensen model are demonstrated. For two onshore farms, statistical models outperform other methods, even when very little information is known about the wind farm. Lastly, chapter 5 focuses on the power system more broadly in the context of the risks expected from tropical cyclones in a changing climate. Risks to U.S. power system infrastructure are simulated under different scenarios of tropical cyclone behavior that may result from climate
Statistical mechanics of helical wormlike chain model
Liu, Ya; Pérez, Toni; Li, Wei; Gunton, J. D.; Green, Amanda
2011-02-01
We investigate the statistical mechanics of polymers with bending and torsional elasticity described by the helical wormlike model. Noticing that the energy function is factorizable, we provide a numerical method to solve the model using a transfer matrix formulation. The tangent-tangent and binormal-binormal correlation functions have been calculated and displayed rich profiles which are sensitive to the combination of the temperature and the equilibrium torsion. Their behaviors indicate that there is no finite temperature Lifshitz point between the disordered and helical phases. The asymptotic behavior at low temperature has been investigated theoretically and the predictions fit the numerical results very well. Our analysis could be used to understand the statics of dsDNA and other chiral polymers.
Statistical Mechanics of Helical Wormlike Model
Liu, Ya; Perez, Toni; Li, Wei; Gunton, James; Green, Amanda
2011-03-01
The bending and torsional elasticities are crucial in determining the static and dynamic properties of ~biopolymers such as dsDNA and sickle hemoglobin. We investigate the statistical mechanics of stiff polymers ~described by the helical wormlike model. We provide a numerical method to solve the model using a transfer matrix formulation. The correlation functions have been calculated and display rich profiles which are sensitive to the combination of the temperature and the equilibrium torsion. The asymptotic behavior at low temperature has been investigated theoretically and the predictions fit the numerical results very well. Our analysis could be used to understand the statics of dsDNA and other chiral polymers. This work is supported by grants from the NSF and Mathers Foundation.
Atmospheric corrosion: statistical validation of models
International Nuclear Information System (INIS)
Diaz, V.; Martinez-Luaces, V.; Guineo-Cobs, G.
2003-01-01
In this paper we discuss two different methods for validation of regression models, applied to corrosion data. One of them is based on the correlation coefficient and the other one is the statistical test of lack of fit. Both methods are used here to analyse fitting of bi logarithmic model in order to predict corrosion for very low carbon steel substrates in rural and urban-industrial atmospheres in Uruguay. Results for parameters A and n of the bi logarithmic model are reported here. For this purpose, all repeated values were used instead of using average values as usual. Modelling is carried out using experimental data corresponding to steel substrates under the same initial meteorological conditions ( in fact, they are put in the rack at the same time). Results of correlation coefficient are compared with the lack of it tested at two different signification levels (α=0.01 and α=0.05). Unexpected differences between them are explained and finally, it is possible to conclude, at least in the studied atmospheres, that the bi logarithmic model does not fit properly the experimental data. (Author) 18 refs
George L. Farnsworth; James D. Nichols; John R. Sauer; Steven G. Fancy; Kenneth H. Pollock; Susan A. Shriner; Theodore R. Simons
2005-01-01
Point counts are a standard sampling procedure for many bird species, but lingering concerns still exist about the quality of information produced from the method. It is well known that variation in observer ability and environmental conditions can influence the detection probability of birds in point counts, but many biologists have been reluctant to abandon point...
MSMBuilder: Statistical Models for Biomolecular Dynamics.
Harrigan, Matthew P; Sultan, Mohammad M; Hernández, Carlos X; Husic, Brooke E; Eastman, Peter; Schwantes, Christian R; Beauchamp, Kyle A; McGibbon, Robert T; Pande, Vijay S
2017-01-10
MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Spherical Process Models for Global Spatial Statistics
Jeong, Jaehong
2017-11-28
Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture the spatial and temporal behavior of these global data sets. Though the geodesic distance is the most natural metric for measuring distance on the surface of a sphere, mathematical limitations have compelled statisticians to use the chordal distance to compute the covariance matrix in many applications instead, which may cause physically unrealistic distortions. Therefore, covariance functions directly defined on a sphere using the geodesic distance are needed. We discuss the issues that arise when dealing with spherical data sets on a global scale and provide references to recent literature. We review the current approaches to building process models on spheres, including the differential operator, the stochastic partial differential equation, the kernel convolution, and the deformation approaches. We illustrate realizations obtained from Gaussian processes with different covariance structures and the use of isotropic and nonstationary covariance models through deformations and geographical indicators for global surface temperature data. To assess the suitability of each method, we compare their log-likelihood values and prediction scores, and we end with a discussion of related research problems.
Statistical Shape Modeling of Cam Femoroacetabular Impingement
Energy Technology Data Exchange (ETDEWEB)
Harris, Michael D.; Dater, Manasi; Whitaker, Ross; Jurrus, Elizabeth R.; Peters, Christopher L.; Anderson, Andrew E.
2013-10-01
In this study, statistical shape modeling (SSM) was used to quantify three-dimensional (3D) variation and morphologic differences between femurs with and without cam femoroacetabular impingement (FAI). 3D surfaces were generated from CT scans of femurs from 41 controls and 30 cam FAI patients. SSM correspondence particles were optimally positioned on each surface using a gradient descent energy function. Mean shapes for control and patient groups were defined from the resulting particle configurations. Morphological differences between group mean shapes and between the control mean and individual patients were calculated. Principal component analysis was used to describe anatomical variation present in both groups. The first 6 modes (or principal components) captured statistically significant shape variations, which comprised 84% of cumulative variation among the femurs. Shape variation was greatest in femoral offset, greater trochanter height, and the head-neck junction. The mean cam femur shape protruded above the control mean by a maximum of 3.3 mm with sustained protrusions of 2.5-3.0 mm along the anterolateral head-neck junction and distally along the anterior neck, corresponding well with reported cam lesion locations and soft-tissue damage. This study provides initial evidence that SSM can describe variations in femoral morphology in both controls and cam FAI patients and may be useful for developing new measurements of pathological anatomy. SSM may also be applied to characterize cam FAI severity and provide templates to guide patient-specific surgical resection of bone.
Statistical model for OCT image denoising
Li, Muxingzi
2017-08-01
Optical coherence tomography (OCT) is a non-invasive technique with a large array of applications in clinical imaging and biological tissue visualization. However, the presence of speckle noise affects the analysis of OCT images and their diagnostic utility. In this article, we introduce a new OCT denoising algorithm. The proposed method is founded on a numerical optimization framework based on maximum-a-posteriori estimate of the noise-free OCT image. It combines a novel speckle noise model, derived from local statistics of empirical spectral domain OCT (SD-OCT) data, with a Huber variant of total variation regularization for edge preservation. The proposed approach exhibits satisfying results in terms of speckle noise reduction as well as edge preservation, at reduced computational cost.
Current algebra, statistical mechanics and quantum models
Vilela Mendes, R.
2017-11-01
Results obtained in the past for free boson systems at zero and nonzero temperatures are revisited to clarify the physical meaning of current algebra reducible functionals which are associated to systems with density fluctuations, leading to observable effects on phase transitions. To use current algebra as a tool for the formulation of quantum statistical mechanics amounts to the construction of unitary representations of diffeomorphism groups. Two mathematical equivalent procedures exist for this purpose. One searches for quasi-invariant measures on configuration spaces, the other for a cyclic vector in Hilbert space. Here, one argues that the second approach is closer to the physical intuition when modelling complex systems. An example of application of the current algebra methodology to the pairing phenomenon in two-dimensional fermion systems is discussed.
New advances in statistical modeling and applications
Santos, Rui; Oliveira, Maria; Paulino, Carlos
2014-01-01
This volume presents selected papers from the XIXth Congress of the Portuguese Statistical Society, held in the town of Nazaré, Portugal, from September 28 to October 1, 2011. All contributions were selected after a thorough peer-review process. It covers a broad range of papers in the areas of statistical science, probability and stochastic processes, extremes and statistical applications.
A statistical model for predicting muscle performance
Byerly, Diane Leslie De Caix
The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing
A Statistical Model for Regional Tornado Climate Studies.
Directory of Open Access Journals (Sweden)
Thomas H Jagger
Full Text Available Tornado reports are locally rare, often clustered, and of variable quality making it difficult to use them directly to describe regional tornado climatology. Here a statistical model is demonstrated that overcomes some of these difficulties and produces a smoothed regional-scale climatology of tornado occurrences. The model is applied to data aggregated at the level of counties. These data include annual population, annual tornado counts and an index of terrain roughness. The model has a term to capture the smoothed frequency relative to the state average. The model is used to examine whether terrain roughness is related to tornado frequency and whether there are differences in tornado activity by County Warning Area (CWA. A key finding is that tornado reports increase by 13% for a two-fold increase in population across Kansas after accounting for improvements in rating procedures. Independent of this relationship, tornadoes have been increasing at an annual rate of 1.9%. Another finding is the pattern of correlated residuals showing more Kansas tornadoes in a corridor of counties running roughly north to south across the west central part of the state consistent with the dryline climatology. The model is significantly improved by adding terrain roughness. The effect amounts to an 18% reduction in the number of tornadoes for every ten meter increase in elevation standard deviation. The model indicates that tornadoes are 51% more likely to occur in counties served by the CWAs of DDC and GID than elsewhere in the state. Flexibility of the model is illustrated by fitting it to data from Illinois, Mississippi, South Dakota, and Ohio.
Bias in iterative reconstruction of low-statistics PET data: benefits of a resolution model
Energy Technology Data Exchange (ETDEWEB)
Walker, M D; Asselin, M-C; Julyan, P J; Feldmann, M; Matthews, J C [School of Cancer and Enabling Sciences, Wolfson Molecular Imaging Centre, MAHSC, University of Manchester, Manchester M20 3LJ (United Kingdom); Talbot, P S [Mental Health and Neurodegeneration Research Group, Wolfson Molecular Imaging Centre, MAHSC, University of Manchester, Manchester M20 3LJ (United Kingdom); Jones, T, E-mail: matthew.walker@manchester.ac.uk [Academic Department of Radiation Oncology, Christie Hospital, University of Manchester, Manchester M20 4BX (United Kingdom)
2011-02-21
Iterative image reconstruction methods such as ordered-subset expectation maximization (OSEM) are widely used in PET. Reconstructions via OSEM are however reported to be biased for low-count data. We investigated this and considered the impact for dynamic PET. Patient listmode data were acquired in [{sup 11}C]DASB and [{sup 15}O]H{sub 2}O scans on the HRRT brain PET scanner. These data were subsampled to create many independent, low-count replicates. The data were reconstructed and the images from low-count data were compared to the high-count originals (from the same reconstruction method). This comparison enabled low-statistics bias to be calculated for the given reconstruction, as a function of the noise-equivalent counts (NEC). Two iterative reconstruction methods were tested, one with and one without an image-based resolution model (RM). Significant bias was observed when reconstructing data of low statistical quality, for both subsampled human and simulated data. For human data, this bias was substantially reduced by including a RM. For [{sup 11}C]DASB the low-statistics bias in the caudate head at 1.7 M NEC (approx. 30 s) was -5.5% and -13% with and without RM, respectively. We predicted biases in the binding potential of -4% and -10%. For quantification of cerebral blood flow for the whole-brain grey- or white-matter, using [{sup 15}O]H{sub 2}O and the PET autoradiographic method, a low-statistics bias of <2.5% and <4% was predicted for reconstruction with and without the RM. The use of a resolution model reduces low-statistics bias and can hence be beneficial for quantitative dynamic PET.
Spatial Multiplication Model as an alternative to the Point Model in Neutron Multiplicity Counting
Energy Technology Data Exchange (ETDEWEB)
Hauck, Danielle K. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Henzl, Vladimir [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2014-03-26
The point model is commonly used in neutron multiplicity counting to relate the correlated neutron detection rates (singles, doubles, triples) to item properties (mass, (α,n) reaction rate and neutron multiplication). The point model assumes that the probability that a neutron will induce fission is a constant across the physical extent of the item. However, in reality, neutrons near the center of an item have a greater probability of inducing fission then items near the edges. As a result, the neutron multiplication has a spatial distribution.
Examining secular trend and seasonality in count data using dynamic generalized linear modelling
DEFF Research Database (Denmark)
Lundbye-Christensen, Søren; Dethlefsen, Claus; Gorst-Rasmussen, Anders
Aims Time series of incidence counts often show secular trends and seasonal patterns. We present a model for incidence counts capable of handling a possible gradual change in growth rates and seasonal patterns, serial correlation and overdispersion. Methods The model resembles an ordinary time...... series regression model for Poisson counts. It differs in allowing the regression coefficients to vary gradually over time in a random fashion. Data In the period January 1980 to 1999, 17,989 incidents of acute myocardial infarction were recorded in the county of Northern Jutland, Denmark. Records were...... updated daily. Results The model with a seasonal pattern and an approximately linear trend was fitted to the data, and diagnostic plots indicate a good model fit. The analysis with the dynamic model revealed peaks coinciding with influenza epidemics. On average the peak-to-trough ratio is estimated...
Statistical Challenges in Modeling Big Brain Signals
Yu, Zhaoxia
2017-11-01
Brain signal data are inherently big: massive in amount, complex in structure, and high in dimensions. These characteristics impose great challenges for statistical inference and learning. Here we review several key challenges, discuss possible solutions, and highlight future research directions.
Statistical Learning Theory: Models, Concepts, and Results
von Luxburg, Ulrike; Schoelkopf, Bernhard
2008-01-01
Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We target at a broad audience, not necessarily machine learning researchers. This paper can serve as a starting point for people who want to get an overview on the field before diving into technical details.
What every radiochemist should know about statistics
International Nuclear Information System (INIS)
Nicholson, W.L.
1994-04-01
Radionuclide decay and measurement with appropriate counting instruments is one of the few physical processes for which exact mathematical/probabilistic models are available. This paper discusses statistical procedures associated with display and analysis of radionuclide counting data that derive from these exact models. For low count situations the attractiveness of fixed-count-random-time procedures is discussed
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Use of a mixture statistical model in studying malaria vectors density.
Boussari, Olayidé; Moiroux, Nicolas; Iwaz, Jean; Djènontin, Armel; Bio-Bangana, Sahabi; Corbel, Vincent; Fonton, Noël; Ecochard, René
2012-01-01
Vector control is a major step in the process of malaria control and elimination. This requires vector counts and appropriate statistical analyses of these counts. However, vector counts are often overdispersed. A non-parametric mixture of Poisson model (NPMP) is proposed to allow for overdispersion and better describe vector distribution. Mosquito collections using the Human Landing Catches as well as collection of environmental and climatic data were carried out from January to December 2009 in 28 villages in Southern Benin. A NPMP regression model with "village" as random effect is used to test statistical correlations between malaria vectors density and environmental and climatic factors. Furthermore, the villages were ranked using the latent classes derived from the NPMP model. Based on this classification of the villages, the impacts of four vector control strategies implemented in the villages were compared. Vector counts were highly variable and overdispersed with important proportion of zeros (75%). The NPMP model had a good aptitude to predict the observed values and showed that: i) proximity to freshwater body, market gardening, and high levels of rain were associated with high vector density; ii) water conveyance, cattle breeding, vegetation index were associated with low vector density. The 28 villages could then be ranked according to the mean vector number as estimated by the random part of the model after adjustment on all covariates. The NPMP model made it possible to describe the distribution of the vector across the study area. The villages were ranked according to the mean vector density after taking into account the most important covariates. This study demonstrates the necessity and possibility of adapting methods of vector counting and sampling to each setting.
Use of a mixture statistical model in studying malaria vectors density.
Directory of Open Access Journals (Sweden)
Olayidé Boussari
Full Text Available Vector control is a major step in the process of malaria control and elimination. This requires vector counts and appropriate statistical analyses of these counts. However, vector counts are often overdispersed. A non-parametric mixture of Poisson model (NPMP is proposed to allow for overdispersion and better describe vector distribution. Mosquito collections using the Human Landing Catches as well as collection of environmental and climatic data were carried out from January to December 2009 in 28 villages in Southern Benin. A NPMP regression model with "village" as random effect is used to test statistical correlations between malaria vectors density and environmental and climatic factors. Furthermore, the villages were ranked using the latent classes derived from the NPMP model. Based on this classification of the villages, the impacts of four vector control strategies implemented in the villages were compared. Vector counts were highly variable and overdispersed with important proportion of zeros (75%. The NPMP model had a good aptitude to predict the observed values and showed that: i proximity to freshwater body, market gardening, and high levels of rain were associated with high vector density; ii water conveyance, cattle breeding, vegetation index were associated with low vector density. The 28 villages could then be ranked according to the mean vector number as estimated by the random part of the model after adjustment on all covariates. The NPMP model made it possible to describe the distribution of the vector across the study area. The villages were ranked according to the mean vector density after taking into account the most important covariates. This study demonstrates the necessity and possibility of adapting methods of vector counting and sampling to each setting.
Detectability adjusted count models of songbird abundance: Chapter 6
2011-01-01
Sagebrush (Artemisia spp.) steppe ecosystems have experienced recent changes resulting not only in the loss of habitat but also fragmentation and degradation of remaining habitats. As a result, sagebrush-obligate and sagebrush associated songbird populations have experienced population declines over the past several decades. We examined landscape-scale responses in occupancy and abundance for six focal songbird species at 318 survey sites across the Wyoming Basins Ecoregional Assessment (WBEA) area. Occupancy and abundance models were fit for each species using datasets developed at multiple moving window extents to assess landscape-scale relationships between abiotic, habitat, and anthropogenic factors. Anthropogenic factors had less influence on species occupancy or abundance than abiotic and habitat factors. Sagebrush measures were strong predictors of occurrence for sagebrush-obligate species, such as Brewer’s sparrows (Spizella breweri), sage sparrows (Amphispiza belli) and sage thrashers (Oreoscoptes montanus), as well as green-tailed towhees (Pipilo chlorurus), a species associated with mountain shrub communities. Occurrence for lark sparrows (Chondestes grammacus) and vesper sparrows (Pooecetes gramineus), considered shrub steppe-associated species, was also related to big sagebrush communities, but at large spatial extents. Although relationships between anthropogenic variables and occurrence were weak for most species, the consistent relationship with sagebrush habitat variables suggests direct habitat loss and not edge or additional fragmentation effects are causing declines in the avifauna examined in the WBEA area. Thus, natural and anthropogenic disturbances that result in loss of critical habitats are the biggest threats to these species. We applied our models spatially across the WBEA area to identify and prioritize key areas for conservation.
Linear Mixed Models in Statistical Genetics
R. de Vlaming (Ronald)
2017-01-01
markdownabstractOne of the goals of statistical genetics is to elucidate the genetic architecture of phenotypes (i.e., observable individual characteristics) that are affected by many genetic variants (e.g., single-nucleotide polymorphisms; SNPs). A particular aim is to identify specific SNPs that
Perry, Mike; Kader, Gary
1998-01-01
Presents an activity on the simplification of penguin counting by employing the basic ideas and principles of sampling to teach students to understand and recognize its role in statistical claims. Emphasizes estimation, data analysis and interpretation, and central limit theorem. Includes a list of items for classroom discussion. (ASK)
Integer Representations towards Efficient Counting in the Bit Probe Model
DEFF Research Database (Denmark)
Brodal, Gerth Stølting; Greve, Mark; Pandey, Vineet
2011-01-01
Abstract We consider the problem of representing numbers in close to optimal space and supporting increment, decrement, addition and subtraction operations efficiently. We study the problem in the bit probe model and analyse the number of bits read and written to perform the operations, both...... in the worst-case and in the average-case. A counter is space-optimal if it represents any number in the range [0,...,2 n − 1] using exactly n bits. We provide a space-optimal counter which supports increment and decrement operations by reading at most n − 1 bits and writing at most 3 bits in the worst......-case. To the best of our knowledge, this is the first such representation which supports these operations by always reading strictly less than n bits. For redundant counters where we only need to represent numbers in the range [0,...,L] for some integer L bits, we define the efficiency...
Assessing the value of museums with a combined discrete choice/ count data model
Rouwendal, J.; Boter, J.
2009-01-01
This article assesses the value of Dutch museums using information about destination choice as well as about the number of trips undertaken by an actor. Destination choice is analysed by means of a mixed logit model, and a count data model is used to explain trip generation. We use a
A LATENT CLASS POISSON REGRESSION-MODEL FOR HETEROGENEOUS COUNT DATA
WEDEL, M; DESARBO, WS; BULT, [No Value; RAMASWAMY, [No Value
1993-01-01
In this paper an approach is developed that accommodates heterogeneity in Poisson regression models for count data. The model developed assumes that heterogeneity arises from a distribution of both the intercept and the coefficients of the explanatory variables. We assume that the mixing
Bayesian prediction of spatial count data using generalized linear mixed models
DEFF Research Database (Denmark)
Christensen, Ole Fredslund; Waagepetersen, Rasmus Plenge
2002-01-01
Spatial weed count data are modeled and predicted using a generalized linear mixed model combined with a Bayesian approach and Markov chain Monte Carlo. Informative priors for a data set with sparse sampling are elicited using a previously collected data set with extensive sampling. Furthermore, ...
Bayesian prediction of spatial count data using generalized linear mixed models
DEFF Research Database (Denmark)
Christensen, Ole Fredslund; Waagepetersen, Rasmus Plenge
2002-01-01
Spatial weed count data are modeled and predicted using a generalized linear mixed model combined with a Bayesian approach and Markov chain Monte Carlo. Informative priors for a data set with sparse sampling are elicited using a previously collected data set with extensive sampling. Furthermore, we...
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Generalized partially linear single-index model for zero-inflated count data.
Wang, Xiaoguang; Zhang, Jun; Yu, Liang; Yin, Guosheng
2015-02-28
Count data often arise in biomedical studies, while there could be a special feature with excessive zeros in the observed counts. The zero-inflated Poisson model provides a natural approach to accounting for the excessive zero counts. In the semiparametric framework, we propose a generalized partially linear single-index model for the mean of the Poisson component, the probability of zero, or both. We develop the estimation and inference procedure via a profile maximum likelihood method. Under some mild conditions, we establish the asymptotic properties of the profile likelihood estimators. The finite sample performance of the proposed method is demonstrated by simulation studies, and the new model is illustrated with a medical care dataset. Copyright © 2014 John Wiley & Sons, Ltd.
Geometric modeling in probability and statistics
Calin, Ovidiu
2014-01-01
This book covers topics of Informational Geometry, a field which deals with the differential geometric study of the manifold probability density functions. This is a field that is increasingly attracting the interest of researchers from many different areas of science, including mathematics, statistics, geometry, computer science, signal processing, physics and neuroscience. It is the authors’ hope that the present book will be a valuable reference for researchers and graduate students in one of the aforementioned fields. This textbook is a unified presentation of differential geometry and probability theory, and constitutes a text for a course directed at graduate or advanced undergraduate students interested in applications of differential geometry in probability and statistics. The book contains over 100 proposed exercises meant to help students deepen their understanding, and it is accompanied by software that is able to provide numerical computations of several information geometric objects. The reader...
Statistical Model Checking of Rich Models and Properties
DEFF Research Database (Denmark)
Poulsen, Danny Bøgsted
Software is in increasing fashion embedded within safety- and business critical processes of society. Errors in these embedded systems can lead to human casualties or severe monetary loss. Model checking technology has proven formal methods capable of finding and correcting errors in software...... motivates why existing model checking technology should be supplemented by new techniques. It also contains a brief introduction to probability theory and concepts covered by the six papers making up the second part. The first two papers are concerned with developing online monitoring techniques...... systems. The fifth paper shows how stochastic hybrid automata are useful for modelling biological systems and the final paper is concerned with showing how statistical model checking is efficiently distributed. In parallel with developing the theory contained in the papers, a substantial part of this work...
A statistical model of future human actions
International Nuclear Information System (INIS)
Woo, G.
1992-02-01
A critical review has been carried out of models of future human actions during the long term post-closure period of a radioactive waste repository. Various Markov models have been considered as alternatives to the standard Poisson model, and the problems of parameterisation have been addressed. Where the simplistic Poisson model unduly exaggerates the intrusion risk, some form of Markov model may have to be introduced. This situation may well arise for shallow repositories, but it is less likely for deep repositories. Recommendations are made for a practical implementation of a computer based model and its associated database. (Author)
Vandergoes, Marcus J.; Howarth, Jamie D.; Dunbar, Gavin B.; Turnbull, Jocelyn C.; Roop, Heidi A.; Levy, Richard H.; Li, Xun; Prior, Christine; Norris, Margaret; Keller, Liz D.; Baisden, W. Troy; Ditchburn, Robert; Fitzsimons, Sean J.; Bronk Ramsey, Christopher
2018-05-01
Annually resolved (varved) lake sequences are important palaeoenvironmental archives as they offer a direct incremental dating technique for high-frequency reconstruction of environmental and climate change. Despite the importance of these records, establishing a robust chronology and quantifying its precision and accuracy (estimations of error) remains an essential but challenging component of their development. We outline an approach for building reliable independent chronologies, testing the accuracy of layer counts and integrating all chronological uncertainties to provide quantitative age and error estimates for varved lake sequences. The approach incorporates (1) layer counts and estimates of counting precision; (2) radiometric and biostratigrapic dating techniques to derive independent chronology; and (3) the application of Bayesian age modelling to produce an integrated age model. This approach is applied to a case study of an annually resolved sediment record from Lake Ohau, New Zealand. The most robust age model provides an average error of 72 years across the whole depth range. This represents a fractional uncertainty of ∼5%, higher than the <3% quoted for most published varve records. However, the age model and reported uncertainty represent the best fit between layer counts and independent chronology and the uncertainties account for both layer counting precision and the chronological accuracy of the layer counts. This integrated approach provides a more representative estimate of age uncertainty and therefore represents a statistically more robust chronology.
Statistical models of shape optimisation and evaluation
Davies, Rhodri; Taylor, Chris
2014-01-01
Deformable shape models have wide application in computer vision and biomedical image analysis. This book addresses a key issue in shape modelling: establishment of a meaningful correspondence between a set of shapes. Full implementation details are provided.
Enhanced surrogate models for statistical design exploiting space mapping technology
DEFF Research Database (Denmark)
Koziel, Slawek; Bandler, John W.; Mohamed, Achmed S.
2005-01-01
We present advances in microwave and RF device modeling exploiting Space Mapping (SM) technology. We propose new SM modeling formulations utilizing input mappings, output mappings, frequency scaling and quadratic approximations. Our aim is to enhance circuit models for statistical analysis...
Statistical Tests for Mixed Linear Models
Khuri, André I; Sinha, Bimal K
2011-01-01
An advanced discussion of linear models with mixed or random effects. In recent years a breakthrough has occurred in our ability to draw inferences from exact and optimum tests of variance component models, generating much research activity that relies on linear models with mixed and random effects. This volume covers the most important research of the past decade as well as the latest developments in hypothesis testing. It compiles all currently available results in the area of exact and optimum tests for variance component models and offers the only comprehensive treatment for these models a
Hornbrook, Mark C; Goshen, Ran; Choman, Eran; O'Keeffe-Rosetti, Maureen; Kinar, Yaron; Liles, Elizabeth G; Rust, Kristal C
2017-10-01
Machine learning tools identify patients with blood counts indicating greater likelihood of colorectal cancer and warranting colonoscopy referral. To validate a machine learning colorectal cancer detection model on a US community-based insured adult population. Eligible colorectal cancer cases (439 females, 461 males) with complete blood counts before diagnosis were identified from Kaiser Permanente Northwest Region's Tumor Registry. Control patients (n = 9108) were randomly selected from KPNW's population who had no cancers, received at ≥1 blood count, had continuous enrollment from 180 days prior to the blood count through 24 months after the count, and were aged 40-89. For each control, one blood count was randomly selected as the pseudo-colorectal cancer diagnosis date for matching to cases, and assigned a "calendar year" based on the count date. For each calendar year, 18 controls were randomly selected to match the general enrollment's 10-year age groups and lengths of continuous enrollment. Prediction performance was evaluated by area under the curve, specificity, and odds ratios. Area under the receiver operating characteristics curve for detecting colorectal cancer was 0.80 ± 0.01. At 99% specificity, the odds ratio for association of a high-risk detection score with colorectal cancer was 34.7 (95% CI 28.9-40.4). The detection model had the highest accuracy in identifying right-sided colorectal cancers. ColonFlag ® identifies individuals with tenfold higher risk of undiagnosed colorectal cancer at curable stages (0/I/II), flags colorectal tumors 180-360 days prior to usual clinical diagnosis, and is more accurate at identifying right-sided (compared to left-sided) colorectal cancers.
Statistical image processing and multidimensional modeling
Fieguth, Paul
2010-01-01
Images are all around us! The proliferation of low-cost, high-quality imaging devices has led to an explosion in acquired images. When these images are acquired from a microscope, telescope, satellite, or medical imaging device, there is a statistical image processing task: the inference of something - an artery, a road, a DNA marker, an oil spill - from imagery, possibly noisy, blurry, or incomplete. A great many textbooks have been written on image processing. However this book does not so much focus on images, per se, but rather on spatial data sets, with one or more measurements taken over
Energy Technology Data Exchange (ETDEWEB)
Geist, William H. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2015-12-01
This set of slides begins by giving background and a review of neutron counting; three attributes of a verification item are discussed: ^{240}Pu_{eff} mass; α, the ratio of (α,n) neutrons to spontaneous fission neutrons; and leakage multiplication. It then takes up neutron detector systems – theory & concepts (coincidence counting, moderation, die-away time); detector systems – some important details (deadtime, corrections); introduction to multiplicity counting; multiplicity electronics and example distributions; singles, doubles, and triples from measured multiplicity distributions; and the point model: multiplicity mathematics.
Poisson regression for modeling count and frequency outcomes in trauma research.
Gagnon, David R; Doron-LaMarca, Susan; Bell, Margret; O'Farrell, Timothy J; Taft, Casey T
2008-10-01
The authors describe how the Poisson regression method for analyzing count or frequency outcome variables can be applied in trauma studies. The outcome of interest in trauma research may represent a count of the number of incidents of behavior occurring in a given time interval, such as acts of physical aggression or substance abuse. Traditional linear regression approaches assume a normally distributed outcome variable with equal variances over the range of predictor variables, and may not be optimal for modeling count outcomes. An application of Poisson regression is presented using data from a study of intimate partner aggression among male patients in an alcohol treatment program and their female partners. Results of Poisson regression and linear regression models are compared.
A Realism-Based View on Counts in OMOP's Common Data Model.
Ceusters, Werner; Blaisure, Jonathan
2017-01-01
Correctly counting entities is a requirement for analytics tools to function appropriately. The Observational Medical Outcomes Partnership's (OMOP) Common Data Model (CDM) specifications were examined to assess the extent to which counting in OMOP CDM compatible data repositories would work as expected. To that end, constructs (tables, fields and attributes) defined in the OMOP CDM as well as cardinality constraints and other business rules found in its documentation and related literature were compared to the types of entities and axioms proposed in realism-based ontologies. It was found that not only the model itself, but also a proposed standard algorithm for computing condition eras may lead to erroneous counting of several sorts of entities.
Statistical modeling and extrapolation of carcinogenesis data
International Nuclear Information System (INIS)
Krewski, D.; Murdoch, D.; Dewanji, A.
1986-01-01
Mathematical models of carcinogenesis are reviewed, including pharmacokinetic models for metabolic activation of carcinogenic substances. Maximum likelihood procedures for fitting these models to epidemiological data are discussed, including situations where the time to tumor occurrence is unobservable. The plausibility of different possible shapes of the dose response curve at low doses is examined, and a robust method for linear extrapolation to low doses is proposed and applied to epidemiological data on radiation carcinogenesis
Statistical Model Selection for TID Hardness Assurance
Ladbury, R.; Gorelick, J. L.; McClure, S.
2010-01-01
Radiation Hardness Assurance (RHA) methodologies against Total Ionizing Dose (TID) degradation impose rigorous statistical treatments for data from a part's Radiation Lot Acceptance Test (RLAT) and/or its historical performance. However, no similar methods exist for using "similarity" data - that is, data for similar parts fabricated in the same process as the part under qualification. This is despite the greater difficulty and potential risk in interpreting of similarity data. In this work, we develop methods to disentangle part-to-part, lot-to-lot and part-type-to-part-type variation. The methods we develop apply not just for qualification decisions, but also for quality control and detection of process changes and other "out-of-family" behavior. We begin by discussing the data used in ·the study and the challenges of developing a statistic providing a meaningful measure of degradation across multiple part types, each with its own performance specifications. We then develop analysis techniques and apply them to the different data sets.
Multivariate statistical modelling based on generalized linear models
Fahrmeir, Ludwig
1994-01-01
This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...
Directory of Open Access Journals (Sweden)
Simone Fiori
2007-07-01
Full Text Available Bivariate statistical modeling from incomplete data is a useful statistical tool that allows to discover the model underlying two data sets when the data in the two sets do not correspond in size nor in ordering. Such situation may occur when the sizes of the two data sets do not match (i.e., there are Ã‚Â“holesÃ‚Â” in the data or when the data sets have been acquired independently. Also, statistical modeling is useful when the amount of available data is enough to show relevant statistical features of the phenomenon underlying the data. We propose to tackle the problem of statistical modeling via a neural (nonlinear system that is able to match its input-output statistic to the statistic of the available data sets. A key point of the new implementation proposed here is that it is based on look-up-table (LUT neural systems, which guarantee a computationally advantageous way of implementing neural systems. A number of numerical experiments, performed on both synthetic and real-world data sets, illustrate the features of the proposed modeling procedure.
Statistical Modelling of Extreme Rainfall in Taiwan
L-F. Chu (Lan-Fen); M.J. McAleer (Michael); C-C. Chang (Ching-Chung)
2012-01-01
textabstractIn this paper, the annual maximum daily rainfall data from 1961 to 2010 are modelled for 18 stations in Taiwan. We fit the rainfall data with stationary and non-stationary generalized extreme value distributions (GEV), and estimate their future behaviour based on the best fitting model.
Statistical Modelling of Extreme Rainfall in Taiwan
L. Chu (LanFen); M.J. McAleer (Michael); C-H. Chang (Chu-Hsiang)
2013-01-01
textabstractIn this paper, the annual maximum daily rainfall data from 1961 to 2010 are modelled for 18 stations in Taiwan. We fit the rainfall data with stationary and non-stationary generalized extreme value distributions (GEV), and estimate their future behaviour based on the best fitting model.
BAYESIAN SPATIAL-TEMPORAL MODELING OF ECOLOGICAL ZERO-INFLATED COUNT DATA.
Wang, Xia; Chen, Ming-Hui; Kuo, Rita C; Dey, Dipak K
2015-01-01
A Bayesian hierarchical model is developed for count data with spatial and temporal correlations as well as excessive zeros, uneven sampling intensities, and inference on missing spots. Our contribution is to develop a model on zero-inflated count data that provides flexibility in modeling spatial patterns in a dynamic manner and also improves the computational efficiency via dimension reduction. The proposed methodology is of particular importance for studying species presence and abundance in the field of ecological sciences. The proposed model is employed in the analysis of the survey data by the Northeast Fisheries Sciences Center (NEFSC) for estimation and prediction of the Atlantic cod in the Gulf of Maine - Georges Bank region. Model comparisons based on the deviance information criterion and the log predictive score show the improvement by the proposed spatial-temporal model.
Directory of Open Access Journals (Sweden)
Cheol-Eung Lee
2017-02-01
Full Text Available Several natural disasters occur because of torrential rainfalls. The change in global climate most likely increases the occurrences of such downpours. Hence, it is necessary to investigate the characteristics of the torrential rainfall events in order to introduce effective measures for mitigating disasters such as urban floods and landslides. However, one of the major problems is evaluating the number of torrential rainfall events from a statistical viewpoint. If the number of torrential rainfall occurrences during a month is considered as count data, their frequency distribution could be identified using a probability distribution. Generally, the number of torrential rainfall occurrences has been analyzed using the Poisson distribution (POI or the Generalized Poisson Distribution (GPD. However, it was reported that POI and GPD often overestimated or underestimated the observed count data when additional or fewer zeros were included. Hence, in this study, a zero-inflated model concept was applied to solve this problem existing in the conventional models. Zero-Inflated Poisson (ZIP model, Zero-Inflated Generalized Poisson (ZIGP model, and the Bayesian ZIGP model have often been applied to fit the count data having additional or fewer zeros. However, the applications of these models in water resource management have been very limited despite their efficiency and accuracy. The five models, namely, POI, GPD, ZIP, ZIGP, and Bayesian ZIGP, were applied to the torrential rainfall data having additional zeros obtained from two rain gauges in South Korea, and their applicability was examined in this study. In particular, the informative prior distributions evaluated via the empirical Bayes method using ten rain gauges were developed in the Bayesian ZIGP model. Finally, it was suggested to avoid using the POI and GPD models to fit the frequency of torrential rainfall data. In addition, it was concluded that the Bayesian ZIGP model used in this study
Statistical modelling of traffic safety development
DEFF Research Database (Denmark)
Christens, Peter
2004-01-01
Road safety is a major concern for society and individuals. Although road safety has improved in recent years, the number of road fatalities is still unacceptably high. In 2000, road accidents killed over 40,000 people in the European Union and injured more than 1.7 million. In 2001 in Denmark...... there were 6861 injury trafficc accidents reported by the police, resulting in 4519 minor injuries, 3946 serious injuries, and 431 fatalities. The general purpose of the research was to improve the insight into aggregated road safety methodology in Denmark. The aim was to analyse advanced statistical methods......, that were designed to study developments over time, including effects of interventions. This aim has been achieved by investigating variations in aggregated Danish traffic accident series and by applying state of the art methodologies to specific case studies. The thesis comprises an introduction...
A Noise Robust Statistical Texture Model
DEFF Research Database (Denmark)
Hilger, Klaus Baggesen; Stegmann, Mikkel Bille; Larsen, Rasmus
2002-01-01
This paper presents a novel approach to the problem of obtaining a low dimensional representation of texture (pixel intensity) variation present in a training set after alignment using a Generalised Procrustes analysis.We extend the conventional analysis of training textures in the Active...... Appearance Models segmentation framework. This is accomplished by augmenting the model with an estimate of the covariance of the noise present in the training data. This results in a more compact model maximising the signal-to-noise ratio, thus favouring subspaces rich on signal, but low on noise....... Differences in the methods are illustrated on a set of left cardiac ventricles obtained using magnetic resonance imaging....
Statistical models for nuclear decay from evaporation to vaporization
Cole, A J
2000-01-01
Elements of equilibrium statistical mechanics: Introduction. Microstates and macrostates. Sub-systems and convolution. The Boltzmann distribution. Statistical mechanics and thermodynamics. The grand canonical ensemble. Equations of state for ideal and real gases. Pseudo-equilibrium. Statistical models of nuclear decay. Nuclear physics background: Introduction. Elements of the theory of nuclear reactions. Quantum mechanical description of scattering from a potential. Decay rates and widths. Level and state densities in atomic nuclei. Angular momentum in quantum mechanics. History of statistical
The demand of car rentals: a microeconometric approach with count models and survey data
Czech Academy of Sciences Publication Activity Database
Menezes, A. G.; Uzagalieva, Ainura
2013-01-01
Roč. 5, č. 1 (2013), s. 25-41 ISSN 1973-3909 Institutional support: RVO:67985998 Keywords : count data models * tourism * tax rates Subject RIV: AH - Economic s http://www.rofea.org/index.php?journal=journal&page=article&op=view&path%5B%5D=106
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Liu, Sijia; Sa, Ruhan; Maguire, Orla; Minderman, Hans; Chaudhary, Vipin
2015-03-01
Cytogenetic abnormalities are important diagnostic and prognostic criteria for acute myeloid leukemia (AML). A flow cytometry-based imaging approach for FISH in suspension (FISH-IS) was established that enables the automated analysis of several log-magnitude higher number of cells compared to the microscopy-based approaches. The rotational positioning can occur leading to discordance between spot count. As a solution of counting error from overlapping spots, in this study, a Gaussian Mixture Model based classification method is proposed. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) of GMM are used as global image features of this classification method. Via Random Forest classifier, the result shows that the proposed method is able to detect closely overlapping spots which cannot be separated by existing image segmentation based spot detection methods. The experiment results show that by the proposed method we can obtain a significant improvement in spot counting accuracy.
Introduction to statistical modelling: linear regression.
Lunt, Mark
2015-07-01
In many studies we wish to assess how a range of variables are associated with a particular outcome and also determine the strength of such relationships so that we can begin to understand how these factors relate to each other at a population level. Ultimately, we may also be interested in predicting the outcome from a series of predictive factors available at, say, a routine clinic visit. In a recent article in Rheumatology, Desai et al. did precisely that when they studied the prediction of hip and spine BMD from hand BMD and various demographic, lifestyle, disease and therapy variables in patients with RA. This article aims to introduce the statistical methodology that can be used in such a situation and explain the meaning of some of the terms employed. It will also outline some common pitfalls encountered when performing such analyses. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
DEFF Research Database (Denmark)
Denwood, M.J.; McKendrick, I.J.; Matthews, L.
that the notional type 1 error rate of the new statistical test is accurate. Power calculations demonstrate a power of only 65% with a sample size of 20 treatment and control animals, which increases to 69% with 40 control animals or 79% with 40 treatment animals. Discussion. The method proposed is simple...
A scan statistic for continuous data based on the normal probability model
Directory of Open Access Journals (Sweden)
Huang Lan
2009-10-01
Full Text Available Abstract Temporal, spatial and space-time scan statistics are commonly used to detect and evaluate the statistical significance of temporal and/or geographical disease clusters, without any prior assumptions on the location, time period or size of those clusters. Scan statistics are mostly used for count data, such as disease incidence or mortality. Sometimes there is an interest in looking for clusters with respect to a continuous variable, such as lead levels in children or low birth weight. For such continuous data, we present a scan statistic where the likelihood is calculated using the the normal probability model. It may also be used for other distributions, while still maintaining the correct alpha level. In an application of the new method, we look for geographical clusters of low birth weight in New York City.
A Statistical Model for Energy Intensity
Directory of Open Access Journals (Sweden)
Marjaneh Issapour
2012-12-01
Full Text Available A promising approach to improve scientific literacy in regards to global warming and climate change is using a simulation as part of a science education course. The simulation needs to employ scientific analysis of actual data from internationally accepted and reputable databases to demonstrate the reality of the current climate change situation. One of the most important criteria for using a simulation in a science education course is the fidelity of the model. The realism of the events and consequences modeled in the simulation is significant as well. Therefore, all underlying equations and algorithms used in the simulation must have real-world scientific basis. The "Energy Choices" simulation is one such simulation. The focus of this paper is the development of a mathematical model for "Energy Intensity" as a part of the overall system dynamics in "Energy Choices" simulation. This model will define the "Energy Intensity" as a function of other independent variables that can be manipulated by users of the simulation. The relationship discovered by this research will be applied to an algorithm in the "Energy Choices" simulation.
Latent domain models for statistical machine translation
Hoàng, C.
2017-01-01
A data-driven approach to model translation suffers from the data mismatch problem and demands domain adaptation techniques. Given parallel training data originating from a specific domain, training an MT system on the data would result in a rather suboptimal translation for other domains. But does
Statistical modelling of fine red wine production
Directory of Open Access Journals (Sweden)
María Rosa Castro
2010-01-01
Full Text Available Producing wine is a very important economic activity in the province of San Juan in Argentina; it is therefore most important to predict production regarding the quantity of raw material needed. This work was aimed at obtaining a model relating kilograms of crushed grape to the litres of wine so produced. Such model will be used for predicting precise future values and confidence intervals for determined quantities of crushed grapes. Data from a vineyard in the province of San Juan was thus used in this work. The sampling coefficient of correlation was calculated and a dispersion diagram was then constructed; this indicated a li- neal relationship between the litres of wine obtained and the kilograms of crushed grape. Two lineal models were then adopted and variance analysis was carried out because the data came from normal populations having the same variance. The most appropriate model was obtained from this analysis; it was validated with experimental values, a good approach being obtained.
Behavioral and statistical models of educational inequality
DEFF Research Database (Denmark)
Holm, Anders; Breen, Richard
2016-01-01
This paper addresses the question of how students and their families make educational decisions. We describe three types of behavioral model that might underlie decision-making and we show that they have consequences for what decisions are made. Our study thus has policy implications if we wish...
Statistical model semiquantitatively approximates arabinoxylooligosaccharides' structural diversity
DEFF Research Database (Denmark)
Dotsenko, Gleb; Nielsen, Michael Krogsgaard; Lange, Lene
2016-01-01
(wheat flour arabinoxylan (arabinose/xylose, A/X = 0.47); grass arabinoxylan (A/X = 0.24); wheat straw arabinoxylan (A/X = 0.15); and hydrothermally pretreated wheat straw arabinoxylan (A/X = 0.05)), is semiquantitatively approximated using the proposed model. The suggested approach can be applied...
A STATISTICAL MODEL FOR STOCK ASSESSMENT OF ...
African Journals Online (AJOL)
Assessment of the status of southern bluefin tuna (SBT) by Australia and Japan has used a method (ADAPT) that imposes a number of structural restrictions, and is ... over time within the bounds of specific structure, and (3) autocorrelation in recruitment processes is considered within the likelihood framework of the model.
Modeling statistical properties of written text.
Directory of Open Access Journals (Sweden)
M Angeles Serrano
Full Text Available Written text is one of the fundamental manifestations of human language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Among these regularities, only Zipf's law has been explored in depth. Other basic properties, such as the existence of bursts of rare words in specific documents, have only been studied independently of each other and mainly by descriptive models. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on burstiness, Heaps' law describing the sublinear growth of vocabulary size with the length of a document, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science and linguistics.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Four shells atomic model to computer the counting efficiency of electron-capture nuclides
International Nuclear Information System (INIS)
Grau Malonda, A.; Fernandez Martinez, A.
1985-01-01
The present paper develops a four-shells atomic model in order to obtain the efficiency of detection in liquid scintillation courting, Mathematical expressions are given to calculate the probabilities of the 229 different atomic rearrangements so as the corresponding effective energies. This new model will permit the study of the influence of the different parameters upon the counting efficiency for nuclides of high atomic number. (Author) 7 refs
Domain analysis and modeling to improve comparability of health statistics.
Okada, M; Hashimoto, H; Ohida, T
2001-01-01
Health statistics is an essential element to improve the ability of managers of health institutions, healthcare researchers, policy makers, and health professionals to formulate appropriate course of reactions and to make decisions based on evidence. To ensure adequate health statistics, standards are of critical importance. A study on healthcare statistics domain analysis is underway in an effort to improve usability and comparability of health statistics. The ongoing study focuses on structuring the domain knowledge and making the knowledge explicit with a data element dictionary being the core. Supplemental to the dictionary are a domain term list, a terminology dictionary, and a data model to help organize the concepts constituting the health statistics domain.
Tibaldi, C; Vasile, E; Bernardini, I; Orlandini, C; Andreuccetti, M; Falcone, A
2008-10-01
We aimed to investigate the prognostic significance of several baseline variables in stage IIIB-IV non-small cell lung cancer to create a model based on independent prognostic factors. A total of 320 patients were treated with last generation chemotherapy regimens. The majority of patients received treatment with cisplatin + gemcitabine or gemcitabine alone if older than 70 years or with an ECOG performance status (PS) = 2. Performance status of 2, squamous histology, number of metastatic sites >2, presence of bone, brain, liver and contralateral lung metastases and elevated leukocyte count in peripheral blood were all statistically significant prognostic factors in univariate analyses whereas the other tested variables (sex, stage, age, presence of adrenal gland and skin metastases) were not. Subsequently, a multivariate Cox's regression analysis identified PS 2 (P leukocyte count (P Leukocyte count resulted the stronger factor after performance status. If prospectly validated, the proposed prognostic model could be useful to stratify performance status 2 patients in specific future trials.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Using Count Data and Ordered Models in National Forest Recreation Demand Analysis
Simões, Paula; Barata, Eduardo; Cruz, Luis
2013-11-01
This research addresses the need to improve our knowledge on the demand for national forests for recreation and offers an in-depth data analysis supported by the complementary use of count data and ordered models. From a policy-making perspective, while count data models enable the estimation of monetary welfare measures, ordered models allow for the wider use of the database and provide a more flexible analysis of data. The main purpose of this article is to analyse the individual forest recreation demand and to derive a measure of its current use value. To allow a more complete analysis of the forest recreation demand structure the econometric approach supplements the use of count data models with ordered category models using data obtained by means of an on-site survey in the Bussaco National Forest (Portugal). Overall, both models reveal that travel cost and substitute prices are important explanatory variables, visits are a normal good and demographic variables seem to have no influence on demand. In particular, estimated price and income elasticities of demand are quite low. Accordingly, it is possible to argue that travel cost (price) in isolation may be expected to have a low impact on visitation levels.
Latent segmentation based count models: Analysis of bicycle safety in Montreal and Toronto.
Yasmin, Shamsunnahar; Eluru, Naveen
2016-10-01
The study contributes to literature on bicycle safety by building on the traditional count regression models to investigate factors affecting bicycle crashes at the Traffic Analysis Zone (TAZ) level. TAZ is a traffic related geographic entity which is most frequently used as spatial unit for macroscopic crash risk analysis. In conventional count models, the impact of exogenous factors is restricted to be the same across the entire region. However, it is possible that the influence of exogenous factors might vary across different TAZs. To accommodate for the potential variation in the impact of exogenous factors we formulate latent segmentation based count models. Specifically, we formulate and estimate latent segmentation based Poisson (LP) and latent segmentation based Negative Binomial (LNB) models to study bicycle crash counts. In our latent segmentation approach, we allow for more than two segments and also consider a large set of variables in segmentation and segment specific models. The formulated models are estimated using bicycle-motor vehicle crash data from the Island of Montreal and City of Toronto for the years 2006 through 2010. The TAZ level variables considered in our analysis include accessibility measures, exposure measures, sociodemographic characteristics, socioeconomic characteristics, road network characteristics and built environment. A policy analysis is also conducted to illustrate the applicability of the proposed model for planning purposes. This macro-level research would assist decision makers, transportation officials and community planners to make informed decisions to proactively improve bicycle safety - a prerequisite to promoting a culture of active transportation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Spectral statistics in particles-rotor model and cranking model
Zhou Xian Rong; Zhao En Guang; Guo Lu
2002-01-01
Spectral statistics for six particles in single-j and two-j model coupled with a deformed core are studied in the frames of particles-rotor model and cranking shell model. The nearest-neighbor-distribution of energy levels and spectral rigidity are studied as a function of the spin or cranking frequency, respectively. The results of single-j shell are compared with those in two-j case. The system becomes more regular when single-j space (i sub 1 sub 3 sub / sub 2) is replaced by two-j shell (g sub 7 sub / sub 2 + d sub 5 sub / sub 2), although the basis size of the configuration space is unchanged. However, the degree of chaoticity of the system changes slightly when configuration space is enlarged by extending single-j shell (i sub 1 sub 3 sub / sub 2) to two-j shell (i sub 1 sub 3 sub / sub 2 + g sub 9 sub / sub 2). Nuclear chaotic behavior is studied when authors take a two-body interaction as delta force and pairing interaction, respectively
Directory of Open Access Journals (Sweden)
Aarón Salinas-Rodríguez
2009-10-01
Full Text Available OBJETIVO: Describir algunos de los modelos estadísticos para el estudio de variables expresadas como un conteo en el contexto del uso de los servicios de salud. MATERIAL Y MÉTODOS: Con base en la Encuesta de Evaluación del Seguro Popular (2005-2006 se calculó el efecto del Seguro Popular sobre el número de consultas externas mediante el uso de los modelos de regresión Poisson, binomial negativo, binomial negativo cero-inflado y Hurdle binomial negativo. Se utilizó el criterio de información de Akaike (AIC para definir el mejor modelo. RESULTADOS: La mejor opción estadística para el análisis del uso de los servicios de salud resultó ser el modelo Hurdle, de acuerdo con sus presuposiciones y el valor del AIC. DISCUSIÓN: La modelación de variables de conteo requiere el empleo de modelos que incluyan una medición de la dispersión. Ante la presencia de exceso de ceros, el modelo Hurdle es una opción apropiada.OBJECTIVE: To describe some of the statistical models for the study of count variables in the context of the use of health services. MATERIAL AND METHODS: We used the Seguro Popular Evaluation Survey to estimate the effect of Seguro Popular on the frequency of use of outpatient health services, using Poisson regression models and negative binomial, zero-inflated negative binomial and the hurdle negative binomial models. We used the Akaike Information Criterion (AIC to define the best model. RESULTS: Results show that the best statistical approach to model the use of health services is the hurdle model, taking into account both the main theoretical assumptions and the statistical results of the AIC. DISCUSSION: The modelling of count data requires the application of statistical models to model data dispersion; in the presence of an excess of zeros, the hurdle model is an appropriate statistical option.
Statistical modelling in biostatistics and bioinformatics selected papers
Peng, Defen
2014-01-01
This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...
Functional summary statistics for the Johnson-Mehl model
DEFF Research Database (Denmark)
Møller, Jesper; Ghorbani, Mohammad
The Johnson-Mehl germination-growth model is a spatio-temporal point process model which among other things have been used for the description of neurotransmitters datasets. However, for such datasets parametric Johnson-Mehl models fitted by maximum likelihood have yet not been evaluated by means...... of functional summary statistics. This paper therefore invents four functional summary statistics adapted to the Johnson-Mehl model, with two of them based on the second-order properties and the other two on the nuclei-boundary distances for the associated Johnson-Mehl tessellation. The functional summary...... statistics theoretical properties are investigated, non-parametric estimators are suggested, and their usefulness for model checking is examined in a simulation study. The functional summary statistics are also used for checking fitted parametric Johnson-Mehl models for a neurotransmitters dataset....
Fitting statistical models in bivariate allometry.
Packard, Gary C; Birchard, Geoffrey F; Boardman, Thomas J
2011-08-01
Several attempts have been made in recent years to formulate a general explanation for what appear to be recurring patterns of allometric variation in morphology, physiology, and ecology of both plants and animals (e.g. the Metabolic Theory of Ecology, the Allometric Cascade, the Metabolic-Level Boundaries hypothesis). However, published estimates for parameters in allometric equations often are inaccurate, owing to undetected bias introduced by the traditional method for fitting lines to empirical data. The traditional method entails fitting a straight line to logarithmic transformations of the original data and then back-transforming the resulting equation to the arithmetic scale. Because of fundamental changes in distributions attending transformation of predictor and response variables, the traditional practice may cause influential outliers to go undetected, and it may result in an underparameterized model being fitted to the data. Also, substantial bias may be introduced by the insidious rotational distortion that accompanies regression analyses performed on logarithms. Consequently, the aforementioned patterns of allometric variation may be illusions, and the theoretical explanations may be wide of the mark. Problems attending the traditional procedure can be largely avoided in future research simply by performing preliminary analyses on arithmetic values and by validating fitted equations in the arithmetic domain. The goal of most allometric research is to characterize relationships between biological variables and body size, and this is done most effectively with data expressed in the units of measurement. Back-transforming from a straight line fitted to logarithms is not a generally reliable way to estimate an allometric equation in the original scale. © 2010 The Authors. Biological Reviews © 2010 Cambridge Philosophical Society.
Probabilistic statistical modeling of air pollution from vehicles
Adikanova, Saltanat; Malgazhdarov, Yerzhan A.; Madiyarov, Muratkan N.; Temirbekov, Nurlan M.
2017-09-01
The aim of the work is to create a probabilistic-statistical mathematical model for the distribution of emissions from vehicles. In this article, it is proposed to use the probabilistic and statistical approach for modeling the distribution of harmful impurities in the atmosphere from vehicles using the example of the Ust-Kamenogorsk city. Using a simplified methodology of stochastic modeling, it is possible to construct effective numerical computational algorithms that significantly reduce the amount of computation without losing their accuracy.
van den Broek, J.
2015-01-01
In this article a non-homogeneous martingale model is proposed to model volatility in a stochastic time series of count data with constant mean. The approach is derived from a general non- homogeneous birth-and-death process, in which the mean and the variance of population size can vary as a
Casati, Michele
2014-05-01
The assertion that solar activity may play a significant role in the trigger of large volcanic eruptions is, and has been discussed by many geophysicists. Numerous scientific papers have established a possible correlation between these events and the electromagnetic coupling between the Earth and the Sun, but none of them has been able to highlight a possible statistically significant relationship between large volcanic eruptions and any of the series, such as geomagnetic activity, solar wind, sunspots number. In our research, we compare the 148 volcanic eruptions with index VEI4, the major 37 historical volcanic eruptions equal to or greater than index VEI5, recorded from 1610 to 2012 , with its sunspots number. Staring, as the threshold value, a monthly sunspot number of 46 (recorded during the great eruption of Krakatoa VEI6 historical index, August 1883), we note some possible relationships and conduct a statistical test. • Of the historical 31 large volcanic eruptions with index VEI5+, recorded between 1610 and 1955, 29 of these were recorded when the SSN<46. The remaining 2 eruptions were not recorded when the SSN<46, but rather during solar maxima of the solar cycle of the year 1739 and in the solar cycle No. 14 (Shikotsu eruption of 1739 and Ksudach 1907). • Of the historical 8 large volcanic eruptions with index VEI6+, recorded from 1610 to the present, 7 of these were recorded with SSN<46 and more specifically, within the three large solar minima known : Maunder (1645-1710), Dalton (1790-1830) and during the solar minimums occurred between 1880 and 1920. As the only exception, we note the eruption of Pinatubo of June 1991, recorded in the solar maximum of cycle 22. • Of the historical 6 major volcanic eruptions with index VEI5+, recorded after 1955, 5 of these were not recorded during periods of low solar activity, but rather during solar maxima, of the cycles 19,21 and 22. The significant tests, conducted with the chi-square χ ² = 7,782, detect a
International Nuclear Information System (INIS)
2005-01-01
For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees
Mixed deterministic statistical modelling of regional ozone air pollution
Kalenderski, Stoitchko
2011-03-17
We develop a physically motivated statistical model for regional ozone air pollution by separating the ground-level pollutant concentration field into three components, namely: transport, local production and large-scale mean trend mostly dominated by emission rates. The model is novel in the field of environmental spatial statistics in that it is a combined deterministic-statistical model, which gives a new perspective to the modelling of air pollution. The model is presented in a Bayesian hierarchical formalism, and explicitly accounts for advection of pollutants, using the advection equation. We apply the model to a specific case of regional ozone pollution-the Lower Fraser valley of British Columbia, Canada. As a predictive tool, we demonstrate that the model vastly outperforms existing, simpler modelling approaches. Our study highlights the importance of simultaneously considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. © 2011 John Wiley & Sons, Ltd..
Bouwman, Aniek C; Hayes, Ben J; Calus, Mario P L
2017-10-30
Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of allele counts results in less shrinkage towards the mean for low minor allele frequency (MAF) variants. Scaling may become relevant for estimating ASE as more low MAF variants will be used in genomic evaluations. We show the impact of scaling on estimates of ASE using real data and a theoretical framework, and in terms of power, model fit and predictive performance. In a dairy cattle dataset with 630 K SNP genotypes, the correlation between DGV for stature from a random regression model using centered allele counts (RRc) and centered and scaled allele counts (RRcs) was 0.9988, whereas the overall correlation between ASE using RRc and RRcs was 0.27. The main difference in ASE between both methods was found for SNPs with a MAF lower than 0.01. Both the ratio (ASE from RRcs/ASE from RRc) and the regression coefficient (regression of ASE from RRcs on ASE from RRc) were much higher than 1 for low MAF SNPs. Derived equations showed that scenarios with a high heritability, a large number of individuals and a small number of variants have lower ratios between ASE from RRc and RRcs. We also investigated the optimal scaling parameter [from - 1 (RRcs) to 0 (RRc) in steps of 0.1] in the bovine stature dataset. We found that the log-likelihood was maximized with a scaling parameter of - 0.8, while the mean squared error of prediction was minimized with a scaling parameter of - 1, i.e., RRcs. Large differences in estimated ASE were observed for low MAF SNPs when allele counts were scaled or not scaled because there is less shrinkage towards the mean for scaled allele counts. We derived a theoretical framework that shows that the difference in ASE due to shrinkage is heavily influenced by the
A count rate model for PET and its application to an LSO HR PLUS scanner
International Nuclear Information System (INIS)
Moisan, C.; Rogers, J.G.; Douglas, J.L.
1996-10-01
We present a count rate model for PET. Considering a standard 20 x 20 cm phantom in the field-of-view of a cylindrical septaless tomograph, the model computes the acceptance to prompt and random events from simple geometric considerations. Dead time factors at all stages of a typical event acquisition architecture are calculated from specified processing clock cycles. Validations of the model's predictions against the measured performances of the ECAT-953B and the EXACT HR PLUS are presented. The model is then used to investigate the benefit of using detectors made of LSO in the EXACT HR PLUS scanner geometry. The results indicate that in replacing BGO by the faster LSO, one can count on an increase of the peak noise-equivalent-count rate by a factor 2.2. This gain will be achieved by using a 5 nsec coincidence window, buckets operating on 128 nsec clock cycle, and front-end data acquisition that can sustain a total rate of 2.9 MHz. (authors)
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
1999-01-01
For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2001-01-01
For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Kolmogorov complexity, pseudorandom generators and statistical models testing
Czech Academy of Sciences Publication Activity Database
Šindelář, Jan; Boček, Pavel
2002-01-01
Roč. 38, č. 6 (2002), s. 747-759 ISSN 0023-5954 R&D Projects: GA ČR GA102/99/1564 Institutional research plan: CEZ:AV0Z1075907 Keywords : Kolmogorov complexity * pseudorandom generators * statistical models testing Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.341, year: 2002
Role of scaling in the statistical modelling of finance
Indian Academy of Sciences (India)
Economics and mathematical finance are multidisciplinary fields in which the ten- dency of statistical physicists to focus on universal laws has been criticized some- ..... is coherent and catches the essential statistical features of a long index history. A very important test for the proposed model concerns the scaling of the ...
Assessing risk factors for dental caries: a statistical modeling approach.
Trottini, Mario; Bossù, Maurizio; Corridore, Denise; Ierardo, Gaetano; Luzzi, Valeria; Saccucci, Matteo; Polimeni, Antonella
2015-01-01
The problem of identifying potential determinants and predictors of dental caries is of key importance in caries research and it has received considerable attention in the scientific literature. From the methodological side, a broad range of statistical models is currently available to analyze dental caries indices (DMFT, dmfs, etc.). These models have been applied in several studies to investigate the impact of different risk factors on the cumulative severity of dental caries experience. However, in most of the cases (i) these studies focus on a very specific subset of risk factors; and (ii) in the statistical modeling only few candidate models are considered and model selection is at best only marginally addressed. As a result, our understanding of the robustness of the statistical inferences with respect to the choice of the model is very limited; the richness of the set of statistical models available for analysis in only marginally exploited; and inferences could be biased due the omission of potentially important confounding variables in the model's specification. In this paper we argue that these limitations can be overcome considering a general class of candidate models and carefully exploring the model space using standard model selection criteria and measures of global fit and predictive performance of the candidate models. Strengths and limitations of the proposed approach are illustrated with a real data set. In our illustration the model space contains more than 2.6 million models, which require inferences to be adjusted for 'optimism'.
Directory of Open Access Journals (Sweden)
Xavier A. Harrison
2014-10-01
Full Text Available Overdispersion is common in models of count data in ecology and evolutionary biology, and can occur due to missing covariates, non-independent (aggregated data, or an excess frequency of zeroes (zero-inflation. Accounting for overdispersion in such models is vital, as failing to do so can lead to biased parameter estimates, and false conclusions regarding hypotheses of interest. Observation-level random effects (OLRE, where each data point receives a unique level of a random effect that models the extra-Poisson variation present in the data, are commonly employed to cope with overdispersion in count data. However studies investigating the efficacy of observation-level random effects as a means to deal with overdispersion are scarce. Here I use simulations to show that in cases where overdispersion is caused by random extra-Poisson noise, or aggregation in the count data, observation-level random effects yield more accurate parameter estimates compared to when overdispersion is simply ignored. Conversely, OLRE fail to reduce bias in zero-inflated data, and in some cases increase bias at high levels of overdispersion. There was a positive relationship between the magnitude of overdispersion and the degree of bias in parameter estimates. Critically, the simulations reveal that failing to account for overdispersion in mixed models can erroneously inflate measures of explained variance (r2, which may lead to researchers overestimating the predictive power of variables of interest. This work suggests use of observation-level random effects provides a simple and robust means to account for overdispersion in count data, but also that their ability to minimise bias is not uniform across all types of overdispersion and must be applied judiciously.
The use of plant models in deep learning: an application to leaf counting in rosette plants
Ubbens, Jordan; Cieslak, Mikolaj; Prusinkiewicz, Przemyslaw; Stavness, Ian
2018-01-01
Deep learning presents many opportunities for image-based plant phenotyping. Here we consider the capability of deep convolutional neural networks to perform the leaf counting task. Deep learning techniques typically require large and diverse datasets to learn generalizable models without providing a priori an engineered algorithm for performing the task. This requirement is challenging, however, for applications in the plant phenotyping field, where available datasets are often small and the...
International Nuclear Information System (INIS)
Casas Galiano, G.; Grau Malonda, A.
1994-01-01
An intelligent computer program has been developed to obtain the mathematical formulae to compute the probabilities and reduced energies of the different atomic rearrangement pathways following electron-capture decay. Creation and annihilation operators for Auger and X processes have been introduced. Taking into account the symmetries associated with each process, 262 different pathways were obtained. This model allows us to obtain the influence of the M-electron-capture in the counting efficiency when the atomic number of the nuclide is high
Accounting for Zero Inflation of Mussel Parasite Counts Using Discrete Regression Models
Directory of Open Access Journals (Sweden)
Emel Çankaya
2017-06-01
Full Text Available In many ecological applications, the absences of species are inevitable due to either detection faults in samples or uninhabitable conditions for their existence, resulting in high number of zero counts or abundance. Usual practice for modelling such data is regression modelling of log(abundance+1 and it is well know that resulting model is inadequate for prediction purposes. New discrete models accounting for zero abundances, namely zero-inflated regression (ZIP and ZINB, Hurdle-Poisson (HP and Hurdle-Negative Binomial (HNB amongst others are widely preferred to the classical regression models. Due to the fact that mussels are one of the economically most important aquatic products of Turkey, the purpose of this study is therefore to examine the performances of these four models in determination of the significant biotic and abiotic factors on the occurrences of Nematopsis legeri parasite harming the existence of Mediterranean mussels (Mytilus galloprovincialis L.. The data collected from the three coastal regions of Sinop city in Turkey showed more than 50% of parasite counts on the average are zero-valued and model comparisons were based on information criterion. The results showed that the probability of the occurrence of this parasite is here best formulated by ZINB or HNB models and influential factors of models were found to be correspondent with ecological differences of the regions.
Improving statistical reasoning theoretical models and practical implications
Sedlmeier, Peter
1999-01-01
This book focuses on how statistical reasoning works and on training programs that can exploit people''s natural cognitive capabilities to improve their statistical reasoning. Training programs that take into account findings from evolutionary psychology and instructional theory are shown to have substantially larger effects that are more stable over time than previous training regimens. The theoretical implications are traced in a neural network model of human performance on statistical reasoning problems. This book apppeals to judgment and decision making researchers and other cognitive scientists, as well as to teachers of statistics and probabilistic reasoning.
Schedulability of Herschel revisited using statistical model checking
DEFF Research Database (Denmark)
David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel
2015-01-01
to obtain some guarantee on the (un)schedulability of the model even in the presence of undecidability. Two methods are considered: symbolic model checking and statistical model checking. Since the model uses stop-watches, the reachability problem becomes undecidable so we are using an over......-approximation technique. We can safely conclude that the system is schedulable for varying values of BCET. For the cases where deadlines are violated, we use polyhedra to try to confirm the witnesses. Our alternative method to confirm non-schedulability uses statistical model-checking (SMC) to generate counter...
International Nuclear Information System (INIS)
Smiriga, N.G.
1976-01-01
This report compares two models for converting beta backscatter count readings into thickness measurements. The necessary formulas to be used in an unweighted and weighted regression analysis are listed. The question of whether one should perform a regression analysis using the five available standard thicknesses or whether one should, in addition to these standard thicknesses, use zero as a standard thickness is decided. A weighted regression analysis is compared with an unweighted one for each model. The ''best'' model is selected, and the conclusions of the analysis are presented
Some remarks on the statistical model of heavy ion collisions
International Nuclear Information System (INIS)
Koch, V.
2003-01-01
This contribution is an attempt to assess what can be learned from the remarkable success of this statistical model in describing ratios of particle abundances in ultra-relativistic heavy ion collisions
Directory of Open Access Journals (Sweden)
Anke Hüls
2017-05-01
Full Text Available Antimicrobial resistance in livestock is a matter of general concern. To develop hygiene measures and methods for resistance prevention and control, epidemiological studies on a population level are needed to detect factors associated with antimicrobial resistance in livestock holdings. In general, regression models are used to describe these relationships between environmental factors and resistance outcome. Besides the study design, the correlation structures of the different outcomes of antibiotic resistance and structural zero measurements on the resistance outcome as well as on the exposure side are challenges for the epidemiological model building process. The use of appropriate regression models that acknowledge these complexities is essential to assure valid epidemiological interpretations. The aims of this paper are (i to explain the model building process comparing several competing models for count data (negative binomial model, quasi-Poisson model, zero-inflated model, and hurdle model and (ii to compare these models using data from a cross-sectional study on antibiotic resistance in animal husbandry. These goals are essential to evaluate which model is most suitable to identify potential prevention measures. The dataset used as an example in our analyses was generated initially to study the prevalence and associated factors for the appearance of cefotaxime-resistant Escherichia coli in 48 German fattening pig farms. For each farm, the outcome was the count of samples with resistant bacteria. There was almost no overdispersion and only moderate evidence of excess zeros in the data. Our analyses show that it is essential to evaluate regression models in studies analyzing the relationship between environmental factors and antibiotic resistances in livestock. After model comparison based on evaluation of model predictions, Akaike information criterion, and Pearson residuals, here the hurdle model was judged to be the most appropriate
International Nuclear Information System (INIS)
2003-01-01
For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2004-01-01
For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees
Applications of spatial statistical network models to stream data
Isaak, Daniel J.; Peterson, Erin E.; Ver Hoef, Jay M.; Wenger, Seth J.; Falke, Jeffrey A.; Torgersen, Christian E.; Sowder, Colin; Steel, E. Ashley; Fortin, Marie-Josée; Jordan, Chris E.; Ruesch, Aaron S.; Som, Nicholas; Monestiez, Pascal
2014-01-01
Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for terrestrial applications and are not optimized for streams. A new class of spatial statistical model, based on valid covariance structures for stream networks, can be used with many common types of stream data (e.g., water quality attributes, habitat conditions, biological surveys) through application of appropriate distributions (e.g., Gaussian, binomial, Poisson). The spatial statistical network models account for spatial autocorrelation (i.e., nonindependence) among measurements, which allows their application to databases with clustered measurement locations. Large amounts of stream data exist in many areas where spatial statistical analyses could be used to develop novel insights, improve predictions at unsampled sites, and aid in the design of efficient monitoring strategies at relatively low cost. We review the topic of spatial autocorrelation and its effects on statistical inference, demonstrate the use of spatial statistics with stream datasets relevant to common research and management questions, and discuss additional applications and development potential for spatial statistics on stream networks. Free software for implementing the spatial statistical network models has been developed that enables custom applications with many stream databases.
Possibilities of the Statistical Scoring Models' Application at Lithuanian Banks
Dzidzevičiūtė, Laima
2013-01-01
The goal of this dissertation is to develop the rating system of Lithuanian companies based on the statistical scoring model and assess the possibilities of this system‘s application at Lithuanian banks. The dissertation consists of three Chapters. Development and application peculiarities of rating systems based on statistical scoring models are described in the first Chapter. In the second Chapter the results of the survey of commercial banks and foreign bank branches, operating in the coun...
A no extensive statistical model for the nucleon structure function
Energy Technology Data Exchange (ETDEWEB)
Trevisan, Luis A. [Departamento de Matematica e Estatistica, Universidade Estadual de Ponta Grossa, 84010-790, Ponta Grossa, PR (Brazil); Mirez, Carlos [Instituto de Ciencia, Engenharia e Tecnologia - ICET, Universidade Federal dos Vales do Jequitinhonha e Mucuri - UFVJM, Campus do Mucuri, Rua do Cruzeiro 01, Jardim Sao Paulo, 39803-371, Teofilo Otoni, Minas Gerais (Brazil)
2013-03-25
We studied an application of nonextensive thermodynamics to describe the structure function of nucleon, in a model where the usual Fermi-Dirac and Bose-Einstein energy distribution were replaced by the equivalent functions of the q-statistical. The parameters of the model are given by an effective temperature T, the q parameter (from Tsallis statistics), and two chemical potentials given by the corresponding up (u) and down (d) quark normalization in the nucleon.
Improved analyses using function datasets and statistical modeling
John S. Hogland; Nathaniel M. Anderson
2014-01-01
Raster modeling is an integral component of spatial analysis. However, conventional raster modeling techniques can require a substantial amount of processing time and storage space and have limited statistical functionality and machine learning algorithms. To address this issue, we developed a new modeling framework using C# and ArcObjects and integrated that framework...
Thiessen, Erik D
2017-01-05
Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik
Integrating count and detection–nondetection data to model population dynamics
Zipkin, Elise F.; Rossman, Sam; Yackulic, Charles B.; Wiens, David; Thorson, James T.; Davis, Raymond J.; Grant, Evan H. Campbell
2017-01-01
There is increasing need for methods that integrate multiple data types into a single analytical framework as the spatial and temporal scale of ecological research expands. Current work on this topic primarily focuses on combining capture–recapture data from marked individuals with other data types into integrated population models. Yet, studies of species distributions and trends often rely on data from unmarked individuals across broad scales where local abundance and environmental variables may vary. We present a modeling framework for integrating detection–nondetection and count data into a single analysis to estimate population dynamics, abundance, and individual detection probabilities during sampling. Our dynamic population model assumes that site-specific abundance can change over time according to survival of individuals and gains through reproduction and immigration. The observation process for each data type is modeled by assuming that every individual present at a site has an equal probability of being detected during sampling processes. We examine our modeling approach through a series of simulations illustrating the relative value of count vs. detection–nondetection data under a variety of parameter values and survey configurations. We also provide an empirical example of the model by combining long-term detection–nondetection data (1995–2014) with newly collected count data (2015–2016) from a growing population of Barred Owl (Strix varia) in the Pacific Northwest to examine the factors influencing population abundance over time. Our model provides a foundation for incorporating unmarked data within a single framework, even in cases where sampling processes yield different detection probabilities. This approach will be useful for survey design and to researchers interested in incorporating historical or citizen science data into analyses focused on understanding how demographic rates drive population abundance.
Liao, Yi; Ma, Xiao-Dong
2018-03-01
We study two aspects of higher dimensional operators in standard model effective field theory. We first introduce a perturbative power counting rule for the entries in the anomalous dimension matrix of operators with equal mass dimension. The power counting is determined by the number of loops and the difference of the indices of the two operators involved, which in turn is defined by assuming that all terms in the standard model Lagrangian have an equal perturbative power. Then we show that the operators with the lowest index are unique at each mass dimension d, i.e., (H † H) d/2 for even d ≥ 4, and (LT∈ H)C(LT∈ H) T (H † H)(d-5)/2 for odd d ≥ 5. Here H, L are the Higgs and lepton doublet, and ∈, C the antisymmetric matrix of rank two and the charge conjugation matrix, respectively. The renormalization group running of these operators can be studied separately from other operators of equal mass dimension at the leading order in power counting. We compute their anomalous dimensions at one loop for general d and find that they are enhanced quadratically in d due to combinatorics. We also make connections with classification of operators in terms of their holomorphic and anti-holomorphic weights. Supported by the National Natural Science Foundation of China under Grant Nos. 11025525, 11575089, and by the CAS Center for Excellence in Particle Physics (CCEPP)
The use of plant models in deep learning: an application to leaf counting in rosette plants.
Ubbens, Jordan; Cieslak, Mikolaj; Prusinkiewicz, Przemyslaw; Stavness, Ian
2018-01-01
Deep learning presents many opportunities for image-based plant phenotyping. Here we consider the capability of deep convolutional neural networks to perform the leaf counting task. Deep learning techniques typically require large and diverse datasets to learn generalizable models without providing a priori an engineered algorithm for performing the task. This requirement is challenging, however, for applications in the plant phenotyping field, where available datasets are often small and the costs associated with generating new data are high. In this work we propose a new method for augmenting plant phenotyping datasets using rendered images of synthetic plants. We demonstrate that the use of high-quality 3D synthetic plants to augment a dataset can improve performance on the leaf counting task. We also show that the ability of the model to generate an arbitrary distribution of phenotypes mitigates the problem of dataset shift when training and testing on different datasets. Finally, we show that real and synthetic plants are significantly interchangeable when training a neural network on the leaf counting task.
International Nuclear Information System (INIS)
Quadri, Andrea
2006-01-01
We elucidate the geometry of the polynomial formulation of the non-Abelian Stueckelberg mechanism. We show that a natural off-shell nilpotent Becchi-Rouet-Stora-Tyutin (BRST) differential exists allowing to implement the constraint on the σ field by means of BRST techniques. This is achieved by extending the ghost sector by an additional U(1) factor (Abelian embedding). An important consequence is that a further BRST-invariant but not gauge-invariant mass term can be written for the non-Abelian gauge fields. As all versions of the Stueckelberg theory, also the Abelian embedding formulation yields a nonpower-counting renormalizable theory in D=4. We then derive its natural power-counting renormalizable extension and show that the physical spectrum contains a physical massive scalar particle. Physical unitarity is also established. This model implements the spontaneous symmetry breaking in the Abelian embedding formalism
Differential measurement errors in zero-truncated regression models for count data.
Huang, Yih-Huei; Hwang, Wen-Han; Chen, Fei-Yin
2011-12-01
Measurement errors in covariates may result in biased estimates in regression analysis. Most methods to correct this bias assume nondifferential measurement errors-i.e., that measurement errors are independent of the response variable. However, in regression models for zero-truncated count data, the number of error-prone covariate measurements for a given observational unit can equal its response count, implying a situation of differential measurement errors. To address this challenge, we develop a modified conditional score approach to achieve consistent estimation. The proposed method represents a novel technique, with efficiency gains achieved by augmenting random errors, and performs well in a simulation study. The method is demonstrated in an ecology application. © 2011, The International Biometric Society.
Models for probability and statistical inference theory and applications
Stapleton, James H
2007-01-01
This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...
An accurate behavioral model for single-photon avalanche diode statistical performance simulation
Xu, Yue; Zhao, Tingchen; Li, Ding
2018-01-01
An accurate behavioral model is presented to simulate important statistical performance of single-photon avalanche diodes (SPADs), such as dark count and after-pulsing noise. The derived simulation model takes into account all important generation mechanisms of the two kinds of noise. For the first time, thermal agitation, trap-assisted tunneling and band-to-band tunneling mechanisms are simultaneously incorporated in the simulation model to evaluate dark count behavior of SPADs fabricated in deep sub-micron CMOS technology. Meanwhile, a complete carrier trapping and de-trapping process is considered in afterpulsing model and a simple analytical expression is derived to estimate after-pulsing probability. In particular, the key model parameters of avalanche triggering probability and electric field dependence of excess bias voltage are extracted from Geiger-mode TCAD simulation and this behavioral simulation model doesn't include any empirical parameters. The developed SPAD model is implemented in Verilog-A behavioral hardware description language and successfully operated on commercial Cadence Spectre simulator, showing good universality and compatibility. The model simulation results are in a good accordance with the test data, validating high simulation accuracy.
Statistical detection model for eddy-current systems
International Nuclear Information System (INIS)
Martinez, J.R.; Bahr, A.J.
1984-01-01
This chapter presents a detailed analysis of some measured noise data and the results of using those data with a probe-flaw interaction model to compute the surface-crack detection characteristics of two different air-core coil probes. The objective is to develop a statistical model for determining the probability of detecting a given flaw using an eddy-current system. The basis for developing a statistical detection model is a measurement model that relates the output voltage of the system to its various signal and noise components. Topics considered include statistics of the measured background voltage, calibration of the probe-flaw interaction model and signal-to-noise ratio (SNR) definition, the operating characteristic, and a comparison of air-core probes
Irwin, Brian J.; Wagner, Tyler; Bence, James R.; Kepler, Megan V.; Liu, Weihai; Hayes, Daniel B.
2013-01-01
Partitioning total variability into its component temporal and spatial sources is a powerful way to better understand time series and elucidate trends. The data available for such analyses of fish and other populations are usually nonnegative integer counts of the number of organisms, often dominated by many low values with few observations of relatively high abundance. These characteristics are not well approximated by the Gaussian distribution. We present a detailed description of a negative binomial mixed-model framework that can be used to model count data and quantify temporal and spatial variability. We applied these models to data from four fishery-independent surveys of Walleyes Sander vitreus across the Great Lakes basin. Specifically, we fitted models to gill-net catches from Wisconsin waters of Lake Superior; Oneida Lake, New York; Saginaw Bay in Lake Huron, Michigan; and Ohio waters of Lake Erie. These long-term monitoring surveys varied in overall sampling intensity, the total catch of Walleyes, and the proportion of zero catches. Parameter estimation included the negative binomial scaling parameter, and we quantified the random effects as the variations among gill-net sampling sites, the variations among sampled years, and site × year interactions. This framework (i.e., the application of a mixed model appropriate for count data in a variance-partitioning context) represents a flexible approach that has implications for monitoring programs (e.g., trend detection) and for examining the potential of individual variance components to serve as response metrics to large-scale anthropogenic perturbations or ecological changes.
Use of Poisson spatiotemporal regression models for the Brazilian Amazon Forest: malaria count data.
Achcar, Jorge Alberto; Martinez, Edson Zangiacomi; Souza, Aparecida Doniseti Pires de; Tachibana, Vilma Mayumi; Flores, Edilson Ferreira
2011-01-01
Malaria is a serious problem in the Brazilian Amazon region, and the detection of possible risk factors could be of great interest for public health authorities. The objective of this article was to investigate the association between environmental variables and the yearly registers of malaria in the Amazon region using bayesian spatiotemporal methods. We used Poisson spatiotemporal regression models to analyze the Brazilian Amazon forest malaria count for the period from 1999 to 2008. In this study, we included some covariates that could be important in the yearly prediction of malaria, such as deforestation rate. We obtained the inferences using a bayesian approach and Markov Chain Monte Carlo (MCMC) methods to simulate samples for the joint posterior distribution of interest. The discrimination of different models was also discussed. The model proposed here suggests that deforestation rate, the number of inhabitants per km², and the human development index (HDI) are important in the prediction of malaria cases. It is possible to conclude that human development, population growth, deforestation, and their associated ecological alterations are conducive to increasing malaria risk. We conclude that the use of Poisson regression models that capture the spatial and temporal effects under the bayesian paradigm is a good strategy for modeling malaria counts.
Range walk error correction and modeling on Pseudo-random photon counting system
Shen, Shanshan; Chen, Qian; He, Weiji
2017-08-01
Signal to noise ratio and depth accuracy are modeled for the pseudo-random ranging system with two random processes. The theoretical results, developed herein, capture the effects of code length and signal energy fluctuation are shown to agree with Monte Carlo simulation measurements. First, the SNR is developed as a function of the code length. Using Geiger-mode avalanche photodiodes (GMAPDs), longer code length is proven to reduce the noise effect and improve SNR. Second, the Cramer-Rao lower bound on range accuracy is derived to justify that longer code length can bring better range accuracy. Combined with the SNR model and CRLB model, it is manifested that the range accuracy can be improved by increasing the code length to reduce the noise-induced error. Third, the Cramer-Rao lower bound on range accuracy is shown to converge to the previously published theories and introduce the Gauss range walk model to range accuracy. Experimental tests also converge to the presented boundary model in this paper. It has been proven that depth error caused by the fluctuation of the number of detected photon counts in the laser echo pulse leads to the depth drift of Time Point Spread Function (TPSF). Finally, numerical fitting function is used to determine the relationship between the depth error and the photon counting ratio. Depth error due to different echo energy is calibrated so that the corrected depth accuracy is improved to 1cm.
Linear mixed models a practical guide using statistical software
West, Brady T; Galecki, Andrzej T
2006-01-01
Simplifying the often confusing array of software programs for fitting linear mixed models (LMMs), Linear Mixed Models: A Practical Guide Using Statistical Software provides a basic introduction to primary concepts, notation, software implementation, model interpretation, and visualization of clustered and longitudinal data. This easy-to-navigate reference details the use of procedures for fitting LMMs in five popular statistical software packages: SAS, SPSS, Stata, R/S-plus, and HLM. The authors introduce basic theoretical concepts, present a heuristic approach to fitting LMMs based on bo
Statistical Model and the mesonic-baryonic transition region
Oeschler, H.; Redlich, K.; Wheaton, S.
2009-01-01
The statistical model assuming chemical equilibriumand local strangeness conservation describes most of the observed features of strange particle production from SIS up to RHIC. Deviations are found as the maximum in the measured K+/pi+ ratio is much sharper than in the model calculations. At the incident energy of the maximum, the statistical model shows that freeze out changes regime from one being dominated by baryons at the lower energies toward one being dominated by mesons. It will be shown how deviations from the usual freeze-out curve influence the various particle ratios. Furthermore, other observables exhibit also changes just in this energy regime.
Multiple commodities in statistical microeconomics: Model and market
Baaquie, Belal E.; Yu, Miao; Du, Xin
2016-11-01
A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.
Multi-region Statistical Shape Model for Cochlear Implantation
DEFF Research Database (Denmark)
Romera, Jordi; Kjer, H. Martin; Piella, Gemma
2016-01-01
Statistical shape models are commonly used to analyze the variability between similar anatomical structures and their use is established as a tool for analysis and segmentation of medical images. However, using a global model to capture the variability of complex structures is not enough to achie...
Evaluation of Statistical Models for Analysis of Insect, Disease and ...
African Journals Online (AJOL)
It is concluded that LMMs and GLMs simultaneously consider the effect of treatments and heterogeneity of variance and hence are more appropriate for analysis of abundance and incidence data than ordinary ANOVA. Keywords: Mixed Models; Generalized Linear Models; Statistical Power East African Journal of Sciences ...
Complex Data Modeling and Computationally Intensive Statistical Methods
Mantovan, Pietro
2010-01-01
The last years have seen the advent and development of many devices able to record and store an always increasing amount of complex and high dimensional data; 3D images generated by medical scanners or satellite remote sensing, DNA microarrays, real time financial data, system control datasets. The analysis of this data poses new challenging problems and requires the development of novel statistical models and computational methods, fueling many fascinating and fast growing research areas of modern statistics. The book offers a wide variety of statistical methods and is addressed to statistici
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
2015-09-30
information on fish school distributions by monitoring the direction of birds returning to the colony or the behavior of other birds at sea through...active sonar. Toward this goal, fundamental advances in the understanding of fish behavior , especially in aggregations, will be made under conditions...relevant to the echo statistics problem. OBJECTIVES To develop new models of behavior of fish aggregations, including the fission/fusion process
The Fractal Characteristics of the Landslides by Box-Counting and P-A Model
Wang, Zhiwang; Zhou, Fangfang; Cao, Hao
2018-01-01
The landslide is a kind of complicated phenomenon with nonlinear inter-reaction. The traditional theories and methods are difficult to study the uncertainty characteristics of dynamic evolution of the landslides. This paper applies box-counting and P-A model to study the fractal characteristics of geometric shape and spatial distribution of the landslide hazards in the study area from Badong county to Zigui county in TGP reservoir region. The data obtained from the study area shows power-law distributions of geometric shape and spatial distribution of the landslides, and thus reveals some fractal or self-similarity properties. The fractral dimensions DAP of the spatial distribution of landslides by P-A model shows that DAP of the western landslides in the study area are smaller than those of the east, which shows that the geometry of the eastern landslide is more irregular and complicated than the western ones. The results show box-counting model and P-A model can be used to characterize the fractal characteristics of geometric shape and spatial distribution of the landslides.
Understanding and forecasting polar stratospheric variability with statistical models
Directory of Open Access Journals (Sweden)
C. Blume
2012-07-01
Full Text Available The variability of the north-polar stratospheric vortex is a prominent aspect of the middle atmosphere. This work investigates a wide class of statistical models with respect to their ability to model geopotential and temperature anomalies, representing variability in the polar stratosphere. Four partly nonstationary, nonlinear models are assessed: linear discriminant analysis (LDA; a cluster method based on finite elements (FEM-VARX; a neural network, namely the multi-layer perceptron (MLP; and support vector regression (SVR. These methods model time series by incorporating all significant external factors simultaneously, including ENSO, QBO, the solar cycle, volcanoes, to then quantify their statistical importance. We show that variability in reanalysis data from 1980 to 2005 is successfully modeled. The period from 2005 to 2011 can be hindcasted to a certain extent, where MLP performs significantly better than the remaining models. However, variability remains that cannot be statistically hindcasted within the current framework, such as the unexpected major warming in January 2009. Finally, the statistical model with the best generalization performance is used to predict a winter 2011/12 with warm and weak vortex conditions. A vortex breakdown is predicted for late January, early February 2012.
Liu, Xuejin; Persson, Mats; Bornefalk, Hans; Karlsson, Staffan; Xu, Cheng; Danielsson, Mats; Huber, Ben
2015-07-01
Variations among detector channels in computed tomography can lead to ring artifacts in the reconstructed images and biased estimates in projection-based material decomposition. Typically, the ring artifacts are corrected by compensation methods based on flat fielding, where transmission measurements are required for a number of material-thickness combinations. Phantoms used in these methods can be rather complex and require an extensive number of transmission measurements. Moreover, material decomposition needs knowledge of the individual response of each detector channel to account for the detector inhomogeneities. For this purpose, we have developed a spectral response model that binwise predicts the response of a multibin photon-counting detector individually for each detector channel. The spectral response model is performed in two steps. The first step employs a forward model to predict the expected numbers of photon counts, taking into account parameters such as the incident x-ray spectrum, absorption efficiency, and energy response of the detector. The second step utilizes a limited number of transmission measurements with a set of flat slabs of two absorber materials to fine-tune the model predictions, resulting in a good correspondence with the physical measurements. To verify the response model, we apply the model in two cases. First, the model is used in combination with a compensation method which requires an extensive number of transmission measurements to determine the necessary parameters. Our spectral response model successfully replaces these measurements by simulations, saving a significant amount of measurement time. Second, the spectral response model is used as the basis of the maximum likelihood approach for projection-based material decomposition. The reconstructed basis images show a good separation between the calcium-like material and the contrast agents, iodine and gadolinium. The contrast agent concentrations are reconstructed with more
Statistical Models for Tornado Climatology: Long and Short-Term Views.
Elsner, James B; Jagger, Thomas H; Fricker, Tyler
2016-01-01
This paper estimates regional tornado risk from records of past events using statistical models. First, a spatial model is fit to the tornado counts aggregated in counties with terms that control for changes in observational practices over time. Results provide a long-term view of risk that delineates the main tornado corridors in the United States where the expected annual rate exceeds two tornadoes per 10,000 square km. A few counties in the Texas Panhandle and central Kansas have annual rates that exceed four tornadoes per 10,000 square km. Refitting the model after removing the least damaging tornadoes from the data (EF0) produces a similar map but with the greatest tornado risk shifted south and eastward. Second, a space-time model is fit to the counts aggregated in raster cells with terms that control for changes in climate factors. Results provide a short-term view of risk. The short-term view identifies a shift of tornado activity away from the Ohio Valley under El Niño conditions and away from the Southeast under positive North Atlantic oscillation conditions. The combined predictor effects on the local rates is quantified by fitting the model after leaving out the year to be predicted from the data. The models provide state-of-the-art views of tornado risk that can be used by government agencies, the insurance industry, and the general public.
Statistical Validation of Engineering and Scientific Models: Background
International Nuclear Information System (INIS)
Hills, Richard G.; Trucano, Timothy G.
1999-01-01
A tutorial is presented discussing the basic issues associated with propagation of uncertainty analysis and statistical validation of engineering and scientific models. The propagation of uncertainty tutorial illustrates the use of the sensitivity method and the Monte Carlo method to evaluate the uncertainty in predictions for linear and nonlinear models. Four example applications are presented; a linear model, a model for the behavior of a damped spring-mass system, a transient thermal conduction model, and a nonlinear transient convective-diffusive model based on Burger's equation. Correlated and uncorrelated model input parameters are considered. The model validation tutorial builds on the material presented in the propagation of uncertainty tutoriaI and uses the damp spring-mass system as the example application. The validation tutorial illustrates several concepts associated with the application of statistical inference to test model predictions against experimental observations. Several validation methods are presented including error band based, multivariate, sum of squares of residuals, and optimization methods. After completion of the tutorial, a survey of statistical model validation literature is presented and recommendations for future work are made
Statistical Validation of Normal Tissue Complication Probability Models
Energy Technology Data Exchange (ETDEWEB)
Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Veld, Aart A. van' t; Langendijk, Johannes A. [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schilstra, Cornelis [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Radiotherapy Institute Friesland, Leeuwarden (Netherlands)
2012-09-01
Purpose: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. Methods and Materials: A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Results: Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Conclusion: Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use.
Modern statistical models for forensic fingerprint examinations: a critical review.
Abraham, Joshua; Champod, Christophe; Lennard, Chris; Roux, Claude
2013-10-10
Over the last decade, the development of statistical models in support of forensic fingerprint identification has been the subject of increasing research attention, spurned on recently by commentators who claim that the scientific basis for fingerprint identification has not been adequately demonstrated. Such models are increasingly seen as useful tools in support of the fingerprint identification process within or in addition to the ACE-V framework. This paper provides a critical review of recent statistical models from both a practical and theoretical perspective. This includes analysis of models of two different methodologies: Probability of Random Correspondence (PRC) models that focus on calculating probabilities of the occurrence of fingerprint configurations for a given population, and Likelihood Ratio (LR) models which use analysis of corresponding features of fingerprints to derive a likelihood value representing the evidential weighting for a potential source. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Growth Curve Models and Applications : Indian Statistical Institute
2017-01-01
Growth curve models in longitudinal studies are widely used to model population size, body height, biomass, fungal growth, and other variables in the biological sciences, but these statistical methods for modeling growth curves and analyzing longitudinal data also extend to general statistics, economics, public health, demographics, epidemiology, SQC, sociology, nano-biotechnology, fluid mechanics, and other applied areas. There is no one-size-fits-all approach to growth measurement. The selected papers in this volume build on presentations from the GCM workshop held at the Indian Statistical Institute, Giridih, on March 28-29, 2016. They represent recent trends in GCM research on different subject areas, both theoretical and applied. This book includes tools and possibilities for further work through new techniques and modification of existing ones. The volume includes original studies, theoretical findings and case studies from a wide range of app lied work, and these contributions have been externally r...
International Nuclear Information System (INIS)
Galiano, G.; Grau, A.
1994-01-01
An intelligent computer program has been developed to obtain the mathematical formulae to compute the probabilities and reduced energies of the different atomic rearrangement pathways following electron-capture decay. Creation and annihilation operators for Auger and X processes have been introduced. Taking into account the symmetries associated with each process, 262 different pathways were obtained. This model allows us to obtain the influence of the M-electro capture in the counting efficiency when the atomic number of the nuclide is high. (Author)
Statistical modelling for recurrent events: an application to sports injuries.
Ullah, Shahid; Gabbett, Tim J; Finch, Caroline F
2014-09-01
Injuries are often recurrent, with subsequent injuries influenced by previous occurrences and hence correlation between events needs to be taken into account when analysing such data. This paper compares five different survival models (Cox proportional hazards (CoxPH) model and the following generalisations to recurrent event data: Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time (WLW-TT) marginal, Prentice-Williams-Peterson gap time (PWP-GT) conditional models) for the analysis of recurrent injury data. Empirical evaluation and comparison of different models were performed using model selection criteria and goodness-of-fit statistics. Simulation studies assessed the size and power of each model fit. The modelling approach is demonstrated through direct application to Australian National Rugby League recurrent injury data collected over the 2008 playing season. Of the 35 players analysed, 14 (40%) players had more than 1 injury and 47 contact injuries were sustained over 29 matches. The CoxPH model provided the poorest fit to the recurrent sports injury data. The fit was improved with the A-G and frailty models, compared to WLW-TT and PWP-GT models. Despite little difference in model fit between the A-G and frailty models, in the interest of fewer statistical assumptions it is recommended that, where relevant, future studies involving modelling of recurrent sports injury data use the frailty model in preference to the CoxPH model or its other generalisations. The paper provides a rationale for future statistical modelling approaches for recurrent sports injury. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The Statistical Modeling of the Trends Concerning the Romanian Population
Directory of Open Access Journals (Sweden)
Gabriela OPAIT
2014-11-01
Full Text Available This paper reflects the statistical modeling concerning the resident population in Romania, respectively the total of the romanian population, through by means of the „Least Squares Method”. Any country it develops by increasing of the population, respectively of the workforce, which is a factor of influence for the growth of the Gross Domestic Product (G.D.P.. The „Least Squares Method” represents a statistical technique for to determine the trend line of the best fit concerning a model.
Statistical Model of the 2001 Czech Census for Interactive Presentation
Czech Academy of Sciences Publication Activity Database
Grim, Jiří; Hora, Jan; Boček, Pavel; Somol, Petr; Pudil, Pavel
Vol. 26, č. 4 (2010), s. 1-23 ISSN 0282-423X R&D Projects: GA ČR GA102/07/1594; GA MŠk 1M0572 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Interactive statistical model * census data presentation * distribution mixtures * data modeling * EM algorithm * incomplete data * data reproduction accuracy * data mining Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.492, year: 2010 http://library.utia.cas.cz/separaty/2010/RO/grim-0350513.pdf
Nagy, P; Faye, B; Marko, O; Thomas, S; Wernery, U; Juhasz, J
2013-09-01
The objectives of the present study were to monitor the microbiological quality and somatic cell count (SCC) of bulk tank milk at the world's first large-scale camel dairy farm for a 2-yr period, to compare the results of 2 methods for the enumeration of SCC, to evaluate correlation among milk quality indicators, and to determine the effect of specific factors (year, season, stage of lactation, and level of production) on milk quality indicators. The study was conducted from January 2008 to January 2010. Total viable count (TVC), coliform count (CC), California Mastitis Test (CMT) score, and SCC were determined from daily bulk milk samples. Somatic cell count was measured by using a direct microscopic method and with an automatic cell counter. In addition, production parameters [total daily milk production (TDM, kg), number of milking camels (NMC), average milk per camel (AMC, kg)] and stage of lactation (average postpartum days, PPD) were recorded for each test day. A strong correlation (r=0.33) was found between the 2 methods for SCC enumeration; however, values derived using the microscopic method were higher. The geometric means of SCC and TVC were 394×10(3) cells/mL and 5,157 cfu/mL during the observation period, respectively. Somatic cell count was >500×10(3) cells/mL on 14.6% (106/725) and TVC was >10×10(3) cfu/mL on 4.0% (30/742) of the test days. Both milk quality indicators had a distinct seasonal pattern. For log SCC, the mean was lowest in summer and highest in autumn. The seasonal pattern of log TVC was slightly different, with the lowest values being recorded during the spring. The monthly mean TVC pattern showed a clear difference between years. Coliform count was <10 cfu/mL in most of the samples (709/742, 95.6%). A positive correlation was found between log SCC and log TVC (r=0.32), between log SCC and CMT score (r=0.26), and between log TVC and CC in yr 1 (r=0.30). All production parameters and stage of lactation showed strong seasonal
Applied systems ecology: models, data, and statistical methods
Energy Technology Data Exchange (ETDEWEB)
Eberhardt, L L
1976-01-01
In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.
Analyzing sickness absence with statistical models for survival data
DEFF Research Database (Denmark)
Christensen, Karl Bang; Andersen, Per Kragh; Smith-Hansen, Lars
2007-01-01
OBJECTIVES: Sickness absence is the outcome in many epidemiologic studies and is often based on summary measures such as the number of sickness absences per year. In this study the use of modern statistical methods was examined by making better use of the available information. Since sickness...... absence data deal with events occurring over time, the use of statistical models for survival data has been reviewed, and the use of frailty models has been proposed for the analysis of such data. METHODS: Three methods for analyzing data on sickness absences were compared using a simulation study...... involving the following: (i) Poisson regression using a single outcome variable (number of sickness absences), (ii) analysis of time to first event using the Cox proportional hazards model, and (iii) frailty models, which are random effects proportional hazards models. Data from a study of the relation...
A Review of Modeling Bioelectrochemical Systems: Engineering and Statistical Aspects
Directory of Open Access Journals (Sweden)
Shuai Luo
2016-02-01
Full Text Available Bioelectrochemical systems (BES are promising technologies to convert organic compounds in wastewater to electrical energy through a series of complex physical-chemical, biological and electrochemical processes. Representative BES such as microbial fuel cells (MFCs have been studied and advanced for energy recovery. Substantial experimental and modeling efforts have been made for investigating the processes involved in electricity generation toward the improvement of the BES performance for practical applications. However, there are many parameters that will potentially affect these processes, thereby making the optimization of system performance hard to be achieved. Mathematical models, including engineering models and statistical models, are powerful tools to help understand the interactions among the parameters in BES and perform optimization of BES configuration/operation. This review paper aims to introduce and discuss the recent developments of BES modeling from engineering and statistical aspects, including analysis on the model structure, description of application cases and sensitivity analysis of various parameters. It is expected to serves as a compass for integrating the engineering and statistical modeling strategies to improve model accuracy for BES development.
Linking statistical bias description to multiobjective model calibration
Reichert, P.; Schuwirth, N.
2012-09-01
In the absence of model deficiencies, simulation results at the correct parameter values lead to an unbiased description of observed data with remaining deviations due to observation errors only. However, this ideal cannot be reached in the practice of environmental modeling, because the required simplified representation of the complex reality by the model and errors in model input lead to errors that are reflected in biased model output. This leads to two related problems: First, ignoring bias of output in the statistical model description leads to bias in parameter estimates, model predictions and, in particular, in the quantification of their uncertainty. Second, as there is no objective choice of how much bias to accept in which output variable, it is not possible to design an "objective" model calibration procedure. The first of these problems has been addressed by introducing a statistical (Bayesian) description of bias, the second by suggesting the use of multiobjective calibration techniques that cannot easily be used for uncertainty analysis. We merge the ideas of these two approaches by using the prior of the statistical bias description to quantify the importance of multiple calibration objectives. This leads to probabilistic inference and prediction while still taking multiple calibration objectives into account. The ideas and technical details of the suggested approach are outlined and a didactical example as well as an application to environmental data are provided to demonstrate its practical feasibility and computational efficiency.
Statistical learning modeling method for space debris photometric measurement
Sun, Wenjing; Sun, Jinqiu; Zhang, Yanning; Li, Haisen
2016-03-01
Photometric measurement is an important way to identify the space debris, but the present methods of photometric measurement have many constraints on star image and need complex image processing. Aiming at the problems, a statistical learning modeling method for space debris photometric measurement is proposed based on the global consistency of the star image, and the statistical information of star images is used to eliminate the measurement noises. First, the known stars on the star image are divided into training stars and testing stars. Then, the training stars are selected as the least squares fitting parameters to construct the photometric measurement model, and the testing stars are used to calculate the measurement accuracy of the photometric measurement model. Experimental results show that, the accuracy of the proposed photometric measurement model is about 0.1 magnitudes.
Statistical, Morphometric, Anatomical Shape Model (Atlas) of Calcaneus
Melinska, Aleksandra U.; Romaszkiewicz, Patryk; Wagel, Justyna; Sasiadek, Marek; Iskander, D. Robert
2015-01-01
The aim was to develop a morphometric and anatomically accurate atlas (statistical shape model) of calcaneus. The model is based on 18 left foot and 18 right foot computed tomography studies of 28 male individuals aged from 17 to 62 years, with no known foot pathology. A procedure for automatic atlas included extraction and identification of common features, averaging feature position, obtaining mean geometry, mathematical shape description and variability analysis. Expert manual assistance was included for the model to fulfil the accuracy sought by medical professionals. The proposed for the first time statistical shape model of the calcaneus could be of value in many orthopaedic applications including providing support in diagnosing pathological lesions, pre-operative planning, classification and treatment of calcaneus fractures as well as for the development of future implant procedures. PMID:26270812
Workshop on Model Uncertainty and its Statistical Implications
1988-01-01
In this book problems related to the choice of models in such diverse fields as regression, covariance structure, time series analysis and multinomial experiments are discussed. The emphasis is on the statistical implications for model assessment when the assessment is done with the same data that generated the model. This is a problem of long standing, notorious for its difficulty. Some contributors discuss this problem in an illuminating way. Others, and this is a truly novel feature, investigate systematically whether sample re-use methods like the bootstrap can be used to assess the quality of estimators or predictors in a reliable way given the initial model uncertainty. The book should prove to be valuable for advanced practitioners and statistical methodologists alike.
Statistical Modeling for Radiation Hardness Assurance: Toward Bigger Data
Ladbury, R.; Campola, M. J.
2015-01-01
New approaches to statistical modeling in radiation hardness assurance are discussed. These approaches yield quantitative bounds on flight-part radiation performance even in the absence of conventional data sources. This allows the analyst to bound radiation risk at all stages and for all decisions in the RHA process. It also allows optimization of RHA procedures for the project's risk tolerance.
Interactive comparison of hypothesis tests for statistical model checking
de Boer, Pieter-Tjerk; Reijsbergen, D.P.; Scheinhardt, Willem R.W.
2015-01-01
We present a web-based interactive comparison of hypothesis tests as are used in statistical model checking, providing users and tool developers with more insight into their characteristics. Parameters can be modified easily and their influence is visualized in real time; an integrated simulation
Syntactic discriminative language model rerankers for statistical machine translation
Carter, S.; Monz, C.
2011-01-01
This article describes a method that successfully exploits syntactic features for n-best translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over n-gram language models in differentiating between Statistical
Hierarchical modelling for the environmental sciences statistical methods and applications
Clark, James S
2006-01-01
New statistical tools are changing the way in which scientists analyze and interpret data and models. Hierarchical Bayes and Markov Chain Monte Carlo methods for analysis provide a consistent framework for inference and prediction where information is heterogeneous and uncertain, processes are complicated, and responses depend on scale. Nowhere are these methods more promising than in the environmental sciences.
Using statistical compatibility to derive advanced probabilistic fatigue models
Czech Academy of Sciences Publication Activity Database
Fernández-Canteli, A.; Castillo, E.; López-Aenlle, M.; Seitl, Stanislav
2010-01-01
Roč. 2, č. 1 (2010), s. 1131-1140 E-ISSN 1877-7058. [Fatigue 2010. Praha, 06.06.2010-11.06.2010] Institutional research plan: CEZ:AV0Z20410507 Keywords : Fatigue models * Statistical compatibility * Functional equations Subject RIV: JL - Materials Fatigue, Friction Mechanics
Modelling geographical graduate job search using circular statistics
Faggian, Alessandra; Corcoran, Jonathan; McCann, Philip
Theory suggests that the spatial patterns of migration flows are contingent both on individual human capital and underlying geographical structures. Here we demonstrate these features by using circular statistics in an econometric modelling framework applied to the flows of UK university graduates.
Statistical Modeling of Energy Production by Photovoltaic Farms
Czech Academy of Sciences Publication Activity Database
Brabec, Marek; Pelikán, Emil; Krč, Pavel; Eben, Kryštof; Musílek, P.
2011-01-01
Roč. 5, č. 9 (2011), s. 785-793 ISSN 1934-8975 Grant - others:GA AV ČR(CZ) M100300904 Institutional research plan: CEZ:AV0Z10300504 Keywords : electrical energy * solar energy * numerical weather prediction model * nonparametric regression * beta regression Subject RIV: BB - Applied Statistics, Operational Research
Two-dimensional models in statistical mechanics and field theory
International Nuclear Information System (INIS)
Koberle, R.
1980-01-01
Several features of two-dimensional models in statistical mechanics and Field theory, such as, lattice quantum chromodynamics, Z(N), Gross-Neveu and CP N-1 are discussed. The problems of confinement and dynamical mass generation are also analyzed. (L.C.) [pt
Statistical properties of the nuclear shell-model Hamiltonian
International Nuclear Information System (INIS)
Dias, H.; Hussein, M.S.; Oliveira, N.A. de
1986-01-01
The statistical properties of realistic nuclear shell-model Hamiltonian are investigated in sd-shell nuclei. The probability distribution of the basic-vector amplitude is calculated and compared with the Porter-Thomas distribution. Relevance of the results to the calculation of the giant resonance mixing parameter is pointed out. (Author) [pt
Eigenfunction statistics for Anderson model with Hölder continuous ...
Indian Academy of Sciences (India)
continuous (0 < α ≤ 1) single site distribution. In localized regime, we study the distri- bution of eigenfunctions in space and energy simultaneously. In a certain scaling limit, we prove limit points are Poisson. Keywords. Anderson model; Hölder continuous measure; Poisson statistics. 2010 Mathematics Subject Classification ...
Integration of Advanced Statistical Analysis Tools and Geophysical Modeling
2012-08-01
1.56 0.48 Beale: MetalMapper Cued: Beale_MMstat Target: 477 Cell 202 of 1547 (SOI, 2OI) Model 1 of 3 (Inv #1 / 2 = SOI: 1 / 1) Tag...Statistical classification of buried unexploded ordnance using nonparametric prior models. IEEE Trans. Geosci. Remote Sensing, 45: 2794–2806, 2007. T...Bell and B. Barrow. Subsurface discrimination using electromagnetic induction sensors. IEEE Trans. Geosci. Remote Sensing, 39:1286–1293, 2001. S. D
A Statistical Model for Synthesis of Detailed Facial Geometry
Golovinskiy, Aleksey; Matusik, Wojciech; Pfister, Hanspeter; Rusinkiewicz, Szymon; Funkhouser, Thomas
2006-01-01
Detailed surface geometry contributes greatly to the visual realism of 3D face models. However, acquiring high-resolution face geometry is often tedious and expensive. Consequently, most face models used in games, virtual reality, or computer vision look unrealistically smooth. In this paper, we introduce a new statistical technique for the analysis and synthesis of small three-dimensional facial features, such as wrinkles and pores. We acquire high-resolution face geometry for people across ...
Statistical and RBF NN models : providing forecasts and risk assessment
Marček, Milan
2009-01-01
Forecast accuracy of economic and financial processes is a popular measure for quantifying the risk in decision making. In this paper, we develop forecasting models based on statistical (stochastic) methods, sometimes called hard computing, and on a soft method using granular computing. We consider the accuracy of forecasting models as a measure for risk evaluation. It is found that the risk estimation process based on soft methods is simplified and less critical to the question w...
Advances on statistical/thermodynamical models for unpolarized structure functions
Energy Technology Data Exchange (ETDEWEB)
Trevisan, Luis A. [Departamento de Matematica e Estatistica, Universidade Estadual de Ponta Grossa, 84010-790, Ponta Grossa, PR (Brazil); Mirez, Carlos [Universidade Federal dos Vales do Jequitinhonha e Mucuri, Campus do Mucuri, 39803-371, Teofilo Otoni, Minas Gerais (Brazil); Tomio, Lauro [Instituto de Fisica Teorica, Universidade Estadual Paulista, R. Dr. Bento Teobaldo Ferraz 271, Bl II Barra Funda, 01140070, Sao Paulo, SP (Brazil)
2013-03-25
During the eights and nineties many statistical/thermodynamical models were proposed to describe the nucleons' structure functions and distribution of the quarks in the hadrons. Most of these models describe the compound quarks and gluons inside the nucleon as a Fermi / Bose gas respectively, confined in a MIT bag with continuous energy levels. Another models considers discrete spectrum. Some interesting features of the nucleons are obtained by these models, like the sea asymmetries {sup -}d/{sup -}u and {sup -}d-{sup -}u.
Statistical modelling of transcript profiles of differentially regulated genes
Directory of Open Access Journals (Sweden)
Sergeant Martin J
2008-07-01
Full Text Available Abstract Background The vast quantities of gene expression profiling data produced in microarray studies, and the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of variance (ANOVA and the clustering of genes based on simple models fitted to their expression profiles over time. We report the novel application of statistical non-linear regression modelling techniques to describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E. coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models provides a more precise description of expression profiles, reducing the "noise" of the raw data to produce a clear "signal" given by the fitted curve, and describing each profile with a small number of biologically interpretable parameters. This approach then allows the direct comparison and clustering of the shapes of response patterns between genes and potentially enables a greater exploration and interpretation of the biological processes driving gene expression. Results Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Split-line" or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification of genes into those with primary and secondary responses. Five-day profiles were modelled using the biologically-oriented, critical exponential curve, y(t = A + (B + CtRt + ε. This non-linear regression approach allowed the expression patterns for different genes to be compared in terms of curve shape, time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory patterns were identified for the five genes studied. Applying the regression modelling approach to microarray-derived time course data
Bilingual Cluster Based Models for Statistical Machine Translation
Yamamoto, Hirofumi; Sumita, Eiichiro
We propose a domain specific model for statistical machine translation. It is well-known that domain specific language models perform well in automatic speech recognition. We show that domain specific language and translation models also benefit statistical machine translation. However, there are two problems with using domain specific models. The first is the data sparseness problem. We employ an adaptation technique to overcome this problem. The second issue is domain prediction. In order to perform adaptation, the domain must be provided, however in many cases, the domain is not known or changes dynamically. For these cases, not only the translation target sentence but also the domain must be predicted. This paper focuses on the domain prediction problem for statistical machine translation. In the proposed method, a bilingual training corpus, is automatically clustered into sub-corpora. Each sub-corpus is deemed to be a domain. The domain of a source sentence is predicted by using its similarity to the sub-corpora. The predicted domain (sub-corpus) specific language and translation models are then used for the translation decoding. This approach gave an improvement of 2.7 in BLEU score on the IWSLT05 Japanese to English evaluation corpus (improving the score from 52.4 to 55.1). This is a substantial gain and indicates the validity of the proposed bilingual cluster based models.
International Nuclear Information System (INIS)
Weathers, J.B.; Luck, R.; Weathers, J.W.
2009-01-01
The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.
WE-A-201-02: Modern Statistical Modeling
International Nuclear Information System (INIS)
Niemierko, A.
2016-01-01
Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear
WE-A-201-02: Modern Statistical Modeling
Energy Technology Data Exchange (ETDEWEB)
Niemierko, A.
2016-06-15
Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear
Risk prediction model: Statistical and artificial neural network approach
Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim
2017-04-01
Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.
Organism-level models: When mechanisms and statistics fail us
Phillips, M. H.; Meyer, J.; Smith, W. P.; Rockhill, J. K.
2014-03-01
Purpose: To describe the unique characteristics of models that represent the entire course of radiation therapy at the organism level and to highlight the uses to which such models can be put. Methods: At the level of an organism, traditional model-building runs into severe difficulties. We do not have sufficient knowledge to devise a complete biochemistry-based model. Statistical model-building fails due to the vast number of variables and the inability to control many of them in any meaningful way. Finally, building surrogate models, such as animal-based models, can result in excluding some of the most critical variables. Bayesian probabilistic models (Bayesian networks) provide a useful alternative that have the advantages of being mathematically rigorous, incorporating the knowledge that we do have, and being practical. Results: Bayesian networks representing radiation therapy pathways for prostate cancer and head & neck cancer were used to highlight the important aspects of such models and some techniques of model-building. A more specific model representing the treatment of occult lymph nodes in head & neck cancer were provided as an example of how such a model can inform clinical decisions. A model of the possible role of PET imaging in brain cancer was used to illustrate the means by which clinical trials can be modelled in order to come up with a trial design that will have meaningful outcomes. Conclusions: Probabilistic models are currently the most useful approach to representing the entire therapy outcome process.
Experimental, statistical, and biological models of radon carcinogenesis
International Nuclear Information System (INIS)
Cross, F.T.
1991-09-01
Risk models developed for underground miners have not been consistently validated in studies of populations exposed to indoor radon. Imprecision in risk estimates results principally from differences between exposures in mines as compared to domestic environments and from uncertainties about the interaction between cigarette-smoking and exposure to radon decay products. Uncertainties in extrapolating miner data to domestic exposures can be reduced by means of a broad-based health effects research program that addresses the interrelated issues of exposure, respiratory tract dose, carcinogenesis (molecular/cellular and animal studies, plus developing biological and statistical models), and the relationship of radon to smoking and other copollutant exposures. This article reviews experimental animal data on radon carcinogenesis observed primarily in rats at Pacific Northwest Laboratory. Recent experimental and mechanistic carcinogenesis models of exposures to radon, uranium ore dust, and cigarette smoke are presented with statistical analyses of animal data. 20 refs., 1 fig
Statistical model selection with “Big Data”
Directory of Open Access Journals (Sweden)
Jurgen A. Doornik
2015-12-01
Full Text Available Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem, using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem, using a viable approach that resolves the computational problem of immense numbers of possible models.
Experimental, statistical and biological models of radon carcinogenesis
International Nuclear Information System (INIS)
Cross, F.T.
1992-01-01
Risk models developed for underground miners have not been consistently validated in studies of populations exposed to indoor radon. Imprecision in risk estimates results principally from differences between exposures in mines as compared with domestic environments and from uncertainties about the interaction between cigarette smoking and exposure to radon decay products. Uncertainties in extrapolating miner data to domestic exposures can be reduced by means of a broad-based health effects research programme that addresses the interrelated issues of exposure, respiratory tract dose, carcinogenesis (molecular/cellular and animal studies, plus developing biological and statistical models) and the relationship of radon to smoking and other co-pollutant exposures. This article reviews experimental animal data on radon carcinogenesis observed primarily in rats at Pacific Northwest Laboratory. Recent experimental and mechanistic carcinogenesis models of exposures to radon, uranium ore dust, and cigarette smoke are presented with statistical analyses of animal data. (author)
Statistical 3D damage accumulation model for ion implant simulators
International Nuclear Information System (INIS)
Hernandez-Mangas, J.M.; Lazaro, J.; Enriquez, L.; Bailon, L.; Barbolla, J.; Jaraiz, M.
2003-01-01
A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided
Statistical 3D damage accumulation model for ion implant simulators
Hernandez-Mangas, J M; Enriquez, L E; Bailon, L; Barbolla, J; Jaraiz, M
2003-01-01
A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided.
SoS contract verification using statistical model checking
Directory of Open Access Journals (Sweden)
Alessandro Mignogna
2013-11-01
Full Text Available Exhaustive formal verification for systems of systems (SoS is impractical and cannot be applied on a large scale. In this paper we propose to use statistical model checking for efficient verification of SoS. We address three relevant aspects for systems of systems: 1 the model of the SoS, which includes stochastic aspects; 2 the formalization of the SoS requirements in the form of contracts; 3 the tool-chain to support statistical model checking for SoS. We adapt the SMC technique for application to heterogeneous SoS. We extend the UPDM/SysML specification language to express the SoS requirements that the implemented strategies over the SoS must satisfy. The requirements are specified with a new contract language specifically designed for SoS, targeting a high-level English- pattern language, but relying on an accurate semantics given by the standard temporal logics. The contracts are verified against the UPDM/SysML specification using the Statistical Model Checker (SMC PLASMA combined with the simulation engine DESYRE, which integrates heterogeneous behavioral models through the functional mock-up interface (FMI standard. The tool-chain allows computing an estimation of the satisfiability of the contracts by the SoS. The results help the system architect to trade-off different solutions to guide the evolution of the SoS.
Conceptualizations of Personality Disorders with the Five Factor Model-Count and Empathy Traits
Kajonius, Petri J.; Dåderman, Anna M.
2017-01-01
Previous research has long advocated that emotional and behavioral disorders are related to general personality traits, such as the Five Factor Model (FFM). The addition of section III in the latest "Diagnostic and Statistical Manual of Mental Disorders" (DSM) recommends that extremity in personality traits together with maladaptive…
Mathematical Modeling of a Lit-End Cigarette: Puffing Cycle and Effects of Puff Counts
Directory of Open Access Journals (Sweden)
Saidi MS
2014-12-01
Full Text Available The burning cycles of a lit-end cigarette were numerically simulated using a 3-D model that includes both the cigarette and its surrounding ambient air and the effects of buoyancy forces. The solid and gas phases were treated separately in a thermally non-equilibrium environment. The tobacco pyrolysis and char oxidation were modeled using multi-precursor models. The changes in tobacco column porosity and its subsequent effects on permeability and gas diffusivity were included. The mass, momentum, energy, and species transport equations were solved in a discretized computational domain using a commercially available computational fluid dynamics (CFD code. The model was applied to puff a cigarette under different puffing intensities and the effects of puff volume, puff profile, and puff duration were studied. The results show that the model is capable of reproducing the major features of a burning cigarette during both smoldering and puffing. For the puffing and puff-by-puff cases, the solid and gas temperatures as well as those mainstream smoke constituents predicted by the model are in a good agreement with experimental results. A parametric study shows the significant effect of puff volume, puff profile, ventilation rate, and puff counts on solid and gas phase temperatures as well as gaseous species concentrations and mainstream smoke delivery. The buoyancy forces have shown to be very important in both smoldering and puffing.
Nuclear EMC effect in non-extensive statistical model
Energy Technology Data Exchange (ETDEWEB)
Trevisan, Luis A. [Departamento de Matematica e Estatistica, Universidade Estadual de Ponta Grossa, 84010-790, Ponta Grossa, PR (Brazil); Mirez, Carlos [ICET, Universidade Federal dos Vales do Jequitinhonha e Mucuri - UFVJM, Campus do Mucuri, Rua do Cruzeiro 01, Jardim Sao Paulo, 39803-371, Teofilo Otoni, MG (Brazil)
2013-05-06
In the present work, we attempt to describe the nuclear EMC effect by using the proton structure functions obtained from the non-extensive statistical quark model. We record that such model has three fundamental variables, the temperature T, the radius, and the Tsallis parameter q. By combining different small changes, a good agreement with the experimental data may be obtained. Another interesting point of the model is to allow phenomenological interpretation, for instance, with q constant and changing the radius and the temperature or changing the radius and q and keeping the temperature.
Martin, Justin D.
2017-01-01
This essay presents data from a census of statistics requirements and offerings at all 4-year journalism programs in the United States (N = 369) and proposes a model of a potential course in statistics for journalism majors. The author proposes that three philosophies underlie a statistics course for journalism students. Such a course should (a)…
Statistical modelling of a new global potential vegetation distribution
Levavasseur, G.; Vrac, M.; Roche, D. M.; Paillard, D.
2012-12-01
The potential natural vegetation (PNV) distribution is required for several studies in environmental sciences. Most of the available databases are quite subjective or depend on vegetation models. We have built a new high-resolution world-wide PNV map using a objective statistical methodology based on multinomial logistic models. Our method appears as a fast and robust alternative in vegetation modelling, independent of any vegetation model. In comparison with other databases, our method provides a realistic PNV distribution in agreement with respect to BIOME 6000 data. Among several advantages, the use of probabilities allows us to estimate the uncertainty, bringing some confidence in the modelled PNV, or to highlight the regions needing some data to improve the PNV modelling. Despite our PNV map being highly dependent on the distribution of data points, it is easily updatable as soon as additional data are available and provides very useful additional information for further applications.
Bayesian statistic methods and theri application in probabilistic simulation models
Directory of Open Access Journals (Sweden)
Sergio Iannazzo
2007-03-01
Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.
Spatio-temporal statistical models with applications to atmospheric processes
International Nuclear Information System (INIS)
Wikle, C.K.
1996-01-01
This doctoral dissertation is presented as three self-contained papers. An introductory chapter considers traditional spatio-temporal statistical methods used in the atmospheric sciences from a statistical perspective. Although this section is primarily a review, many of the statistical issues considered have not been considered in the context of these methods and several open questions are posed. The first paper attempts to determine a means of characterizing the semiannual oscillation (SAO) spatial variation in the northern hemisphere extratropical height field. It was discovered that the midlatitude SAO in 500hPa geopotential height could be explained almost entirely as a result of spatial and temporal asymmetries in the annual variation of stationary eddies. It was concluded that the mechanism for the SAO in the northern hemisphere is a result of land-sea contrasts. The second paper examines the seasonal variability of mixed Rossby-gravity waves (MRGW) in lower stratospheric over the equatorial Pacific. Advanced cyclostationary time series techniques were used for analysis. It was found that there are significant twice-yearly peaks in MRGW activity. Analyses also suggested a convergence of horizontal momentum flux associated with these waves. In the third paper, a new spatio-temporal statistical model is proposed that attempts to consider the influence of both temporal and spatial variability. This method is mainly concerned with prediction in space and time, and provides a spatially descriptive and temporally dynamic model
Spatio-temporal statistical models with applications to atmospheric processes
Energy Technology Data Exchange (ETDEWEB)
Wikle, Christopher K. [Iowa State Univ., Ames, IA (United States)
1996-01-01
This doctoral dissertation is presented as three self-contained papers. An introductory chapter considers traditional spatio-temporal statistical methods used in the atmospheric sciences from a statistical perspective. Although this section is primarily a review, many of the statistical issues considered have not been considered in the context of these methods and several open questions are posed. The first paper attempts to determine a means of characterizing the semiannual oscillation (SAO) spatial variation in the northern hemisphere extratropical height field. It was discovered that the midlatitude SAO in 500hPa geopotential height could be explained almost entirely as a result of spatial and temporal asymmetries in the annual variation of stationary eddies. It was concluded that the mechanism for the SAO in the northern hemisphere is a result of land-sea contrasts. The second paper examines the seasonal variability of mixed Rossby-gravity waves (MRGW) in lower stratospheric over the equatorial Pacific. Advanced cyclostationary time series techniques were used for analysis. It was found that there are significant twice-yearly peaks in MRGW activity. Analyses also suggested a convergence of horizontal momentum flux associated with these waves. In the third paper, a new spatio-temporal statistical model is proposed that attempts to consider the influence of both temporal and spatial variability. This method is mainly concerned with prediction in space and time, and provides a spatially descriptive and temporally dynamic model.
1981-10-01
Two statistical procedures have been developed to estimate hourly or daily aircraft counts. These counts can then be transformed into estimates of instantaneous air counts. The first procedure estimates the stable (deterministic) mean level of hourly...
Can spatial statistical river temperature models be transferred between catchments?
Jackson, Faye L.; Fryer, Robert J.; Hannah, David M.; Malcolm, Iain A.
2017-09-01
There has been increasing use of spatial statistical models to understand and predict river temperature (Tw) from landscape covariates. However, it is not financially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses Tw data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum Tw (Twmax) within and between catchments. Models were fitted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature (Ta) metric (LS_Ta models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of Twmax. However, the predictions were biased when mean Twmax differed between catchments. The RNS was needed to characterise and predict finer-scale spatially correlated variation. Because the RNS was unique to each catchment and thus non-transferable, predictions were better within catchments than between catchments. A single model fitted to all catchments found no interactions between the landscape covariates and catchment, suggesting that the landscape relationships were transferable. The LS_Ta models transferred less well, with particularly poor performance when the relationship with the Ta metric was physically implausible or required extrapolation outside the range of the data. A single model fitted to all catchments found catchment-specific relationships between Twmax and the Ta metric, indicating that the Ta metric was not transferable. These findings improve our understanding of the transferability of spatial statistical river temperature models and provide a foundation for developing new approaches for predicting Tw at unmonitored locations across
Directory of Open Access Journals (Sweden)
Michelle Cronin
2010-09-01
Full Text Available The time of year and day, the state of the tide and prevailing environmental conditions significantly influence seal haulout behaviour. Understanding these effects is fundamentally important in deriving accurate estimates of harbour seal abundance from haulout data. We present a modelling approach to assess the influence of these variables on seals’ haulout behaviour and, by identifying the combination of covariates during which seal abundance is highest, predict the optimal time and conditions for future surveys. Count data of harbour seals at haulouts in southwest Ireland collected during 2003-2005 were included in mixed additive models together with environmental covariates, including season, time of day and weather conditions. The models show maximum abundance at haulout sites occurred during midday periods during August and in late afternoon/early evening during September. Accurate national and local population estimates are essential for the effective monitoring of the conservation status of the species and for the identification, management and monitoring of Special Areas of Conservation (SAC in accordance with the EU Habitats Directive. Our model based approach provides a useful tool for optimising the timing of harbourseal surveys in Ireland and the modelling framework is useful for predicting optimal survey periods for other protected, endangered or significant species worldwide.
Specific count model for investing the related factors of cost of GERD and functional dyspepsia
Abadi, Alireza; Chaibakhsh, Samira; Safaee, Azadeh; Moghimi-Dehkordi, Bijan
2013-01-01
Aim The purpose of this study is to analyze the cost of GERD and functional dyspepsia for investing its related factors. Background Gastro-oesophageal reflux disease GERD and dyspepsia are the most common symptoms of gastrointestinal disorders. Recent studies showed high prevalence and variety of clinical presentation of these two symptoms imposed enormous economic burden to the society. Cost data that related to economics burden have specific characteristics. So this kind of data needs to specific models. Poisson regression (PR) and negative binomial regression (NB) are the models that were used for analyzing cost data in this paper. Patients and methods This study designed as a cross-sectional household survey from May 2006 to December 2007 on a random sample of individual in the Tehran province, Iran to find the prevalence of gastrointestinal symptoms and disorders and its related factors. The Cost in each item was counted. PR and NB were carried out to the data respectively. Likelihood ratio test was performed for comparison between models. Also Log likelihood, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to compare performance of the models. Results According to Likelihood ratio test and all three criterions that we used to compare performance of the models, NB was the best model for analyzing this cost data. Sex, age and insurance statues were being significant. Conclusion PR and NB models were carried out for this data and according the results improved fit of the NB model over PR, it clearly indicates that over-dispersion is involved due to unobserved heterogeneity and/or clustering. NB model in cost data more appropriate fit than PR. PMID:24834282
Statistical volumetric model for characterization and visualization of prostate cancer
Lu, Jianping; Srikanchana, Rujirutana; McClain, Maxine A.; Wang, Yue J.; Xuan, Jian Hua; Sesterhenn, Isabell A.; Freedman, Matthew T.; Mun, Seong K.
2000-04-01
To reveal the spatial pattern of localized prostate cancer distribution, a 3D statistical volumetric model, showing the probability map of prostate cancer distribution, together with the anatomical structure of the prostate, has been developed from 90 digitally-imaged surgical specimens. Through an enhanced virtual environment with various visualization modes, this master model permits for the first time an accurate characterization and understanding of prostate cancer distribution patterns. The construction of the statistical volumetric model is characterized by mapping all of the individual models onto a generic prostate site model, in which a self-organizing scheme is used to decompose a group of contours representing multifold tumors into localized tumor elements. Next crucial step of creating the master model is the development of an accurate multi- object and non-rigid registration/warping scheme incorporating various variations among these individual moles in true 3D. This is achieved with a multi-object based principle-axis alignment followed by an affine transform, and further fine-tuned by a thin-plate spline interpolation driven by the surface based deformable warping dynamics. Based on the accurately mapped tumor distribution, a standard finite normal mixture is used to model the cancer volumetric distribution statistics, whose parameters are estimated using both the K-means and expectation- maximization algorithms under the information theoretic criteria. Given the desired number of tissue samplings, the prostate needle biopsy site selection is optimized through a probabilistic self-organizing map thus achieving a maximum likelihood of cancer detection. We describe the details of our theory and methodology, and report our pilot results and evaluation of the effectiveness of the algorithm in characterizing prostate cancer distributions and optimizing needle biopsy techniques.
Spin studies of nucleons in a statistical model
International Nuclear Information System (INIS)
Singh, J P; Upadhyay, Alka
2004-01-01
We decompose various quark-gluon Fock states of a nucleon in a set of states in which each of the three-quark core and the rest of the stuff, termed as sea, appears with definite spin and colour quantum number, their weights being determined, statistically, from their multiplicities. The expansion coefficients in the quark-gluon Fock state expansion have been taken from a recently proposed statistical model. We have also considered two modifications of this model with a view to reducing the contributions of the sea components with higher multiplicities. With certain approximations, we have calculated the quark contributions to the spin of the nucleon, the ratio of the magnetic moments of nucleons, their weak decay constant and the ratio of SU(3) reduced matrix elements for the axial current
Statistical inference to advance network models in epidemiology.
Welch, David; Bansal, Shweta; Hunter, David R
2011-03-01
Contact networks are playing an increasingly important role in the study of epidemiology. Most of the existing work in this area has focused on considering the effect of underlying network structure on epidemic dynamics by using tools from probability theory and computer simulation. This work has provided much insight on the role that heterogeneity in host contact patterns plays on infectious disease dynamics. Despite the important understanding afforded by the probability and simulation paradigm, this approach does not directly address important questions about the structure of contact networks such as what is the best network model for a particular mode of disease transmission, how parameter values of a given model should be estimated, or how precisely the data allow us to estimate these parameter values. We argue that these questions are best answered within a statistical framework and discuss the role of statistical inference in estimating contact networks from epidemiological data. Copyright © 2011 Elsevier B.V. All rights reserved.
A statistical method for descriminating between alternative radiobiological models
International Nuclear Information System (INIS)
Kinsella, I.A.; Malone, J.F.
1977-01-01
Radiobiological models assist understanding of the development of radiation damage, and may provide a basis for extrapolating dose-effect curves from high to low dose regions. Many models have been proposed such as multitarget and its modifications, enzymatic models, and those with a quadratic dose response relationship (i.e. αD + βD 2 forms). It is difficult to distinguish between these because the statistical techniques used are almost always limited, in that one method can rarely be applied to the whole range of models. A general statistical procedure for parameter estimation (Maximum Liklihood Method) has been found applicable to a wide range of radiobiological models. The curve parameters are estimated using a computerised search that continues until the most likely set of values to fit the data is obtained. When the search is complete two procedures are carried out. First a goodness of fit test is applied which examines the applicability of an individual model to the data. Secondly an index is derived which provides an indication of the adequacy of any model compared with alternative models. Thus the models may be ranked according to how well they fit the data. For example, with one set of data, multitarget types were found to be more suitable than quadratic types (αD + βD 2 ). This method should be of assitance is evaluating various models. It may also be profitably applied to selection of the most appropriate model to use, when it is necessary to extrapolate from high to low doses
A statistical model of structure functions and quantum chromodynamics
International Nuclear Information System (INIS)
Mac, E.; Ugaz, E.; Universidad Nacional de Ingenieria, Lima
1989-01-01
We consider a model for the x-dependence of the quark distributions in the proton. Within the context of simple statistical assumptions, we obtain the parton densities in the infinite momentum frame. In a second step lowest order QCD corrections are incorporated to these distributions. Crude, but reasonable, agreement with experiment is found for the F 2 , valence and q, anti q distributions for x> or approx.0.2. (orig.)
A Statistical Model for Soliton Particle Interaction in Plasmas
DEFF Research Database (Denmark)
Dysthe, K. B.; Pécseli, Hans; Truelsen, J.
1986-01-01
A statistical model for soliton-particle interaction is presented. A master equation is derived for the time evolution of the particle velocity distribution as induced by resonant interaction with Korteweg-de Vries solitons. The detailed energy balance during the interaction subsequently determines...... the evolution of the soliton amplitude distribution. The analysis applies equally well for weakly nonlinear plasma waves in a strongly magnetized waveguide, or for ion acoustic waves propagating in one-dimensional systems....
Physical-Statistical Model of Thermal Conductivity of Nanofluids
Directory of Open Access Journals (Sweden)
B. Usowicz
2014-01-01
Full Text Available A physical-statistical model for predicting the effective thermal conductivity of nanofluids is proposed. The volumetric unit of nanofluids in the model consists of solid, liquid, and gas particles and is treated as a system made up of regular geometric figures, spheres, filling the volumetric unit by layers. The model assumes that connections between layers of the spheres and between neighbouring spheres in the layer are represented by serial and parallel connections of thermal resistors, respectively. This model is expressed in terms of thermal resistance of nanoparticles and fluids and the multinomial distribution of particles in the nanofluids. The results for predicted and measured effective thermal conductivity of several nanofluids (Al2O3/ethylene glycol-based and Al2O3/water-based; CuO/ethylene glycol-based and CuO/water-based; and TiO2/ethylene glycol-based are presented. The physical-statistical model shows a reasonably good agreement with the experimental results and gives more accurate predictions for the effective thermal conductivity of nanofluids compared to existing classical models.
Statistical modeling of global geogenic fluoride contamination in groundwaters.
Amini, Manouchehr; Mueller, Kim; Abbaspour, Karim C; Rosenberg, Thomas; Afyuni, Majid; Møller, Klaus N; Sarr, Mamadou; Johnson, C Annette
2008-05-15
The use of groundwater with high fluoride concentrations poses a health threat to millions of people around the world. This study aims at providing a global overview of potentially fluoride-rich groundwaters by modeling fluoride concentration. A large database of worldwide fluoride concentrations as well as available information on related environmental factors such as soil properties, geological settings, and climatic and topographical information on a global scale have all been used in the model. The modeling approach combines geochemical knowledge with statistical methods to devise a rule-based statistical procedure, which divides the world into 8 different "process regions". For each region a separate predictive model was constructed. The end result is a global probability map of fluoride concentration in the groundwater. Comparisons of the modeled and measured data indicate that 60-70% of the fluoride variation could be explained by the models in six process regions, while in two process regions only 30% of the variation in the measured data was explained. Furthermore, the global probability map corresponded well with fluorotic areas described in the international literature. Although the probability map should not replace fluoride testing, it can give a first indication of possible contamination and thus may support the planning process of new drinking water projects.
Testa, Francesco; Marano, Giuseppe; Ambrogi, Federico; Boracchi, Patrizia; Casula, Antonio; Biganzoli, Elia; Moroni, Paolo
2017-10-01
Elevated bulk tank milk somatic cell count (BMSCC) has a negative impact on milk production, milk quality, and animal health. Seasonal increases in herd level somatic cell count (SCC) are commonly associated with elevated environmental temperature and humidity. The Temperature Humidity Index (THI) has been developed to measure general environmental stress in dairy cattle; however, additional work is needed to determine a specific effect of the heat stress index on herd-level SCC. Generalized Additive Model methods were used for a flexible exploration of the relationships between daily temperature, relative humidity, and bulk milk somatic cell count. The data consist of BMSCC and meteorological recordings collected between March 2009 and October 2011 of 10 dairy farms. The results indicate that, an average increase of 0.16% of BMSCC is expected for an increase of 1°C degree of temperature. A complex relationship was found for relative humidity. For example, increase of 0.099%, 0.037% and 0.020% are expected in correspondence to an increase of relative humidity from 50% to 51%, 80% to 81%; and 90% to 91%, respectively. Using this model, it will be possible to provide evidence-based advice to dairy farmers for the use of THI control charts created on the basis of our statistical model. Copyright © 2017 Elsevier Ltd. All rights reserved.
UPPAAL-SMC: Statistical Model Checking for Priced Timed Automata
DEFF Research Database (Denmark)
Bulychev, Petr; David, Alexandre; Larsen, Kim Guldstrand
2012-01-01
in the form of probability distributions and compare probabilities to analyze performance aspects of systems. The focus of the survey is on the evolution of the tool – including modeling and specification formalisms as well as techniques applied – together with applications of the tool to case studies....... on a series of extensions of the statistical model checking approach generalized to handle real-time systems and estimate undecidable problems. U PPAAL - SMC comes together with a friendly user interface that allows a user to specify complex problems in an efficient manner as well as to get feedback...
Statistical mechanics of attractor neural network models with synaptic depression
International Nuclear Information System (INIS)
Igarashi, Yasuhiko; Oizumi, Masafumi; Otsubo, Yosuke; Nagata, Kenji; Okada, Masato
2009-01-01
Synaptic depression is known to control gain for presynaptic inputs. Since cortical neurons receive thousands of presynaptic inputs, and their outputs are fed into thousands of other neurons, the synaptic depression should influence macroscopic properties of neural networks. We employ simple neural network models to explore the macroscopic effects of synaptic depression. Systems with the synaptic depression cannot be analyzed due to asymmetry of connections with the conventional equilibrium statistical-mechanical approach. Thus, we first propose a microscopic dynamical mean field theory. Next, we derive macroscopic steady state equations and discuss the stabilities of steady states for various types of neural network models.
Linguistically motivated statistical machine translation models and algorithms
Xiong, Deyi
2015-01-01
This book provides a wide variety of algorithms and models to integrate linguistic knowledge into Statistical Machine Translation (SMT). It helps advance conventional SMT to linguistically motivated SMT by enhancing the following three essential components: translation, reordering and bracketing models. It also serves the purpose of promoting the in-depth study of the impacts of linguistic knowledge on machine translation. Finally it provides a systematic introduction of Bracketing Transduction Grammar (BTG) based SMT, one of the state-of-the-art SMT formalisms, as well as a case study of linguistically motivated SMT on a BTG-based platform.
Efficient Parallel Statistical Model Checking of Biochemical Networks
Directory of Open Access Journals (Sweden)
Paolo Ballarini
2009-12-01
Full Text Available We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.
Statistical models for expert judgement and wear prediction
International Nuclear Information System (INIS)
Pulkkinen, U.
1994-01-01
This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)
Improving EWMA Plans for Detecting Unusual Increases in Poisson Counts
Directory of Open Access Journals (Sweden)
R. S. Sparks
2009-01-01
adaptive exponentially weighted moving average (EWMA plan is developed for signalling unusually high incidence when monitoring a time series of nonhomogeneous daily disease counts. A Poisson transitional regression model is used to fit background/expected trend in counts and provides “one-day-ahead” forecasts of the next day's count. Departures of counts from their forecasts are monitored. The paper outlines an approach for improving early outbreak data signals by dynamically adjusting the exponential weights to be efficient at signalling local persistent high side changes. We emphasise outbreak signals in steady-state situations; that is, changes that occur after the EWMA statistic had run through several in-control counts.
Model-generated air quality statistics for application in vegetation response models in Alberta
International Nuclear Information System (INIS)
McVehil, G.E.; Nosal, M.
1990-01-01
To test and apply vegetation response models in Alberta, air pollution statistics representative of various parts of the Province are required. At this time, air quality monitoring data of the requisite accuracy and time resolution are not available for most parts of Alberta. Therefore, there exists a need to develop appropriate air quality statistics. The objectives of the work reported here were to determine the applicability of model generated air quality statistics and to develop by modelling, realistic and representative time series of hourly SO 2 concentrations that could be used to generate the statistics demanded by vegetation response models
The Impact of Statistical Leakage Models on Design Yield Estimation
Directory of Open Access Journals (Sweden)
Rouwaida Kanj
2011-01-01
Full Text Available Device mismatch and process variation models play a key role in determining the functionality and yield of sub-100 nm design. Average characteristics are often of interest, such as the average leakage current or the average read delay. However, detecting rare functional fails is critical for memory design and designers often seek techniques that enable accurately modeling such events. Extremely leaky devices can inflict functionality fails. The plurality of leaky devices on a bitline increase the dimensionality of the yield estimation problem. Simplified models are possible by adopting approximations to the underlying sum of lognormals. The implications of such approximations on tail probabilities may in turn bias the yield estimate. We review different closed form approximations and compare against the CDF matching method, which is shown to be most effective method for accurate statistical leakage modeling.
The GNASH preequilibrium-statistical nuclear model code
International Nuclear Information System (INIS)
Arthur, E. D.
1988-01-01
The following report is based on materials presented in a series of lectures at the International Center for Theoretical Physics, Trieste, which were designed to describe the GNASH preequilibrium statistical model code and its use. An overview is provided of the code with emphasis upon code's calculational capabilities and the theoretical models that have been implemented in it. Two sample problems are discussed, the first dealing with neutron reactions on 58 Ni. the second illustrates the fission model capabilities implemented in the code and involves n + 235 U reactions. Finally a description is provided of current theoretical model and code development underway. Examples of calculated results using these new capabilities are also given. 19 refs., 17 figs., 3 tabs
Estimating Predictive Variance for Statistical Gas Distribution Modelling
International Nuclear Information System (INIS)
Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo
2009-01-01
Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.
Directory of Open Access Journals (Sweden)
Prabhakaran T. Raghu
2014-07-01
Full Text Available Sustainable agricultural practices require, among other factors, adoption of improved nutrient management techniques, pest mitigation technology and soil conservation measures. Such improved management practices can be tools for enhancing crop productivity. Data on micro-level farm management practices from developing countries is either scarce or unavailable, despite the importance of their policy implications with regard to resource allocation. The present study investigates adoption of some farm management practices and factors influencing the adoption behavior of farm households in three agrobiodiversity hotspots in India: Kundra block in the Koraput district of Odisha, Meenangadi panchayat in the Wayanad district of Kerala and Kolli Hills in the Namakkal district of Tamil Nadu. Information on farm management practices was collected from November 2011 to February 2012 from 3845 households, of which the data from 2726 farm households was used for analysis. The three most popular farm management practices adopted by farmers include: application of chemical fertilizers, farm yard manure and green manure for managing nutrients; application of chemical pesticides, inter-cropping and mixed cropping for mitigating pests; and contour bunds, grass bunds and trenches for soil conservation. A Negative Binomial count data regression model was used to estimate factors influencing decision-making by farmers on farm management practices. The regression results indicate that farmers who received information from agricultural extension are statistically significant and positively related to the adoption of farm management practices. Another key finding shows the negative relationship between cultivation of local varieties and adoption of farm management practices.
Editorial to: Six papers on Dynamic Statistical Models
DEFF Research Database (Denmark)
2014-01-01
statistical methodology and theory for large and complex data sets that included biostatisticians and mathematical statisticians from three faculties at the University of Copenhagen. The satellite meeting took place August 17–19, 2011. Its purpose was to bring together researchers in statistics and related...... Group-Sequential Covariate-Adjusted Randomized Clinical Trials Antoine Chambaz and Mark J. van der Laan Estimation of Causal Odds of Concordance using the Aalen Additive Model Torben Martinussen and Christian Bressen Pipper We would like to acknowledge the financial support from the University...... of Copenhagen Program of Excellence and Elsevier. We would also like to thank the authors for contributing interesting papers, the referees for their helpful reports, and the present and previous editors of SJS for their support of the publication of the papers from the satellite meeting....
The Statistical Multifragmentation Model with Skyrme Effective Interactions
Carlson, B V; Donangelo, R; Lynch, W G; Steiner, A W; Tsang, M B
2010-01-01
The Statistical Multifragmentation Model is modified to incorporate Helmholtz free energies calculated in the finite temperature Thomas-Fermi approximation using Skyrme effective interactions. In this formulation, the density of the fragments at the freeze-out configuration corresponds to the equilibrium value obtained in the Thomas-Fermi approximation at the given temperature. The behavior of the nuclear caloric curve, at constant volume, is investigated in the micro-canonical ensemble and a plateau is observed for excitation energies between 8 and 10 MeV per nucleon. A small kink in the caloric curve is found at the onset of this gas transition, indicating the existence of negative heat capacity, even in this case in which the system is constrained to a fixed volume, in contrast to former statistical calculations.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
Model output statistics applied to wind power prediction
Energy Technology Data Exchange (ETDEWEB)
Joensen, A.; Giebel, G.; Landberg, L. [Risoe National Lab., Roskilde (Denmark); Madsen, H.; Nielsen, H.A. [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)
1999-03-01
Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.
A statistical model for interpreting computerized dynamic posturography data
Feiveson, Alan H.; Metter, E. Jeffrey; Paloski, William H.
2002-01-01
Computerized dynamic posturography (CDP) is widely used for assessment of altered balance control. CDP trials are quantified using the equilibrium score (ES), which ranges from zero to 100, as a decreasing function of peak sway angle. The problem of how best to model and analyze ESs from a controlled study is considered. The ES often exhibits a skewed distribution in repeated trials, which can lead to incorrect inference when applying standard regression or analysis of variance models. Furthermore, CDP trials are terminated when a patient loses balance. In these situations, the ES is not observable, but is assigned the lowest possible score--zero. As a result, the response variable has a mixed discrete-continuous distribution, further compromising inference obtained by standard statistical methods. Here, we develop alternative methodology for analyzing ESs under a stochastic model extending the ES to a continuous latent random variable that always exists, but is unobserved in the event of a fall. Loss of balance occurs conditionally, with probability depending on the realized latent ES. After fitting the model by a form of quasi-maximum-likelihood, one may perform statistical inference to assess the effects of explanatory variables. An example is provided, using data from the NIH/NIA Baltimore Longitudinal Study on Aging.
Moghimbeigi, Abbas
2015-05-07
Poisson regression models provide a standard framework for quantitative trait locus (QTL) mapping of count traits. In practice, however, count traits are often over-dispersed relative to the Poisson distribution. In these situations, the zero-inflated Poisson (ZIP), zero-inflated generalized Poisson (ZIGP) and zero-inflated negative binomial (ZINB) regression may be useful for QTL mapping of count traits. Added genetic variables to the negative binomial part equation, may also affect extra zero data. In this study, to overcome these challenges, I apply two-part ZINB model. The EM algorithm with Newton-Raphson method in the M-step uses for estimating parameters. An application of the two-part ZINB model for QTL mapping is considered to detect associations between the formation of gallstone and the genotype of markers. Copyright © 2015 Elsevier Ltd. All rights reserved.
Carb counting; Carbohydrate-controlled diet; Diabetic diet; Diabetes-counting carbohydrates ... Many foods contain carbohydrates (carbs), including: Fruit and fruit juice Cereal, bread, pasta, and rice Milk and milk products, soy milk Beans, legumes, ...
National Oceanic and Atmospheric Administration, Department of Commerce — Database of seal counts from aerial photography. Counts by image, site, species, and date are stored in the database along with information on entanglements and...
Deng, Chenhui; Plan, Elodie L; Karlsson, Mats O
2016-06-01
Parameter variation in pharmacometric analysis studies can be characterized as within subject parameter variability (WSV) in pharmacometric models. WSV has previously been successfully modeled using inter-occasion variability (IOV), but also stochastic differential equations (SDEs). In this study, two approaches, dynamic inter-occasion variability (dIOV) and adapted stochastic differential equations, were proposed to investigate WSV in pharmacometric count data analysis. These approaches were applied to published count models for seizure counts and Likert pain scores. Both approaches improved the model fits significantly. In addition, stochastic simulation and estimation were used to explore further the capability of the two approaches to diagnose and improve models where existing WSV is not recognized. The results of simulations confirmed the gain in introducing WSV as dIOV and SDEs when parameters vary randomly over time. Further, the approaches were also informative as diagnostics of model misspecification, when parameters changed systematically over time but this was not recognized in the structural model. The proposed approaches in this study offer strategies to characterize WSV and are not restricted to count data.
Heiss, Jonathan A; Breitling, Lutz P; Lehne, Benjamin; Kooner, Jaspal S; Chambers, John C; Brenner, Hermann
2017-01-01
Whole-blood DNA methylation depends on the underlying leukocyte composition and confounding hereby is a major concern in epigenome-wide association studies. Cell counts are often missing or may not be feasible. Computational approaches estimate leukocyte composition from DNA methylation based on reference datasets of purified leukocytes. We explored the possibility to train such a model on whole-blood DNA methylation and cell counts without the need for purification. Using whole-blood DNA methylation and corresponding five-part cell counts from 2445 participants from the London Life Sciences Prospective Population Study, a model was trained on a subset of 175 subjects and evaluated on the remaining. Correlations between cell counts and estimated cell proportions were high (neutrophils 0.85, eosinophils 0.88, basophils 0.02, lymphocytes 0.84, monocytes 0.55) and estimated proportions explained more variance in whole-blood DNA methylation levels than counts. Our model provided precise estimates for the common cell types.
Dynamic statistical models of biological cognition: insights from communications theory
Wallace, Rodrick
2014-10-01
Maturana's cognitive perspective on the living state, Dretske's insight on how information theory constrains cognition, the Atlan/Cohen cognitive paradigm, and models of intelligence without representation, permit construction of a spectrum of dynamic necessary conditions statistical models of signal transduction, regulation, and metabolism at and across the many scales and levels of organisation of an organism and its context. Nonequilibrium critical phenomena analogous to physical phase transitions, driven by crosstalk, will be ubiquitous, representing not only signal switching, but the recruitment of underlying cognitive modules into tunable dynamic coalitions that address changing patterns of need and opportunity at all scales and levels of organisation. The models proposed here, while certainly providing much conceptual insight, should be most useful in the analysis of empirical data, much as are fitted regression equations.
Ziarkash, Abdul Waris; Joshi, Siddarth Koduru; Stipčević, Mario; Ursin, Rupert
2018-03-22
Single-photon avalanche diode (SPAD) detectors, have a great importance in fields like quantum key distribution, laser ranging, florescence microscopy, etc. Afterpulsing is a non-ideal behavior of SPADs that adversely affects any application that measures the number or timing of detection events. Several studies based on a few individual detectors, derived distinct mathematical models from semiconductor physics perspectives. With a consistent testing procedure and statistically large data sets, we show that different individual detectors - even if identical in type, make, brand, etc. - behave according to fundamentally different mathematical models. Thus, every detector must be characterized individually and it is wrong to draw universal conclusions about the physical meaning behind these models. We also report the presence of high-order afterpulses that are not accounted for in any of the standard models.
BOX-COX transformation and random regression models for fecal egg count data
Directory of Open Access Journals (Sweden)
Marcos Vinicius Silva
2012-01-01
Full Text Available Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants fecal egg count (FEC is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used to achieve normality before analysis. However, the transformed data are often not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6,375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (covariance components. We also proposed using random regression models (RRM for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4 adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
A Statistical Graphical Model of the California Reservoir System
Taeb, A.; Reager, J. T.; Turmon, M.; Chandrasekaran, V.
2017-11-01
The recent California drought has highlighted the potential vulnerability of the state's water management infrastructure to multiyear dry intervals. Due to the high complexity of the network, dynamic storage changes in California reservoirs on a state-wide scale have previously been difficult to model using either traditional statistical or physical approaches. Indeed, although there is a significant line of research on exploring models for single (or a small number of) reservoirs, these approaches are not amenable to a system-wide modeling of the California reservoir network due to the spatial and hydrological heterogeneities of the system. In this work, we develop a state-wide statistical graphical model to characterize the dependencies among a collection of 55 major California reservoirs across the state; this model is defined with respect to a graph in which the nodes index reservoirs and the edges specify the relationships or dependencies between reservoirs. We obtain and validate this model in a data-driven manner based on reservoir volumes over the period 2003-2016. A key feature of our framework is a quantification of the effects of external phenomena that influence the entire reservoir network. We further characterize the degree to which physical factors (e.g., state-wide Palmer Drought Severity Index (PDSI), average temperature, snow pack) and economic factors (e.g., consumer price index, number of agricultural workers) explain these external influences. As a consequence of this analysis, we obtain a system-wide health diagnosis of the reservoir network as a function of PDSI.
MASKED AREAS IN SHEAR PEAK STATISTICS: A FORWARD MODELING APPROACH
Energy Technology Data Exchange (ETDEWEB)
Bard, D. [KIPAC, SLAC National Accelerator Laboratory, 2575 Sand Hill Rd, Menlo Park, CA 94025 (United States); Kratochvil, J. M. [Astrophysics and Cosmology Research Unit, University of KwaZulu-Natal, Westville, Durban 4000 (South Africa); Dawson, W., E-mail: djbard@slac.stanford.edu [Lawrence Livermore National Laboratory, 7000 East Ave, Livermore, CA 94550 (United States)
2016-03-10
The statistics of shear peaks have been shown to provide valuable cosmological information beyond the power spectrum, and will be an important constraint of models of cosmology in forthcoming astronomical surveys. Surveys include masked areas due to bright stars, bad pixels etc., which must be accounted for in producing constraints on cosmology from shear maps. We advocate a forward-modeling approach, where the impacts of masking and other survey artifacts are accounted for in the theoretical prediction of cosmological parameters, rather than correcting survey data to remove them. We use masks based on the Deep Lens Survey, and explore the impact of up to 37% of the survey area being masked on LSST and DES-scale surveys. By reconstructing maps of aperture mass the masking effect is smoothed out, resulting in up to 14% smaller statistical uncertainties compared to simply reducing the survey area by the masked area. We show that, even in the presence of large survey masks, the bias in cosmological parameter estimation produced in the forward-modeling process is ≈1%, dominated by bias caused by limited simulation volume. We also explore how this potential bias scales with survey area and evaluate how much small survey areas are impacted by the differences in cosmological structure in the data and simulated volumes, due to cosmic variance.
Development of modelling algorithm of technological systems by statistical tests
Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.
2018-03-01
The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.
A Tensor Statistical Model for Quantifying Dynamic Functional Connectivity.
Zhu, Yingying; Zhu, Xiaofeng; Kim, Minjeong; Yan, Jin; Wu, Guorong
2017-06-01
Functional connectivity (FC) has been widely investigated in many imaging-based neuroscience and clinical studies. Since functional Magnetic Resonance Image (MRI) signal is just an indirect reflection of brain activity, it is difficult to accurately quantify the FC strength only based on signal correlation. To address this limitation, we propose a learning-based tensor model to derive high sensitivity and specificity connectome biomarkers at the individual level from resting-state fMRI images. First, we propose a learning-based approach to estimate the intrinsic functional connectivity. In addition to the low level region-to-region signal correlation, latent module-to-module connection is also estimated and used to provide high level heuristics for measuring connectivity strength. Furthermore, sparsity constraint is employed to automatically remove the spurious connections, thus alleviating the issue of searching for optimal threshold. Second, we integrate our learning-based approach with the sliding-window technique to further reveal the dynamics of functional connectivity. Specifically, we stack the functional connectivity matrix within each sliding window and form a 3D tensor where the third dimension denotes for time. Then we obtain dynamic functional connectivity (dFC) for each individual subject by simultaneously estimating the within-sliding-window functional connectivity and characterizing the across-sliding-window temporal dynamics. Third, in order to enhance the robustness of the connectome patterns extracted from dFC, we extend the individual-based 3D tensors to a population-based 4D tensor (with the fourth dimension stands for the training subjects) and learn the statistics of connectome patterns via 4D tensor analysis. Since our 4D tensor model jointly (1) optimizes dFC for each training subject and (2) captures the principle connectome patterns, our statistical model gains more statistical power of representing new subject than current state
Statistical Models for Inferring Vegetation Composition from Fossil Pollen
Paciorek, C.; McLachlan, J. S.; Shang, Z.
2011-12-01
Fossil pollen provide information about vegetation composition that can be used to help understand how vegetation has changed over the past. However, these data have not traditionally been analyzed in a way that allows for statistical inference about spatio-temporal patterns and trends. We build a Bayesian hierarchical model called STEPPS (Spatio-Temporal Empirical Prediction from Pollen in Sediments) that predicts forest composition in southern New England, USA, over the last two millenia based on fossil pollen. The critical relationships between abundances of tree taxa in the pollen record and abundances in actual vegetation are estimated using modern (Forest Inventory Analysis) data and (witness tree) data from colonial records. This gives us two time points at which both pollen and direct vegetation data are available. Based on these relationships, and incorporating our uncertainty about them, we predict forest composition using fossil pollen. We estimate the spatial distribution and relative abundances of tree species and draw inference about how these patterns have changed over time. Finally, we describe ongoing work to extend the modeling to the upper Midwest of the U.S., including an approach to infer tree density and thereby estimate the prairie-forest boundary in Minnesota and Wisconsin. This work is part of the PalEON project, which brings together a team of ecosystem modelers, paleoecologists, and statisticians with the goal of reconstructing vegetation responses to climate during the last two millenia in the northeastern and midwestern United States. The estimates from the statistical modeling will be used to assess and calibrate ecosystem models that are used to project ecological changes in response to global change.
Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.
2017-12-01
Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.
Duarte, Adam; Adams, Michael J.; Peterson, James T.
2018-01-01
Monitoring animal populations is central to wildlife and fisheries management, and the use of N-mixture models toward these efforts has markedly increased in recent years. Nevertheless, relatively little work has evaluated estimator performance when basic assumptions are violated. Moreover, diagnostics to identify when bias in parameter estimates from N-mixture models is likely is largely unexplored. We simulated count data sets using 837 combinations of detection probability, number of sample units, number of survey occasions, and type and extent of heterogeneity in abundance or detectability. We fit Poisson N-mixture models to these data, quantified the bias associated with each combination, and evaluated if the parametric bootstrap goodness-of-fit (GOF) test can be used to indicate bias in parameter estimates. We also explored if assumption violations can be diagnosed prior to fitting N-mixture models. In doing so, we propose a new model diagnostic, which we term the quasi-coefficient of variation (QCV). N-mixture models performed well when assumptions were met and detection probabilities were moderate (i.e., ≥0.3), and the performance of the estimator improved with increasing survey occasions and sample units. However, the magnitude of bias in estimated mean abundance with even slight amounts of unmodeled heterogeneity was substantial. The parametric bootstrap GOF test did not perform well as a diagnostic for bias in parameter estimates when detectability and sample sizes were low. The results indicate the QCV is useful to diagnose potential bias and that potential bias associated with unidirectional trends in abundance or detectability can be diagnosed using Poisson regression. This study represents the most thorough assessment to date of assumption violations and diagnostics when fitting N-mixture models using the most commonly implemented error distribution. Unbiased estimates of population state variables are needed to properly inform management decision
Statistical Agent Based Modelization of the Phenomenon of Drug Abuse
di Clemente, Riccardo; Pietronero, Luciano
2012-07-01
We introduce a statistical agent based model to describe the phenomenon of drug abuse and its dynamical evolution at the individual and global level. The agents are heterogeneous with respect to their intrinsic inclination to drugs, to their budget attitude and social environment. The various levels of drug use were inspired by the professional description of the phenomenon and this permits a direct comparison with all available data. We show that certain elements have a great importance to start the use of drugs, for example the rare events in the personal experiences which permit to overcame the barrier of drug use occasionally. The analysis of how the system reacts to perturbations is very important to understand its key elements and it provides strategies for effective policy making. The present model represents the first step of a realistic description of this phenomenon and can be easily generalized in various directions.
A statistical model of Rift Valley fever activity in Egypt.
Drake, John M; Hassan, Ali N; Beier, John C
2013-12-01
Rift Valley fever (RVF) is a viral disease of animals and humans and a global public health concern due to its ecological plasticity, adaptivity, and potential for spread to countries with a temperate climate. In many places, outbreaks are episodic and linked to climatic, hydrologic, and socioeconomic factors. Although outbreaks of RVF have occurred in Egypt since 1977, attempts to identify risk factors have been limited. Using a statistical learning approach (lasso-regularized generalized linear model), we tested the hypotheses that outbreaks in Egypt are linked to (1) River Nile conditions that create a mosquito vector habitat, (2) entomologic conditions favorable to transmission, (3) socio-economic factors (Islamic festival of Greater Bairam), and (4) recent history of transmission activity. Evidence was found for effects of rainfall and river discharge and recent history of transmission activity. There was no evidence for an effect of Greater Bairam. The model predicted RVF activity correctly in 351 of 358 months (98.0%). This is the first study to statistically identify risk factors for RVF outbreaks in a region of unstable transmission. © 2013 The Society for Vector Ecology.
Statistical modeling of global geogenic arsenic contamination in groundwater.
Amini, Manouchehr; Abbaspour, Karim C; Berg, Michael; Winkel, Lenny; Hug, Stephan J; Hoehn, Eduard; Yang, Hong; Johnson, C Annette
2008-05-15
Contamination of groundwaters with geogenic arsenic poses a major health risk to millions of people. Although the main geochemical mechanisms of arsenic mobilization are well understood, the worldwide scale of affected regions is still unknown. In this study we used a large database of measured arsenic concentration in groundwaters (around 20,000 data points) from around the world as well as digital maps of physical characteristics such as soil, geology, climate, and elevation to model probability maps of global arsenic contamination. A novel rule-based statistical procedure was used to combine the physical data and expert knowledge to delineate two process regions for arsenic mobilization: "reducing" and "high-pH/ oxidizing". Arsenic concentrations were modeled in each region using regression analysis and adaptive neuro-fuzzy inferencing followed by Latin hypercube sampling for uncertainty propagation to produce probability maps. The derived global arsenic models could benefit from more accurate geologic information and aquifer chemical/physical information. Using some proxy surface information, however, the models explained 77% of arsenic variation in reducing regions and 68% of arsenic variation in high-pH/oxidizing regions. The probability maps based on the above models correspond well with the known contaminated regions around the world and delineate new untested areas that have a high probability of arsenic contamination. Notable among these regions are South East and North West of China in Asia, Central Australia, New Zealand, Northern Afghanistan, and Northern Mali and Zambia in Africa.
Joint modeling of longitudinal zero-inflated count and time-to-event data: A Bayesian perspective.
Zhu, Huirong; DeSantis, Stacia M; Luo, Sheng
2018-04-01
Longitudinal zero-inflated count data are encountered frequently in substance-use research when assessing the effects of covariates and risk factors on outcomes. Often, both the time to a terminal event such as death or dropout and repeated measure count responses are collected for each subject. In this setting, the longitudinal counts are censored by the terminal event, and the time to the terminal event may depend on the longitudinal outcomes. In the study described herein, we expand the class of joint models for longitudinal and survival data to accommodate zero-inflated counts and time-to-event data by using a Cox proportional hazards model with piecewise constant baseline hazard. We use a Bayesian framework via Markov chain Monte Carlo simulations implemented in the BUGS programming language. Via an extensive simulation study, we apply the joint model and obtain estimates that are more accurate than those of the corresponding independence model. We apply the proposed method to an alpha-tocopherol, beta-carotene lung cancer prevention study.
Flashover of a vacuum-insulator interface: A statistical model
Directory of Open Access Journals (Sweden)
W. A. Stygar
2004-07-01
Full Text Available We have developed a statistical model for the flashover of a 45° vacuum-insulator interface (such as would be found in an accelerator subject to a pulsed electric field. The model assumes that the initiation of a flashover plasma is a stochastic process, that the characteristic statistical component of the flashover delay time is much greater than the plasma formative time, and that the average rate at which flashovers occur is a power-law function of the instantaneous value of the electric field. Under these conditions, we find that the flashover probability is given by 1-exp(-E_{p}^{β}t_{eff}C/k^{β}, where E_{p} is the peak value in time of the spatially averaged electric field E(t, t_{eff}≡∫[E(t/E_{p}]^{β}dt is the effective pulse width, C is the insulator circumference, k∝exp(λ/d, and β and λ are constants. We define E(t as V(t/d, where V(t is the voltage across the insulator and d is the insulator thickness. Since the model assumes that flashovers occur at random azimuthal locations along the insulator, it does not apply to systems that have a significant defect, i.e., a location contaminated with debris or compromised by an imperfection at which flashovers repeatedly take place, and which prevents a random spatial distribution. The model is consistent with flashover measurements to within 7% for pulse widths between 0.5 ns and 10 μs, and to within a factor of 2 between 0.5 ns and 90 s (a span of over 11 orders of magnitude. For these measurements, E_{p} ranges from 64 to 651 kV/cm, d from 0.50 to 4.32 cm, and C from 4.96 to 95.74 cm. The model is significantly more accurate, and is valid over a wider range of parameters, than the J. C. Martin flashover relation that has been in use since 1971 [J. C. Martin on Pulsed Power, edited by T. H. Martin, A. H. Guenther, and M. Kristiansen (Plenum, New York, 1996]. We have generalized the statistical model to estimate the total-flashover probability of an
Statistics Based Models for the Dynamics of Chernivtsi Children Disease
Directory of Open Access Journals (Sweden)
Igor G. Nesteruk
2017-10-01
Full Text Available Background. Simple mathematical models of contamination and SIR-model of spreading an infection were used to simulate the time dynamics of the unknown before children disease, which occurred in Chernivtsi (Ukraine. The cause of many cases of alopecia, which began in this city in August 1988 is still not fully clarified. According to the official report of the governmental commission, the last new cases occurred in the middle of November 1988, and the reason of the illness was reported as chemical exogenous intoxication. Later this illness became the name “Chernivtsi chemical disease”. Nevertheless, the significantly increased number of new cases of the local alopecia was registered almost three years and is still not clarified. Objective. The comparison of two different versions of the disease: chemical exogenous intoxication and infection. Identification of the parameters of mathematical models and prediction of the disease development. Methods. Analytical solutions of the contamination models and SIR-model for an epidemic are obtained. The optimal values of parameters with the use of linear regression were found. Results. The optimal values of the models parameters with the use of statistical approach were identified. The calculations showed that the infectious version of the disease is more reliable in comparison with the popular contamination one. The possible date of the epidemic beginning was estimated. Conclusions. The optimal parameters of SIR-model allow calculating the realistic number of victims and other characteristics of possible epidemic. They also show that increased number of cases of local alopecia could be a part of the same epidemic as “Chernivtsi chemical disease”.
Linear mixed models a practical guide using statistical software
West, Brady T; Galecki, Andrzej T
2014-01-01
Highly recommended by JASA, Technometrics, and other journals, the first edition of this bestseller showed how to easily perform complex linear mixed model (LMM) analyses via a variety of software programs. Linear Mixed Models: A Practical Guide Using Statistical Software, Second Edition continues to lead readers step by step through the process of fitting LMMs. This second edition covers additional topics on the application of LMMs that are valuable for data analysts in all fields. It also updates the case studies using the latest versions of the software procedures and provides up-to-date information on the options and features of the software procedures available for fitting LMMs in SAS, SPSS, Stata, R/S-plus, and HLM.New to the Second Edition A new chapter on models with crossed random effects that uses a case study to illustrate software procedures capable of fitting these models Power analysis methods for longitudinal and clustered study designs, including software options for power analyses and suggest...
Stochastic Spatial Models in Ecology: A Statistical Physics Approach
Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.
2017-11-01
Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.
A Statistical Toolbox For Mining And Modeling Spatial Data
Directory of Open Access Journals (Sweden)
D’Aubigny Gérard
2016-12-01
Full Text Available Most data mining projects in spatial economics start with an evaluation of a set of attribute variables on a sample of spatial entities, looking for the existence and strength of spatial autocorrelation, based on the Moran’s and the Geary’s coefficients, the adequacy of which is rarely challenged, despite the fact that when reporting on their properties, many users seem likely to make mistakes and to foster confusion. My paper begins by a critical appraisal of the classical definition and rational of these indices. I argue that while intuitively founded, they are plagued by an inconsistency in their conception. Then, I propose a principled small change leading to corrected spatial autocorrelation coefficients, which strongly simplifies their relationship, and opens the way to an augmented toolbox of statistical methods of dimension reduction and data visualization, also useful for modeling purposes. A second section presents a formal framework, adapted from recent work in statistical learning, which gives theoretical support to our definition of corrected spatial autocorrelation coefficients. More specifically, the multivariate data mining methods presented here, are easily implementable on the existing (free software, yield methods useful to exploit the proposed corrections in spatial data analysis practice, and, from a mathematical point of view, whose asymptotic behavior, already studied in a series of papers by Belkin & Niyogi, suggests that they own qualities of robustness and a limited sensitivity to the Modifiable Areal Unit Problem (MAUP, valuable in exploratory spatial data analysis.
Statistical Models and Methods for Network Meta-Analysis.
Madden, L V; Piepho, H-P; Paul, P A
2016-08-01
Meta-analysis, the methodology for analyzing the results from multiple independent studies, has grown tremendously in popularity over the last four decades. Although most meta-analyses involve a single effect size (summary result, such as a treatment difference) from each study, there are often multiple treatments of interest across the network of studies in the analysis. Multi-treatment (or network) meta-analysis can be used for simultaneously analyzing the results from all the treatments. However, the methodology is considerably more complicated than for the analysis of a single effect size, and there have not been adequate explanations of the approach for agricultural investigations. We review the methods and models for conducting a network meta-analysis based on frequentist statistical principles, and demonstrate the procedures using a published multi-treatment plant pathology data set. A major advantage of network meta-analysis is that correlations of estimated treatment effects are automatically taken into account when an appropriate model is used. Moreover, treatment comparisons may be possible in a network meta-analysis that are not possible in a single study because all treatments of interest may not be included in any given study. We review several models that consider the study effect as either fixed or random, and show how to interpret model-fitting output. We further show how to model the effect of moderator variables (study-level characteristics) on treatment effects, and present one approach to test for the consistency of treatment effects across the network. Online supplemental files give explanations on fitting the network meta-analytical models using SAS.
A statistical analysis based recommender model for heart disease patients.
Mustaqeem, Anam; Anwar, Syed Muhammad; Khan, Abdul Rashid; Majid, Muhammad
2017-12-01
An intelligent information technology based system could have a positive impact on the life-style of patients suffering from chronic diseases by providing useful health recommendations. In this paper, we have proposed a hybrid model that provides disease prediction and medical recommendations to cardiac patients. The first part aims at implementing a prediction model, that can identify the disease of a patient and classify it into one of the four output classes i.e., non-cardiac chest pain, silent ischemia, angina, and myocardial infarction. Following the disease prediction, the second part of the model provides general medical recommendations to patients. The recommendations are generated by assessing the severity of clinical features of patients, estimating the risk associated with clinical features and disease, and calculating the probability of occurrence of disease. The purpose of this model is to build an intelligent and adaptive recommender system for heart disease patients. The experiments for the proposed recommender system are conducted on a clinical data set collected and labelled in consultation with medical experts from a known hospital. The performance of the proposed prediction model is evaluated using accuracy and kappa statistics as evaluation measures. The medical recommendations are generated based on information collected from a knowledge base created with the help of physicians. The results of the recommendation model are evaluated using confusion matrix and gives an accuracy of 97.8%. The proposed system exhibits good prediction and recommendation accuracies and promises to be a useful contribution in the field of e-health and medical informatics. Copyright © 2017 Elsevier B.V. All rights reserved.
A statistical downscaling model for summer rainfall over Pakistan
Kazmi, Dildar Hussain; Li, Jianping; Ruan, Chengqing; Zhao, Sen; Li, Yanjie
2016-10-01
A statistical approach is utilized to construct an interannual model for summer (July-August) rainfall over the western parts of South Asian Monsoon. Observed monthly rainfall data for selected stations of Pakistan for the last 55 years (1960-2014) is taken as predictand. Recommended climate indices along with the oceanic and atmospheric data on global scales, for the period April-June are employed as predictors. First 40 years data has been taken as training period and the rest as validation period. Cross-validation stepwise regression approach adopted to select the robust predictors. Upper tropospheric zonal wind at 200 hPa over the northeastern Atlantic is finally selected as the best predictor for interannual model. Besides, the next possible candidate `geopotential height at upper troposphere' is taken as the indirect predictor for being a source of energy transportation from core region (northeast Atlantic/western Europe) to the study area. The model performed well for both the training as well as validation period with correlation coefficient of 0.71 and tolerable root mean square errors. Cross-validation of the model has been processed by incorporating JRA-55 data for potential predictors in addition to NCEP and fragmentation of study period to five non-overlapping test samples. Subsequently, to verify the outcome of the model on physical grounds, observational analyses as well as the model simulations are incorporated. It is revealed that originating from the jet exit region through large vorticity gradients, zonally dominating waves may transport energy and momentum to the downstream areas of west-central Asia, that ultimately affect interannual variability of the specific rainfall. It has been detected that both the circumglobal teleconnection and Rossby wave propagation play vital roles in modulating the proposed mechanism.
STATISTICAL MECHANICS MODELING OF MESOSCALE DEFORMATION IN METALS
Energy Technology Data Exchange (ETDEWEB)
Anter El-Azab
2013-04-08
The research under this project focused on a theoretical and computational modeling of dislocation dynamics of mesoscale deformation of metal single crystals. Specifically, the work aimed to implement a continuum statistical theory of dislocations to understand strain hardening and cell structure formation under monotonic loading. These aspects of crystal deformation are manifestations of the evolution of the underlying dislocation system under mechanical loading. The project had three research tasks: 1) Investigating the statistical characteristics of dislocation systems in deformed crystals. 2) Formulating kinetic equations of dislocations and coupling these kinetics equations and crystal mechanics. 3) Computational solution of coupled crystal mechanics and dislocation kinetics. Comparison of dislocation dynamics predictions with experimental results in the area of statistical properties of dislocations and their field was also a part of the proposed effort. In the first research task, the dislocation dynamics simulation method was used to investigate the spatial, orientation, velocity, and temporal statistics of dynamical dislocation systems, and on the use of the results from this investigation to complete the kinetic description of dislocations. The second task focused on completing the formulation of a kinetic theory of dislocations that respects the discrete nature of crystallographic slip and the physics of dislocation motion and dislocation interaction in the crystal. Part of this effort also targeted the theoretical basis for establishing the connection between discrete and continuum representation of dislocations and the analysis of discrete dislocation simulation results within the continuum framework. This part of the research enables the enrichment of the kinetic description with information representing the discrete dislocation systems behavior. The third task focused on the development of physics-inspired numerical methods of solution of the coupled
Smooth extrapolation of unknown anatomy via statistical shape models
Grupp, R. B.; Chiang, H.; Otake, Y.; Murphy, R. J.; Gordon, C. R.; Armand, M.; Taylor, R. H.
2015-03-01
Several methods to perform extrapolation of unknown anatomy were evaluated. The primary application is to enhance surgical procedures that may use partial medical images or medical images of incomplete anatomy. Le Fort-based, face-jaw-teeth transplant is one such procedure. From CT data of 36 skulls and 21 mandibles separate Statistical Shape Models of the anatomical surfaces were created. Using the Statistical Shape Models, incomplete surfaces were projected to obtain complete surface estimates. The surface estimates exhibit non-zero error in regions where the true surface is known; it is desirable to keep the true surface and seamlessly merge the estimated unknown surface. Existing extrapolation techniques produce non-smooth transitions from the true surface to the estimated surface, resulting in additional error and a less aesthetically pleasing result. The three extrapolation techniques evaluated were: copying and pasting of the surface estimate (non-smooth baseline), a feathering between the patient surface and surface estimate, and an estimate generated via a Thin Plate Spline trained from displacements between the surface estimate and corresponding vertices of the known patient surface. Feathering and Thin Plate Spline approaches both yielded smooth transitions. However, feathering corrupted known vertex values. Leave-one-out analyses were conducted, with 5% to 50% of known anatomy removed from the left-out patient and estimated via the proposed approaches. The Thin Plate Spline approach yielded smaller errors than the other two approaches, with an average vertex error improvement of 1.46 mm and 1.38 mm for the skull and mandible respectively, over the baseline approach.
An Improved Statistical Point-source Foreground Model for the Epoch of Reionization
Energy Technology Data Exchange (ETDEWEB)
Murray, S. G.; Trott, C. M.; Jordan, C. H. [ARC Centre of Excellence for All-sky Astrophysics (CAASTRO) (Australia)
2017-08-10
We present a sophisticated statistical point-source foreground model for low-frequency radio Epoch of Reionization (EoR) experiments using the 21 cm neutral hydrogen emission line. Motivated by our understanding of the low-frequency radio sky, we enhance the realism of two model components compared with existing models: the source count distributions as a function of flux density and spatial position (source clustering), extending current formalisms for the foreground covariance of 2D power-spectral modes in 21 cm EoR experiments. The former we generalize to an arbitrarily broken power law, and the latter to an arbitrary isotropically correlated field. This paper presents expressions for the modified covariance under these extensions, and shows that for a more realistic source spatial distribution, extra covariance arises in the EoR window that was previously unaccounted for. Failure to include this contribution can yield bias in the final power-spectrum and under-estimate uncertainties, potentially leading to a false detection of signal. The extent of this effect is uncertain, owing to ignorance of physical model parameters, but we show that it is dependent on the relative abundance of faint sources, to the effect that our extension will become more important for future deep surveys. Finally, we show that under some parameter choices, ignoring source clustering can lead to false detections on large scales, due to both the induced bias and an artificial reduction in the estimated measurement uncertainty.
Optimizing DNA assembly based on statistical language modelling.
Fang, Gang; Zhang, Shemin; Dong, Yafei
2017-12-15
By successively assembling genetic parts such as BioBrick according to grammatical models, complex genetic constructs composed of dozens of functional blocks can be built. However, usually every category of genetic parts includes a few or many parts. With increasing quantity of genetic parts, the process of assembling more than a few sets of these parts can be expensive, time consuming and error prone. At the last step of assembling it is somewhat difficult to decide which part should be selected. Based on statistical language model, which is a probability distribution P(s) over strings S that attempts to reflect how frequently a string S occurs as a sentence, the most commonly used parts will be selected. Then, a dynamic programming algorithm was designed to figure out the solution of maximum probability. The algorithm optimizes the results of a genetic design based on a grammatical model and finds an optimal solution. In this way, redundant operations can be reduced and the time and cost required for conducting biological experiments can be minimized. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Critical, statistical, and thermodynamical properties of lattice models
Energy Technology Data Exchange (ETDEWEB)
Varma, Vipin Kerala
2013-10-15
In this thesis we investigate zero temperature and low temperature properties - critical, statistical and thermodynamical - of lattice models in the contexts of bosonic cold atom systems, magnetic materials, and non-interacting particles on various lattice geometries. We study quantum phase transitions in the Bose-Hubbard model with higher body interactions, as relevant for optical lattice experiments of strongly interacting bosons, in one and two dimensions; the universality of the Mott insulator to superfluid transition is found to remain unchanged for even large three body interaction strengths. A systematic renormalization procedure is formulated to fully re-sum these higher (three and four) body interactions into the two body terms. In the strongly repulsive limit, we analyse the zero and low temperature physics of interacting hard-core bosons on the kagome lattice at various fillings. Evidence for a disordered phase in the Ising limit of the model is presented; in the strong coupling limit, the transition between the valence bond solid and the superfluid is argued to be first order at the tip of the solid lobe.
Critical, statistical, and thermodynamical properties of lattice models
International Nuclear Information System (INIS)
Varma, Vipin Kerala
2013-10-01
In this thesis we investigate zero temperature and low temperature properties - critical, statistical and thermodynamical - of lattice models in the contexts of bosonic cold atom systems, magnetic materials, and non-interacting particles on various lattice geometries. We study quantum phase transitions in the Bose-Hubbard model with higher body interactions, as relevant for optical lattice experiments of strongly interacting bosons, in one and two dimensions; the universality of the Mott insulator to superfluid transition is found to remain unchanged for even large three body interaction strengths. A systematic renormalization procedure is formulated to fully re-sum these higher (three and four) body interactions into the two body terms. In the strongly repulsive limit, we analyse the zero and low temperature physics of interacting hard-core bosons on the kagome lattice at various fillings. Evidence for a disordered phase in the Ising limit of the model is presented; in the strong coupling limit, the transition between the valence bond solid and the superfluid is argued to be first order at the tip of the solid lobe.
Terminal-Dependent Statistical Inference for the FBSDEs Models
Directory of Open Access Journals (Sweden)
Yunquan Song
2014-01-01
Full Text Available The original stochastic differential equations (OSDEs and forward-backward stochastic differential equations (FBSDEs are often used to model complex dynamic process that arise in financial, ecological, and many other areas. The main difference between OSDEs and FBSDEs is that the latter is designed to depend on a terminal condition, which is a key factor in some financial and ecological circumstances. It is interesting but challenging to estimate FBSDE parameters from noisy data and the terminal condition. However, to the best of our knowledge, the terminal-dependent statistical inference for such a model has not been explored in the existing literature. We proposed a nonparametric terminal control variables estimation method to address this problem. The reason why we use the terminal control variables is that the newly proposed inference procedures inherit the terminal-dependent characteristic. Through this new proposed method, the estimators of the functional coefficients of the FBSDEs model are obtained. The asymptotic properties of the estimators are also discussed. Simulation studies show that the proposed method gives satisfying estimates for the FBSDE parameters from noisy data and the terminal condition. A simulation is performed to test the feasibility of our method.
The statistical multifragmentation model: Origins and recent advances
Energy Technology Data Exchange (ETDEWEB)
Donangelo, R., E-mail: donangel@fing.edu.uy [Instituto de Física, Facultad de Ingeniería, Universidad de la República, Julio Herrera y Reissig 565, 11300, Montevideo (Uruguay); Instituto de Física, Universidade Federal do Rio de Janeiro, C.P. 68528, 21941-972 Rio de Janeiro - RJ (Brazil); Souza, S. R., E-mail: srsouza@if.ufrj.br [Instituto de Física, Universidade Federal do Rio de Janeiro, C.P. 68528, 21941-972 Rio de Janeiro - RJ (Brazil); Instituto de Física, Universidade Federal do Rio Grande do Sul, C.P. 15051, 91501-970 Porto Alegre - RS (Brazil)
2016-07-07
We review the Statistical Multifragmentation Model (SMM) which considers a generalization of the liquid-drop model for hot nuclei and allows one to calculate thermodynamic quantities characterizing the nuclear ensemble at the disassembly stage. We show how to determine probabilities of definite partitions of finite nuclei and how to determine, through Monte Carlo calculations, observables such as the caloric curve, multiplicity distributions, heat capacity, among others. Some experimental measurements of the caloric curve confirmed the SMM predictions of over 10 years before, leading to a surge in the interest in the model. However, the experimental determination of the fragmentation temperatures relies on the yields of different isotopic species, which were not correctly calculated in the schematic, liquid-drop picture, employed in the SMM. This led to a series of improvements in the SMM, in particular to the more careful choice of nuclear masses and energy densities, specially for the lighter nuclei. With these improvements the SMM is able to make quantitative determinations of isotope production. We show the application of SMM to the production of exotic nuclei through multifragmentation. These preliminary calculations demonstrate the need for a careful choice of the system size and excitation energy to attain maximum yields.
Multivariate Statistical Modelling of Drought and Heat Wave Events
Manning, Colin; Widmann, Martin; Vrac, Mathieu; Maraun, Douglas; Bevaqua, Emanuele
2016-04-01
Multivariate Statistical Modelling of Drought and Heat Wave Events C. Manning1,2, M. Widmann1, M. Vrac2, D. Maraun3, E. Bevaqua2,3 1. School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, UK 2. Laboratoire des Sciences du Climat et de l'Environnement, (LSCE-IPSL), Centre d'Etudes de Saclay, Gif-sur-Yvette, France 3. Wegener Center for Climate and Global Change, University of Graz, Brandhofgasse 5, 8010 Graz, Austria Compound extreme events are a combination of two or more contributing events which in themselves may not be extreme but through their joint occurrence produce an extreme impact. Compound events are noted in the latest IPCC report as an important type of extreme event that have been given little attention so far. As part of the CE:LLO project (Compound Events: muLtivariate statisticaL mOdelling) we are developing a multivariate statistical model to gain an understanding of the dependence structure of certain compound events. One focus of this project is on the interaction between drought and heat wave events. Soil moisture has both a local and non-local effect on the occurrence of heat waves where it strongly controls the latent heat flux affecting the transfer of sensible heat to the atmosphere. These processes can create a feedback whereby a heat wave maybe amplified or suppressed by the soil moisture preconditioning, and vice versa, the heat wave may in turn have an effect on soil conditions. An aim of this project is to capture this dependence in order to correctly describe the joint probabilities of these conditions and the resulting probability of their compound impact. We will show an application of Pair Copula Constructions (PCCs) to study the aforementioned compound event. PCCs allow in theory for the formulation of multivariate dependence structures in any dimension where the PCC is a decomposition of a multivariate distribution into a product of bivariate components modelled using copulas. A
Towards a Statistical Model of Tropical Cyclone Genesis
Fernandez, A.; Kashinath, K.; McAuliffe, J.; Prabhat, M.; Stark, P. B.; Wehner, M. F.
2017-12-01
Tropical Cyclones (TCs) are important extreme weather phenomena that have a strong impact on humans. TC forecasts are largely based on global numerical models that produce TC-like features. Aspects of Tropical Cyclones such as their formation/genesis, evolution, intensification and dissipation over land are important and challenging problems in climate science. This study investigates the environmental conditions associated with Tropical Cyclone Genesis (TCG) by testing how accurately a statistical model can predict TCG in the CAM5.1 climate model. TCG events are defined using TECA software @inproceedings{Prabhat2015teca, title={TECA: Petascale Pattern Recognition for Climate Science}, author={Prabhat and Byna, Surendra and Vishwanath, Venkatram and Dart, Eli and Wehner, Michael and Collins, William D}, booktitle={Computer Analysis of Images and Patterns}, pages={426-436}, year={2015}, organization={Springer}} to extract TC trajectories from CAM5.1. L1-regularized logistic regression (L1LR) is applied to the CAM5.1 output. The predictions have nearly perfect accuracy for data not associated with TC tracks and high accuracy differentiating between high vorticity and low vorticity systems. The model's active variables largely correspond to current hypotheses about important factors for TCG, such as wind field patterns and local pressure minima, and suggests new routes for investigation. Furthermore, our model's predictions of TC activity are competitive with the output of an instantaneous version of Emanuel and Nolan's Genesis Potential Index (GPI) @inproceedings{eman04, title = "Tropical cyclone activity and the global climate system", author = "Kerry Emanuel and Nolan, {David S.}", year = "2004", pages = "240-241", booktitle = "26th Conference on Hurricanes and Tropical Meteorology"}.
International Nuclear Information System (INIS)
Gamage, Kelum A.A.; Joyce, Malcolm J.; Cave, Frank D.
2013-06-01
Neutron coincidence counting is an established, nondestructive method for the qualitative and quantitative analysis of nuclear materials. Several even-numbered nuclei of the actinide isotopes, and especially even-numbered plutonium isotopes, undergo spontaneous fission, resulting in the emission of neutrons which are correlated in time. The characteristics of this i.e. the multiplicity can be used to identify each isotope in question. Similarly, the corresponding characteristics of isotopes that are susceptible to stimulated fission are somewhat isotope-related, and also dependent on the energy of the incident neutron that stimulates the fission event, and this can hence be used to identify and quantify isotopes also. Most of the neutron coincidence counters currently used are based on 3 He gas tubes. In the 3 He-filled gas proportional-counter, the (n, p) reaction is largely responsible for the detection of slow neutrons and hence neutrons have to be slowed down to thermal energies. As a result, moderator and shielding materials are essential components of many systems designed to assess quantities of fissile materials. The use of a moderator, however, extends the die-away time of the detector necessitating a larger coincidence window and, further, 3 He is now in short supply and expensive. In this paper, a simulation based on the Monte Carlo method is described which has been performed using MCNPX 2.6.0, to model the geometry of a sector-shaped liquid scintillation detector in response to coincident neutron events. The detection of neutrons from a mixed-oxide (MOX) fuel pellet using an organic liquid scintillator has been simulated for different thicknesses of scintillators. In this new neutron detector, a layer of lead has been used to reduce the gamma-ray fluence reaching the scintillator. The effect of lead for neutron detection has also been estimated by considering different thicknesses of lead layers. (authors)
Feature and Statistical Model Development in Structural Health Monitoring
Kim, Inho
All structures suffer wear and tear because of impact, excessive load, fatigue, corrosion, etc. in addition to inherent defects during their manufacturing processes and their exposure to various environmental effects. These structural degradations are often imperceptible, but they can severely affect the structural performance of a component, thereby severely decreasing its service life. Although previous studies of Structural Health Monitoring (SHM) have revealed extensive prior knowledge on the parts of SHM processes, such as the operational evaluation, data processing, and feature extraction, few studies have been conducted from a systematical perspective, the statistical model development. The first part of this dissertation, the characteristics of inverse scattering problems, such as ill-posedness and nonlinearity, reviews ultrasonic guided wave-based structural health monitoring problems. The distinctive features and the selection of the domain analysis are investigated by analytically searching the conditions of the uniqueness solutions for ill-posedness and are validated experimentally. Based on the distinctive features, a novel wave packet tracing (WPT) method for damage localization and size quantification is presented. This method involves creating time-space representations of the guided Lamb waves (GLWs), collected at a series of locations, with a spatially dense distribution along paths at pre-selected angles with respect to the direction, normal to the direction of wave propagation. The fringe patterns due to wave dispersion, which depends on the phase velocity, are selected as the primary features that carry information, regarding the wave propagation and scattering. The following part of this dissertation presents a novel damage-localization framework, using a fully automated process. In order to construct the statistical model for autonomous damage localization deep-learning techniques, such as restricted Boltzmann machine and deep belief network
The issue of statistical power for overall model fit in evaluating structural equation models
Directory of Open Access Journals (Sweden)
Richard HERMIDA
2015-06-01
Full Text Available Statistical power is an important concept for psychological research. However, examining the power of a structural equation model (SEM is rare in practice. This article provides an accessible review of the concept of statistical power for the Root Mean Square Error of Approximation (RMSEA index of overall model fit in structural equation modeling. By way of example, we examine the current state of power in the literature by reviewing studies in top Industrial-Organizational (I/O Psychology journals using SEMs. Results indicate that in many studies, power is very low, which implies acceptance of invalid models. Additionally, we examined methodological situations which may have an influence on statistical power of SEMs. Results showed that power varies significantly as a function of model type and whether or not the model is the main model for the study. Finally, results indicated that power is significantly related to model fit statistics used in evaluating SEMs. The results from this quantitative review imply that researchers should be more vigilant with respect to power in structural equation modeling. We therefore conclude by offering methodological best practices to increase confidence in the interpretation of structural equation modeling results with respect to statistical power issues.
Stamm, John W.; Long, D. Leann; Kincade, Megan E.
2012-01-01
Over the past five to ten years, zero-inflated count regression models have been increasingly applied to the analysis of dental caries indices (e.g., DMFT, dfms, etc). The main reason for that is linked to the broad decline in children’s caries experience, such that dmf and DMF indices more frequently generate low or even zero counts. This article specifically reviews the application of zero-inflated Poisson and zero-inflated negative binomial regression models to dental caries, with emphasis on the description of the models and the interpretation of fitted model results given the study goals. The review finds that interpretations provided in the published caries research are often imprecise or inadvertently misleading, particularly with respect to failing to discriminate between inference for the class of susceptible persons defined by such models and inference for the sampled population in terms of overall exposure effects. Recommendations are provided to enhance the use as well as the interpretation and reporting of results of count regression models when applied to epidemiological studies of dental caries. PMID:22710271
International Nuclear Information System (INIS)
Bessis, J.
1986-09-01
Methods are described for calculating the probabilities, p(m), of detection of m neutrons, inside a split millisecond counting gate, m varying from zero to some units. At the present stage, these methods suppose the source to be very small. Using the generating function concept, they concern both possible modes of the counting system, for opening gates, i.e.: 1) Trigger pulses randomly with regard to the emitted neutrons, 2) Trigger pulses from the detected neutrons themselves. Computed values are finally compared to the measured ones. This comparison seems to be very favourable, since the respective deviations are often lower than 1 % [fr
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-04-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-01-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided. PMID:24834325
Statistical osteoporosis models using composite finite elements: a parameter study.
Wolfram, Uwe; Schwen, Lars Ole; Simon, Ulrich; Rumpf, Martin; Wilke, Hans-Joachim
2009-09-18
Osteoporosis is a widely spread disease with severe consequences for patients and high costs for health care systems. The disease is characterised by a loss of bone mass which induces a loss of mechanical performance and structural integrity. It was found that transverse trabeculae are thinned and perforated while vertical trabeculae stay intact. For understanding these phenomena and the mechanisms leading to fractures of trabecular bone due to osteoporosis, numerous researchers employ micro-finite element models. To avoid disadvantages in setting up classical finite element models, composite finite elements (CFE) can be used. The aim of the study is to test the potential of CFE. For that, a parameter study on numerical lattice samples with statistically simulated, simplified osteoporosis is performed. These samples are subjected to compression and shear loading. Results show that the biggest drop of compressive stiffness is reached for transverse isotropic structures losing 32% of the trabeculae (minus 89.8% stiffness). The biggest drop in shear stiffness is found for an isotropic structure also losing 32% of the trabeculae (minus 67.3% stiffness). The study indicates that losing trabeculae leads to a worse drop of macroscopic stiffness than thinning of trabeculae. The results further demonstrate the advantages of CFEs for simulating micro-structured samples.
Automated robust generation of compact 3D statistical shape models
Vrtovec, Tomaz; Likar, Bostjan; Tomazevic, Dejan; Pernus, Franjo
2004-05-01
Ascertaining the detailed shape and spatial arrangement of anatomical structures is important not only within diagnostic settings but also in the areas of planning, simulation, intraoperative navigation, and tracking of pathology. Robust, accurate and efficient automated segmentation of anatomical structures is difficult because of their complexity and inter-patient variability. Furthermore, the position of the patient during image acquisition, the imaging device and protocol, image resolution, and other factors induce additional variations in shape and appearance. Statistical shape models (SSMs) have proven quite successful in capturing structural variability. A possible approach to obtain a 3D SSM is to extract reference voxels by precisely segmenting the structure in one, reference image. The corresponding voxels in other images are determined by registering the reference image to each other image. The SSM obtained in this way describes statistically plausible shape variations over the given population as well as variations due to imperfect registration. In this paper, we present a completely automated method that significantly reduces shape variations induced by imperfect registration, thus allowing a more accurate description of variations. At each iteration, the derived SSM is used for coarse registration, which is further improved by describing finer variations of the structure. The method was tested on 64 lumbar spinal column CT scans, from which 23, 38, 45, 46 and 42 volumes of interest containing vertebra L1, L2, L3, L4 and L5, respectively, were extracted. Separate SSMs were generated for each vertebra. The results show that the method is capable of reducing the variations induced by registration errors.
Halyo, Nesim; Choi, Sang H.
1987-01-01
Two count conversion algorithms and the associated dynamic sensor model for the M/WFOV nonscanner radiometers are defined. The sensor model provides and updates the constants necessary for the conversion algorithms, though the frequency with which these updates were needed was uncertain. This analysis therefore develops mathematical models for the conversion of irradiance at the sensor field of view (FOV) limiter into data counts, derives from this model two algorithms for the conversion of data counts to irradiance at the sensor FOV aperture and develops measurement models which account for a specific target source together with a sensor. The resulting algorithms are of the gain/offset and Kalman filter types. The gain/offset algorithm was chosen since it provided sufficient accuracy using simpler computations.
Patch-based generative shape model and MDL model selection for statistical analysis of archipelagos
DEFF Research Database (Denmark)
Ganz, Melanie; Nielsen, Mads; Brandt, Sami
2010-01-01
as a probability distribution of a binary image where the model is intended to facilitate sequential simulation. Our results show that a relatively simple model is able to generate structures visually similar to calcifications. Furthermore, we used the shape model as a shape prior in the statistical segmentation......We propose a statistical generative shape model for archipelago-like structures. These kind of structures occur, for instance, in medical images, where our intention is to model the appearance and shapes of calcifications in x-ray radio graphs. The generative model is constructed by (1) learning...... a patch-based dictionary for possible shapes, (2) building up a time-homogeneous Markov model to model the neighbourhood correlations between the patches, and (3) automatic selection of the model complexity by the minimum description length principle. The generative shape model is proposed...
Earthquake statistics in a Block Slider Model and a fully dynamic Fault Model
Directory of Open Access Journals (Sweden)
D. Weatherley
2004-01-01
Full Text Available We examine the event statistics obtained from two differing simplified models for earthquake faults. The first model is a reproduction of the Block-Slider model of Carlson et al. (1991, a model often employed in seismicity studies. The second model is an elastodynamic fault model based upon the Lattice Solid Model (LSM of Mora and Place (1994. We performed simulations in which the fault length was varied in each model and generated synthetic catalogs of event sizes and times. From these catalogs, we constructed interval event size distributions and inter-event time distributions. The larger, localised events in the Block-Slider model displayed the same scaling behaviour as events in the LSM however the distribution of inter-event times was markedly different. The analysis of both event size and inter-event time statistics is an effective method for comparative studies of differing simplified models for earthquake faults.
DEFF Research Database (Denmark)
Scheike, Thomas Harder
2002-01-01
for the rate function, i.e., the instantaneous probability of an event conditional on only a selected set of covariates. When the rate function for the counting process is of Aalen form we show that the usual Aalen estimator can be used and gives almost unbiased estimates. The usual martingale based variance...
Local yield stress statistics in model amorphous solids
Barbot, Armand; Lerbinger, Matthias; Hernandez-Garcia, Anier; García-García, Reinaldo; Falk, Michael L.; Vandembroucq, Damien; Patinet, Sylvain
2018-03-01
We develop and extend a method presented by Patinet, Vandembroucq, and Falk [Phys. Rev. Lett. 117, 045501 (2016), 10.1103/PhysRevLett.117.045501] to compute the local yield stresses at the atomic scale in model two-dimensional Lennard-Jones glasses produced via differing quench protocols. This technique allows us to sample the plastic rearrangements in a nonperturbative manner for different loading directions on a well-controlled length scale. Plastic activity upon shearing correlates strongly with the locations of low yield stresses in the quenched states. This correlation is higher in more structurally relaxed systems. The distribution of local yield stresses is also shown to strongly depend on the quench protocol: the more relaxed the glass, the higher the local plastic thresholds. Analysis of the magnitude of local plastic relaxations reveals that stress drops follow exponential distributions, justifying the hypothesis of an average characteristic amplitude often conjectured in mesoscopic or continuum models. The amplitude of the local plastic rearrangements increases on average with the yield stress, regardless of the system preparation. The local yield stress varies with the shear orientation tested and strongly correlates with the plastic rearrangement locations when the system is sheared correspondingly. It is thus argued that plastic rearrangements are the consequence of shear transformation zones encoded in the glass structure that possess weak slip planes along different orientations. Finally, we justify the length scale employed in this work and extract the yield threshold statistics as a function of the size of the probing zones. This method makes it possible to derive physically grounded models of plasticity for amorphous materials by directly revealing the relevant details of the shear transformation zones that mediate this process.
Extension of the Wald statistic to models with dependent observations
Czech Academy of Sciences Publication Activity Database
Morales, D.; Pardo, L.; Pardo, M. C.; Vajda, Igor
2000-01-01
Roč. 52, č. 2 (2000), s. 97-113 ISSN 0026-1335 R&D Projects: GA ČR GA102/99/1137 Grant - others:DGES(ES) PB-960635; GV(ES) 99/159/01 Institutional research plan: AV0Z1075907 Keywords : composite parametric hypotheses * generalized likelihood ratio statistic * generalized Wald statistic Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.212, year: 2000
DEFF Research Database (Denmark)
A methodology is presented that combines modelling based on first principles and data based modelling into a modelling cycle that facilitates fast decision-making based on statistical methods. A strong feature of this methodology is that given a first principles model along with process data......, the corresponding modelling cycle model of the given system for a given purpose. A computer-aided tool, which integrates the elements of the modelling cycle, is also presented, and an example is given of modelling a fed-batch bioreactor....
DEFF Research Database (Denmark)
Græsbøll, Kaare; Kirkeby, Carsten Thure; Nielsen, Søren Saxmose
2016-01-01
using a herd level curve allows for estimating the cow production level from first the recording in the parity, while a two-parameter model requires more recordings for a credible estimate, but may more precisely predict persistence, and given the independence of parameters, these can be easily drawn....... Furthermore, we investigated how the parameters of lactation models correlate between parities and from dam to offspring. The aim of the study was to provide simple and robust models for cow level milk yield and somatic cell count for fitting to sparse data to parameterize herd- and cow-specific simulation...... than somatic cells per milliliter. A positive correlation was found between relative levels of the total somatic cell count and the milk yield. The variation of lactation and somatic cell count curves between farms highlights the importance of a herd level approach. The one-parameter per cow model...
Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and...
Dataset of coded handwriting features for use in statistical modelling
Directory of Open Access Journals (Sweden)
Anna Agius
2018-02-01
Full Text Available The data presented here is related to the article titled, “Using handwriting to infer a writer's country of origin for forensic intelligence purposes” (Agius et al., 2017 [1]. This article reports original writer, spatial and construction characteristic data for thirty-seven English Australian writers and thirty-seven Vietnamese writers. All of these characteristics were coded and recorded in Microsoft Excel 2013 (version 15.31. The construction characteristics coded were only extracted from seven characters, which were: ‘g’, ‘h’, ‘th’, ‘M’, ‘0’, ‘7’ and ‘9’. The coded format of the writer, spatial and construction characteristics is made available in this Data in Brief in order to allow others to perform statistical analyses and modelling to investigate whether there is a relationship between the handwriting features and the nationality of the writer, and whether the two nationalities can be differentiated. Furthermore, to employ mathematical techniques that are capable of characterising the extracted features from each participant.
Increased Statistical Efficiency in a Lognormal Mean Model
Directory of Open Access Journals (Sweden)
Grant H. Skrepnek
2014-01-01
Full Text Available Within the context of clinical and other scientific research, a substantial need exists for an accurate determination of the point estimate in a lognormal mean model, given that highly skewed data are often present. As such, logarithmic transformations are often advocated to achieve the assumptions of parametric statistical inference. Despite this, existing approaches that utilize only a sample’s mean and variance may not necessarily yield the most efficient estimator. The current investigation developed and tested an improved efficient point estimator for a lognormal mean by capturing more complete information via the sample’s coefficient of variation. Results of an empirical simulation study across varying sample sizes and population standard deviations indicated relative improvements in efficiency of up to 129.47 percent compared to the usual maximum likelihood estimator and up to 21.33 absolute percentage points above the efficient estimator presented by Shen and colleagues (2006. The relative efficiency of the proposed estimator increased particularly as a function of decreasing sample size and increasing population standard deviation.
Statistical physics of medical diagnostics: Study of a probabilistic model
Mashaghi, Alireza; Ramezanpour, Abolfazl
2018-03-01
We study a diagnostic strategy which is based on the anticipation of the diagnostic process by simulation of the dynamical process starting from the initial findings. We show that such a strategy could result in more accurate diagnoses compared to a strategy that is solely based on the direct implications of the initial observations. We demonstrate this by employing the mean-field approximation of statistical physics to compute the posterior disease probabilities for a given subset of observed signs (symptoms) in a probabilistic model of signs and diseases. A Monte Carlo optimization algorithm is then used to maximize an objective function of the sequence of observations, which favors the more decisive observations resulting in more polarized disease probabilities. We see how the observed signs change the nature of the macroscopic (Gibbs) states of the sign and disease probability distributions. The structure of these macroscopic states in the configuration space of the variables affects the quality of any approximate inference algorithm (so the diagnostic performance) which tries to estimate the sign-disease marginal probabilities. In particular, we find that the simulation (or extrapolation) of the diagnostic process is helpful when the disease landscape is not trivial and the system undergoes a phase transition to an ordered phase.
Olive mill wastewater characteristics: modelling and statistical analysis
Directory of Open Access Journals (Sweden)
Martins-Dias, Susete
2004-09-01
Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus
Statistical Damage Detection of Civil Engineering Structures using ARMAV Models
DEFF Research Database (Denmark)
Andersen, P.; Kirkegaard, Poul Henning
In this paper a statistically based damage detection of a lattice steel mast is performed. By estimation of the modal parameters and their uncertainties it is possible to detect whether some of the modal parameters have changed with a statistical significance. The estimation of the uncertainties ...
Definitions and Models of Statistical Literacy: A Literature Review
Sharma, Sashi
2017-01-01
Despite statistical literacy being relatively new in statistics education research, it needs special attention as attempts are being made to enhance the teaching, learning and assessing of this sub-strand. It is important that teachers and researchers are aware of the challenges of teaching this literacy. In this article, the growing importance of…
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
In the previous paper ({\\it Pramana – J. Phys.} 81(6), 1009 (2013)), the mechanism of stress corrosion cracking (SCC) based on non-quadratic form of Dirichlet energy was proposed and its statistical features were discussed. Following those results, we discuss here how SCC propagates on pipe wall statistically. It reveals ...
Maximum entropy principle and hydrodynamic models in statistical mechanics
International Nuclear Information System (INIS)
Trovato, M.; Reggiani, L.
2012-01-01
This review presents the state of the art of the maximum entropy principle (MEP) in its classical and quantum (QMEP) formulation. Within the classical MEP we overview a general theory able to provide, in a dynamical context, the macroscopic relevant variables for carrier transport in the presence of electric fields of arbitrary strength. For the macroscopic variables the linearized maximum entropy approach is developed including full-band effects within a total energy scheme. Under spatially homogeneous conditions, we construct a closed set of hydrodynamic equations for the small-signal (dynamic) response of the macroscopic variables. The coupling between the driving field and the energy dissipation is analyzed quantitatively by using an arbitrary number of moments of the distribution function. Analogously, the theoretical approach is applied to many one-dimensional n + nn + submicron Si structures by using different band structure models, different doping profiles, different applied biases and is validated by comparing numerical calculations with ensemble Monte Carlo simulations and with available experimental data. Within the quantum MEP we introduce a quantum entropy functional of the reduced density matrix, the principle of quantum maximum entropy is then asserted as fundamental principle of quantum statistical mechanics. Accordingly, we have developed a comprehensive theoretical formalism to construct rigorously a closed quantum hydrodynamic transport within a Wigner function approach. The theory is formulated both in thermodynamic equilibrium and nonequilibrium conditions, and the quantum contributions are obtained by only assuming that the Lagrange multipliers can be expanded in powers of ħ 2 , being ħ the reduced Planck constant. In particular, by using an arbitrary number of moments, we prove that: i) on a macroscopic scale all nonlocal effects, compatible with the uncertainty principle, are imputable to high-order spatial derivatives both of the
Modelling T4 cell count as a marker of HIV progression in the absence of any defence mechanism
Directory of Open Access Journals (Sweden)
VSS Yadavalli
2010-12-01
Full Text Available The T4 cell count, which is considered one of the markers of disease progression in an HIV infected individual, is modelled in this paper. The World Health Organisation has recently advocated that countries encourage HIV infected individuals to commence antiretroviral treatments once their T4 cell count drops below 350 cells per ml of blood (this threshold was formerly 200 cells per ml of blood. This recommendation is made because when the T4 cell count is low, the T4 cells are unable to mount an effective immune response against antigens and any such foreign matters in the body, and consequently the individual becomes susceptible to opportunistic infections and lymphomas. A stochastic catastrophe model is developed in this paper to obtain the mean, variance and covariance of the uninfected, infected and lysed T4 cells. The amount of toxin produced in an HIV infected person from the time of infection to a later time may also be obtained from the model. Numerical illustrations of the correlation structures between uninfected and infected T4 cells, and between the infected and lysed T4 cells are also presented.
Cosmic Statistics of Statistics
Szapudi, I.; Colombi, S.; Bernardeau, F.
1999-01-01
The errors on statistics measured in finite galaxy catalogs are exhaustively investigated. The theory of errors on factorial moments by Szapudi & Colombi (1996) is applied to cumulants via a series expansion method. All results are subsequently extended to the weakly non-linear regime. Together with previous investigations this yields an analytic theory of the errors for moments and connected moments of counts in cells from highly nonlinear to weakly nonlinear scales. The final analytic formu...
Statistical behaviour of adaptive multilevel splitting algorithms in simple models
International Nuclear Information System (INIS)
Rolland, Joran; Simonnet, Eric
2015-01-01
Adaptive multilevel splitting algorithms have been introduced rather recently for estimating tail distributions in a fast and efficient way. In particular, they can be used for computing the so-called reactive trajectories corresponding to direct transitions from one metastable state to another. The algorithm is based on successive selection–mutation steps performed on the system in a controlled way. It has two intrinsic parameters, the number of particles/trajectories and the reaction coordinate used for discriminating good or bad trajectories. We investigate first the convergence in law of the algorithm as a function of the timestep for several simple stochastic models. Second, we consider the average duration of reactive trajectories for which no theoretical predictions exist. The most important aspect of this work concerns some systems with two degrees of freedom. They are studied in detail as a function of the reaction coordinate in the asymptotic regime where the number of trajectories goes to infinity. We show that during phase transitions, the statistics of the algorithm deviate significatively from known theoretical results when using non-optimal reaction coordinates. In this case, the variance of the algorithm is peaking at the transition and the convergence of the algorithm can be much slower than the usual expected central limit behaviour. The duration of trajectories is affected as well. Moreover, reactive trajectories do not correspond to the most probable ones. Such behaviour disappears when using the optimal reaction coordinate called committor as predicted by the theory. We finally investigate a three-state Markov chain which reproduces this phenomenon and show logarithmic convergence of the trajectory durations
Modelling malaria treatment practices in Bangladesh using spatial statistics
Directory of Open Access Journals (Sweden)
Haque Ubydul
2012-03-01
Full Text Available Abstract Background Malaria treatment-seeking practices vary worldwide and Bangladesh is no exception. Individuals from 88 villages in Rajasthali were asked about their treatment-seeking practices. A portion of these households preferred malaria treatment from the National Control Programme, but still a large number of households continued to use drug vendors and approximately one fourth of the individuals surveyed relied exclusively on non-control programme treatments. The risks of low-control programme usage include incomplete malaria treatment, possible misuse of anti-malarial drugs, and an increased potential for drug resistance. Methods The spatial patterns of treatment-seeking practices were first examined using hot-spot analysis (Local Getis-Ord Gi statistic and then modelled using regression. Ordinary least squares (OLS regression identified key factors explaining more than 80% of the variation in control programme and vendor treatment preferences. Geographically weighted regression (GWR was then used to assess where each factor was a strong predictor of treatment-seeking preferences. Results Several factors including tribal affiliation, housing materials, household densities, education levels, and proximity to the regional urban centre, were found to be effective predictors of malaria treatment-seeking preferences. The predictive strength of each of these factors, however, varied across the study area. While education, for example, was a strong predictor in some villages, it was less important for predicting treatment-seeking outcomes in other villages. Conclusion Understanding where each factor is a strong predictor of treatment-seeking outcomes may help in planning targeted interventions aimed at increasing control programme usage. Suggested strategies include providing additional training for the Building Resources across Communities (BRAC health workers, implementing educational programmes, and addressing economic factors.
International Nuclear Information System (INIS)
He Xin; Links, Jonathan M; Frey, Eric C
2010-01-01
Quantum noise as well as anatomic and uptake variability in patient populations limits observer performance on a defect detection task in myocardial perfusion SPECT (MPS). The goal of this study was to investigate the relative importance of these two effects by varying acquisition time, which determines the count level, and assessing the change in performance on a myocardial perfusion (MP) defect detection task using both mathematical and human observers. We generated ten sets of projections of a simulated patient population with count levels ranging from 1/128 to around 15 times a typical clinical count level to simulate different levels of quantum noise. For the simulated population we modeled variations in patient, heart and defect size, heart orientation and shape, defect location, organ uptake ratio, etc. The projection data were reconstructed using the OS-EM algorithm with no compensation or with attenuation, detector response and scatter compensation (ADS). The images were then post-filtered and reoriented to generate short-axis slices. A channelized Hotelling observer (CHO) was applied to the short-axis images, and the area under the receiver operating characteristics (ROC) curve (AUC) was computed. For each noise level and reconstruction method, we optimized the number of iterations and cutoff frequencies of the Butterworth filter to maximize the AUC. Using the images obtained with the optimal iteration and cutoff frequency and ADS compensation, we performed human observer studies for four count levels to validate the CHO results. Both CHO and human observer studies demonstrated that observer performance was dependent on the relative magnitude of the quantum noise and the patient variation. When the count level was high, the patient variation dominated, and the AUC increased very slowly with changes in the count level for the same level of anatomic variability. When the count level was low, however, quantum noise dominated, and changes in the count level
Monthly to seasonal low flow prediction: statistical versus dynamical models
Ionita-Scholz, Monica; Klein, Bastian; Meissner, Dennis; Rademacher, Silke
2016-04-01
the Alfred Wegener Institute a purely statistical scheme to generate streamflow forecasts for several months ahead. Instead of directly using teleconnection indices (e.g. NAO, AO) the idea is to identify regions with stable teleconnections between different global climate information (e.g. sea surface temperature, geopotential height etc.) and streamflow at different gauges relevant for inland waterway transport. So-called stability (correlation) maps are generated showing regions where streamflow and climate variable from previous months are significantly correlated in a 21 (31) years moving window. Finally, the optimal forecast model is established based on a multiple regression analysis of the stable predictors. We will present current results of the aforementioned approaches with focus on the River Rhine (being one of the world's most frequented waterways and the backbone of the European inland waterway network) and the Elbe River. Overall, our analysis reveals the existence of a valuable predictability of the low flows at monthly and seasonal time scales, a result that may be useful to water resources management. Given that all predictors used in the models are available at the end of each month, the forecast scheme can be used operationally to predict extreme events and to provide early warnings for upcoming low flows.
A counting process model of survival of parallel load-sharing system
Czech Academy of Sciences Publication Activity Database
Volf, Petr; Linka, A.
2001-01-01
Roč. 37, č. 1 (2001), s. 47-60 ISSN 0023-5954 R&D Projects: GA ČR GA402/98/0472; GA MŠk VS97084 Institutional research plan: AV0Z1075907 Keywords : reliability * mathematical statistics * parallel system Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.316, year: 2001
Mohebbi, Mohammadreza; Wolfe, Rory; Forbes, Andrew
2014-01-01
This paper applies the generalised linear model for modelling geographical variation to esophageal cancer incidence data in the Caspian region of Iran. The data have a complex and hierarchical structure that makes them suitable for hierarchical analysis using Bayesian techniques, but with care required to deal with problems arising from counts of events observed in small geographical areas when overdispersion and residual spatial autocorrelation are present. These considerations lead to nine regression models derived from using three probability distributions for count data: Poisson, generalised Poisson and negative binomial, and three different autocorrelation structures. We employ the framework of Bayesian variable selection and a Gibbs sampling based technique to identify significant cancer risk factors. The framework deals with situations where the number of possible models based on different combinations of candidate explanatory variables is large enough such that calculation of posterior probabilities for all models is difficult or infeasible. The evidence from applying the modelling methodology suggests that modelling strategies based on the use of generalised Poisson and negative binomial with spatial autocorrelation work well and provide a robust basis for inference. PMID:24413702
Improving statistical reasoning: theoretical models and practical implications
National Research Council Canada - National Science Library
Sedlmeier, Peter
1999-01-01
... in Psychology? 206 References 216 Author Index 230 Subject Index 235 v PrefacePreface Statistical literacy, the art of drawing reasonable inferences from an abundance of numbers provided daily by...
A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839
Cai, Li; Monroe, Scott
2014-01-01
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia
2017-08-31
As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.
Computational and Statistical Models: A Comparison for Policy Modeling of Childhood Obesity
Mabry, Patricia L.; Hammond, Ross; Ip, Edward Hak-Sing; Huang, Terry T.-K.
As systems science methodologies have begun to emerge as a set of innovative approaches to address complex problems in behavioral, social science, and public health research, some apparent conflicts with traditional statistical methodologies for public health have arisen. Computational modeling is an approach set in context that integrates diverse sources of data to test the plausibility of working hypotheses and to elicit novel ones. Statistical models are reductionist approaches geared towards proving the null hypothesis. While these two approaches may seem contrary to each other, we propose that they are in fact complementary and can be used jointly to advance solutions to complex problems. Outputs from statistical models can be fed into computational models, and outputs from computational models can lead to further empirical data collection and statistical models. Together, this presents an iterative process that refines the models and contributes to a greater understanding of the problem and its potential solutions. The purpose of this panel is to foster communication and understanding between statistical and computational modelers. Our goal is to shed light on the differences between the approaches and convey what kinds of research inquiries each one is best for addressing and how they can serve complementary (and synergistic) roles in the research process, to mutual benefit. For each approach the panel will cover the relevant "assumptions" and how the differences in what is assumed can foster misunderstandings. The interpretations of the results from each approach will be compared and contrasted and the limitations for each approach will be delineated. We will use illustrative examples from CompMod, the Comparative Modeling Network for Childhood Obesity Policy. The panel will also incorporate interactive discussions with the audience on the issues raised here.
Evaluation of Deterministic and Stochastic Components of Traffic Counts
Directory of Open Access Journals (Sweden)
Ivan Bošnjak
2012-10-01
Full Text Available Traffic counts or statistical evidence of the traffic processare often a characteristic of time-series data. In this paper fundamentalproblem of estimating deterministic and stochasticcomponents of a traffic process are considered, in the context of"generalised traffic modelling". Different methods for identificationand/or elimination of the trend and seasonal componentsare applied for concrete traffic counts. Further investigationsand applications of ARIMA models, Hilbert space formulationsand state-space representations are suggested.
A two-component rain model for the prediction of attenuation statistics
Crane, R. K.
1982-01-01
A two-component rain model has been developed for calculating attenuation statistics. In contrast to most other attenuation prediction models, the two-component model calculates the occurrence probability for volume cells or debris attenuation events. The model performed significantly better than the International Radio Consultative Committee model when used for predictions on earth-satellite paths. It is expected that the model will have applications in modeling the joint statistics required for space diversity system design, the statistics of interference due to rain scatter at attenuating frequencies, and the duration statistics for attenuation events.
Dharmarathna, Sinhalagoda Lekamlage Chandi Asoka; Wickramasinghe, Susiji; Waduge, Roshitha Nilmini; Rajapakse, Rajapakse Peramune Veddikkarage Jayanthe; Kularatne, Senanayake Abeysinghe Mudiyanselage
2013-09-01
To investigate the potential role of fresh Carica papaya (C. papaya) leaf extract on haematological and biochemical parameters and toxicological changes in a murine model. In total 36 mice were used for the trial. Fresh C. papaya leaf extract [0.2 mL (2 g)/mouse] was given only to the test group (18 mice). General behavior, clinical signs and feeding patterns were recorded. Blood and tissue samples were collected at intervals. Haematological parameters including platelet, red blood cell (RBC), white blood cell (WBC), packed cell volume (PCV), serum biochemistry including serum creatinine, serum glutamic-oxaloacetic transaminase (SGOT) and serum glutamic-pyruvic transaminase (SGPT) were determined. Organs for possible histopathological changes were examined. Neither group exhibited alteration of behavior or reduction in food and water intake. Similarly, no significant changes in SGOT, SGPT and serum creatinine levels were detected in the test group. Histopathological organ changes were not observed in either group of mice except in three liver samples of the test group which had a mild focal necrosis. The platelet count (11.33±0.35)×10⁵/µL (P=0.00004) and the RBC count (7.97±0.61)×10⁶/µL (P=0.00003) were significantly increased in the test group compared to that of the controls. However, WBC count and PCV (%) values were not changed significantly in the test group. The platelet count in the test group started to increase significantly from Day 3 (3.4±0.18×10⁵/µL), reaching almost a fourfold higher at Day 21 (11.3×10⁵/µL), while it was 3.8×10⁵/µL and 5.5×10⁵/µL at Day 3 and Day 21 respectively in the control. Likewise, the RBC count in the test group increased from 6×10⁶/µL to 9×10⁶/ µL at Day 21 while it remained near constant in the control group (6×10⁶/µL). Fresh C. papaya leaf extract significantly increased the platelet and RBC counts in the test group as compared to controls. Therefore, it is very important to identify
Teater, Barbra; Roy, Jessica; Carpenter, John; Forrester, Donald; Devaney, John; Scourfield, Jonathan
2017-01-01
Students in the United Kingdom (UK) are found to lack knowledge and skills in quantitative research methods. To address this gap, a quantitative research method and statistical analysis curriculum comprising 10 individual lessons was developed, piloted, and evaluated at two universities The evaluation found that BSW students' (N = 81)…
von Ruette, Jonas; Papritz, Andreas; Lehmann, Peter; Rickli, Christian; Or, Dani
2011-10-01
Statistical models that exploit the correlation between landslide occurrence and geomorphic properties are often used to map the spatial occurrence of shallow landslides triggered by heavy rainfalls. In many landslide susceptibility studies, the true predictive power of the statistical model remains unknown because the predictions are not validated with independent data from other events or areas. This study validates statistical susceptibility predictions with independent test data. The spatial incidence of landslides, triggered by an extreme rainfall in a study area, was modeled by logistic regression. The fitted model was then used to generate susceptibility maps for another three study areas, for which event-based landslide inventories were also available. All the study areas lie in the northern foothills of the Swiss Alps. The landslides had been triggered by heavy rainfall either in 2002 or 2005. The validation was designed such that the first validation study area shared the geomorphology and the second the triggering rainfall event with the calibration study area. For the third validation study area, both geomorphology and rainfall were different. All explanatory variables were extracted for the logistic regression analysis from high-resolution digital elevation and surface models (2.5 m grid). The model fitted to the calibration data comprised four explanatory variables: (i) slope angle (effect of gravitational driving forces), (ii) vegetation type (grassland and forest; root reinforcement), (iii) planform curvature (convergent water flow paths), and (iv) contributing area (potential supply of water). The area under the Receiver Operating Characteristic (ROC) curve ( AUC) was used to quantify the predictive performance of the logistic regression model. The AUC values were computed for the susceptibility maps of the three validation study areas (validation AUC), the fitted susceptibility map of the calibration study area (apparent AUC: 0.80) and another
Information Geometric Complexity of a Trivariate Gaussian Statistical Model
Directory of Open Access Journals (Sweden)
Domenico Felice
2014-05-01
Full Text Available We evaluate the information geometric complexity of entropic motion on low-dimensional Gaussian statistical manifolds in order to quantify how difficult it is to make macroscopic predictions about systems in the presence of limited information. Specifically, we observe that the complexity of such entropic inferences not only depends on the amount of available pieces of information but also on the manner in which such pieces are correlated. Finally, we uncover that, for certain correlational structures, the impossibility of reaching the most favorable configuration from an entropic inference viewpoint seems to lead to an information geometric analog of the well-known frustration effect that occurs in statistical physics.
Validation of the measure automobile emissions model : a statistical analysis
2000-09-01
The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEASURE) model provides an external validation capability for hot stabilized option; the model is one of several new modal emissions models designed to predict hot stabilized e...
Parameterizing Phrase Based Statistical Machine Translation Models: An Analytic Study
Cer, Daniel
2011-01-01
The goal of this dissertation is to determine the best way to train a statistical machine translation system. I first develop a state-of-the-art machine translation system called Phrasal and then use it to examine a wide variety of potential learning algorithms and optimization criteria and arrive at two very surprising results. First, despite the…
Applications of spatial statistical network models to stream data
Daniel J. Isaak; Erin E. Peterson; Jay M. Ver Hoef; Seth J. Wenger; Jeffrey A. Falke; Christian E. Torgersen; Colin Sowder; E. Ashley Steel; Marie-Josee Fortin; Chris E. Jordan; Aaron S. Ruesch; Nicholas Som; Pascal. Monestiez
2014-01-01
Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for...
Monte Carlo simulation of quantum statistical lattice models
Raedt, Hans De; Lagendijk, Ad
1985-01-01
In this article we review recent developments in computational methods for quantum statistical lattice problems. We begin by giving the necessary mathematical basis, the generalized Trotter formula, and discuss the computational tools, exact summations and Monte Carlo simulation, that will be used
On cumulative process model and its statistical analysis
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2000-01-01
Roč. 36, č. 2 (2000), s. 165-176 ISSN 0023-5954 R&D Projects: GA ČR GA201/97/0354; GA ČR GA402/98/0742 Institutional research plan: AV0Z1075907 Subject RIV: BB - Applied Statistics, Operational Research
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
2016-09-07
Sep 7, 2016 ... Abstract. In the previous paper (Pramana – J. Phys. 81(6), 1009 (2013)), the mechanism of stress corrosion cracking (SCC) based on non-quadratic form of Dirichlet energy was proposed and its statistical features were discussed. Following those results, we discuss here how SCC propagates on pipe wall ...
Learning Statistical Patterns in Relational Data Using Probabilistic Relational Models
National Research Council Canada - National Science Library
Koller, Daphne
2005-01-01
.... This effort focused on developing undirected probabilistic models for representing and learning graph patterns, learning patterns involving links between objects, learning discriminative models...
Statistical description of tropospheric delay for InSAR : Overview and a new model
DEFF Research Database (Denmark)
Merryman Boncori, John Peter; Mohr, Johan Jacob
2007-01-01
This paper focuses on statistical modeling of water vapor fluctuations for InSAR. The structure function and power spectral density approaches are reviewed, summarizing their assumptions and results. The linking equations between these modeling techniques are reported. A structure function model ...... of these, to atmospheric statistics. The latter approach is used to compare the derived model with previously published results....
DEFF Research Database (Denmark)
ter Beek, Maurice H.; Legay, Axel; Lluch Lafuente, Alberto
2015-01-01
We investigate the suitability of statistical model checking techniques for analysing quantitative properties of software product line models with probabilistic aspects. For this purpose, we enrich the feature-oriented language FLAN with action rates, which specify the likelihood of exhibiting...... particular behaviour or of installing features at a specific moment or in a specific order. The enriched language (called PFLAN) allows us to specify models of software product lines with probabilistic configurations and behaviour, e.g. by considering a PFLAN semantics based on discrete-time Markov chains....... The Maude implementation of PFLAN is combined with the distributed statistical model checker MultiVeStA to perform quantitative analyses of a simple product line case study. The presented analyses include the likelihood of certain behaviour of interest (e.g. product malfunctioning) and the expected average...
Directory of Open Access Journals (Sweden)
Maurice H. ter Beek
2015-04-01
Full Text Available We investigate the suitability of statistical model checking techniques for analysing quantitative properties of software product line models with probabilistic aspects. For this purpose, we enrich the feature-oriented language FLan with action rates, which specify the likelihood of exhibiting particular behaviour or of installing features at a specific moment or in a specific order. The enriched language (called PFLan allows us to specify models of software product lines with probabilistic configurations and behaviour, e.g. by considering a PFLan semantics based on discrete-time Markov chains. The Maude implementation of PFLan is combined with the distributed statistical model checker MultiVeStA to perform quantitative analyses of a simple product line case study. The presented analyses include the likelihood of certain behaviour of interest (e.g. product malfunctioning and the expected average cost of products.
Gallagher, H. Colin; Robins, Garry
2015-01-01
As part of the shift within second language acquisition (SLA) research toward complex systems thinking, researchers have called for investigations of social network structure. One strand of social network analysis yet to receive attention in SLA is network statistical models, whereby networks are explained in terms of smaller substructures of…
Study on Semi-Parametric Statistical Model of Safety Monitoring of Cracks in Concrete Dams
Gu, Chongshi; Qin, Dong; Li, Zhanchao; Zheng, Xueqin
2013-01-01
Cracks are one of the hidden dangers in concrete dams. The study on safety monitoring models of concrete dam cracks has always been difficult. Using the parametric statistical model of safety monitoring of cracks in concrete dams, with the help of the semi-parametric statistical theory, and considering the abnormal behaviors of these cracks, the semi-parametric statistical model of safety monitoring of concrete dam cracks is established to overcome the limitation of the parametric model in ex...
Creel, Michael; Loomis, John
1992-10-01
The recreational benefits from providing increased quantities of water to wildlife and fisheries habitats is estimated using linked multinomial logit site selection models and count data trip frequency models. The study encompasses waterfowl hunting, fishing and wildlife viewing at 14 recreational resources in the San Joaquin Valley, including the National Wildlife Refuges, the State Wildlife Management Areas, and six river destinations. The economic benefits of increasing water supplies to wildlife refuges were also examined by using the estimated models to predict changing patterns of site selection and overall participation due to increases in water allocations. Estimates of the dollar value per acre foot of water are calculated for increases in water to refuges. The resulting model is a flexible and useful tool for estimating the economic benefits of alternative water allocation policies for wildlife habitat and rivers.
Stirrup, Oliver T; Babiker, Abdel G; Carpenter, James R; Copas, Andrew J
2016-04-30
Longitudinal data are widely analysed using linear mixed models, with 'random slopes' models particularly common. However, when modelling, for example, longitudinal pre-treatment CD4 cell counts in HIV-positive patients, the incorporation of non-stationary stochastic processes such as Brownian motion has been shown to lead to a more biologically plausible model and a substantial improvement in model fit. In this article, we propose two further extensions. Firstly, we propose the addition of a fractional Brownian motion component, and secondly, we generalise the model to follow a multivariate-t distribution. These extensions are biologically plausible, and each demonstrated substantially improved fit on application to example data from the Concerted Action on SeroConversion to AIDS and Death in Europe study. We also propose novel procedures for residual diagnostic plots that allow such models to be assessed. Cohorts of patients were simulated from the previously reported and newly developed models in order to evaluate differences in predictions made for the timing of treatment initiation under different clinical management strategies. A further simulation study was performed to demonstrate the substantial biases in parameter estimates of the mean slope of CD4 decline with time that can occur when random slopes models are applied in the presence of censoring because of treatment initiation, with the degree of bias found to depend strongly on the treatment initiation rule applied. Our findings indicate that researchers should consider more complex and flexible models for the analysis of longitudinal biomarker data, particularly when there are substantial missing data, and that the parameter estimates from random slopes models must be interpreted with caution. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Linear System Models for Ultrasonic Imaging: Intensity Signal Statistics.
Abbey, Craig K; Zhu, Yang; Bahramian, Sara; Insana, Michael F
2017-04-01
Despite a great deal of work characterizing the statistical properties of radio frequency backscattered ultrasound signals, less is known about the statistical properties of demodulated intensity signals. Analysis of intensity is made more difficult by a strong nonlinearity that arises in the process of demodulation. This limits our ability to characterize the spatial resolution and noise properties of B-mode ultrasound images. In this paper, we generalize earlier results on two-point intensity covariance using a multivariate systems approach. We derive the mean and autocovariance function of the intensity signal under Gaussian assumptions on both the object scattering function and acquisition noise, and with the assumption of a locally shift-invariant pulse-echo system function. We investigate the limiting cases of point statistics and a uniform scattering field with a stationary distribution. Results from validation studies using simulation and data from a real system applied to a uniform scattering phantom are presented. In the simulation studies, we find errors less than 10% between the theoretical mean and variance, and sample estimates of these quantities. Prediction of the intensity power spectrum (PS) in the real system exhibits good qualitative agreement (errors less than 3.5 dB for frequencies between 0.1 and 10 cyc/mm, but with somewhat higher error outside this range that may be due to the use of a window in the PS estimation procedure). We also replicate the common finding that the intensity mean is equal to its standard deviation (i.e., signal-to-noise ratio = 1) for fully developed speckle. We show how the derived statistical properties can be used to characterize the quality of an ultrasound linear array for low-contrast patterns using generalized noise-equivalent quanta directly on the intensity signal.
Statistical model of the powder flow regulation by nanomaterials
Kurfess, D.; Hinrichsen, H.; Zimmermann, I.
2005-01-01
Fine powders often tend to agglomerate due to van der Waals forces between the particles. These forces can be reduced significantly by covering the particles with nanoscaled adsorbates, as shown by recent experiments. In the present work a quantitative statistical analysis of the effect of powder flow regulating nanomaterials on the adhesive forces in powders is given. Covering two spherical powder particles randomly with nanoadsorbates we compute the decrease of the mutual van der Waals forc...
International Nuclear Information System (INIS)
Lim, Gyeong Hui
2008-03-01
This book consists of 15 chapters, which are basic conception and meaning of statistical thermodynamics, Maxwell-Boltzmann's statistics, ensemble, thermodynamics function and fluctuation, statistical dynamics with independent particle system, ideal molecular system, chemical equilibrium and chemical reaction rate in ideal gas mixture, classical statistical thermodynamics, ideal lattice model, lattice statistics and nonideal lattice model, imperfect gas theory on liquid, theory on solution, statistical thermodynamics of interface, statistical thermodynamics of a high molecule system and quantum statistics
Hyppolite, Judex; Trivedi, Pravin
2012-06-01
Cross-sectional latent class regression models, also known as switching regressions or hidden Markov models, cannot identify transitions between classes that may occur over time. This limitation can potentially be overcome when panel data are available. For such data, we develop a sequence of models that combine features of the static cross-sectional latent class (finite mixture) models with those of hidden Markov models. We model the probability of movement between categories in terms of a Markovian structure, which links the current state with a previous state, where state may refer to the category of an individual. This article presents a suite of mixture models of varying degree of complexity and flexibility for use in a panel count data setting, beginning with a baseline model which is a two-component mixture of Poisson distribution in which latent classes are fixed and permanent. Sequentially, we extend this framework (i) to allow the mixing proportions to be smoothly varying continuous functions of time-varying covariates, (ii) to add time dependence to the benchmark model by modeling the class-indicator variable as a first-order Markov chain and (iii) to extend item (i) by making it dynamic and introducing covariate dependence in the transition probabilities. We develop and implement estimation algorithms for these models and provide an empirical illustration using 1995-1999 panel data on the number of doctor visits derived from the German Socio-Economic Panel. Copyright © 2012 John Wiley & Sons, Ltd.
Statistical analysis and model validation of automobile emissions
2000-09-01
The article discusses the development of a comprehensive modal emissions model that is currently being integrated with a variety of transportation models as part of National Cooperative Highway Research Program project 25-11. Described is the second-...
Cross-Lingual Lexical Triggers in Statistical Language Modeling
National Research Council Canada - National Science Library
Kim, Woosung; Khudanpur, Sanjeev
2003-01-01
.... We achieve this through an extension of the method of lexical triggers to the cross-language problem, and by developing a likelihoodbased adaptation scheme for combining a trigger model with an N-gram model...
Energy Technology Data Exchange (ETDEWEB)
Lovejoy, S., E-mail: lovejoy@physics.mcgill.ca [Physics Department, McGill University, Montreal, Quebec H3A 2T8 (Canada); Lima, M. I. P. de [Institute of Marine Research (IMAR) and Marine and Environmental Sciences Centre (MARE), Coimbra (Portugal); Department of Civil Engineering, University of Coimbra, 3030-788 Coimbra (Portugal)
2015-07-15
Over the range of time scales from about 10 days to 30–100 years, in addition to the familiar weather and climate regimes, there is an intermediate “macroweather” regime characterized by negative temporal fluctuation exponents: implying that fluctuations tend to cancel each other out so that averages tend to converge. We show theoretically and numerically that macroweather precipitation can be modeled by a stochastic weather-climate model (the Climate Extended Fractionally Integrated Flux, model, CEFIF) first proposed for macroweather temperatures and we show numerically that a four parameter space-time CEFIF model can approximately reproduce eight or so empirical space-time exponents. In spite of this success, CEFIF is theoretically and numerically difficult to manage. We therefore propose a simplified stochastic model in which the temporal behavior is modeled as a fractional Gaussian noise but the spatial behaviour as a multifractal (climate) cascade: a spatial extension of the recently introduced ScaLIng Macroweather Model, SLIMM. Both the CEFIF and this spatial SLIMM model have a property often implicitly assumed by climatologists that climate statistics can be “homogenized” by normalizing them with the standard deviation of the anomalies. Physically, it means that the spatial macroweather variability corresponds to different climate zones that multiplicatively modulate the local, temporal statistics. This simplified macroweather model provides a framework for macroweather forecasting that exploits the system's long range memory and spatial correlations; for it, the forecasting problem has been solved. We test this factorization property and the model with the help of three centennial, global scale precipitation products that we analyze jointly in space and in time.
Lovejoy, S; de Lima, M I P
2015-07-01
Over the range of time scales from about 10 days to 30-100 years, in addition to the familiar weather and climate regimes, there is an intermediate "macroweather" regime characterized by negative temporal fluctuation exponents: implying that fluctuations tend to cancel each other out so that averages tend to converge. We show theoretically and numerically that macroweather precipitation can be modeled by a stochastic weather-climate model (the Climate Extended Fractionally Integrated Flux, model, CEFIF) first proposed for macroweather temperatures and we show numerically that a four parameter space-time CEFIF model can approximately reproduce eight or so empirical space-time exponents. In spite of this success, CEFIF is theoretically and numerically difficult to manage. We therefore propose a simplified stochastic model in which the temporal behavior is modeled as a fractional Gaussian noise but the spatial behaviour as a multifractal (climate) cascade: a spatial extension of the recently introduced ScaLIng Macroweather Model, SLIMM. Both the CEFIF and this spatial SLIMM model have a property often implicitly assumed by climatologists that climate statistics can be "homogenized" by normalizing them with the standard deviation of the anomalies. Physically, it means that the spatial macroweather variability corresponds to different climate zones that multiplicatively modulate the local, temporal statistics. This simplified macroweather model provides a framework for macroweather forecasting that exploits the system's long range memory and spatial correlations; for it, the forecasting problem has been solved. We test this factorization property and the model with the help of three centennial, global scale precipitation products that we analyze jointly in space and in time.
Regional temperature models are needed for characterizing and mapping stream thermal regimes, establishing reference conditions, predicting future impacts and identifying critical thermal refugia. Spatial statistical models have been developed to improve regression modeling techn...
Medicaid Drug Claims Statistics
U.S. Department of Health & Human Services — The Medicaid Drug Claims Statistics CD is a useful tool that conveniently breaks up Medicaid claim counts and separates them by quarter and includes an annual count.
International Nuclear Information System (INIS)
Povoski, Stephen P; Chapman, Gregg J; Murrey, Douglas A; Lee, Robert; Martin, Edward W; Hall, Nathan C
2013-01-01
Intraoperative detection of 18 F-FDG-avid tissue sites during 18 F-FDG-directed surgery can be very challenging when utilizing gamma detection probes that rely on a fixed target-to-background (T/B) ratio (ratiometric threshold) for determination of probe positivity. The purpose of our study was to evaluate the counting efficiency and the success rate of in situ intraoperative detection of 18 F-FDG-avid tissue sites (using the three-sigma statistical threshold criteria method and the ratiometric threshold criteria method) for three different gamma detection probe systems. Of 58 patients undergoing 18 F-FDG-directed surgery for known or suspected malignancy using gamma detection probes, we identified nine 18 F-FDG-avid tissue sites (from amongst seven patients) that were seen on same-day preoperative diagnostic PET/CT imaging, and for which each 18 F-FDG-avid tissue site underwent attempted in situ intraoperative detection concurrently using three gamma detection probe systems (K-alpha probe, and two commercially-available PET-probe systems), and then were subsequently surgical excised. The mean relative probe counting efficiency ratio was 6.9 (± 4.4, range 2.2–15.4) for the K-alpha probe, as compared to 1.5 (± 0.3, range 1.0–2.1) and 1.0 (± 0, range 1.0–1.0), respectively, for two commercially-available PET-probe systems (P < 0.001). Successful in situ intraoperative detection of 18 F-FDG-avid tissue sites was more frequently accomplished with each of the three gamma detection probes tested by using the three-sigma statistical threshold criteria method than by using the ratiometric threshold criteria method, specifically with the three-sigma statistical threshold criteria method being significantly better than the ratiometric threshold criteria method for determining probe positivity for the K-alpha probe (P = 0.05). Our results suggest that the improved probe counting efficiency of the K-alpha probe design used in conjunction with the three
Povoski, Stephen P; Chapman, Gregg J; Murrey, Douglas A; Lee, Robert; Martin, Edward W; Hall, Nathan C
2013-03-04
Intraoperative detection of (18)F-FDG-avid tissue sites during 18F-FDG-directed surgery can be very challenging when utilizing gamma detection probes that rely on a fixed target-to-background (T/B) ratio (ratiometric threshold) for determination of probe positivity. The purpose of our study was to evaluate the counting efficiency and the success rate of in situ intraoperative detection of (18)F-FDG-avid tissue sites (using the three-sigma statistical threshold criteria method and the ratiometric threshold criteria method) for three different gamma detection probe systems. Of 58 patients undergoing (18)F-FDG-directed surgery for known or suspected malignancy using gamma detection probes, we identified nine (18)F-FDG-avid tissue sites (from amongst seven patients) that were seen on same-day preoperative diagnostic PET/CT imaging, and for which each (18)F-FDG-avid tissue site underwent attempted in situ intraoperative detection concurrently using three gamma detection probe systems (K-alpha probe, and two commercially-available PET-probe systems), and then were subsequently surgical excised. The mean relative probe counting efficiency ratio was 6.9 (± 4.4, range 2.2-15.4) for the K-alpha probe, as compared to 1.5 (± 0.3, range 1.0-2.1) and 1.0 (± 0, range 1.0-1.0), respectively, for two commercially-available PET-probe systems (P < 0.001). Successful in situ intraoperative detection of 18F-FDG-avid tissue sites was more frequently accomplished with each of the three gamma detection probes tested by using the three-sigma statistical threshold criteria method than by using the ratiometric threshold criteria method, specifically with the three-sigma statistical threshold criteria method being significantly better than the ratiometric threshold criteria method for determining probe positivity for the K-alpha probe (P = 0.05). Our results suggest that the improved probe counting efficiency of the K-alpha probe design used in conjunction with the three-sigma statistical
Oseloka Ezepue, Patrick; Ojo, Adegbola
2012-12-01
A challenging problem in some developing countries such as Nigeria is inadequate training of students in effective problem solving using the core concepts of their disciplines. Related to this is a disconnection between their learning and socio-economic development agenda of a country. These problems are more vivid in statistical education which is dominated by textbook examples and unbalanced assessment 'for' and 'of' learning within traditional curricula. The problems impede the achievement of socio-economic development objectives such as those stated in the Nigerian Vision 2020 blueprint and United Nations Millennium Development Goals. They also impoverish the ability of (statistics) graduates to creatively use their knowledge in relevant business and industry sectors, thereby exacerbating mass graduate unemployment in Nigeria and similar developing countries. This article uses a case study in statistical modelling to discuss the nature of innovations in statistics education vital to producing new kinds of graduates who can link their learning to national economic development goals, create wealth and alleviate poverty through (self) employment. Wider implications of the innovations for repositioning mathematical sciences education globally are explored in this article.
International Nuclear Information System (INIS)
Matsumoto, Haruya; Kaya, Nobuyuki; Yuasa, Kazuhiro; Hayashi, Tomoaki
1976-01-01
Electron counting method has been devised and experimented for the purpose of measuring electron temperature and density, the most fundamental quantities to represent plasma conditions. Electron counting is a method to count the electrons in plasma directly by equipping a probe with the secondary electron multiplier. It has three advantages of adjustable sensitivity, high sensitivity of the secondary electron multiplier, and directional property. Sensitivity adjustment is performed by changing the size of collecting hole (pin hole) on the incident front of the multiplier. The probe is usable as a direct reading thermometer of electron temperature because it requires to collect very small amount of electrons, thus it doesn't disturb the surrounding plasma, and the narrow sweep width of the probe voltage is enough. Therefore it can measure anisotropy more sensitively than a Langmuir probe, and it can be used for very low density plasma. Though many problems remain on anisotropy, computer simulation has been carried out. Also it is planned to provide a Helmholtz coil in the vacuum chamber to eliminate the effect of earth magnetic field. In practical experiments, the measurement with a Langmuir probe and an emission probe mounted to the movable structure, the comparison with the results obtained in reverse magnetic field by using a Helmholtz coil, and the measurement of ionic sound wave are scheduled. (Wakatsuki, Y.)
DEFF Research Database (Denmark)
Rizzi, Silvia
Vital statistics are often available to health researchers on a low resolution. In mortality analysis the distribution of deaths by age is often aggregated in groups of 5 years of age with a wide open-ended interval that sums a total for persons above age 85. The data that the researcher observes...... are therefore only an aggregate of true latent values. Grouping vital statistics in relatively wide bins before making them available is due to several reasons: Protect the privacy of patients; enable a compact presentation of the data; assemble scares observations; make them comparable with other databases...... a non-parametric method is developed to efficiently estimate age-at-death distributions and mortality rates from coarsely grouped data. The approach is based on a yet unexplored statistical model, the penalized composite link model, which extends generalized linear models. Observations are treated...
A Statistical Model for Natural Gas Standardized Load Profiles
Czech Academy of Sciences Publication Activity Database
Brabec, Marek; Konár, Ondřej; Malý, Marek; Pelikán, Emil; Vondráček, Jiří
2009-01-01
Roč. 58, č. 1 (2009), s. 123-139 ISSN 0035-9254 R&D Projects: GA AV ČR 1ET400300513 Institutional research plan: CEZ:AV0Z10300504 Keywords : disaggregation * generalized additive models * multiplicative model * non-linear effects * segmentation * semiparametric regression model Subject RIV: JE - Non-nuclear Energetics, Energy Consumption ; Use Impact factor: 1.060, year: 2009
Carrier Statistics and Quantum Capacitance Models of Graphene Nanoscroll
Directory of Open Access Journals (Sweden)
M. Khaledian
2014-01-01
schematic perfect scroll-like Archimedes spiral. The DOS model was derived at first, while it was later applied to compute the carrier concentration and quantum capacitance model. Furthermore, the carrier concentration and quantum capacitance were modeled for both degenerate and nondegenerate regimes, along with examining the effect of structural parameters and chirality number on the density of state and carrier concentration. Latterly, the temperature effect on the quantum capacitance was studied too.
A Count Model to Study the Correlates of 60 Min of Daily Physical Activity in Portuguese Children
Directory of Open Access Journals (Sweden)
Alessandra Borges
2015-02-01
Full Text Available This study aimed to present data on Portuguese children (aged 9–11 years complying with moderate-to-vigorous physical activity (MVPA guidelines, and to identify the importance of correlates from multiple domains associated with meeting the guidelines. Physical activity (PA was objectively assessed by accelerometry throughout seven days on 777 children. A count model using Poisson regression was used to identify the best set of correlates that predicts the variability in meeting the guidelines. Only 3.1% of children met the recommended daily 60 min of MVPA for all seven days of the week. Further, the Cochrane–Armitage chi-square test indicated a linear and negative trend (p < 0.001 from none to all seven days of children complying with the guidelines. The count model explained 22% of the variance in meeting MVPA guidelines daily. Being a girl, having a higher BMI, belonging to families with higher income, sleeping more and taking greater time walking from home to a sporting venue significantly reduced the probability of meeting daily recommended MVPA across the seven days. Furthermore, compared to girls, increasing sleep time in boys increased their chances of compliance with the MVPA recommendations. These results reinforce the relevance of considering different covariates’ roles on PA compliance when designing efficient intervention strategies to promote healthy and active lifestyles in children.
Role of scaling in the statistical modelling of finance
Indian Academy of Sciences (India)
Abstract. Modelling the evolution of a financial index as a stochastic process is a prob- lem awaiting a full, satisfactory solution since it was first formulated by Bachelier in 1900. Here it is shown that the scaling with time of the return probability density function sampled from the historical series suggests a successful model.
Statistical shape model with random walks for inner ear segmentation
DEFF Research Database (Denmark)
Pujadas, Esmeralda Ruiz; Kjer, Hans Martin; Piella, Gemma
2016-01-01
Cochlear implants can restore hearing to completely or partially deaf patients. The intervention planning can be aided by providing a patient-specific model of the inner ear. Such a model has to be built from high resolution images with accurate segmentations. Thus, a precise segmentation is requ...
Recent advances in importance sampling for statistical model checking
Reijsbergen, D.P.; de Boer, Pieter-Tjerk; Scheinhardt, Willem R.W.; Haverkort, Boudewijn R.H.M.
2013-01-01
In the following work we present an overview of recent advances in rare event simulation for model checking made at the University of Twente. The overview is divided into the several model classes for which we propose algorithms, namely multicomponent systems, Markov chains and stochastic Petri
Role of scaling in the statistical modelling of finance
Indian Academy of Sciences (India)
Modelling the evolution of a financial index as a stochastic process is a problem awaiting a full, satisfactory solution since it was first formulated by Bachelier in 1900. Here it is shown that the scaling with time of the return probability density function sampled from the historical series suggests a successful model.
Statistical model of stress corrosion cracking based on extended
Indian Academy of Sciences (India)
The mechanism of stress corrosion cracking (SCC) has been discussed for decades. Here I propose a model of SCC reflecting the feature of fracture in brittle manner based on the variational principle under approximately supposed thermal equilibrium. In that model the functionals are expressed with extended forms of ...
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
The mechanism of stress corrosion cracking (SCC) has been discussed for decades. Here I propose a model of SCC reflecting the feature of fracture in brittle manner based on the variational principle under approximately supposed thermal equilibrium. In that model the functionals are expressed with extended forms of ...
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
2013-12-01
Dec 1, 2013 ... Abstract. The mechanism of stress corrosion cracking (SCC) has been discussed for decades. Here I propose a model of SCC reflecting the feature of fracture in brittle manner based on the vari- ational principle under approximately supposed thermal equilibrium. In that model the functionals are expressed ...
Study on Semi-Parametric Statistical Model of Safety Monitoring of Cracks in Concrete Dams
Directory of Open Access Journals (Sweden)
Chongshi Gu
2013-01-01
Full Text Available Cracks are one of the hidden dangers in concrete dams. The study on safety monitoring models of concrete dam cracks has always been difficult. Using the parametric statistical model of safety monitoring of cracks in concrete dams, with the help of the semi-parametric statistical theory, and considering the abnormal behaviors of these cracks, the semi-parametric statistical model of safety monitoring of concrete dam cracks is established to overcome the limitation of the parametric model in expressing the objective model. Previous projects show that the semi-parametric statistical model has a stronger fitting effect and has a better explanation for cracks in concrete dams than the parametric statistical model. However, when used for forecast, the forecast capability of the semi-parametric statistical model is equivalent to that of the parametric statistical model. The modeling of the semi-parametric statistical model is simple, has a reasonable principle, and has a strong practicality, with a good application prospect in the actual project.
Applied categorical and count data analysis
Tang, Wan; Tu, Xin M
2012-01-01
Introduction Discrete Outcomes Data Source Outline of the BookReview of Key Statistical ResultsSoftwareContingency Tables Inference for One-Way Frequency TableInference for 2 x 2 TableInference for 2 x r TablesInference for s x r TableMeasures of AssociationSets of Contingency Tables Confounding Effects Sets of 2 x 2 TablesSets of s x r TablesRegression Models for Categorical Response Logistic Regression for Binary ResponseInference about Model ParametersGoodness of FitGeneralized Linear ModelsRegression Models for Polytomous ResponseRegression Models for Count Response Poisson Regression Mode
Reflections on the Baron and Kenny model of statistical mediation
Directory of Open Access Journals (Sweden)
Antonio Pardo
2013-05-01
Full Text Available In the 25 years since Baron and Kenny (1986 published their ideas on how to analyze and interpret statistical mediation, few works have been more cited, and perhaps, so decisively influenced the way applied researchers understand and analyze mediation in social and health sciences. However, the widespread use of a procedure does not necessarily make it a safe or reliable strategy. In fact, during these years, many researchers have pointed out the limitations of the procedure Baron and Kenny proposed for demonstrating mediation. The twofold aim of this paper is to (1 carry out a review of the limitations of the method by Baron and Kenny, with particular attention to the weakness in the confirmatory logic of the procedure, and (2 provide an empirical example that, in applying the method, data obtained from the same theoretical scenario (i.e., with or without the presence of mediation can be compatible with both the mediation and no-mediation hypotheses.
Statistical modelling of Poisson/log-normal data
International Nuclear Information System (INIS)
Miller, G.
2007-01-01
In statistical data fitting, self consistency is checked by examining the closeness of the quantity Χ 2 /NDF to 1, where Χ 2 is the sum of squares of data minus fit divided by standard deviation, and NDF is the number of data minus the number of fit parameters. In order to calculate Χ 2 one needs an expression for the standard deviation. In this note several alternative expressions for the standard deviation of data distributed according to a Poisson/log-normal distribution are proposed and evaluated by Monte Carlo simulation. Two preferred alternatives are identified. The use of replicate data to obtain uncertainty is problematic for a small number of replicates. A method to correct this problem is proposed. The log-normal approximation is good for sufficiently positive data. A modification of the log-normal approximation is proposed, which allows it to be used to test the hypothesis that the true value is zero. (authors)
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Statistical Texture Model for mass Detection in Mammography
Directory of Open Access Journals (Sweden)
Nicolás Gallego-Ortiz
2013-12-01
Full Text Available In the context of image processing algorithms for mass detection in mammography, texture is a key feature to be used to distinguish abnormal tissue from normal tissue. Recently, a texture model based on a multivariate Gaussian mixture was proposed, of which the parameters are learned in an unsupervised way from the pixel intensities of images. The model produces images that are probabilistic maps of texture normality and it was proposed as a visualization aid for diagnostic by clinical experts. In this paper, the usability of the model is studied for automatic mass detection. A segmentation strategy is proposed and evaluated using 79 mammography cases.
Directory of Open Access Journals (Sweden)
Jerry Määttä
2015-10-01
Full Text Available Over the past decade, apocalyptic and post-apocalyptic disaster narratives seem to have become more popular than ever before. Since its inception in secular form in the first decades of the nineteenth century, however, the genre has experienced a number of fluctuations in popularity, especially in the twentieth century. Inspired by Franco Moretti's influential Graphs, Maps, Trees (2005, the aim of this study is to analyse the historiography, canonisation, and historical fluctuations of Anglo-phone apocalyptic and post-apocalyptic disaster narratives in literature and film through an elementary statistical analysis of previous surveys of the field. While the small database on which the study is based essentially consists of a meta-study of historiography and canonisation within the genre, disclosing which works have been considered to be the most important, the data is also used to assess the periods in which the most influential, innovative, and/or popular works were published or released. As an attempt is also made to explain some of the fluctuations in the pop-ularity of the genre - with an eye to historical, cultural, medial, social, and political contexts - perhaps the study might help us understand why it is that we as a society seem to need these stories ever so often.
Computational modeling of neural activities for statistical inference
Kolossa, Antonio
2016-01-01
This authored monograph supplies empirical evidence for the Bayesian brain hypothesis by modeling event-related potentials (ERP) of the human electroencephalogram (EEG) during successive trials in cognitive tasks. The employed observer models are useful to compute probability distributions over observable events and hidden states, depending on which are present in the respective tasks. Bayesian model selection is then used to choose the model which best explains the ERP amplitude fluctuations. Thus, this book constitutes a decisive step towards a better understanding of the neural coding and computing of probabilities following Bayesian rules. The target audience primarily comprises research experts in the field of computational neurosciences, but the book may also be beneficial for graduate students who want to specialize in this field. .
A Statistical Model of Current Loops and Magnetic Monopoles
International Nuclear Information System (INIS)
Ayyer, Arvind
2015-01-01
We formulate a natural model of loops and isolated vertices for arbitrary planar graphs, which we call the monopole-dimer model. We show that the partition function of this model can be expressed as a determinant. We then extend the method of Kasteleyn and Temperley-Fisher to calculate the partition function exactly in the case of rectangular grids. This partition function turns out to be a square of a polynomial with positive integer coefficients when the grid lengths are even. Finally, we analyse this formula in the infinite volume limit and show that the local monopole density, free energy and entropy can be expressed in terms of well-known elliptic functions. Our technique is a novel determinantal formula for the partition function of a model of isolated vertices and loops for arbitrary graphs
Statistical model based gender prediction for targeted NGS clinical panels
Directory of Open Access Journals (Sweden)
Palani Kannan Kandavel
2017-12-01
The reference test dataset are being used to test the model. The sensitivity on predicting the gender has been increased from the current “genotype composition in ChrX” based approach. In addition, the prediction score given by the model can be used to evaluate the quality of clinical dataset. The higher prediction score towards its respective gender indicates the higher quality of sequenced data.
Illness-death model: statistical perspective and differential equations.
Brinks, Ralph; Hoyer, Annika
2018-01-27
The aim of this work is to relate the theory of stochastic processes with the differential equations associated with multistate (compartment) models. We show that the Kolmogorov Forward Differential Equations can be used to derive a relation between the prevalence and the transition rates in the illness-death model. Then, we prove mathematical well-definedness and epidemiological meaningfulness of the prevalence of the disease. As an application, we derive the incidence of diabetes from a series of cross-sections.
Statistical geological discrete fracture network model. Forsmark modelling stage 2.2
International Nuclear Information System (INIS)
Fox, Aaron; La Pointe, Paul; Simeonov, Assen; Hermanson, Jan; Oehman, Johan
2007-11-01
The Swedish Nuclear Fuel and Waste Management Company (SKB) is performing site characterization at two different locations, Forsmark and Laxemar, in order to locate a site for a final geologic repository for spent nuclear fuel. The program is built upon the development of Site Descriptive Models (SDMs) at specific timed data freezes. Each SDM is formed from discipline-specific reports from across the scientific spectrum. This report describes the methods, analyses, and conclusions of the geological modeling team with respect to a geological and statistical model of fractures and minor deformation zones (henceforth referred to as the geological DFN), version 2.2, at the Forsmark site. The geological DFN builds upon the work of other geological modelers, including the deformation zone (DZ), rock domain (RD), and fracture domain (FD) models. The geological DFN is a statistical model for stochastically simulating rock fractures and minor deformation zones as a scale of less than 1,000 m (the lower cut-off of the DZ models). The geological DFN is valid within four specific fracture domains inside the local model region, and encompassing the candidate volume at Forsmark: FFM01, FFM02, FFM03, and FFM06. The models are build using data from detailed surface outcrop maps and the cored borehole record at Forsmark. The conceptual model for the Forsmark 2.2 geological revolves around the concept of orientation sets; for each fracture domain, other model parameters such as size and intensity are tied to the orientation sets. Two classes of orientation sets were described; Global sets, which are encountered everywhere in the model region, and Local sets, which represent highly localized stress environments. Orientation sets were described in terms of their general cardinal direction (NE, NW, etc). Two alternatives are presented for fracture size modeling: - the tectonic continuum approach (TCM, TCMF) described by coupled size-intensity scaling following power law distributions
Statistical geological discrete fracture network model. Forsmark modelling stage 2.2
Energy Technology Data Exchange (ETDEWEB)
Fox, Aaron; La Pointe, Paul [Golder Associates Inc (United States); Simeonov, Assen [Swedish Nuclear Fuel and Waste Management Co., Stockholm (Sweden); Hermanson, Jan; Oehman, Johan [Golder Associates AB, Stockholm (Sweden)
2007-11-15
The Swedish Nuclear Fuel and Waste Management Company (SKB) is performing site characterization at two different locations, Forsmark and Laxemar, in order to locate a site for a final geologic repository for spent nuclear fuel. The program is built upon the development of Site Descriptive Models (SDMs) at specific timed data freezes. Each SDM is formed from discipline-specific reports from across the scientific spectrum. This report describes the methods, analyses, and conclusions of the geological modeling team with respect to a geological and statistical model of fractures and minor deformation zones (henceforth referred to as the geological DFN), version 2.2, at the Forsmark site. The geological DFN builds upon the work of other geological modelers, including the deformation zone (DZ), rock domain (RD), and fracture domain (FD) models. The geological DFN is a statistical model for stochastically simulating rock fractures and minor deformation zones as a scale of less than 1,000 m (the lower cut-off of the DZ models). The geological DFN is valid within four specific fracture domains inside the local model region, and encompassing the candidate volume at Forsmark: FFM01, FFM02, FFM03, and FFM06. The models are build using data from detailed surface outcrop maps and the cored borehole record at Forsmark. The conceptual model for the Forsmark 2.2 geological revolves around the concept of orientation sets; for each fracture domain, other model parameters such as size and intensity are tied to the orientation sets. Two classes of orientation sets were described; Global sets, which are encountered everywhere in the model region, and Local sets, which represent highly localized stress environments. Orientation sets were described in terms of their general cardinal direction (NE, NW, etc). Two alternatives are presented for fracture size modeling: - the tectonic continuum approach (TCM, TCMF) described by coupled size-intensity scaling following power law distributions
Modelling West African Total Precipitation Depth: A Statistical Approach
Directory of Open Access Journals (Sweden)
S. Sovoe
2015-09-01
Full Text Available Even though several reports over the past few decades indicate an increasing aridity over West Africa, attempts to establish the controlling factor(s have not been successful. The traditional belief of the position of the Inter-tropical Convergence Zone (ITCZ as the predominant factor over the region has been refuted by recent findings. Changes in major atmospheric circulations such as African Easterly Jet (AEJ and Tropical Easterly Jet (TEJ are being cited as major precipitation driving forces over the region. Thus, any attempt to predict long term precipitation events over the region using Global Circulation or Local Circulation Models could be flawed as the controlling factors are not fully elucidated yet. Successful prediction effort may require models which depend on past events as their inputs as in the case of time series models such as Autoregressive Integrated Moving Average (ARIMA model. In this study, historical precipitation data was imported as time series data structure into an R programming language and was used to build appropriate Seasonal Multiplicative Autoregressive Integrated Moving Average model, ARIMA (p, d, q*(P, D, Q. The model was then used to predict long term precipitation events over the Ghanaian segment of the Volta Basin which could be used in planning and implementation of development policies.
Peters, J. M.; Kravtsov, S.
2011-12-01
This study quantifies the dependence of nonlinear regimes (manifested in non-gaussian probability distributions) and spreads of ensemble trajectories in a reduced phase space of a realistic three-layer quasi-geostrophic (QG3) atmospheric model on this model's climate state.To elucidate probabilistic properties of the QG3 trajectories, we compute, in phase planes of leading EOFs of the model, the coefficients of the corresponding Fokker-Planck (FP) equations. These coefficients represent drift vectors (computed from one-day phase space tendencies) and diffusion tensors (computed from one-day lagged covariance matrices of model trajectory displacements), and are based on a long QG3 simulation. We also fit two statistical trajectory models to the reduced phase-space time series spanned by the full QG3 model states. One reduced model is a standard Linear Inverse Model (LIM) fitted to a long QG3 time series. The LIM model is forced by state-independent (additive) noise and has a deterministic operator which represents non-divergent velocity field in the reduced phase space considered. The other, more advanced model (NSM), is nonlinear, divergent, and is driven by state-dependent noise. The NSM model mimics well the full QG3 model trajectory behavior in the reduced phase space; its corresponding FP model is nearly identical to that based on the full QG3 simulations. By systematic analysis of the differences between the drift vectors and diffusion tensors of the QG3-based, NSM-based, and LIM-based FP models, as well as the PDF evolution simulated by these FP models, we disentangle the contributions of the multiplicative noise and deterministic dynamics into nonlinear behavior and predictability of the atmospheric states produced by the dynamical QG3 model.
Wang, Yiyi; Kockelman, Kara M
2013-11-01
This work examines the relationship between 3-year pedestrian crash counts across Census tracts in Austin, Texas, and various land use, network, and demographic attributes, such as land use balance, residents' access to commercial land uses, sidewalk density, lane-mile densities (by roadway class), and population and employment densities (by type). The model specification allows for region-specific heterogeneity, correlation across response types, and spatial autocorrelation via a Poisson-based multivariate conditional auto-regressive (CAR) framework and is estimated using Bayesian Markov chain Monte Carlo methods. Least-squares regression estimates of walk-miles traveled per zone serve as the exposure measure. Here, the Poisson-lognormal multivariate CAR model outperforms an aspatial Poisson-lognormal multivariate model and a spatial model (without cross-severity correlation), both in terms of fit and inference. Positive spatial autocorrelation emerges across neighborhoods, as expected (due to latent heterogeneity or missing variables that trend in space, resulting in spatial clustering of crash counts). In comparison, the positive aspatial, bivariate cross correlation of severe (fatal or incapacitating) and non-severe crash rates reflects latent covariates that have impacts across severity levels but are more local in nature (such as lighting conditions and local sight obstructions), along with spatially lagged cross correlation. Results also suggest greater mixing of residences and commercial land uses is associated with higher pedestrian crash risk across different severity levels, ceteris paribus, presumably since such access produces more potential conflicts between pedestrian and vehicle movements. Interestingly, network densities show variable effects, and sidewalk provision is associated with lower severe-crash rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
Benchmark validation of statistical models: Application to mediation analysis of imagery and memory.
MacKinnon, David P; Valente, Matthew J; Wurpts, Ingrid C
2018-03-29
This article describes benchmark validation, an approach to validating a statistical model. According to benchmark validation, a valid model generates estimates and research conclusions consistent with a known substantive effect. Three types of benchmark validation-(a) benchmark value, (b) benchmark estimate, and (c) benchmark effect-are described and illustrated with examples. Benchmark validation methods are especially useful for statistical models with assumptions that are untestable or very difficult to test. Benchmark effect validation methods were applied to evaluate statistical mediation analysis in eight studies using the established effect that increasing mental imagery improves recall of words. Statistical mediation analysis led to conclusions about mediation that were consistent with established theory that increased imagery leads to increased word recall. Benchmark validation based on established substantive theory is discussed as a general way to investigate characteristics of statistical models and a complement to mathematical proof and statistical simulation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Physics-based statistical model and simulation method of RF propagation in urban environments
Pao, Hsueh-Yuan; Dvorak, Steven L.
2010-09-14
A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.
Sharing brain mapping statistical results with the neuroimaging data model
Maumet, Camille; Auer, Tibor; Bowring, Alexander; Chen, Gang; Das, Samir; Flandin, Guillaume; Ghosh, Satrajit; Glatard, Tristan; Gorgolewski, Krzysztof J.; Helmer, Karl G.; Jenkinson, Mark; Keator, David B.; Nichols, B. Nolan; Poline, Jean-Baptiste; Reynolds, Richard; Sochat, Vanessa; Turner, Jessica; Nichols, Thomas E.
2016-01-01
Only a tiny fraction of the data and metadata produced by an fMRI study is finally conveyed to the community. This lack of transparency not only hinders the reproducibility of neuroimaging results but also impairs future meta-analyses. In this work we introduce NIDM-Results, a format specification providing a machine-readable description of neuroimaging statistical results along with key image data summarising the experiment. NIDM-Results provides a unified representation of mass univariate analyses including a level of detail consistent with available best practices. This standardized representation allows authors to relay methods and results in a platform-independent regularized format that is not tied to a particular neuroimaging software package. Tools are available to export NIDM-Result graphs and associated files from the widely used SPM and FSL software packages, and the NeuroVault repository can import NIDM-Results archives. The specification is publically available at: http://nidm.nidash.org/specs/nidm-results.html. PMID:27922621
From intuition to statistics in building subsurface structural models
Brandenburg, J.P.; Alpak, F.O.; Naruk, S.; Solum, J.
2011-01-01
Experts associated with the oil and gas exploration industry suggest that combining forward trishear models with stochastic global optimization algorithms allows a quantitative assessment of the uncertainty associated with a given structural model. The methodology is applied to incompletely imaged structures related to deepwater hydrocarbon reservoirs and results are compared to prior manual palinspastic restorations and borehole data. This methodology is also useful for extending structural interpretations into other areas of limited resolution, such as subsalt in addition to extrapolating existing data into seismic data gaps. This technique can be used for rapid reservoir appraisal and potentially have other applications for seismic processing, well planning, and borehole stability analysis.