Kolmogorov complexity, pseudorandom generators and statistical models testing
Czech Academy of Sciences Publication Activity Database
Šindelář, Jan; Boček, Pavel
2002-01-01
Roč. 38, č. 6 (2002), s. 747-759 ISSN 0023-5954 R&D Projects: GA ČR GA102/99/1564 Institutional research plan: CEZ:AV0Z1075907 Keywords : Kolmogorov complexity * pseudorandom generators * statistical models testing Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.341, year: 2002
Automated robust generation of compact 3D statistical shape models
Vrtovec, Tomaz; Likar, Bostjan; Tomazevic, Dejan; Pernus, Franjo
2004-05-01
Ascertaining the detailed shape and spatial arrangement of anatomical structures is important not only within diagnostic settings but also in the areas of planning, simulation, intraoperative navigation, and tracking of pathology. Robust, accurate and efficient automated segmentation of anatomical structures is difficult because of their complexity and inter-patient variability. Furthermore, the position of the patient during image acquisition, the imaging device and protocol, image resolution, and other factors induce additional variations in shape and appearance. Statistical shape models (SSMs) have proven quite successful in capturing structural variability. A possible approach to obtain a 3D SSM is to extract reference voxels by precisely segmenting the structure in one, reference image. The corresponding voxels in other images are determined by registering the reference image to each other image. The SSM obtained in this way describes statistically plausible shape variations over the given population as well as variations due to imperfect registration. In this paper, we present a completely automated method that significantly reduces shape variations induced by imperfect registration, thus allowing a more accurate description of variations. At each iteration, the derived SSM is used for coarse registration, which is further improved by describing finer variations of the structure. The method was tested on 64 lumbar spinal column CT scans, from which 23, 38, 45, 46 and 42 volumes of interest containing vertebra L1, L2, L3, L4 and L5, respectively, were extracted. Separate SSMs were generated for each vertebra. The results show that the method is capable of reducing the variations induced by registration errors.
Model-generated air quality statistics for application in vegetation response models in Alberta
International Nuclear Information System (INIS)
McVehil, G.E.; Nosal, M.
1990-01-01
To test and apply vegetation response models in Alberta, air pollution statistics representative of various parts of the Province are required. At this time, air quality monitoring data of the requisite accuracy and time resolution are not available for most parts of Alberta. Therefore, there exists a need to develop appropriate air quality statistics. The objectives of the work reported here were to determine the applicability of model generated air quality statistics and to develop by modelling, realistic and representative time series of hourly SO 2 concentrations that could be used to generate the statistics demanded by vegetation response models
Patch-based generative shape model and MDL model selection for statistical analysis of archipelagos
DEFF Research Database (Denmark)
Ganz, Melanie; Nielsen, Mads; Brandt, Sami
2010-01-01
as a probability distribution of a binary image where the model is intended to facilitate sequential simulation. Our results show that a relatively simple model is able to generate structures visually similar to calcifications. Furthermore, we used the shape model as a shape prior in the statistical segmentation......We propose a statistical generative shape model for archipelago-like structures. These kind of structures occur, for instance, in medical images, where our intention is to model the appearance and shapes of calcifications in x-ray radio graphs. The generative model is constructed by (1) learning...... a patch-based dictionary for possible shapes, (2) building up a time-homogeneous Markov model to model the neighbourhood correlations between the patches, and (3) automatic selection of the model complexity by the minimum description length principle. The generative shape model is proposed...
Physical and statistical models for steam generator clogging diagnosis
Girard, Sylvain
2014-01-01
Clogging of steam generators in nuclear power plants is a highly sensitive issue in terms of performance and safety and this book proposes a completely novel methodology for diagnosing this phenomenon. It demonstrates real-life industrial applications of this approach to French steam generators and applies the approach to operational data gathered from French nuclear power plants. The book presents a detailed review of in situ diagnosis techniques and assesses existing methodologies for clogging diagnosis, whilst examining their limitations. It also addresses numerical modelling of the dynamic
A testing procedure for wind turbine generators based on the power grid statistical model
DEFF Research Database (Denmark)
Farajzadehbibalan, Saber; Ramezani, Mohammad Hossein; Nielsen, Peter
2017-01-01
In this study, a comprehensive test procedure is developed to test wind turbine generators with a hardware-in-loop setup. The procedure employs the statistical model of the power grid considering the restrictions of the test facility and system dynamics. Given the model in the latent space, the j...
Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana
2015-05-01
Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. Copyright © 2015 Elsevier Ltd. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Nelson, K; Sokkappa, P
2008-10-29
This report describes an approach for generating a simulated population of plausible nuclear threat radiation signatures spanning a range of variability that could be encountered by radiation detection systems. In this approach, we develop a statistical model for generating random instances of smuggled nuclear material. The model is based on physics principles and bounding cases rather than on intelligence information or actual threat device designs. For this initial stage of work, we focus on random models using fissile material and do not address scenarios using non-fissile materials. The model has several uses. It may be used as a component in a radiation detection system performance simulation to generate threat samples for injection studies. It may also be used to generate a threat population to be used for training classification algorithms. In addition, we intend to use this model to generate an unclassified 'benchmark' threat population that can be openly shared with other organizations, including vendors, for use in radiation detection systems performance studies and algorithm development and evaluation activities. We assume that a quantity of fissile material is being smuggled into the country for final assembly and that shielding may have been placed around the fissile material. In terms of radiation signature, a nuclear weapon is basically a quantity of fissile material surrounded by various layers of shielding. Thus, our model of smuggled material is expected to span the space of potential nuclear weapon signatures as well. For computational efficiency, we use a generic 1-dimensional spherical model consisting of a fissile material core surrounded by various layers of shielding. The shielding layers and their configuration are defined such that the model can represent the potential range of attenuation and scattering that might occur. The materials in each layer and the associated parameters are selected from probability distributions that
Directory of Open Access Journals (Sweden)
F. F. Asal
2012-07-01
Full Text Available Digital elevation data obtained from different Engineering Surveying techniques is utilized in generating Digital Elevation Model (DEM, which is employed in many Engineering and Environmental applications. This data is usually in discrete point format making it necessary to utilize an interpolation approach for the creation of DEM. Quality assessment of the DEM is a vital issue controlling its use in different applications; however this assessment relies heavily on statistical methods with neglecting the visual methods. The research applies visual analysis investigation on DEMs generated using IDW interpolator of varying powers in order to examine their potential in the assessment of the effects of the variation of the IDW power on the quality of the DEMs. Real elevation data has been collected from field using total station instrument in a corrugated terrain. DEMs have been generated from the data at a unified cell size using IDW interpolator with power values ranging from one to ten. Visual analysis has been undertaken using 2D and 3D views of the DEM; in addition, statistical analysis has been performed for assessment of the validity of the visual techniques in doing such analysis. Visual analysis has shown that smoothing of the DEM decreases with the increase in the power value till the power of four; however, increasing the power more than four does not leave noticeable changes on 2D and 3D views of the DEM. The statistical analysis has supported these results where the value of the Standard Deviation (SD of the DEM has increased with increasing the power. More specifically, changing the power from one to two has produced 36% of the total increase (the increase in SD due to changing the power from one to ten in SD and changing to the powers of three and four has given 60% and 75% respectively. This refers to decrease in DEM smoothing with the increase in the power of the IDW. The study also has shown that applying visual methods supported
Substorm associated radar auroral surges: a statistical study and possible generation model
Directory of Open Access Journals (Sweden)
B. A. Shand
Full Text Available Substorm-associated radar auroral surges (SARAS are a short lived (15–90 minutes and spatially localised (~5° of latitude perturbation of the plasma convection pattern observed within the auroral E-region. The understanding of such phenomena has important ramifications for the investigation of the larger scale plasma convection and ultimately the coupling of the solar wind, magnetosphere and ionosphere system. A statistical investigation is undertaken of SARAS, observed by the Sweden And Britain Radar Experiment (SABRE, in order to provide a more extensive examination of the local time occurrence and propagation characteristics of the events. The statistical analysis has determined a local time occurrence of observations between 1420 MLT and 2200 MLT with a maximum occurrence centred around 1700 MLT. The propagation velocity of the SARAS feature through the SABRE field of view was found to be predominately L-shell aligned with a velocity centred around 1750 m s^{–1} and within the range 500 m s^{–1} and 3500 m s^{–1}. This comprehensive examination of the SARAS provides the opportunity to discuss, qualitatively, a possible generation mechanism for SARAS based on a proposed model for the production of a similar phenomenon referred to as sub-auroral ion drifts (SAIDs. The results of the comparison suggests that SARAS may result from a similar geophysical mechanism to that which produces SAID events, but probably occurs at a different time in the evolution of the event.
Key words. Substorms · Auroral surges · Plasma con-vection · Sub-auroral ion drifts
Diffeomorphic Statistical Deformation Models
DEFF Research Database (Denmark)
Hansen, Michael Sass; Hansen, Mads/Fogtman; Larsen, Rasmus
2007-01-01
In this paper we present a new method for constructing diffeomorphic statistical deformation models in arbitrary dimensional images with a nonlinear generative model and a linear parameter space. Our deformation model is a modified version of the diffeomorphic model introduced by Cootes et al....... The modifications ensure that no boundary restriction has to be enforced on the parameter space to prevent folds or tears in the deformation field. For straightforward statistical analysis, principal component analysis and sparse methods, we assume that the parameters for a class of deformations lie on a linear...
A Statistical Model for Hourly Large-Scale Wind and Photovoltaic Generation in New Locations
DEFF Research Database (Denmark)
Ekstrom, Jussi; Koivisto, Matti Juhani; Mellin, Ilkka
2017-01-01
The analysis of large-scale wind and photovoltaic (PV) energy generation is of vital importance in power systems where their penetration is high. This paper presents a modular methodology to assess the power generation and volatility of a system consisting of both PV plants (PVPs) and wind power...... through distances between the locations, which allows the methodology to be used to assess scenarios with PVPs and WPPs in multiple locations without actual measurement data. The methodology can be applied by the transmission and distribution system operators when analysing the effects and feasibility...... of new PVPs and WPPs in system planning. The model is verified against hourly measured wind speed and solar irradiance data from Finland. A case study assessing the impact of the geographical distribution of the PVPs and WPPs on aggregate power generation and its variability is presented....
Statistics Instruction: The Next Generation.
Prybutok, Victor R.; And Others
1991-01-01
Described are examples of classroom exercises that use interactive graphics software for personal computers to enhance the teaching of statistical concepts by allowing students to generate multiple examples, make conjectures, and verify their findings about the concept. The transfer of this interactive tool to other subject areas is suggested.…
Awédikian , Roy; Yannou , Bernard
2012-01-01
International audience; With the growing complexity of industrial software applications, industrials are looking for efficient and practical methods to validate the software. This paper develops a model-based statistical testing approach that automatically generates online and offline test cases for embedded software. It discusses an integrated framework that combines solutions for three major software testing research questions: (i) how to select test inputs; (ii) how to predict the expected...
Cabanes, S.; Spiga, A.; Guerlet, S.; Aurnou, J. M.; Favier, B.; Le Bars, M.
2017-12-01
The strong zonal (i.e. east-west) jet flows on the gas giants, Jupiter and Saturn, have persisted for hundreds of years. Zonal jets are large-scale features ubiquitous in planetary atmosphere and result from multi-scales interactions in rapidly rotating turbulent flows. Here we use a new Saturn Global Climate Model (GCM) coupling seasonal radiative model tailored for Saturn with a new hydrodynamical solver, developed in Laboratoire de Météorology Dynamique, which uses an original icosahedral mapping of the planetary sphere to ensure excellent conservation and scalability properties in massively parallel computing resources. Strong and quasi-steady Saturn jets are reproduced in our GCM simulations with both unprecedented horizontal resolutions (reference at 1/2 ° latitude/longitude, and tests at 1/4 ° and 1/8 ° ), integrated time (up to ten simulated Saturn years), and large vertical extent (from the troposphere to the stratosphere). We perform statistical analysis on the resulting flows to explore scales interactions and kinetic energy distribution at all scale. It appears that horizontal resolution as well as subgrid-scale (unresolved) dissipation, included as an additional hyperdiffusion term, strongly affect jets' intensity and statistical properties. In parallel, we set the first laboratory device capable to achieve the relevant regime to form planetary like zonal jets. We report that in a rapidly rotating cylindrical container, turbulent laboratory flow naturally generate multiple, alternating jets that share basic properties of the one observed on gas planets. By performing similar statistical analysis we directly confront flow properties of laboratory versus GCM generated jets and point out the effect of limited numerical resolution and subgrid-scale assumptions on atmospheric dynamics at large/jets scale.
International Nuclear Information System (INIS)
Hufnagel, Heike; Pennec, Xavier; Ayache, Nicholas; Ehrhardt, Jan; Handels, Heinz
2008-01-01
. The novel algorithm for building a generative statistical shape model (gSSM) does not need one-to-one point correspondences but relies solely on point correspondence probabilities for the computation of mean shape and eigenmodes. It is well-suited for shape analysis on unstructured point sets. (orig.)
Goldstein, Harvey
2011-01-01
This book provides a clear introduction to this important area of statistics. The author provides a wide of coverage of different kinds of multilevel models, and how to interpret different statistical methodologies and algorithms applied to such models. This 4th edition reflects the growth and interest in this area and is updated to include new chapters on multilevel models with mixed response types, smoothing and multilevel data, models with correlated random effects and modeling with variance.
Sampling, Probability Models and Statistical Reasoning Statistical ...
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Sampling, Probability Models and Statistical Reasoning Statistical
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
International Nuclear Information System (INIS)
O'Carroll, M.
1993-01-01
The author considers models of statistical mechanics and quantum field theory (in the Euclidean formulation) which are treated using renormalization group methods and where the action is a small perturbation of a quadratic action. The author obtains multiscale formulas for the generating and correlation functions after n renormalization group transformations which bring out the relation with the nth effective action. The author derives and compares the formulas for different RGs. The formulas for correlation functions involve (1) two propagators which are determined by a sequence of approximate wave function renormalization constants and renormalization group operators associated with the decomposition into scales of the quadratic form and (2) field derivatives of the nth effective action. For the case of the block field open-quotes δ-functionclose quotes RG the formulas are especially simple and for asymptotic free theories only the derivatives at zero field are needed; the formulas have been previously used directly to obtain bounds on correlation functions using information obtained from the analysis of effective actions. The simplicity can be traced to an open-quotes orthogonality-of-scalesclose quotes property which follows from an implicit wavelet structure. Other commonly used RGs do not have the open-quotes orthogonality of scalesclose quotes property. 19 refs
Using Microsoft Excel to Generate Usage Statistics
Spellman, Rosemary
2011-01-01
At the Libraries Service Center, statistics are generated on a monthly, quarterly, and yearly basis by using four Microsoft Excel workbooks. These statistics provide information about what materials are being requested and by whom. They also give details about why certain requests may not have been filled. Utilizing Excel allows for a shallower…
Improved model for statistical alignment
Energy Technology Data Exchange (ETDEWEB)
Miklos, I.; Toroczkai, Z. (Zoltan)
2001-01-01
The statistical approach to molecular sequence evolution involves the stochastic modeling of the substitution, insertion and deletion processes. Substitution has been modeled in a reliable way for more than three decades by using finite Markov-processes. Insertion and deletion, however, seem to be more difficult to model, and thc recent approaches cannot acceptably deal with multiple insertions and deletions. A new method based on a generating function approach is introduced to describe the multiple insertion process. The presented algorithm computes the approximate joint probability of two sequences in 0(13) running time where 1 is the geometric mean of the sequence lengths.
First-Generation Transgenic Plants and Statistics
Nap, Jan-Peter; Keizer, Paul; Jansen, Ritsert
1993-01-01
The statistical analyses of populations of first-generation transgenic plants are commonly based on mean and variance and generally require a test of normality. Since in many cases the assumptions of normality are not met, analyses can result in erroneous conclusions. Transformation of data to
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
Software Used to Generate Cancer Statistics - SEER Cancer Statistics
Videos that highlight topics and trends in cancer statistics and definitions of statistical terms. Also software tools for analyzing and reporting cancer statistics, which are used to compile SEER's annual reports.
Statistical Model for Content Extraction
DEFF Research Database (Denmark)
Qureshi, Pir Abdul Rasool; Memon, Nasrullah
2011-01-01
We present a statistical model for content extraction from HTML documents. The model operates on Document Object Model (DOM) tree of the corresponding HTML document. It evaluates each tree node and associated statistical features to predict significance of the node towards overall content of the ...... also describe the significance of the model in the domain of counterterrorism and open source intelligence....
Ghose, Soumya; Greer, Peter B.; Sun, Jidi; Pichler, Peter; Rivest-Henault, David; Mitra, Jhimli; Richardson, Haylea; Wratten, Chris; Martin, Jarad; Arm, Jameen; Best, Leah; Dowling, Jason A.
2017-11-01
In MR only radiation therapy planning, generation of the tissue specific HU map directly from the MRI would eliminate the need of CT image acquisition and may improve radiation therapy planning. The aim of this work is to generate and validate substitute CT (sCT) scans generated from standard T2 weighted MR pelvic scans in prostate radiation therapy dose planning. A Siemens Skyra 3T MRI scanner with laser bridge, flat couch and pelvic coil mounts was used to scan 39 patients scheduled for external beam radiation therapy for localized prostate cancer. For sCT generation a whole pelvis MRI (1.6 mm 3D isotropic T2w SPACE sequence) was acquired. Patients received a routine planning CT scan. Co-registered whole pelvis CT and T2w MRI pairs were used as training images. Advanced tissue specific non-linear regression models to predict HU for the fat, muscle, bladder and air were created from co-registered CT-MRI image pairs. On a test case T2w MRI, the bones and bladder were automatically segmented using a novel statistical shape and appearance model, while other soft tissues were separated using an Expectation-Maximization based clustering model. The CT bone in the training database that was most ‘similar’ to the segmented bone was then transformed with deformable registration to create the sCT component of the test case T2w MRI bone tissue. Predictions for the bone, air and soft tissue from the separate regression models were successively combined to generate a whole pelvis sCT. The change in monitor units between the sCT-based plans relative to the gold standard CT plan for the same IMRT dose plan was found to be 0.3%+/-0.9% (mean ± standard deviation) for 39 patients. The 3D Gamma pass rate was 99.8+/-0.00 (2 mm/2%). The novel hybrid model is computationally efficient, generating an sCT in 20 min from standard T2w images for prostate cancer radiation therapy dose planning and DRR generation.
Methods of statistical model estimation
Hilbe, Joseph
2013-01-01
Methods of Statistical Model Estimation examines the most important and popular methods used to estimate parameters for statistical models and provide informative model summary statistics. Designed for R users, the book is also ideal for anyone wanting to better understand the algorithms used for statistical model fitting. The text presents algorithms for the estimation of a variety of regression procedures using maximum likelihood estimation, iteratively reweighted least squares regression, the EM algorithm, and MCMC sampling. Fully developed, working R code is constructed for each method. Th
Exclusion statistics and integrable models
International Nuclear Information System (INIS)
Mashkevich, S.
1998-01-01
The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question
Sensometrics: Thurstonian and Statistical Models
DEFF Research Database (Denmark)
Christensen, Rune Haubo Bojesen
This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the us...... of generalized linear mixed models, cumulative link models and cumulative link mixed models. The relation between the Wald, likelihood and score statistics is expanded upon using the shape of the (profile) likelihood function as common reference....
Statistical modeling for degradation data
Lio, Yuhlong; Ng, Hon; Tsai, Tzong-Ru
2017-01-01
This book focuses on the statistical aspects of the analysis of degradation data. In recent years, degradation data analysis has come to play an increasingly important role in different disciplines such as reliability, public health sciences, and finance. For example, information on products’ reliability can be obtained by analyzing degradation data. In addition, statistical modeling and inference techniques have been developed on the basis of different degradation measures. The book brings together experts engaged in statistical modeling and inference, presenting and discussing important recent advances in degradation data analysis and related applications. The topics covered are timely and have considerable potential to impact both statistics and reliability engineering.
Statistical modelling with quantile functions
Gilchrist, Warren
2000-01-01
Galton used quantiles more than a hundred years ago in describing data. Tukey and Parzen used them in the 60s and 70s in describing populations. Since then, the authors of many papers, both theoretical and practical, have used various aspects of quantiles in their work. Until now, however, no one put all the ideas together to form what turns out to be a general approach to statistics.Statistical Modelling with Quantile Functions does just that. It systematically examines the entire process of statistical modelling, starting with using the quantile function to define continuous distributions. The author shows that by using this approach, it becomes possible to develop complex distributional models from simple components. A modelling kit can be developed that applies to the whole model - deterministic and stochastic components - and this kit operates by adding, multiplying, and transforming distributions rather than data.Statistical Modelling with Quantile Functions adds a new dimension to the practice of stati...
Statistical validation of stochastic models
Energy Technology Data Exchange (ETDEWEB)
Hunter, N.F. [Los Alamos National Lab., NM (United States). Engineering Science and Analysis Div.; Barney, P.; Paez, T.L. [Sandia National Labs., Albuquerque, NM (United States). Experimental Structural Dynamics Dept.; Ferregut, C.; Perez, L. [Univ. of Texas, El Paso, TX (United States). Dept. of Civil Engineering
1996-12-31
It is common practice in structural dynamics to develop mathematical models for system behavior, and the authors are now capable of developing stochastic models, i.e., models whose parameters are random variables. Such models have random characteristics that are meant to simulate the randomness in characteristics of experimentally observed systems. This paper suggests a formal statistical procedure for the validation of mathematical models of stochastic systems when data taken during operation of the stochastic system are available. The statistical characteristics of the experimental system are obtained using the bootstrap, a technique for the statistical analysis of non-Gaussian data. The authors propose a procedure to determine whether or not a mathematical model is an acceptable model of a stochastic system with regard to user-specified measures of system behavior. A numerical example is presented to demonstrate the application of the technique.
Statistical Models for Social Networks
Snijders, Tom A. B.; Cook, KS; Massey, DS
2011-01-01
Statistical models for social networks as dependent variables must represent the typical network dependencies between tie variables such as reciprocity, homophily, transitivity, etc. This review first treats models for single (cross-sectionally observed) networks and then for network dynamics. For
Automated statistical modeling of analytical measurement systems
International Nuclear Information System (INIS)
Jacobson, J.J.
1992-01-01
The statistical modeling of analytical measurement systems at the Idaho Chemical Processing Plant (ICPP) has been completely automated through computer software. The statistical modeling of analytical measurement systems is one part of a complete quality control program used by the Remote Analytical Laboratory (RAL) at the ICPP. The quality control program is an integration of automated data input, measurement system calibration, database management, and statistical process control. The quality control program and statistical modeling program meet the guidelines set forth by the American Society for Testing Materials and American National Standards Institute. A statistical model is a set of mathematical equations describing any systematic bias inherent in a measurement system and the precision of a measurement system. A statistical model is developed from data generated from the analysis of control standards. Control standards are samples which are made up at precise known levels by an independent laboratory and submitted to the RAL. The RAL analysts who process control standards do not know the values of those control standards. The object behind statistical modeling is to describe real process samples in terms of their bias and precision and, to verify that a measurement system is operating satisfactorily. The processing of control standards gives us this ability
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Hansen, Kurt Schaldemose; Larsen, Gunner Chr.
2005-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... (PDF) of turbulence driven short-term extreme wind shear events, conditioned on the mean wind speed, for an arbitrary recurrence period. The model is based on an asymptotic expansion, and only a few and easily accessible parameters are needed as input. The model of the extreme PDF is supplemented...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Larsen, Gunner Chr.; Hansen, Kurt Schaldemose
2004-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... (PDF) of turbulence driven short-term extreme wind shear events, conditioned on the mean wind speed, for an arbitrary recurrence period. The model is based on an asymptotic expansion, and only a few and easily accessible parameters are needed as input. The model of the extreme PDF is supplemented...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...
Generative models for chemical structures.
White, David; Wilson, Richard C
2010-07-26
We apply recently developed techniques for pattern recognition to construct a generative model for chemical structure. This approach can be viewed as ligand-based de novo design. We construct a statistical model describing the structural variations present in a set of molecules which may be sampled to generate new structurally similar examples. We prevent the possibility of generating chemically invalid molecules, according to our implicit hydrogen model, by projecting samples onto the nearest chemically valid molecule. By populating the input set with molecules that are active against a target, we show how new molecules may be generated that will likely also be active against the target.
A Statistical Programme Assignment Model
DEFF Research Database (Denmark)
Rosholm, Michael; Staghøj, Jonas; Svarer, Michael
When treatment effects of active labour market programmes are heterogeneous in an observable way across the population, the allocation of the unemployed into different programmes becomes a particularly important issue. In this paper, we present a statistical model designed to improve the present...
Textual information access statistical models
Gaussier, Eric
2013-01-01
This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:- information extraction and retrieval;- text classification and clustering;- opinion mining;- comprehension aids (automatic summarization, machine translation, visualization).In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications
Quantum Statistical Testing of a Quantum Random Number Generator
Energy Technology Data Exchange (ETDEWEB)
Humble, Travis S [ORNL
2014-01-01
The unobservable elements in a quantum technology, e.g., the quantum state, complicate system verification against promised behavior. Using model-based system engineering, we present methods for verifying the opera- tion of a prototypical quantum random number generator. We begin with the algorithmic design of the QRNG followed by the synthesis of its physical design requirements. We next discuss how quantum statistical testing can be used to verify device behavior as well as detect device bias. We conclude by highlighting how system design and verification methods must influence effort to certify future quantum technologies.
Statistical Models of Adaptive Immune populations
Sethna, Zachary; Callan, Curtis; Walczak, Aleksandra; Mora, Thierry
The availability of large (104-106 sequences) datasets of B or T cell populations from a single individual allows reliable fitting of complex statistical models for naïve generation, somatic selection, and hypermutation. It is crucial to utilize a probabilistic/informational approach when modeling these populations. The inferred probability distributions allow for population characterization, calculation of probability distributions of various hidden variables (e.g. number of insertions), as well as statistical properties of the distribution itself (e.g. entropy). In particular, the differences between the T cell populations of embryonic and mature mice will be examined as a case study. Comparing these populations, as well as proposed mixed populations, provides a concrete exercise in model creation, comparison, choice, and validation.
The Utility of "Tree-Generating" Statistics in Applied Social Work Research.
Wambach, Kathryn G.
1999-01-01
Illustrates the usefulness of "tree-generating" statistics for public sector service planning, focusing on the use of Automatic Interaction Detection, an early version of tree-generating statistics, to provide a model for the data analysis needed to plan substance abuse programs on a local level. (SLD)
Auxiliary Deep Generative Models
DEFF Research Database (Denmark)
Maaløe, Lars; Sønderby, Casper Kaae; Sønderby, Søren Kaae
2016-01-01
Deep generative models parameterized by neural networks have recently achieved state-of-the-art performance in unsupervised and semi-supervised learning. We extend deep generative models with auxiliary variables which improves the variational approximation. The auxiliary variables leave...... faster with better results. We show state-of-the-art performance within semi-supervised learning on MNIST (0.96%), SVHN (16.61%) and NORB (9.40%) datasets....
Statistical Analysis by Statistical Physics Model for the STOCK Markets
Wang, Tiansong; Wang, Jun; Fan, Bingli
A new stochastic stock price model of stock markets based on the contact process of the statistical physics systems is presented in this paper, where the contact model is a continuous time Markov process, one interpretation of this model is as a model for the spread of an infection. Through this model, the statistical properties of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE) are studied. In the present paper, the data of SSE Composite Index and the data of SZSE Component Index are analyzed, and the corresponding simulation is made by the computer computation. Further, we investigate the statistical properties, fat-tail phenomena, the power-law distributions, and the long memory of returns for these indices. The techniques of skewness-kurtosis test, Kolmogorov-Smirnov test, and R/S analysis are applied to study the fluctuation characters of the stock price returns.
Statistical tests of simple earthquake cycle models
Devries, Phoebe M. R.; Evans, Eileen
2016-01-01
A central goal of observing and modeling the earthquake cycle is to forecast when a particular fault may generate an earthquake: a fault late in its earthquake cycle may be more likely to generate an earthquake than a fault early in its earthquake cycle. Models that can explain geodetic observations throughout the entire earthquake cycle may be required to gain a more complete understanding of relevant physics and phenomenology. Previous efforts to develop unified earthquake models for strike-slip faults have largely focused on explaining both preseismic and postseismic geodetic observations available across a few faults in California, Turkey, and Tibet. An alternative approach leverages the global distribution of geodetic and geologic slip rate estimates on strike-slip faults worldwide. Here we use the Kolmogorov-Smirnov test for similarity of distributions to infer, in a statistically rigorous manner, viscoelastic earthquake cycle models that are inconsistent with 15 sets of observations across major strike-slip faults. We reject a large subset of two-layer models incorporating Burgers rheologies at a significance level of α = 0.05 (those with long-term Maxwell viscosities ηM ~ 4.6 × 1020 Pa s) but cannot reject models on the basis of transient Kelvin viscosity ηK. Finally, we examine the implications of these results for the predicted earthquake cycle timing of the 15 faults considered and compare these predictions to the geologic and historical record.
Statistical Modelling of the Soil Dielectric Constant
Usowicz, Boguslaw; Marczewski, Wojciech; Bogdan Usowicz, Jerzy; Lipiec, Jerzy
2010-05-01
The dielectric constant of soil is the physical property being very sensitive on water content. It funds several electrical measurement techniques for determining the water content by means of direct (TDR, FDR, and others related to effects of electrical conductance and/or capacitance) and indirect RS (Remote Sensing) methods. The work is devoted to a particular statistical manner of modelling the dielectric constant as the property accounting a wide range of specific soil composition, porosity, and mass density, within the unsaturated water content. Usually, similar models are determined for few particular soil types, and changing the soil type one needs switching the model on another type or to adjust it by parametrization of soil compounds. Therefore, it is difficult comparing and referring results between models. The presented model was developed for a generic representation of soil being a hypothetical mixture of spheres, each representing a soil fraction, in its proper phase state. The model generates a serial-parallel mesh of conductive and capacitive paths, which is analysed for a total conductive or capacitive property. The model was firstly developed to determine the thermal conductivity property, and now it is extended on the dielectric constant by analysing the capacitive mesh. The analysis is provided by statistical means obeying physical laws related to the serial-parallel branching of the representative electrical mesh. Physical relevance of the analysis is established electrically, but the definition of the electrical mesh is controlled statistically by parametrization of compound fractions, by determining the number of representative spheres per unitary volume per fraction, and by determining the number of fractions. That way the model is capable covering properties of nearly all possible soil types, all phase states within recognition of the Lorenz and Knudsen conditions. In effect the model allows on generating a hypothetical representative of
Energy Technology Data Exchange (ETDEWEB)
A. Alsaed
2004-11-18
''The Disposal Criticality Analysis Methodology Topical Report'' prescribes an approach to the methodology for performing postclosure criticality analyses within the monitored geologic repository at Yucca Mountain, Nevada. An essential component of the methodology is the ''Configuration Generator Model for In-Package Criticality'' that provides a tool to evaluate the probabilities of degraded configurations achieving a critical state. The configuration generator model is a risk-informed, performance-based process for evaluating the criticality potential of degraded configurations in the monitored geologic repository. The method uses event tree methods to define configuration classes derived from criticality scenarios and to identify configuration class characteristics (parameters, ranges, etc.). The probabilities of achieving the various configuration classes are derived in part from probability density functions for degradation parameters. The NRC has issued ''Safety Evaluation Report for Disposal Criticality Analysis Methodology Topical Report, Revision 0''. That report contained 28 open items that required resolution through additional documentation. Of the 28 open items, numbers 5, 6, 9, 10, 18, and 19 were concerned with a previously proposed software approach to the configuration generator methodology and, in particular, the k{sub eff} regression analysis associated with the methodology. However, the use of a k{sub eff} regression analysis is not part of the current configuration generator methodology and, thus, the referenced open items are no longer considered applicable and will not be further addressed.
Statistical modelling of fish stocks
DEFF Research Database (Denmark)
Kvist, Trine
1999-01-01
for modelling the dynamics of a fish population is suggested. A new approach is introduced to analyse the sources of variation in age composition data, which is one of the most important sources of information in the cohort based models for estimation of stock abundancies and mortalities. The approach combines...... and it is argued that an approach utilising stochastic differential equations might be advantagous in fish stoch assessments....
Statistical lung model for microdosimetry
International Nuclear Information System (INIS)
Fisher, D.R.; Hadley, R.T.
1984-03-01
To calculate the microdosimetry of plutonium in the lung, a mathematical description is needed of lung tissue microstructure that defines source-site parameters. Beagle lungs were expanded using a glutaraldehyde fixative at 30 cm water pressure. Tissue specimens, five microns thick, were stained with hematoxylin and eosin then studied using an image analyzer. Measurements were made along horizontal lines through the magnified tissue image. The distribution of air space and tissue chord lengths and locations of epithelial cell nuclei were recorded from about 10,000 line scans. The distribution parameters constituted a model of lung microstructure for predicting the paths of random alpha particle tracks in the lung and the probability of traversing biologically sensitive sites. This lung model may be used in conjunction with established deposition and retention models for determining the microdosimetry in the pulmonary lung for a wide variety of inhaled radioactive materials
Actuarial statistics with generalized linear mixed models
Antonio, K.; Beirlant, J.
2007-01-01
Over the last decade the use of generalized linear models (GLMs) in actuarial statistics has received a lot of attention, starting from the actuarial illustrations in the standard text by McCullagh and Nelder [McCullagh, P., Nelder, J.A., 1989. Generalized linear models. In: Monographs on Statistics
Statistical Modeling of Bivariate Data.
1982-08-01
to one. Following Crain (1974), one may consider order m approximators m log f111(X) - k k (x) - c(e), asx ;b. (4.4.5) k,-r A m and attempt to find...literature. Consider the approximate model m log fn (x) = 7 ekk(x) + a G(x), aSx ;b, (44.8) " k=-Mn ’ where G(x) is a Gaussian process and n is a
Statistical Models and Methods for Lifetime Data
Lawless, Jerald F
2011-01-01
Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,
Statistical Shape Modeling of Cam Femoroacetabular Impingement
Energy Technology Data Exchange (ETDEWEB)
Harris, Michael D.; Dater, Manasi; Whitaker, Ross; Jurrus, Elizabeth R.; Peters, Christopher L.; Anderson, Andrew E.
2013-10-01
In this study, statistical shape modeling (SSM) was used to quantify three-dimensional (3D) variation and morphologic differences between femurs with and without cam femoroacetabular impingement (FAI). 3D surfaces were generated from CT scans of femurs from 41 controls and 30 cam FAI patients. SSM correspondence particles were optimally positioned on each surface using a gradient descent energy function. Mean shapes for control and patient groups were defined from the resulting particle configurations. Morphological differences between group mean shapes and between the control mean and individual patients were calculated. Principal component analysis was used to describe anatomical variation present in both groups. The first 6 modes (or principal components) captured statistically significant shape variations, which comprised 84% of cumulative variation among the femurs. Shape variation was greatest in femoral offset, greater trochanter height, and the head-neck junction. The mean cam femur shape protruded above the control mean by a maximum of 3.3 mm with sustained protrusions of 2.5-3.0 mm along the anterolateral head-neck junction and distally along the anterior neck, corresponding well with reported cam lesion locations and soft-tissue damage. This study provides initial evidence that SSM can describe variations in femoral morphology in both controls and cam FAI patients and may be useful for developing new measurements of pathological anatomy. SSM may also be applied to characterize cam FAI severity and provide templates to guide patient-specific surgical resection of bone.
Accelerated life models modeling and statistical analysis
Bagdonavicius, Vilijandas
2001-01-01
Failure Time DistributionsIntroductionParametric Classes of Failure Time DistributionsAccelerated Life ModelsIntroductionGeneralized Sedyakin's ModelAccelerated Failure Time ModelProportional Hazards ModelGeneralized Proportional Hazards ModelsGeneralized Additive and Additive-Multiplicative Hazards ModelsChanging Shape and Scale ModelsGeneralizationsModels Including Switch-Up and Cycling EffectsHeredity HypothesisSummaryAccelerated Degradation ModelsIntroductionDegradation ModelsModeling the Influence of Explanatory Varia
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Uncertainty the soul of modeling, probability & statistics
Briggs, William
2016-01-01
This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science. The ultimate goal is to call into question many standard tenets and lay the philosophical and probabilistic groundwork and infrastructure for statistical modeling. It is the first book devoted to the philosophy of data aimed at working scientists and calls for a new consideration in the practice of probability and statistics to eliminate what has been referred to as the "Cult of Statistical Significance". The book explains the philosophy of these ideas and not the mathematics, though there are a handful of mathematical examples. The topics are logically laid out, starting with basic philosophy as related to probability, statistics, and science, and stepping through the key probabilistic ideas and concepts, and ending with statistical models. Its jargon-free approach asserts that standard methods, suc...
Statistical modelling of citation exchange between statistics journals.
Varin, Cristiano; Cattelan, Manuela; Firth, David
2016-01-01
Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation 'export scores' within the discipline of statistics.
Topology for statistical modeling of petascale data.
Energy Technology Data Exchange (ETDEWEB)
Pascucci, Valerio (University of Utah, Salt Lake City, UT); Mascarenhas, Ajith Arthur; Rusek, Korben (Texas A& M University, College Station, TX); Bennett, Janine Camille; Levine, Joshua (University of Utah, Salt Lake City, UT); Pebay, Philippe Pierre; Gyulassy, Attila (University of Utah, Salt Lake City, UT); Thompson, David C.; Rojas, Joseph Maurice (Texas A& M University, College Station, TX)
2011-07-01
This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled 'Topology for Statistical Modeling of Petascale Data', funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program. Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is thus to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, our approach is based on the complementary techniques of combinatorial topology and statistical modeling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modeling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. This document summarizes the technical advances we have made to date that were made possible in whole or in part by MAPD funding. These technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modeling, and (3) new integrated topological and statistical methods.
Schedulability of Herschel revisited using statistical model checking
DEFF Research Database (Denmark)
David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel
2015-01-01
to obtain some guarantee on the (un)schedulability of the model even in the presence of undecidability. Two methods are considered: symbolic model checking and statistical model checking. Since the model uses stop-watches, the reachability problem becomes undecidable so we are using an over......-approximation technique. We can safely conclude that the system is schedulable for varying values of BCET. For the cases where deadlines are violated, we use polyhedra to try to confirm the witnesses. Our alternative method to confirm non-schedulability uses statistical model-checking (SMC) to generate counter...
Infinite Random Graphs as Statistical Mechanical Models
DEFF Research Database (Denmark)
Durhuus, Bergfinnur Jøgvan; Napolitano, George Maria
2011-01-01
We discuss two examples of infinite random graphs obtained as limits of finite statistical mechanical systems: a model of two-dimensional dis-cretized quantum gravity defined in terms of causal triangulated surfaces, and the Ising model on generic random trees. For the former model we describe...
Review of statistical models for nuclear reactions
International Nuclear Information System (INIS)
Igarasi, Sin-iti
1991-01-01
Statistical model calculations have been widely performed for nuclear data evaluations. These were based on the models of Hauser-Feshbach, Weisskopf-Ewing and their modifications. Since the 1940s, non-compound nuclear phenomena have been observed, and stimulated many nuclear physicists to study compound and non-compound nuclear reaction mechanisms. Concerning compound nuclear reactions, they investigated problems on the basis of fundamental properties of S-matrix, statistical distributions of resonance pole parameters, random matrix elements of the nuclear Hamiltonian, and so forth. They have presented many sophisticated results. But old statistical models have been still useful, because these models were simple and easily utilizable. In this report, these old and new models will be briefly reviewed with a purpose of application to nuclear data evaluation, and examine applicability of the new models. (author)
Matrix Tricks for Linear Statistical Models
Puntanen, Simo; Styan, George PH
2011-01-01
In teaching linear statistical models to first-year graduate students or to final-year undergraduate students there is no way to proceed smoothly without matrices and related concepts of linear algebra; their use is really essential. Our experience is that making some particular matrix tricks very familiar to students can substantially increase their insight into linear statistical models (and also multivariate statistical analysis). In matrix algebra, there are handy, sometimes even very simple "tricks" which simplify and clarify the treatment of a problem - both for the student and
Daily precipitation statistics in regional climate models
DEFF Research Database (Denmark)
Frei, Christoph; Christensen, Jens Hesselbjerg; Déqué, Michel
2003-01-01
. The 15-year integrations were forced from reanalyses and observed sea surface temperature and sea ice (global model from sea surface only). The observational reference is based on 6400 rain gauge records (10-50 stations per grid box). Evaluation statistics encompass mean precipitation, wet-day frequency...... for other statistics. In summer, all models underestimate precipitation intensity (by 16-42%) and there is a too low frequency of heavy events. This bias reflects too dry summer mean conditions in three of the models, while it is partly compensated by too many low-intensity events in the other two models...
Learning generative models for protein fold families.
Balakrishnan, Sivaraman; Kamisetty, Hetunandan; Carbonell, Jaime G; Lee, Su-In; Langmead, Christopher James
2011-04-01
We introduce a new approach to learning statistical models from multiple sequence alignments (MSA) of proteins. Our method, called GREMLIN (Generative REgularized ModeLs of proteINs), learns an undirected probabilistic graphical model of the amino acid composition within the MSA. The resulting model encodes both the position-specific conservation statistics and the correlated mutation statistics between sequential and long-range pairs of residues. Existing techniques for learning graphical models from MSA either make strong, and often inappropriate assumptions about the conditional independencies within the MSA (e.g., Hidden Markov Models), or else use suboptimal algorithms to learn the parameters of the model. In contrast, GREMLIN makes no a priori assumptions about the conditional independencies within the MSA. We formulate and solve a convex optimization problem, thus guaranteeing that we find a globally optimal model at convergence. The resulting model is also generative, allowing for the design of new protein sequences that have the same statistical properties as those in the MSA. We perform a detailed analysis of covariation statistics on the extensively studied WW and PDZ domains and show that our method out-performs an existing algorithm for learning undirected probabilistic graphical models from MSA. We then apply our approach to 71 additional families from the PFAM database and demonstrate that the resulting models significantly out-perform Hidden Markov Models in terms of predictive accuracy. Copyright © 2011 Wiley-Liss, Inc.
Distributions with given marginals and statistical modelling
Fortiana, Josep; Rodriguez-Lallena, José
2002-01-01
This book contains a selection of the papers presented at the meeting `Distributions with given marginals and statistical modelling', held in Barcelona (Spain), July 17-20, 2000. In 24 chapters, this book covers topics such as the theory of copulas and quasi-copulas, the theory and compatibility of distributions, models for survival distributions and other well-known distributions, time series, categorical models, definition and estimation of measures of dependence, monotonicity and stochastic ordering, shape and separability of distributions, hidden truncation models, diagonal families, orthogonal expansions, tests of independence, and goodness of fit assessment. These topics share the use and properties of distributions with given marginals, this being the fourth specialised text on this theme. The innovative aspect of the book is the inclusion of statistical aspects such as modelling, Bayesian statistics, estimation, and tests.
Fluctuations of offshore wind generation: Statistical modelling
DEFF Research Database (Denmark)
Pinson, Pierre; Christensen, Lasse E.A.; Madsen, Henrik
2007-01-01
The magnitude of power fluctuations at large offshore wind farms has a significant impact on the control and management strategies of their power output. If focusing on the minute scale, one observes successive periods with smaller and larger power fluctuations. It seems that different regimes yi...
Statistical Modeling for Radiation Hardness Assurance
Ladbury, Raymond L.
2014-01-01
We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.
Performance modeling, loss networks, and statistical multiplexing
Mazumdar, Ravi
2009-01-01
This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of understanding the phenomenon of statistical multiplexing. The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the important ideas of Palm distributions associated with traffic models and their role in performance measures. Also presented are recent ideas of large buffer, and many sources asymptotics that play an important role in understanding statistical multiplexing. I
Simple statistical model for branched aggregates
DEFF Research Database (Denmark)
Lemarchand, Claire; Hansen, Jesper Schmidt
2015-01-01
We propose a statistical model that can reproduce the size distribution of any branched aggregate, including amylopectin, dendrimers, molecular clusters of monoalcohols, and asphaltene nanoaggregates. It is based on the conditional probability for one molecule to form a new bond with a molecule......, given that it already has bonds with others. The model is applied here to asphaltene nanoaggregates observed in molecular dynamics simulations of Cooee bitumen. The variation with temperature of the probabilities deduced from this model is discussed in terms of statistical mechanics arguments....... The relevance of the statistical model in the case of asphaltene nanoaggregates is checked by comparing the predicted value of the probability for one molecule to have exactly i bonds with the same probability directly measured in the molecular dynamics simulations. The agreement is satisfactory...
Statistical Model Checking for Stochastic Hybrid Systems
DEFF Research Database (Denmark)
David, Alexandre; Du, Dehui; Larsen, Kim Guldstrand
2012-01-01
This paper presents novel extensions and applications of the UPPAAL-SMC model checker. The extensions allow for statistical model checking of stochastic hybrid systems. We show how our race-based stochastic semantics extends to networks of hybrid systems, and indicate the integration technique ap...
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Statistical physics of pairwise probability models
DEFF Research Database (Denmark)
Roudi, Yasser; Aurell, Erik; Hertz, John
2009-01-01
(dansk abstrakt findes ikke) Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data......: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying...... and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring...
Growth curve models and statistical diagnostics
Pan, Jian-Xin
2002-01-01
Growth-curve models are generalized multivariate analysis-of-variance models. These models are especially useful for investigating growth problems on short times in economics, biology, medical research, and epidemiology. This book systematically introduces the theory of the GCM with particular emphasis on their multivariate statistical diagnostics, which are based mainly on recent developments made by the authors and their collaborators. The authors provide complete proofs of theorems as well as practical data sets and MATLAB code.
Topology for Statistical Modeling of Petascale Data
Energy Technology Data Exchange (ETDEWEB)
Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Bremer, P. -T. [Univ. of Utah, Salt Lake City, UT (United States)
2013-10-31
Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, the approach of the entire team involving all three institutions is based on the complementary techniques of combinatorial topology and statistical modelling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modelling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. The overall technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modelling, and (3) new integrated topological and statistical methods. Roughly speaking, the division of labor between our 3 groups (Sandia Labs in Livermore, Texas A&M in College Station, and U Utah in Salt Lake City) is as follows: the Sandia group focuses on statistical methods and their formulation in algebraic terms, and finds the application problems (and data sets) most relevant to this project, the Texas A&M Group develops new algebraic geometry algorithms, in particular with fewnomial theory, and the Utah group develops new algorithms in computational topology via Discrete Morse Theory. However, we hasten to point out that our three groups stay in tight contact via videconference every 2 weeks, so there is much synergy of ideas between the groups. The following of this document is focused on the contributions that had grater direct involvement from the team at the University of Utah in Salt Lake City.
An R companion to linear statistical models
Hay-Jahans, Christopher
2011-01-01
Focusing on user-developed programming, An R Companion to Linear Statistical Models serves two audiences: those who are familiar with the theory and applications of linear statistical models and wish to learn or enhance their skills in R; and those who are enrolled in an R-based course on regression and analysis of variance. For those who have never used R, the book begins with a self-contained introduction to R that lays the foundation for later chapters.This book includes extensive and carefully explained examples of how to write programs using the R programming language. These examples cove
Bayesian models a statistical primer for ecologists
Hobbs, N Thompson
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili
Two-dimensional models in statistical mechanics and field theory
International Nuclear Information System (INIS)
Koberle, R.
1980-01-01
Several features of two-dimensional models in statistical mechanics and Field theory, such as, lattice quantum chromodynamics, Z(N), Gross-Neveu and CP N-1 are discussed. The problems of confinement and dynamical mass generation are also analyzed. (L.C.) [pt
Complex Data Modeling and Computationally Intensive Statistical Methods
Mantovan, Pietro
2010-01-01
The last years have seen the advent and development of many devices able to record and store an always increasing amount of complex and high dimensional data; 3D images generated by medical scanners or satellite remote sensing, DNA microarrays, real time financial data, system control datasets. The analysis of this data poses new challenging problems and requires the development of novel statistical models and computational methods, fueling many fascinating and fast growing research areas of modern statistics. The book offers a wide variety of statistical methods and is addressed to statistici
STATISTICAL MODELS OF REPRESENTING INTELLECTUAL CAPITAL
Directory of Open Access Journals (Sweden)
Andreea Feraru
2016-06-01
Full Text Available This article entitled Statistical Models of Representing Intellectual Capital approaches and analyses the concept of intellectual capital, as well as the main models which can support enterprisers/managers in evaluating and quantifying the advantages of intellectual capital. Most authors examine intellectual capital from a static perspective and focus on the development of its various evaluation models. In this chapter we surveyed the classical static models: Sveiby, Edvisson, Balanced Scorecard, as well as the canonical model of intellectual capital. Among the group of static models for evaluating organisational intellectual capital the canonical model stands out. This model enables the structuring of organisational intellectual capital in: human capital, structural capital and relational capital. Although the model is widely spread, it is a static one and can thus create a series of errors in the process of evaluation, because all the three entities mentioned above are not independent from the viewpoint of their contents, as any logic of structuring complex entities requires.
Automated Simulation Model Generation
Huang, Y.
2013-01-01
One of today's challenges in the field of modeling and simulation is to model increasingly larger and more complex systems. Complex models take long to develop and incur high costs. With the advances in data collection technologies and more popular use of computer-aided systems, more data has become
Statistical Model Checking for Product Lines
DEFF Research Database (Denmark)
ter Beek, Maurice H.; Legay, Axel; Lluch Lafuente, Alberto
2016-01-01
We report on the suitability of statistical model checking for the analysis of quantitative properties of product line models by an extended treatment of earlier work by the authors. The type of analysis that can be performed includes the likelihood of specific product behaviour, the expected...... average cost of products (in terms of the attributes of the products’ features) and the probability of features to be (un)installed at runtime. The product lines must be modelled in QFLan, which extends the probabilistic feature-oriented language PFLan with novel quantitative constraints among features...... behaviour converge in a discrete-time Markov chain semantics, enabling the analysis of quantitative properties. Technically, a Maude implementation of QFLan, integrated with Microsoft’s SMT constraint solver Z3, is combined with the distributed statistical model checker MultiVeStA, developed by one...
(ajst) statistical mechanics model for orientational
African Journals Online (AJOL)
2: December, 2005. African Journal of Science and Technology (AJST). Science and Engineering Series Vol. 6, No. 2, pp. 94 - 101. STATISTICAL MECHANICS MODEL FOR ORIENTATIONAL. MOTION OF TWO-DIMENSIONAL RIGID ROTATOR. Malo, J.O.. Department of Physics, University of Nairobi, P.O. Box 30197 ...
Probing NWP model deficiencies by statistical postprocessing
DEFF Research Database (Denmark)
Rosgaard, Martin Haubjerg; Nielsen, Henrik Aalborg; Nielsen, Torben S.
2016-01-01
The objective in this article is twofold. On one hand, a Model Output Statistics (MOS) framework for improved wind speed forecast accuracy is described and evaluated. On the other hand, the approach explored identifies unintuitive explanatory value from a diagnostic variable in an operational num...
Topology for Statistical Modeling of Petascale Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pebay, Philippe Pierre [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Rojas, Maurice [Texas A & M Univ., College Station, TX (United States)
2014-07-01
This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled "Topology for Statistical Modeling of Petascale Data", funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program.
Statistical models for competing risk analysis
International Nuclear Information System (INIS)
Sather, H.N.
1976-08-01
Research results on three new models for potential applications in competing risks problems. One section covers the basic statistical relationships underlying the subsequent competing risks model development. Another discusses the problem of comparing cause-specific risk structure by competing risks theory in two homogeneous populations, P1 and P2. Weibull models which allow more generality than the Berkson and Elveback models are studied for the effect of time on the hazard function. The use of concomitant information for modeling single-risk survival is extended to the multiple failure mode domain of competing risks. The model used to illustrate the use of this methodology is a life table model which has constant hazards within pre-designated intervals of the time scale. Two parametric models for bivariate dependent competing risks, which provide interesting alternatives, are proposed and examined
Performance modeling, stochastic networks, and statistical multiplexing
Mazumdar, Ravi R
2013-01-01
This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of introducing an appropriate mathematical framework for modeling and analysis as well as understanding the phenomenon of statistical multiplexing. The models, techniques, and results presented form the core of traffic engineering methods used to design, control and allocate resources in communication networks.The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the importan
Statistical physics of pairwise probability models
Directory of Open Access Journals (Sweden)
Yasser Roudi
2009-11-01
Full Text Available Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring the parameters in a pairwise model depends on the time bin chosen for binning the data. We also study the effect of the size of the time bin on the model quality itself, again using simulated data. We show that using finer time bins increases the quality of the pairwise model. We offer new ways of deriving the expressions reported in our previous work for assessing the quality of pairwise models.
Statistical models of petrol engines vehicles dynamics
Ilie, C. O.; Marinescu, M.; Alexa, O.; Vilău, R.; Grosu, D.
2017-10-01
This paper focuses on studying statistical models of vehicles dynamics. It was design and perform a one year testing program. There were used many same type cars with gasoline engines and different mileage. Experimental data were collected of onboard sensors and those on the engine test stand. A database containing data of 64th tests was created. Several mathematical modelling were developed using database and the system identification method. Each modelling is a SISO or a MISO linear predictive ARMAX (AutoRegressive-Moving-Average with eXogenous inputs) model. It represents a differential equation with constant coefficients. It were made 64th equations for each dependency like engine torque as output and engine’s load and intake manifold pressure, as inputs. There were obtained strings with 64 values for each type of model. The final models were obtained using average values of the coefficients. The accuracy of models was assessed.
Equilibrium statistical mechanics of lattice models
Lavis, David A
2015-01-01
Most interesting and difficult problems in equilibrium statistical mechanics concern models which exhibit phase transitions. For graduate students and more experienced researchers this book provides an invaluable reference source of approximate and exact solutions for a comprehensive range of such models. Part I contains background material on classical thermodynamics and statistical mechanics, together with a classification and survey of lattice models. The geometry of phase transitions is described and scaling theory is used to introduce critical exponents and scaling laws. An introduction is given to finite-size scaling, conformal invariance and Schramm—Loewner evolution. Part II contains accounts of classical mean-field methods. The parallels between Landau expansions and catastrophe theory are discussed and Ginzburg—Landau theory is introduced. The extension of mean-field theory to higher-orders is explored using the Kikuchi—Hijmans—De Boer hierarchy of approximations. In Part III the use of alge...
Statistical shape and appearance models of bones.
Sarkalkan, Nazli; Weinans, Harrie; Zadpoor, Amir A
2014-03-01
When applied to bones, statistical shape models (SSM) and statistical appearance models (SAM) respectively describe the mean shape and mean density distribution of bones within a certain population as well as the main modes of variations of shape and density distribution from their mean values. The availability of this quantitative information regarding the detailed anatomy of bones provides new opportunities for diagnosis, evaluation, and treatment of skeletal diseases. The potential of SSM and SAM has been recently recognized within the bone research community. For example, these models have been applied for studying the effects of bone shape on the etiology of osteoarthritis, improving the accuracy of clinical osteoporotic fracture prediction techniques, design of orthopedic implants, and surgery planning. This paper reviews the main concepts, methods, and applications of SSM and SAM as applied to bone. Copyright © 2013 Elsevier Inc. All rights reserved.
Statistical Tests for Mixed Linear Models
Khuri, André I; Sinha, Bimal K
2011-01-01
An advanced discussion of linear models with mixed or random effects. In recent years a breakthrough has occurred in our ability to draw inferences from exact and optimum tests of variance component models, generating much research activity that relies on linear models with mixed and random effects. This volume covers the most important research of the past decade as well as the latest developments in hypothesis testing. It compiles all currently available results in the area of exact and optimum tests for variance component models and offers the only comprehensive treatment for these models a
reporttools: R Functions to Generate LaTeX Tables of Descriptive Statistics
Rufibach, Kaspar
2009-01-01
In statistical analysis reports, tables with descriptive statistics are routinely presented. We introduce the R package reporttools containing functions to efficiently generate such tables when compiling statistical analyses reports by combining LaTeX and R via Sweave.
Cellular automata and statistical mechanical models
International Nuclear Information System (INIS)
Rujan, P.
1987-01-01
The authors elaborate on the analogy between the transfer matrix of usual lattice models and the master equation describing the time development of cellular automata. Transient and stationary properties of probabilistic automata are linked to surface and bulk properties, respectively, of restricted statistical mechanical systems. It is demonstrated that methods of statistical physics can be successfully used to describe the dynamic and the stationary behavior of such automata. Some exact results are derived, including duality transformations, exact mappings, disorder, and linear solutions. Many examples are worked out in detail to demonstrate how to use statistical physics in order to construct cellular automata with desired properties. This approach is considered to be a first step toward the design of fully parallel, probabilistic systems whose computational abilities rely on the cooperative behavior of their components
Statistical modeling of geopressured geothermal reservoirs
Ansari, Esmail; Hughes, Richard; White, Christopher D.
2017-06-01
Identifying attractive candidate reservoirs for producing geothermal energy requires predictive models. In this work, inspectional analysis and statistical modeling are used to create simple predictive models for a line drive design. Inspectional analysis on the partial differential equations governing this design yields a minimum number of fifteen dimensionless groups required to describe the physics of the system. These dimensionless groups are explained and confirmed using models with similar dimensionless groups but different dimensional parameters. This study models dimensionless production temperature and thermal recovery factor as the responses of a numerical model. These responses are obtained by a Box-Behnken experimental design. An uncertainty plot is used to segment the dimensionless time and develop a model for each segment. The important dimensionless numbers for each segment of the dimensionless time are identified using the Boosting method. These selected numbers are used in the regression models. The developed models are reduced to have a minimum number of predictors and interactions. The reduced final models are then presented and assessed using testing runs. Finally, applications of these models are offered. The presented workflow is generic and can be used to translate the output of a numerical simulator into simple predictive models in other research areas involving numerical simulation.
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Logarithmic transformed statistical models in calibration
International Nuclear Information System (INIS)
Zeis, C.D.
1975-01-01
A general type of statistical model used for calibration of instruments having the property that the standard deviations of the observed values increase as a function of the mean value is described. The application to the Helix Counter at the Rocky Flats Plant is primarily from a theoretical point of view. The Helix Counter measures the amount of plutonium in certain types of chemicals. The method described can be used also for other calibrations. (U.S.)
Statistical model for high energy inclusive processes
International Nuclear Information System (INIS)
Pomorisac, B.
1980-01-01
We propose a statistical model of inclusive processes. The model is an extension of the model proposed by Salapino and Sugar for the inclusive distributions in rapidity. The model is defined in terms of a random variable on the full phase space of the produced particles and in terms of a Lorentz-invariant probability distribution. We suggest that the Lorentz invariance is broken spontaneously, this may describe the observed anisotropy of the inclusive distributions. Based on this model we calculate the distribution in transverse momentum. An explicit calculation is given of the one-particle inclusive cross sections and the two-particle correlation. The results give a fair representation of the shape of one-particle inclusive cross sections, and positive correlation for the particles emitted. The relevance of our results to experiments is discussed
statistical analysis of wind speed for electrical power generation
African Journals Online (AJOL)
HOD
Keywords: Wind speed - probability - density function – wind energy conversion system- statistical analyses. 1. INTRODUCTION. In order ..... "Statistical analysis of wind speed distribution based on six Weibull Methods for wind power evaluation in. Garoua, Cameroon," Revue des Energies. Renouvelables, vol. 18, no. 1, pp.
Encoding Dissimilarity Data for Statistical Model Building.
Wahba, Grace
2010-12-01
We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.
statistical analysis of wind speed for electrical power generation
African Journals Online (AJOL)
HOD
1, 4, 5 DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING, UNIVERSITY OF ILORIN, KWARA STATE, NIGERIA. 2DEPARTMENT OF ... Keywords: Wind speed - probability - density function – wind energy conversion system- statistical analyses. 1. ..... weather data for energy assessments of hybrid.
Wave Generation in Physical Models
DEFF Research Database (Denmark)
Andersen, Thomas Lykke; Frigaard, Peter
The present book describes the most important aspects of wave generation techniques in physical models. Moreover, the book serves as technical documentation for the wave generation software AwaSys 6, cf. Aalborg University (2012). In addition to the two main authors also Tue Hald and Michael...
Statistical Model Checking for Biological Systems
DEFF Research Database (Denmark)
David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel
2014-01-01
Statistical Model Checking (SMC) is a highly scalable simulation-based verification approach for testing and estimating the probability that a stochastic system satisfies a given linear temporal property. The technique has been applied to (discrete and continuous time) Markov chains, stochastic...... timed automata and most recently hybrid systems using the tool Uppaal SMC. In this paper we enable the application of SMC to complex biological systems, by combining Uppaal SMC with ANIMO, a plugin of the tool Cytoscape used by biologists, as well as with SimBiology®, a plugin of Matlab to simulate...
Average Nuclear properties based on statistical model
International Nuclear Information System (INIS)
El-Jaick, L.J.
1974-01-01
The rough properties of nuclei were investigated by statistical model, in systems with the same and different number of protons and neutrons, separately, considering the Coulomb energy in the last system. Some average nuclear properties were calculated based on the energy density of nuclear matter, from Weizsscker-Beth mass semiempiric formulae, generalized for compressible nuclei. In the study of a s surface energy coefficient, the great influence exercised by Coulomb energy and nuclear compressibility was verified. For a good adjust of beta stability lines and mass excess, the surface symmetry energy were established. (M.C.K.) [pt
Workshop on Model Uncertainty and its Statistical Implications
1988-01-01
In this book problems related to the choice of models in such diverse fields as regression, covariance structure, time series analysis and multinomial experiments are discussed. The emphasis is on the statistical implications for model assessment when the assessment is done with the same data that generated the model. This is a problem of long standing, notorious for its difficulty. Some contributors discuss this problem in an illuminating way. Others, and this is a truly novel feature, investigate systematically whether sample re-use methods like the bootstrap can be used to assess the quality of estimators or predictors in a reliable way given the initial model uncertainty. The book should prove to be valuable for advanced practitioners and statistical methodologists alike.
Modelling gas generation for landfill.
Chakma, Sumedha; Mathur, Shashi
2017-06-01
A methodology was developed to predict the optimum long-term spatial and temporal generation of landfill gases such as methane, carbon dioxide, ammonia, and hydrogen sulphide on post-closure landfill. The model incorporated the chemical and the biochemical processes responsible for the degradation of the municipal solid waste. The developed model also takes into account the effects of heterogeneity with different layers as observed at the site of landfills' morphology. The important parameters for gas generation due to biodegradation such as temperature, pH, and moisture content were incorporated. The maximum and the minimum generations of methane and hydrogen sulphide were observed. The rate of gas generation was found almost same throughout the depth after 30 years of landfill closure. The proposed model would be very useful for landfill engineering in the mining landfill gas and proper design for landfill gas management systems.
Association testing for next-generation sequencing data using score statistics
DEFF Research Database (Denmark)
Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders
2012-01-01
computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...... of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... to association testing for next-generation sequencing data. The joint model accounts for the genotype classification uncertainty via the posterior probabilities of the genotypes given the observed sequencing data, which gives the approach higher power than methods based on called genotypes. This strategy remains...
Investigating the statistical properties of user-generated documents
Inches, Giacomo; Carman, Mark J.; Crestani, Fabio
2011-01-01
The importance of the Internet as a communication medium is reflected in the large amount of documents being generated every day by users of the different services that take place online. In this work we aim at analyzing the properties of these online user-generated documents for some of the established services over the Internet (Kongregate, Twitter, Myspace and Slashdot) and comparing them with a consolidated collection of standard information retrieval documents (from the Wall Street...
Investigating the Statistical Properties of User-Generated Documents
Inches Giacomo; Carman Mark James
2011-01-01
The importance of the Internet as a communication medium is reflected in the large amount of documents being generated every day by users of the different services that take place online. In this work we aim at analyzing the properties of these online user generated documents for some of the established services over the Internet (Kongregate Twitter Myspace and Slashdot) and comparing them with a consolidated collection of standard information retrieval documents (from the Wall Street Journal...
Statistical modeling to support power system planning
Staid, Andrea
This dissertation focuses on data-analytic approaches that improve our understanding of power system applications to promote better decision-making. It tackles issues of risk analysis, uncertainty management, resource estimation, and the impacts of climate change. Tools of data mining and statistical modeling are used to bring new insight to a variety of complex problems facing today's power system. The overarching goal of this research is to improve the understanding of the power system risk environment for improved operation, investment, and planning decisions. The first chapter introduces some challenges faced in planning for a sustainable power system. Chapter 2 analyzes the driving factors behind the disparity in wind energy investments among states with a goal of determining the impact that state-level policies have on incentivizing wind energy. Findings show that policy differences do not explain the disparities; physical and geographical factors are more important. Chapter 3 extends conventional wind forecasting to a risk-based focus of predicting maximum wind speeds, which are dangerous for offshore operations. Statistical models are presented that issue probabilistic predictions for the highest wind speed expected in a three-hour interval. These models achieve a high degree of accuracy and their use can improve safety and reliability in practice. Chapter 4 examines the challenges of wind power estimation for onshore wind farms. Several methods for wind power resource assessment are compared, and the weaknesses of the Jensen model are demonstrated. For two onshore farms, statistical models outperform other methods, even when very little information is known about the wind farm. Lastly, chapter 5 focuses on the power system more broadly in the context of the risks expected from tropical cyclones in a changing climate. Risks to U.S. power system infrastructure are simulated under different scenarios of tropical cyclone behavior that may result from climate
Statistical mechanics of helical wormlike chain model
Liu, Ya; Pérez, Toni; Li, Wei; Gunton, J. D.; Green, Amanda
2011-02-01
We investigate the statistical mechanics of polymers with bending and torsional elasticity described by the helical wormlike model. Noticing that the energy function is factorizable, we provide a numerical method to solve the model using a transfer matrix formulation. The tangent-tangent and binormal-binormal correlation functions have been calculated and displayed rich profiles which are sensitive to the combination of the temperature and the equilibrium torsion. Their behaviors indicate that there is no finite temperature Lifshitz point between the disordered and helical phases. The asymptotic behavior at low temperature has been investigated theoretically and the predictions fit the numerical results very well. Our analysis could be used to understand the statics of dsDNA and other chiral polymers.
Statistical Mechanics of Helical Wormlike Model
Liu, Ya; Perez, Toni; Li, Wei; Gunton, James; Green, Amanda
2011-03-01
The bending and torsional elasticities are crucial in determining the static and dynamic properties of ~biopolymers such as dsDNA and sickle hemoglobin. We investigate the statistical mechanics of stiff polymers ~described by the helical wormlike model. We provide a numerical method to solve the model using a transfer matrix formulation. The correlation functions have been calculated and display rich profiles which are sensitive to the combination of the temperature and the equilibrium torsion. The asymptotic behavior at low temperature has been investigated theoretically and the predictions fit the numerical results very well. Our analysis could be used to understand the statics of dsDNA and other chiral polymers. This work is supported by grants from the NSF and Mathers Foundation.
A Review of Modeling Bioelectrochemical Systems: Engineering and Statistical Aspects
Directory of Open Access Journals (Sweden)
Shuai Luo
2016-02-01
Full Text Available Bioelectrochemical systems (BES are promising technologies to convert organic compounds in wastewater to electrical energy through a series of complex physical-chemical, biological and electrochemical processes. Representative BES such as microbial fuel cells (MFCs have been studied and advanced for energy recovery. Substantial experimental and modeling efforts have been made for investigating the processes involved in electricity generation toward the improvement of the BES performance for practical applications. However, there are many parameters that will potentially affect these processes, thereby making the optimization of system performance hard to be achieved. Mathematical models, including engineering models and statistical models, are powerful tools to help understand the interactions among the parameters in BES and perform optimization of BES configuration/operation. This review paper aims to introduce and discuss the recent developments of BES modeling from engineering and statistical aspects, including analysis on the model structure, description of application cases and sensitivity analysis of various parameters. It is expected to serves as a compass for integrating the engineering and statistical modeling strategies to improve model accuracy for BES development.
Atmospheric corrosion: statistical validation of models
International Nuclear Information System (INIS)
Diaz, V.; Martinez-Luaces, V.; Guineo-Cobs, G.
2003-01-01
In this paper we discuss two different methods for validation of regression models, applied to corrosion data. One of them is based on the correlation coefficient and the other one is the statistical test of lack of fit. Both methods are used here to analyse fitting of bi logarithmic model in order to predict corrosion for very low carbon steel substrates in rural and urban-industrial atmospheres in Uruguay. Results for parameters A and n of the bi logarithmic model are reported here. For this purpose, all repeated values were used instead of using average values as usual. Modelling is carried out using experimental data corresponding to steel substrates under the same initial meteorological conditions ( in fact, they are put in the rack at the same time). Results of correlation coefficient are compared with the lack of it tested at two different signification levels (α=0.01 and α=0.05). Unexpected differences between them are explained and finally, it is possible to conclude, at least in the studied atmospheres, that the bi logarithmic model does not fit properly the experimental data. (Author) 18 refs
MSMBuilder: Statistical Models for Biomolecular Dynamics.
Harrigan, Matthew P; Sultan, Mohammad M; Hernández, Carlos X; Husic, Brooke E; Eastman, Peter; Schwantes, Christian R; Beauchamp, Kyle A; McGibbon, Robert T; Pande, Vijay S
2017-01-10
MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Spherical Process Models for Global Spatial Statistics
Jeong, Jaehong
2017-11-28
Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture the spatial and temporal behavior of these global data sets. Though the geodesic distance is the most natural metric for measuring distance on the surface of a sphere, mathematical limitations have compelled statisticians to use the chordal distance to compute the covariance matrix in many applications instead, which may cause physically unrealistic distortions. Therefore, covariance functions directly defined on a sphere using the geodesic distance are needed. We discuss the issues that arise when dealing with spherical data sets on a global scale and provide references to recent literature. We review the current approaches to building process models on spheres, including the differential operator, the stochastic partial differential equation, the kernel convolution, and the deformation approaches. We illustrate realizations obtained from Gaussian processes with different covariance structures and the use of isotropic and nonstationary covariance models through deformations and geographical indicators for global surface temperature data. To assess the suitability of each method, we compare their log-likelihood values and prediction scores, and we end with a discussion of related research problems.
Probabilistic Forecasting of Photovoltaic Generation: An Efficient Statistical Approach
DEFF Research Database (Denmark)
Wan, Can; Lin, Jin; Song, Yonghua
2017-01-01
This letter proposes a novel efficient probabilistic forecasting approach to accurately quantify the variability and uncertainty of the power production from photovoltaic (PV) systems. Distinguished from most existing models, a linear programming based prediction interval construction model for PV...
Kim, M.
2016-12-01
GCM (General Circulation Model) is a basic and fundamental tool for predicting future climate conditions. Unfortunately, its output is too coarse in space and time to be directly applied to application fields. Currently, there is a large amount of research focused on closing the gap between resolutions of GCM output and application data. This process is called downscaling, for which many methods have been proposed in dynamical and statistical contexts. Statistical downscaling methods are frequently employed in hydrological and agricultural studies since it can rapidly downscale GCM output without expensive computational costs. Among many statistical downscaling methods, weather generator is an attractive one producing climate data of daily time scale for a local region. However, most of weather generators are originally designed to simulate local weather based on climatology during observation period. Especially, inter-annual variability of climate is not taken into account. In this study, we develop a new weather generator linked with large-scale climate in order to reflect seasonal prediction into weather simulation. The basic idea is to parametrize local climate characteristics into the underlying weather generator model, then, to link it with large-scale climatic variables. It indicates that the values of parameters, representing the condition of local climate system, are varying according to that of large-scale climate. We illustrate it by an application to a Korea basin. Local climate characteristics under consideration are monthly mean of daily maximum/minimum temperatures adjusted by precipitation effect, mean of dry-spell length, and precipitation intensity. The link between local and large-scale climate is quantified by regression model.
Generative Anatomy Modeling Language (GAML).
Demirel, Doga; Yu, Alexander; Baer-Cooper, Seth; Halic, Tansel; Bayrak, Coskun
2017-12-01
This paper presents the Generative Anatomy Modeling Language (GAML) for generating variation of 3D virtual human anatomy in real-time. This framework provides a set of operators for modification of a reference base 3D anatomy. The perturbation of the 3D models is satisfied with nonlinear geometry constraints to create an authentic human anatomy. GAML was used to create 3D difficult anatomical scenarios for virtual simulation of airway management techniques such as Endotracheal Intubation (ETI) and Cricothyroidotomy (CCT). Difficult scenarios for each technique were defined and the model variations procedurally created with GAML. This study presents details of the GAML design, set of operators, types of constraints. Cases of CCT and ETI difficulty were generated and confirmed by expert surgeons. Execution performance pertaining to an increasing complexity of constraints using nonlinear programming was in real-time execution. Copyright © 2017 John Wiley & Sons, Ltd.
Statistical model for OCT image denoising
Li, Muxingzi
2017-08-01
Optical coherence tomography (OCT) is a non-invasive technique with a large array of applications in clinical imaging and biological tissue visualization. However, the presence of speckle noise affects the analysis of OCT images and their diagnostic utility. In this article, we introduce a new OCT denoising algorithm. The proposed method is founded on a numerical optimization framework based on maximum-a-posteriori estimate of the noise-free OCT image. It combines a novel speckle noise model, derived from local statistics of empirical spectral domain OCT (SD-OCT) data, with a Huber variant of total variation regularization for edge preservation. The proposed approach exhibits satisfying results in terms of speckle noise reduction as well as edge preservation, at reduced computational cost.
Current algebra, statistical mechanics and quantum models
Vilela Mendes, R.
2017-11-01
Results obtained in the past for free boson systems at zero and nonzero temperatures are revisited to clarify the physical meaning of current algebra reducible functionals which are associated to systems with density fluctuations, leading to observable effects on phase transitions. To use current algebra as a tool for the formulation of quantum statistical mechanics amounts to the construction of unitary representations of diffeomorphism groups. Two mathematical equivalent procedures exist for this purpose. One searches for quasi-invariant measures on configuration spaces, the other for a cyclic vector in Hilbert space. Here, one argues that the second approach is closer to the physical intuition when modelling complex systems. An example of application of the current algebra methodology to the pairing phenomenon in two-dimensional fermion systems is discussed.
Statistically generated events and the fluid-dynamical expectation in high energy nucleon collisions
International Nuclear Information System (INIS)
Csernai, L.P.; Randrup, J.; Fai, G.
1984-01-01
Experimental developments point in the direction of measuring exclusive quantities in high-energy nuclear collisions. On the theory side a computer simulation model has been put forward recently to generate complete (exclusive) events statistically. In the present work this model together with fluid-dynamical results is used to see how the formation of composite fragments, the finiteness of the multiplicity, and the statistical fluctuations in the final states affect the event analysis. From a series of detailed three-dimensional fluid-dynamical calculations certain gross features are extracted that are used to give an approximate characterization of the final state of the fluid dynamical state of the collision in terms of a few subsystems (sources), a participant source and up to two spectator sources
Phenomenological Model of Vortex Generators
DEFF Research Database (Denmark)
Hansen, Martin Otto Laver; Westergaard, C.
1995-01-01
For some time attempts have been made to improve the power curve of stall regulated wind turbines by using devices like vortex generators VG and Gurney flaps. The vortex produces an additional mixing of the boundary layer and the free stream and thereby increasing the momentum close to the wall......, which again delays separation in adverse pressure gradient regions. A model is needed to include the effect of vortex generators in numerical computations of the viscous flow past rotors. In this paper a simple model is proposed....
Instance-Based Generative Biological Shape Modeling.
Peng, Tao; Wang, Wei; Rohde, Gustavo K; Murphy, Robert F
2009-01-01
Biological shape modeling is an essential task that is required for systems biology efforts to simulate complex cell behaviors. Statistical learning methods have been used to build generative shape models based on reconstructive shape parameters extracted from microscope image collections. However, such parametric modeling approaches are usually limited to simple shapes and easily-modeled parameter distributions. Moreover, to maximize the reconstruction accuracy, significant effort is required to design models for specific datasets or patterns. We have therefore developed an instance-based approach to model biological shapes within a shape space built upon diffeomorphic measurement. We also designed a recursive interpolation algorithm to probabilistically synthesize new shape instances using the shape space model and the original instances. The method is quite generalizable and therefore can be applied to most nuclear, cell and protein object shapes, in both 2D and 3D.
New advances in statistical modeling and applications
Santos, Rui; Oliveira, Maria; Paulino, Carlos
2014-01-01
This volume presents selected papers from the XIXth Congress of the Portuguese Statistical Society, held in the town of Nazaré, Portugal, from September 28 to October 1, 2011. All contributions were selected after a thorough peer-review process. It covers a broad range of papers in the areas of statistical science, probability and stochastic processes, extremes and statistical applications.
A statistical model for predicting muscle performance
Byerly, Diane Leslie De Caix
The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing
Efficient Parallel Statistical Model Checking of Biochemical Networks
Directory of Open Access Journals (Sweden)
Paolo Ballarini
2009-12-01
Full Text Available We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.
CHNANI, M; MAKER, H; PERA, MC; CANDUSSO, D; HISSEL, D
2005-01-01
Polymer electrolyte fuel cell is an alternative technology for powering electrical vehicles. As simulation is a binding milestone to develop efficient power train, a fuel cell generator model has been developed with this aim in view. The electrical response is considered as quasi-static state series, according to a semi-empirical approach. The hydraulic behaviour of the fluid line components is based on an electrical analogy. They are represented by RC circuits. Parameters of the model are id...
Modeling statistical properties of written text.
Directory of Open Access Journals (Sweden)
M Angeles Serrano
Full Text Available Written text is one of the fundamental manifestations of human language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Among these regularities, only Zipf's law has been explored in depth. Other basic properties, such as the existence of bursts of rare words in specific documents, have only been studied independently of each other and mainly by descriptive models. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on burstiness, Heaps' law describing the sublinear growth of vocabulary size with the length of a document, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science and linguistics.
Directory of Open Access Journals (Sweden)
Rochelle E. Tractenberg
2016-12-01
Full Text Available Statistical literacy is essential to an informed citizenry; and two emerging trends highlight a growing need for training that achieves this literacy. The first trend is towards “big” data: while automated analyses can exploit massive amounts of data, the interpretation—and possibly more importantly, the replication—of results are challenging without adequate statistical literacy. The second trend is that science and scientific publishing are struggling with insufficient/inappropriate statistical reasoning in writing, reviewing, and editing. This paper describes a model for statistical literacy (SL and its development that can support modern scientific practice. An established curriculum development and evaluation tool—the Mastery Rubric—is integrated with a new, developmental, model of statistical literacy that reflects the complexity of reasoning and habits of mind that scientists need to cultivate in order to recognize, choose, and interpret statistical methods. This developmental model provides actionable evidence, and explicit opportunities for consequential assessment that serves students, instructors, developers/reviewers/accreditors of a curriculum, and institutions. By supporting the enrichment, rather than increasing the amount, of statistical training in the basic and life sciences, this approach supports curriculum development, evaluation, and delivery to promote statistical literacy for students and a collective quantitative proficiency more broadly.
Model of reverse steam generator
International Nuclear Information System (INIS)
Malasek, V.; Manek, O.; Masek, V.; Riman, J.
1987-01-01
The claim of Czechoslovak discovery no. 239272 is a model designed for the verification of the properties of a reverse steam generator during the penetration of water, steam-water mixture or steam into liquid metal flowing inside the heat exchange tubes. The design may primarily be used for steam generators with a built-in inter-tube structure. The model is provided with several injection devices configured in different heat exchange tubes, spaced at different distances along the model axis. The design consists in that between the pressure and the circumferential casings there are transverse partitions and that in one chamber consisting of the circumferential casings, pressure casing and two adjoining partitions there is only one passage of the injection device through the inter-tube space. (Z.M.). 1 fig
Statistical modelling of transcript profiles of differentially regulated genes
Directory of Open Access Journals (Sweden)
Sergeant Martin J
2008-07-01
allowed 11% of the Escherichia coli features to be fitted by an exponential function, and 25% of the Rattus norvegicus features could be described by the critical exponential model, all with statistical significance of p Conclusion The statistical non-linear regression approaches presented in this study provide detailed biologically oriented descriptions of individual gene expression profiles, using biologically variable data to generate a set of defining parameters. These approaches have application to the modelling and greater interpretation of profiles obtained across a wide range of platforms, such as microarrays. Through careful choice of appropriate model forms, such statistical regression approaches allow an improved comparison of gene expression profiles, and may provide an approach for the greater understanding of common regulatory mechanisms between genes.
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.
Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G
2012-10-02
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
A Knowledge Generation Model via the Hypernetwork
Liu, Jian-Guo; Yang, Guang-Yong; Hu, Zhao-Long
2014-01-01
The influence of the statistical properties of the network on the knowledge diffusion has been extensively studied. However, the structure evolution and the knowledge generation processes are always integrated simultaneously. By introducing the Cobb-Douglas production function and treating the knowledge growth as a cooperative production of knowledge, in this paper, we present two knowledge-generation dynamic evolving models based on different evolving mechanisms. The first model, named “HDPH model,” adopts the hyperedge growth and the hyperdegree preferential attachment mechanisms. The second model, named “KSPH model,” adopts the hyperedge growth and the knowledge stock preferential attachment mechanisms. We investigate the effect of the parameters on the total knowledge stock of the two models. The hyperdegree distribution of the HDPH model can be theoretically analyzed by the mean-field theory. The analytic result indicates that the hyperdegree distribution of the HDPH model obeys the power-law distribution and the exponent is . Furthermore, we present the distributions of the knowledge stock for different parameters . The findings indicate that our proposed models could be helpful for deeply understanding the scientific research cooperation. PMID:24626143
International Nuclear Information System (INIS)
Yahya, Noorazrul; Ebert, Martin A.; Bulsara, Max; House, Michael J.; Kennedy, Angel; Joseph, David J.; Denham, James W.
2016-01-01
Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions
Energy Technology Data Exchange (ETDEWEB)
Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au [School of Physics, University of Western Australia, Western Australia 6009, Australia and School of Health Sciences, National University of Malaysia, Bangi 43600 (Malaysia); Ebert, Martin A. [School of Physics, University of Western Australia, Western Australia 6009, Australia and Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008 (Australia); Bulsara, Max [Institute for Health Research, University of Notre Dame, Fremantle, Western Australia 6959 (Australia); House, Michael J. [School of Physics, University of Western Australia, Western Australia 6009 (Australia); Kennedy, Angel [Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008 (Australia); Joseph, David J. [Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008, Australia and School of Surgery, University of Western Australia, Western Australia 6009 (Australia); Denham, James W. [School of Medicine and Public Health, University of Newcastle, New South Wales 2308 (Australia)
2016-05-15
Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions
Statistical Challenges in Modeling Big Brain Signals
Yu, Zhaoxia
2017-11-01
Brain signal data are inherently big: massive in amount, complex in structure, and high in dimensions. These characteristics impose great challenges for statistical inference and learning. Here we review several key challenges, discuss possible solutions, and highlight future research directions.
Statistical Learning Theory: Models, Concepts, and Results
von Luxburg, Ulrike; Schoelkopf, Bernhard
2008-01-01
Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We target at a broad audience, not necessarily machine learning researchers. This paper can serve as a starting point for people who want to get an overview on the field before diving into technical details.
Earthquake statistics in a Block Slider Model and a fully dynamic Fault Model
Directory of Open Access Journals (Sweden)
D. Weatherley
2004-01-01
Full Text Available We examine the event statistics obtained from two differing simplified models for earthquake faults. The first model is a reproduction of the Block-Slider model of Carlson et al. (1991, a model often employed in seismicity studies. The second model is an elastodynamic fault model based upon the Lattice Solid Model (LSM of Mora and Place (1994. We performed simulations in which the fault length was varied in each model and generated synthetic catalogs of event sizes and times. From these catalogs, we constructed interval event size distributions and inter-event time distributions. The larger, localised events in the Block-Slider model displayed the same scaling behaviour as events in the LSM however the distribution of inter-event times was markedly different. The analysis of both event size and inter-event time statistics is an effective method for comparative studies of differing simplified models for earthquake faults.
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Predictive modelling of noise level generated during sawing of rocks ...
Indian Academy of Sciences (India)
Abstract. This paper presents an experimental and statistical study on noise level generated during of rock sawing by circular diamond sawblades. Influence of the oper- ating variables and rock properties on the noise level are investigated and analysed. Statistical analyses are then employed and models are built for the ...
A generative model for predicting terrorist incidents
Verma, Dinesh C.; Verma, Archit; Felmlee, Diane; Pearson, Gavin; Whitaker, Roger
2017-05-01
A major concern in coalition peace-support operations is the incidence of terrorist activity. In this paper, we propose a generative model for the occurrence of the terrorist incidents, and illustrate that an increase in diversity, as measured by the number of different social groups to which that an individual belongs, is inversely correlated with the likelihood of a terrorist incident in the society. A generative model is one that can predict the likelihood of events in new contexts, as opposed to statistical models which are used to predict the future incidents based on the history of the incidents in an existing context. Generative models can be useful in planning for persistent Information Surveillance and Reconnaissance (ISR) since they allow an estimation of regions in the theater of operation where terrorist incidents may arise, and thus can be used to better allocate the assignment and deployment of ISR assets. In this paper, we present a taxonomy of terrorist incidents, identify factors related to occurrence of terrorist incidents, and provide a mathematical analysis calculating the likelihood of occurrence of terrorist incidents in three common real-life scenarios arising in peace-keeping operations
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
Linear Mixed Models in Statistical Genetics
R. de Vlaming (Ronald)
2017-01-01
markdownabstractOne of the goals of statistical genetics is to elucidate the genetic architecture of phenotypes (i.e., observable individual characteristics) that are affected by many genetic variants (e.g., single-nucleotide polymorphisms; SNPs). A particular aim is to identify specific SNPs that
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Geometric modeling in probability and statistics
Calin, Ovidiu
2014-01-01
This book covers topics of Informational Geometry, a field which deals with the differential geometric study of the manifold probability density functions. This is a field that is increasingly attracting the interest of researchers from many different areas of science, including mathematics, statistics, geometry, computer science, signal processing, physics and neuroscience. It is the authors’ hope that the present book will be a valuable reference for researchers and graduate students in one of the aforementioned fields. This textbook is a unified presentation of differential geometry and probability theory, and constitutes a text for a course directed at graduate or advanced undergraduate students interested in applications of differential geometry in probability and statistics. The book contains over 100 proposed exercises meant to help students deepen their understanding, and it is accompanied by software that is able to provide numerical computations of several information geometric objects. The reader...
Higher-Order Moment Characterisation of Rogue Wave Statistics in Supercontinuum Generation
DEFF Research Database (Denmark)
Sørensen, Simon Toft; Bang, Ole; Wetzel, Benjamin
2012-01-01
The noise characteristics of supercontinuum generation are characterized using higherorder statistical moments. Measures of skew and kurtosis, and the coefficient of variation allow quantitative identification of spectral regions dominated by rogue wave like behaviour.......The noise characteristics of supercontinuum generation are characterized using higherorder statistical moments. Measures of skew and kurtosis, and the coefficient of variation allow quantitative identification of spectral regions dominated by rogue wave like behaviour....
Statistical Model Checking of Rich Models and Properties
DEFF Research Database (Denmark)
Poulsen, Danny Bøgsted
Software is in increasing fashion embedded within safety- and business critical processes of society. Errors in these embedded systems can lead to human casualties or severe monetary loss. Model checking technology has proven formal methods capable of finding and correcting errors in software...... motivates why existing model checking technology should be supplemented by new techniques. It also contains a brief introduction to probability theory and concepts covered by the six papers making up the second part. The first two papers are concerned with developing online monitoring techniques...... systems. The fifth paper shows how stochastic hybrid automata are useful for modelling biological systems and the final paper is concerned with showing how statistical model checking is efficiently distributed. In parallel with developing the theory contained in the papers, a substantial part of this work...
A statistical model of future human actions
International Nuclear Information System (INIS)
Woo, G.
1992-02-01
A critical review has been carried out of models of future human actions during the long term post-closure period of a radioactive waste repository. Various Markov models have been considered as alternatives to the standard Poisson model, and the problems of parameterisation have been addressed. Where the simplistic Poisson model unduly exaggerates the intrusion risk, some form of Markov model may have to be introduced. This situation may well arise for shallow repositories, but it is less likely for deep repositories. Recommendations are made for a practical implementation of a computer based model and its associated database. (Author)
Statistical models of shape optimisation and evaluation
Davies, Rhodri; Taylor, Chris
2014-01-01
Deformable shape models have wide application in computer vision and biomedical image analysis. This book addresses a key issue in shape modelling: establishment of a meaningful correspondence between a set of shapes. Full implementation details are provided.
nQuire: a statistical framework for ploidy estimation using next generation sequencing.
Weiß, Clemens L; Pais, Marina; Cano, Liliana M; Kamoun, Sophien; Burbano, Hernán A
2018-04-04
Intraspecific variation in ploidy occurs in a wide range of species including pathogenic and nonpathogenic eukaryotes such as yeasts and oomycetes. Ploidy can be inferred indirectly - without measuring DNA content - from experiments using next-generation sequencing (NGS). We present nQuire, a statistical framework that distinguishes between diploids, triploids and tetraploids using NGS. The command-line tool models the distribution of base frequencies at variable sites using a Gaussian Mixture Model, and uses maximum likelihood to select the most plausible ploidy model. nQuire handles large genomes at high coverage efficiently and uses standard input file formats. We demonstrate the utility of nQuire analyzing individual samples of the pathogenic oomycete Phytophthora infestans and the Baker's yeast Saccharomyces cerevisiae. Using these organisms we show the dependence between reliability of the ploidy assignment and sequencing depth. Additionally, we employ normalized maximized log- likelihoods generated by nQuire to ascertain ploidy level in a population of samples with ploidy heterogeneity. Using these normalized values we cluster samples in three dimensions using multivariate Gaussian mixtures. The cluster assignments retrieved from a S. cerevisiae population recovered the true ploidy level in over 96% of samples. Finally, we show that nQuire can be used regionally to identify chromosomal aneuploidies. nQuire provides a statistical framework to study organisms with intraspecific variation in ploidy. nQuire is likely to be useful in epidemiological studies of pathogens, artificial selection experiments, and for historical or ancient samples where intact nuclei are not preserved. It is implemented as a stand-alone Linux command line tool in the C programming language and is available at https://github.com/clwgg/nQuire under the MIT license.
Enhanced surrogate models for statistical design exploiting space mapping technology
DEFF Research Database (Denmark)
Koziel, Slawek; Bandler, John W.; Mohamed, Achmed S.
2005-01-01
We present advances in microwave and RF device modeling exploiting Space Mapping (SM) technology. We propose new SM modeling formulations utilizing input mappings, output mappings, frequency scaling and quadratic approximations. Our aim is to enhance circuit models for statistical analysis...
Statistical image processing and multidimensional modeling
Fieguth, Paul
2010-01-01
Images are all around us! The proliferation of low-cost, high-quality imaging devices has led to an explosion in acquired images. When these images are acquired from a microscope, telescope, satellite, or medical imaging device, there is a statistical image processing task: the inference of something - an artery, a road, a DNA marker, an oil spill - from imagery, possibly noisy, blurry, or incomplete. A great many textbooks have been written on image processing. However this book does not so much focus on images, per se, but rather on spatial data sets, with one or more measurements taken over
Statistical modeling and extrapolation of carcinogenesis data
International Nuclear Information System (INIS)
Krewski, D.; Murdoch, D.; Dewanji, A.
1986-01-01
Mathematical models of carcinogenesis are reviewed, including pharmacokinetic models for metabolic activation of carcinogenic substances. Maximum likelihood procedures for fitting these models to epidemiological data are discussed, including situations where the time to tumor occurrence is unobservable. The plausibility of different possible shapes of the dose response curve at low doses is examined, and a robust method for linear extrapolation to low doses is proposed and applied to epidemiological data on radiation carcinogenesis
Statistical Model Selection for TID Hardness Assurance
Ladbury, R.; Gorelick, J. L.; McClure, S.
2010-01-01
Radiation Hardness Assurance (RHA) methodologies against Total Ionizing Dose (TID) degradation impose rigorous statistical treatments for data from a part's Radiation Lot Acceptance Test (RLAT) and/or its historical performance. However, no similar methods exist for using "similarity" data - that is, data for similar parts fabricated in the same process as the part under qualification. This is despite the greater difficulty and potential risk in interpreting of similarity data. In this work, we develop methods to disentangle part-to-part, lot-to-lot and part-type-to-part-type variation. The methods we develop apply not just for qualification decisions, but also for quality control and detection of process changes and other "out-of-family" behavior. We begin by discussing the data used in ·the study and the challenges of developing a statistic providing a meaningful measure of degradation across multiple part types, each with its own performance specifications. We then develop analysis techniques and apply them to the different data sets.
Statistical analysis of regional capital and operating costs for electric power generation
Energy Technology Data Exchange (ETDEWEB)
Sanchez, L.R.; Myers, M.G.; Herrman, J.A.; Provanizano, A.J.
1977-10-01
This report presents the results of a three and one-half-month study conducted for Brookhaven National Lab. to develop capital and operating cost relationships for seven electric power generating technologies: oil-, coal-, gas-, and nuclear-fired steam-electric plants, hydroelectric plants, and gas-turbine plants. The methodology is based primarily on statistical analysis of Federal Power Commission data for plant construction and annual operating costs. The development of cost-output relationships for electric power generation is emphasized, considering the effects of scale, technology, and location on each of the generating processes investigated. The regional effects on cost are measured at the Census Region level to be consistent with the Brookhaven Multi-Regional Energy and Interindustry Regional Model of the United States. Preliminary cost relationships for system-wide costs - transmission, distribution, and general expenses - were also derived. These preliminary results cover the demand for transmission and distribution capacity and operating and maintenance costs in terms of system-service characteristics. 15 references, 6 figures, 23 tables.
Multivariate statistical modelling based on generalized linear models
Fahrmeir, Ludwig
1994-01-01
This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...
Directory of Open Access Journals (Sweden)
Simone Fiori
2007-07-01
Full Text Available Bivariate statistical modeling from incomplete data is a useful statistical tool that allows to discover the model underlying two data sets when the data in the two sets do not correspond in size nor in ordering. Such situation may occur when the sizes of the two data sets do not match (i.e., there are Ã‚Â“holesÃ‚Â” in the data or when the data sets have been acquired independently. Also, statistical modeling is useful when the amount of available data is enough to show relevant statistical features of the phenomenon underlying the data. We propose to tackle the problem of statistical modeling via a neural (nonlinear system that is able to match its input-output statistic to the statistic of the available data sets. A key point of the new implementation proposed here is that it is based on look-up-table (LUT neural systems, which guarantee a computationally advantageous way of implementing neural systems. A number of numerical experiments, performed on both synthetic and real-world data sets, illustrate the features of the proposed modeling procedure.
Statistical Modelling of Extreme Rainfall in Taiwan
L-F. Chu (Lan-Fen); M.J. McAleer (Michael); C-C. Chang (Ching-Chung)
2012-01-01
textabstractIn this paper, the annual maximum daily rainfall data from 1961 to 2010 are modelled for 18 stations in Taiwan. We fit the rainfall data with stationary and non-stationary generalized extreme value distributions (GEV), and estimate their future behaviour based on the best fitting model.
Statistical Modelling of Extreme Rainfall in Taiwan
L. Chu (LanFen); M.J. McAleer (Michael); C-H. Chang (Chu-Hsiang)
2013-01-01
textabstractIn this paper, the annual maximum daily rainfall data from 1961 to 2010 are modelled for 18 stations in Taiwan. We fit the rainfall data with stationary and non-stationary generalized extreme value distributions (GEV), and estimate their future behaviour based on the best fitting model.
Statistical modelling of traffic safety development
DEFF Research Database (Denmark)
Christens, Peter
2004-01-01
Road safety is a major concern for society and individuals. Although road safety has improved in recent years, the number of road fatalities is still unacceptably high. In 2000, road accidents killed over 40,000 people in the European Union and injured more than 1.7 million. In 2001 in Denmark...... there were 6861 injury trafficc accidents reported by the police, resulting in 4519 minor injuries, 3946 serious injuries, and 431 fatalities. The general purpose of the research was to improve the insight into aggregated road safety methodology in Denmark. The aim was to analyse advanced statistical methods......, that were designed to study developments over time, including effects of interventions. This aim has been achieved by investigating variations in aggregated Danish traffic accident series and by applying state of the art methodologies to specific case studies. The thesis comprises an introduction...
A Noise Robust Statistical Texture Model
DEFF Research Database (Denmark)
Hilger, Klaus Baggesen; Stegmann, Mikkel Bille; Larsen, Rasmus
2002-01-01
This paper presents a novel approach to the problem of obtaining a low dimensional representation of texture (pixel intensity) variation present in a training set after alignment using a Generalised Procrustes analysis.We extend the conventional analysis of training textures in the Active...... Appearance Models segmentation framework. This is accomplished by augmenting the model with an estimate of the covariance of the noise present in the training data. This results in a more compact model maximising the signal-to-noise ratio, thus favouring subspaces rich on signal, but low on noise....... Differences in the methods are illustrated on a set of left cardiac ventricles obtained using magnetic resonance imaging....
Statistical models for nuclear decay from evaporation to vaporization
Cole, A J
2000-01-01
Elements of equilibrium statistical mechanics: Introduction. Microstates and macrostates. Sub-systems and convolution. The Boltzmann distribution. Statistical mechanics and thermodynamics. The grand canonical ensemble. Equations of state for ideal and real gases. Pseudo-equilibrium. Statistical models of nuclear decay. Nuclear physics background: Introduction. Elements of the theory of nuclear reactions. Quantum mechanical description of scattering from a potential. Decay rates and widths. Level and state densities in atomic nuclei. Angular momentum in quantum mechanics. History of statistical
Energy Technology Data Exchange (ETDEWEB)
Curley, G. Michael [North American Electric Reliability Corporation (United States); Mandula, Jiri [International Atomic Energy Agency (IAEA)
2008-05-15
The WEC Committee on the Performance of Generating Plant (PGP) has been collecting and analysing power plant performance statistics worldwide for more than 30 years and has produced regular reports, which include examples of advanced techniques and methods for improving power plant performance through benchmarking. A series of reports from the various working groups was issued in 2008. This reference presents the results of Working Group 2 (WG2). WG2's main task is to facilitate the collection and input on an annual basis of power plant performance data (unit-by-unit and aggregated data) into the WEC PGP database. The statistics will be collected for steam, nuclear, gas turbine and combined cycle, hydro and pump storage plant. WG2 will also oversee the ongoing development of the availability statistics database, including the contents, the required software, security issues and other important information. The report is divided into two sections: Thermal generating, combined cycle/co-generation, combustion turbine, hydro and pumped storage unavailability factors and availability statistics; and nuclear power generating units.
A statistical analysis based recommender model for heart disease patients.
Mustaqeem, Anam; Anwar, Syed Muhammad; Khan, Abdul Rashid; Majid, Muhammad
2017-12-01
An intelligent information technology based system could have a positive impact on the life-style of patients suffering from chronic diseases by providing useful health recommendations. In this paper, we have proposed a hybrid model that provides disease prediction and medical recommendations to cardiac patients. The first part aims at implementing a prediction model, that can identify the disease of a patient and classify it into one of the four output classes i.e., non-cardiac chest pain, silent ischemia, angina, and myocardial infarction. Following the disease prediction, the second part of the model provides general medical recommendations to patients. The recommendations are generated by assessing the severity of clinical features of patients, estimating the risk associated with clinical features and disease, and calculating the probability of occurrence of disease. The purpose of this model is to build an intelligent and adaptive recommender system for heart disease patients. The experiments for the proposed recommender system are conducted on a clinical data set collected and labelled in consultation with medical experts from a known hospital. The performance of the proposed prediction model is evaluated using accuracy and kappa statistics as evaluation measures. The medical recommendations are generated based on information collected from a knowledge base created with the help of physicians. The results of the recommendation model are evaluated using confusion matrix and gives an accuracy of 97.8%. The proposed system exhibits good prediction and recommendation accuracies and promises to be a useful contribution in the field of e-health and medical informatics. Copyright © 2017 Elsevier B.V. All rights reserved.
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
A Model Kelvin Electrostatic Generator.
Hill, M.; Jacobs, D. J.
1997-01-01
Describes how to construct a form of a Kelvin Electrostatics Generator from readily available components and provides an explanation of how it works. The device can generate 10-12 mm long sparks in the air. (DDR)
Introduction to statistical modelling: linear regression.
Lunt, Mark
2015-07-01
In many studies we wish to assess how a range of variables are associated with a particular outcome and also determine the strength of such relationships so that we can begin to understand how these factors relate to each other at a population level. Ultimately, we may also be interested in predicting the outcome from a series of predictive factors available at, say, a routine clinic visit. In a recent article in Rheumatology, Desai et al. did precisely that when they studied the prediction of hip and spine BMD from hand BMD and various demographic, lifestyle, disease and therapy variables in patients with RA. This article aims to introduce the statistical methodology that can be used in such a situation and explain the meaning of some of the terms employed. It will also outline some common pitfalls encountered when performing such analyses. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Statistical Model for Energy Intensity
Directory of Open Access Journals (Sweden)
Marjaneh Issapour
2012-12-01
Full Text Available A promising approach to improve scientific literacy in regards to global warming and climate change is using a simulation as part of a science education course. The simulation needs to employ scientific analysis of actual data from internationally accepted and reputable databases to demonstrate the reality of the current climate change situation. One of the most important criteria for using a simulation in a science education course is the fidelity of the model. The realism of the events and consequences modeled in the simulation is significant as well. Therefore, all underlying equations and algorithms used in the simulation must have real-world scientific basis. The "Energy Choices" simulation is one such simulation. The focus of this paper is the development of a mathematical model for "Energy Intensity" as a part of the overall system dynamics in "Energy Choices" simulation. This model will define the "Energy Intensity" as a function of other independent variables that can be manipulated by users of the simulation. The relationship discovered by this research will be applied to an algorithm in the "Energy Choices" simulation.
Latent domain models for statistical machine translation
Hoàng, C.
2017-01-01
A data-driven approach to model translation suffers from the data mismatch problem and demands domain adaptation techniques. Given parallel training data originating from a specific domain, training an MT system on the data would result in a rather suboptimal translation for other domains. But does
Statistical modelling of fine red wine production
Directory of Open Access Journals (Sweden)
María Rosa Castro
2010-01-01
Full Text Available Producing wine is a very important economic activity in the province of San Juan in Argentina; it is therefore most important to predict production regarding the quantity of raw material needed. This work was aimed at obtaining a model relating kilograms of crushed grape to the litres of wine so produced. Such model will be used for predicting precise future values and confidence intervals for determined quantities of crushed grapes. Data from a vineyard in the province of San Juan was thus used in this work. The sampling coefficient of correlation was calculated and a dispersion diagram was then constructed; this indicated a li- neal relationship between the litres of wine obtained and the kilograms of crushed grape. Two lineal models were then adopted and variance analysis was carried out because the data came from normal populations having the same variance. The most appropriate model was obtained from this analysis; it was validated with experimental values, a good approach being obtained.
Behavioral and statistical models of educational inequality
DEFF Research Database (Denmark)
Holm, Anders; Breen, Richard
2016-01-01
This paper addresses the question of how students and their families make educational decisions. We describe three types of behavioral model that might underlie decision-making and we show that they have consequences for what decisions are made. Our study thus has policy implications if we wish...
Statistical model semiquantitatively approximates arabinoxylooligosaccharides' structural diversity
DEFF Research Database (Denmark)
Dotsenko, Gleb; Nielsen, Michael Krogsgaard; Lange, Lene
2016-01-01
(wheat flour arabinoxylan (arabinose/xylose, A/X = 0.47); grass arabinoxylan (A/X = 0.24); wheat straw arabinoxylan (A/X = 0.15); and hydrothermally pretreated wheat straw arabinoxylan (A/X = 0.05)), is semiquantitatively approximated using the proposed model. The suggested approach can be applied...
A STATISTICAL MODEL FOR STOCK ASSESSMENT OF ...
African Journals Online (AJOL)
Assessment of the status of southern bluefin tuna (SBT) by Australia and Japan has used a method (ADAPT) that imposes a number of structural restrictions, and is ... over time within the bounds of specific structure, and (3) autocorrelation in recruitment processes is considered within the likelihood framework of the model.
Automatic Generation of Algorithms for the Statistical Analysis of Planetary Nebulae Images
Fischer, Bernd
2004-01-01
Analyzing data sets collected in experiments or by observations is a Core scientific activity. Typically, experimentd and observational data are &aught with uncertainty, and the analysis is based on a statistical model of the conjectured underlying processes, The large data volumes collected by modern instruments make computer support indispensible for this. Consequently, scientists spend significant amounts of their time with the development and refinement of the data analysis programs. AutoBayes [GF+02, FS03] is a fully automatic synthesis system for generating statistical data analysis programs. Externally, it looks like a compiler: it takes an abstract problem specification and translates it into executable code. Its input is a concise description of a data analysis problem in the form of a statistical model as shown in Figure 1; its output is optimized and fully documented C/C++ code which can be linked dynamically into the Matlab and Octave environments. Internally, however, it is quite different: AutoBayes derives a customized algorithm implementing the given model using a schema-based process, and then further refines and optimizes the algorithm into code. A schema is a parameterized code template with associated semantic constraints which define and restrict the template s applicability. The schema parameters are instantiated in a problem-specific way during synthesis as AutoBayes checks the constraints against the original model or, recursively, against emerging sub-problems. AutoBayes schema library contains problem decomposition operators (which are justified by theorems in a formal logic in the domain of Bayesian networks) as well as machine learning algorithms (e.g., EM, k-Means) and nu- meric optimization methods (e.g., Nelder-Mead simplex, conjugate gradient). AutoBayes augments this schema-based approach by symbolic computation to derive closed-form solutions whenever possible. This is a major advantage over other statistical data analysis systems
Benchmark validation of statistical models: Application to mediation analysis of imagery and memory.
MacKinnon, David P; Valente, Matthew J; Wurpts, Ingrid C
2018-03-29
This article describes benchmark validation, an approach to validating a statistical model. According to benchmark validation, a valid model generates estimates and research conclusions consistent with a known substantive effect. Three types of benchmark validation-(a) benchmark value, (b) benchmark estimate, and (c) benchmark effect-are described and illustrated with examples. Benchmark validation methods are especially useful for statistical models with assumptions that are untestable or very difficult to test. Benchmark effect validation methods were applied to evaluate statistical mediation analysis in eight studies using the established effect that increasing mental imagery improves recall of words. Statistical mediation analysis led to conclusions about mediation that were consistent with established theory that increased imagery leads to increased word recall. Benchmark validation based on established substantive theory is discussed as a general way to investigate characteristics of statistical models and a complement to mathematical proof and statistical simulation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Domain analysis and modeling to improve comparability of health statistics.
Okada, M; Hashimoto, H; Ohida, T
2001-01-01
Health statistics is an essential element to improve the ability of managers of health institutions, healthcare researchers, policy makers, and health professionals to formulate appropriate course of reactions and to make decisions based on evidence. To ensure adequate health statistics, standards are of critical importance. A study on healthcare statistics domain analysis is underway in an effort to improve usability and comparability of health statistics. The ongoing study focuses on structuring the domain knowledge and making the knowledge explicit with a data element dictionary being the core. Supplemental to the dictionary are a domain term list, a terminology dictionary, and a data model to help organize the concepts constituting the health statistics domain.
Ding, Xiaoning; Liu, Wei; Shen, Jiajian; Anand, Aman; Stoker, Joshua B; Hu, Yanle; Bues, Martin
2017-11-01
Monte Carlo (MC) simulation has been used to generate commissioning data for the beam modeling of treatment planning system (TPS). We have developed a method called radial projection (RP) for postprocessing of MC-simulation-generated data. We used the RP method to reduce the statistical uncertainty of the lateral profile of proton pencil beams with axial symmetry. The RP method takes advantage of the axial symmetry of dose distribution to use the mean value of multiple independent scores as the representative score. Using the mean as the representative value rather than any individual score results in substantial reduction in statistical uncertainty. Herein, we present the concept and step-by-step implementation of the RP method, as well as show the advantage of the RP method over conventional measurement methods for generating lateral profile. Lateral profiles generated by both methods were compared to demonstrate the uncertainty reduction qualitatively, and standard error comparison was performed to demonstrate the reduction quantitatively. The comparisons showed that statistical uncertainty was reduced substantially by the RP method. Using the RP method to postprocess MC data, the corresponding MC simulation time was reduced by a factor of 10 without quality reduction in the generated result from the MC data. We concluded that the RP method is an effective technique to increase MC simulation efficiency for generating lateral profiles for axially symmetric pencil beams. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Huang, N. E.; Long, S. R.
1980-01-01
Laboratory experiments were performed to measure the surface elevation probability density function and associated statistical properties for a wind-generated wave field. The laboratory data along with some limited field data were compared. The statistical properties of the surface elevation were processed for comparison with the results derived from the Longuet-Higgins (1963) theory. It is found that, even for the highly non-Gaussian cases, the distribution function proposed by Longuet-Higgins still gives good approximations.
Spectral statistics in particles-rotor model and cranking model
Zhou Xian Rong; Zhao En Guang; Guo Lu
2002-01-01
Spectral statistics for six particles in single-j and two-j model coupled with a deformed core are studied in the frames of particles-rotor model and cranking shell model. The nearest-neighbor-distribution of energy levels and spectral rigidity are studied as a function of the spin or cranking frequency, respectively. The results of single-j shell are compared with those in two-j case. The system becomes more regular when single-j space (i sub 1 sub 3 sub / sub 2) is replaced by two-j shell (g sub 7 sub / sub 2 + d sub 5 sub / sub 2), although the basis size of the configuration space is unchanged. However, the degree of chaoticity of the system changes slightly when configuration space is enlarged by extending single-j shell (i sub 1 sub 3 sub / sub 2) to two-j shell (i sub 1 sub 3 sub / sub 2 + g sub 9 sub / sub 2). Nuclear chaotic behavior is studied when authors take a two-body interaction as delta force and pairing interaction, respectively
Statistical modelling in biostatistics and bioinformatics selected papers
Peng, Defen
2014-01-01
This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...
Functional summary statistics for the Johnson-Mehl model
DEFF Research Database (Denmark)
Møller, Jesper; Ghorbani, Mohammad
The Johnson-Mehl germination-growth model is a spatio-temporal point process model which among other things have been used for the description of neurotransmitters datasets. However, for such datasets parametric Johnson-Mehl models fitted by maximum likelihood have yet not been evaluated by means...... of functional summary statistics. This paper therefore invents four functional summary statistics adapted to the Johnson-Mehl model, with two of them based on the second-order properties and the other two on the nuclei-boundary distances for the associated Johnson-Mehl tessellation. The functional summary...... statistics theoretical properties are investigated, non-parametric estimators are suggested, and their usefulness for model checking is examined in a simulation study. The functional summary statistics are also used for checking fitted parametric Johnson-Mehl models for a neurotransmitters dataset....
Fitting statistical models in bivariate allometry.
Packard, Gary C; Birchard, Geoffrey F; Boardman, Thomas J
2011-08-01
Several attempts have been made in recent years to formulate a general explanation for what appear to be recurring patterns of allometric variation in morphology, physiology, and ecology of both plants and animals (e.g. the Metabolic Theory of Ecology, the Allometric Cascade, the Metabolic-Level Boundaries hypothesis). However, published estimates for parameters in allometric equations often are inaccurate, owing to undetected bias introduced by the traditional method for fitting lines to empirical data. The traditional method entails fitting a straight line to logarithmic transformations of the original data and then back-transforming the resulting equation to the arithmetic scale. Because of fundamental changes in distributions attending transformation of predictor and response variables, the traditional practice may cause influential outliers to go undetected, and it may result in an underparameterized model being fitted to the data. Also, substantial bias may be introduced by the insidious rotational distortion that accompanies regression analyses performed on logarithms. Consequently, the aforementioned patterns of allometric variation may be illusions, and the theoretical explanations may be wide of the mark. Problems attending the traditional procedure can be largely avoided in future research simply by performing preliminary analyses on arithmetic values and by validating fitted equations in the arithmetic domain. The goal of most allometric research is to characterize relationships between biological variables and body size, and this is done most effectively with data expressed in the units of measurement. Back-transforming from a straight line fitted to logarithms is not a generally reliable way to estimate an allometric equation in the original scale. © 2010 The Authors. Biological Reviews © 2010 Cambridge Philosophical Society.
Probabilistic statistical modeling of air pollution from vehicles
Adikanova, Saltanat; Malgazhdarov, Yerzhan A.; Madiyarov, Muratkan N.; Temirbekov, Nurlan M.
2017-09-01
The aim of the work is to create a probabilistic-statistical mathematical model for the distribution of emissions from vehicles. In this article, it is proposed to use the probabilistic and statistical approach for modeling the distribution of harmful impurities in the atmosphere from vehicles using the example of the Ust-Kamenogorsk city. Using a simplified methodology of stochastic modeling, it is possible to construct effective numerical computational algorithms that significantly reduce the amount of computation without losing their accuracy.
International Nuclear Information System (INIS)
2005-01-01
For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees
Mixed deterministic statistical modelling of regional ozone air pollution
Kalenderski, Stoitchko
2011-03-17
We develop a physically motivated statistical model for regional ozone air pollution by separating the ground-level pollutant concentration field into three components, namely: transport, local production and large-scale mean trend mostly dominated by emission rates. The model is novel in the field of environmental spatial statistics in that it is a combined deterministic-statistical model, which gives a new perspective to the modelling of air pollution. The model is presented in a Bayesian hierarchical formalism, and explicitly accounts for advection of pollutants, using the advection equation. We apply the model to a specific case of regional ozone pollution-the Lower Fraser valley of British Columbia, Canada. As a predictive tool, we demonstrate that the model vastly outperforms existing, simpler modelling approaches. Our study highlights the importance of simultaneously considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. © 2011 John Wiley & Sons, Ltd..
Modeling and generating input processes
Energy Technology Data Exchange (ETDEWEB)
Johnson, M.E.
1987-01-01
This tutorial paper provides information relevant to the selection and generation of stochastic inputs to simulation studies. The primary area considered is multivariate but much of the philosophy at least is relevant to univariate inputs as well. 14 refs.
Automatic Model Generation Framework for Computational Simulation of Cochlear Implantation
DEFF Research Database (Denmark)
Mangado Lopez, Nerea; Ceresa, Mario; Duchateau, Nicolas
2016-01-01
's CT image, an accurate model of the patient-specific cochlea anatomy is obtained. An algorithm based on the parallel transport frame is employed to perform the virtual insertion of the cochlear implant. Our automatic framework also incorporates the surrounding bone and nerve fibers and assigns....... To address such a challenge, we propose an automatic framework for the generation of patient-specific meshes for finite element modeling of the implanted cochlea. First, a statistical shape model is constructed from high-resolution anatomical μCT images. Then, by fitting the statistical model to a patient...
Smooth extrapolation of unknown anatomy via statistical shape models
Grupp, R. B.; Chiang, H.; Otake, Y.; Murphy, R. J.; Gordon, C. R.; Armand, M.; Taylor, R. H.
2015-03-01
Several methods to perform extrapolation of unknown anatomy were evaluated. The primary application is to enhance surgical procedures that may use partial medical images or medical images of incomplete anatomy. Le Fort-based, face-jaw-teeth transplant is one such procedure. From CT data of 36 skulls and 21 mandibles separate Statistical Shape Models of the anatomical surfaces were created. Using the Statistical Shape Models, incomplete surfaces were projected to obtain complete surface estimates. The surface estimates exhibit non-zero error in regions where the true surface is known; it is desirable to keep the true surface and seamlessly merge the estimated unknown surface. Existing extrapolation techniques produce non-smooth transitions from the true surface to the estimated surface, resulting in additional error and a less aesthetically pleasing result. The three extrapolation techniques evaluated were: copying and pasting of the surface estimate (non-smooth baseline), a feathering between the patient surface and surface estimate, and an estimate generated via a Thin Plate Spline trained from displacements between the surface estimate and corresponding vertices of the known patient surface. Feathering and Thin Plate Spline approaches both yielded smooth transitions. However, feathering corrupted known vertex values. Leave-one-out analyses were conducted, with 5% to 50% of known anatomy removed from the left-out patient and estimated via the proposed approaches. The Thin Plate Spline approach yielded smaller errors than the other two approaches, with an average vertex error improvement of 1.46 mm and 1.38 mm for the skull and mandible respectively, over the baseline approach.
Semi-Supervised Generation with Cluster-aware Generative Models
DEFF Research Database (Denmark)
Maaløe, Lars; Fraccaro, Marco; Winther, Ole
2017-01-01
Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning. Many real life data sets contain a small amount of labelled data points, that are typically disregarded when training generative models. We propose the Cluster...... a log-likelihood of −79.38 nats on permutation invariant MNIST, while also achieving competitive semi-supervised classification accuracies. The model can also be trained fully unsupervised, and still improve the log-likelihood performance with respect to related methods....
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
1999-01-01
For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2001-01-01
For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Role of scaling in the statistical modelling of finance
Indian Academy of Sciences (India)
Economics and mathematical finance are multidisciplinary fields in which the ten- dency of statistical physicists to focus on universal laws has been criticized some- ..... is coherent and catches the essential statistical features of a long index history. A very important test for the proposed model concerns the scaling of the ...
A Statistical Evaluation of Atmosphere-Ocean General Circulation Models: Complexity vs. Simplicity
Robert K. Kaufmann; David I. Stern
2004-01-01
The principal tools used to model future climate change are General Circulation Models which are deterministic high resolution bottom-up models of the global atmosphere-ocean system that require large amounts of supercomputer time to generate results. But are these models a cost-effective way of predicting future climate change at the global level? In this paper we use modern econometric techniques to evaluate the statistical adequacy of three general circulation models (GCMs) by testing thre...
Assessing risk factors for dental caries: a statistical modeling approach.
Trottini, Mario; Bossù, Maurizio; Corridore, Denise; Ierardo, Gaetano; Luzzi, Valeria; Saccucci, Matteo; Polimeni, Antonella
2015-01-01
The problem of identifying potential determinants and predictors of dental caries is of key importance in caries research and it has received considerable attention in the scientific literature. From the methodological side, a broad range of statistical models is currently available to analyze dental caries indices (DMFT, dmfs, etc.). These models have been applied in several studies to investigate the impact of different risk factors on the cumulative severity of dental caries experience. However, in most of the cases (i) these studies focus on a very specific subset of risk factors; and (ii) in the statistical modeling only few candidate models are considered and model selection is at best only marginally addressed. As a result, our understanding of the robustness of the statistical inferences with respect to the choice of the model is very limited; the richness of the set of statistical models available for analysis in only marginally exploited; and inferences could be biased due the omission of potentially important confounding variables in the model's specification. In this paper we argue that these limitations can be overcome considering a general class of candidate models and carefully exploring the model space using standard model selection criteria and measures of global fit and predictive performance of the candidate models. Strengths and limitations of the proposed approach are illustrated with a real data set. In our illustration the model space contains more than 2.6 million models, which require inferences to be adjusted for 'optimism'.
Generation of statistical scenarios of short-term wind power production
DEFF Research Database (Denmark)
Pinson, Pierre; Papaefthymiou, George; Klockl, Bernd
2007-01-01
Short-term (up to 2-3 days ahead) probabilistic forecasts of wind power provide forecast users with a paramount information on the uncertainty of expected wind generation. Whatever the type of these probabilistic forecasts, they are produced on a per horizon basis, and hence do not inform...... on the development of the forecast uncertainty through forecast series. This issue is addressed here by describing a method that permits to generate statistical scenarios of wind generation that accounts for the interdependence structure of prediction errors, in plus of respecting predictive distributions of wind...
Statistical prediction of the numbers of degraded tubes in nuclear power plant steam generators
International Nuclear Information System (INIS)
Gallucci, R.H.V.; Klisiewicz, J.W.; Craig, K.R.
1990-01-01
Corrosion of nuclear power plant steam generator (SG) tubes often necessitates plugging/sleeving, causing decreased SG thermal performance and possible SG replacement. Statistical methods have been developed to predict probabilistically the numbers of tubes degraded due to secondary side pitting, wastage, and intergranular attack/stress-corrosion cracking. Inspection data from two Combustion Engineering (C-E) plants have been converted into statistics representing defect formation and growth. Computer simulation programs have been generated to predict the numbers of tubes to be plugged/sleeved during future outages. The probabilistic predictions for both plants successfully have bounded subsequent observations. While so far applied only to C-E SGs for the three degradation phenomena, the statistical methodology is adaptable to other SG types and phenomena
Improving statistical reasoning theoretical models and practical implications
Sedlmeier, Peter
1999-01-01
This book focuses on how statistical reasoning works and on training programs that can exploit people''s natural cognitive capabilities to improve their statistical reasoning. Training programs that take into account findings from evolutionary psychology and instructional theory are shown to have substantially larger effects that are more stable over time than previous training regimens. The theoretical implications are traced in a neural network model of human performance on statistical reasoning problems. This book apppeals to judgment and decision making researchers and other cognitive scientists, as well as to teachers of statistics and probabilistic reasoning.
International Nuclear Information System (INIS)
Guan, Dong; Wu, Jiu Hui; Jing, Li
2015-01-01
Highlights: • A random internal morphology and structure generation-growth method, termed as the quartet structure generation set (QSGS), has been utilized based on the stochastic cluster growth theory for numerical generating the various microstructures of porous metal materials. • Effects of different parameters such as thickness and porosity on sound absorption performance of the generated structures are studied by the present method, and the obtained results are validated by an empirical model as well. • This method could be utilized to guide the design and fabrication of the sound-absorption porous metal materials. - Abstract: In this paper, a statistical method for predicting sound absorption properties of porous metal materials is presented. To reflect the stochastic distribution characteristics of the porous metal materials, a random internal morphology and structure generation-growth method, termed as the quartet structure generation set (QSGS), has been utilized based on the stochastic cluster growth theory for numerical generating the various microstructures of porous metal materials. Then by using the transfer-function approach along with the QSGS tool, we investigate the sound absorbing performance of porous metal materials with complex stochastic geometries. The statistical method has been validated by the good agreement among the numerical results for metal rubber from this method and a previous empirical model and the corresponding experimental data. Furthermore, the effects of different parameters such as thickness and porosity on sound absorption performance of the generated structures are studied by the present method, and the obtained results are validated by an empirical model as well. Therefore, the present method is a reliable and robust method for predicting the sound absorption performance of porous metal materials, and could be utilized to guide the design and fabrication of the sound-absorption porous metal materials
Development of a statistical shape model of multi-organ and its performance evaluation
International Nuclear Information System (INIS)
Nakada, Misaki; Shimizu, Akinobu; Kobatake, Hidefumi; Nawano, Shigeru
2010-01-01
Existing statistical shape modeling methods for an organ can not take into account the correlation between neighboring organs. This study focuses on a level set distribution model and proposes two modeling methods for multiple organs that can take into account the correlation between neighboring organs. The first method combines level set functions of multiple organs into a vector. Subsequently it analyses the distribution of the vectors of a training dataset by a principal component analysis and builds a multiple statistical shape model. Second method constructs a statistical shape model for each organ independently and assembles component scores of different organs in a training dataset so as to generate a vector. It analyses the distribution of the vectors of to build a statistical shape model of multiple organs. This paper shows results of applying the proposed methods trained by 15 abdominal CT volumes to unknown 8 CT volumes. (author)
Some remarks on the statistical model of heavy ion collisions
International Nuclear Information System (INIS)
Koch, V.
2003-01-01
This contribution is an attempt to assess what can be learned from the remarkable success of this statistical model in describing ratios of particle abundances in ultra-relativistic heavy ion collisions
International Nuclear Information System (INIS)
2003-01-01
For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2004-01-01
For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees
Applications of spatial statistical network models to stream data
Isaak, Daniel J.; Peterson, Erin E.; Ver Hoef, Jay M.; Wenger, Seth J.; Falke, Jeffrey A.; Torgersen, Christian E.; Sowder, Colin; Steel, E. Ashley; Fortin, Marie-Josée; Jordan, Chris E.; Ruesch, Aaron S.; Som, Nicholas; Monestiez, Pascal
2014-01-01
Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for terrestrial applications and are not optimized for streams. A new class of spatial statistical model, based on valid covariance structures for stream networks, can be used with many common types of stream data (e.g., water quality attributes, habitat conditions, biological surveys) through application of appropriate distributions (e.g., Gaussian, binomial, Poisson). The spatial statistical network models account for spatial autocorrelation (i.e., nonindependence) among measurements, which allows their application to databases with clustered measurement locations. Large amounts of stream data exist in many areas where spatial statistical analyses could be used to develop novel insights, improve predictions at unsampled sites, and aid in the design of efficient monitoring strategies at relatively low cost. We review the topic of spatial autocorrelation and its effects on statistical inference, demonstrate the use of spatial statistics with stream datasets relevant to common research and management questions, and discuss additional applications and development potential for spatial statistics on stream networks. Free software for implementing the spatial statistical network models has been developed that enables custom applications with many stream databases.
Possibilities of the Statistical Scoring Models' Application at Lithuanian Banks
Dzidzevičiūtė, Laima
2013-01-01
The goal of this dissertation is to develop the rating system of Lithuanian companies based on the statistical scoring model and assess the possibilities of this system‘s application at Lithuanian banks. The dissertation consists of three Chapters. Development and application peculiarities of rating systems based on statistical scoring models are described in the first Chapter. In the second Chapter the results of the survey of commercial banks and foreign bank branches, operating in the coun...
A no extensive statistical model for the nucleon structure function
Energy Technology Data Exchange (ETDEWEB)
Trevisan, Luis A. [Departamento de Matematica e Estatistica, Universidade Estadual de Ponta Grossa, 84010-790, Ponta Grossa, PR (Brazil); Mirez, Carlos [Instituto de Ciencia, Engenharia e Tecnologia - ICET, Universidade Federal dos Vales do Jequitinhonha e Mucuri - UFVJM, Campus do Mucuri, Rua do Cruzeiro 01, Jardim Sao Paulo, 39803-371, Teofilo Otoni, Minas Gerais (Brazil)
2013-03-25
We studied an application of nonextensive thermodynamics to describe the structure function of nucleon, in a model where the usual Fermi-Dirac and Bose-Einstein energy distribution were replaced by the equivalent functions of the q-statistical. The parameters of the model are given by an effective temperature T, the q parameter (from Tsallis statistics), and two chemical potentials given by the corresponding up (u) and down (d) quark normalization in the nucleon.
Improved analyses using function datasets and statistical modeling
John S. Hogland; Nathaniel M. Anderson
2014-01-01
Raster modeling is an integral component of spatial analysis. However, conventional raster modeling techniques can require a substantial amount of processing time and storage space and have limited statistical functionality and machine learning algorithms. To address this issue, we developed a new modeling framework using C# and ArcObjects and integrated that framework...
Thiessen, Erik D
2017-01-05
Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik
International Nuclear Information System (INIS)
Procaccia, H.; Cordier, R.; Muller, S.
1994-07-01
Statistical decision theory could be a alternative for the optimization of preventive maintenance periodicity. In effect, this theory concerns the situation in which a decision maker has to make a choice between a set of reasonable decisions, and where the loss associated to a given decision depends on a probabilistic risk, called state of nature. In the case of maintenance optimization, the decisions to be analyzed are different periodicities proposed by the experts, given the observed feedback experience, the states of nature are the associated failure probabilities, and the losses are the expectations of the induced cost of maintenance and of consequences of the failures. As failure probabilities concern rare events, at the ultimate state of RCM analysis (failure of sub-component), and as expected foreseeable behaviour of equipment has to be evaluated by experts, Bayesian approach is successfully used to compute states of nature. In Bayesian decision theory, a prior distribution for failure probabilities is modeled from expert knowledge, and is combined with few stochastic information provided by feedback experience, giving a posterior distribution of failure probabilities. The optimized decision is the decision that minimizes the expected loss over the posterior distribution. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants. In these plants, auxiliary electric power is supplied by 2 redundant diesel generators which are tested every 2 weeks during about 1 hour. Until now, during yearly refueling of each plant, one endoscopic inspection of diesel cylinders is performed, and every 5 operating years, all cylinders are replaced. RCM has shown that cylinder failures could be critical. So Bayesian decision theory has been applied, taking into account expert opinions, and possibility of aging when maintenance periodicity is extended. (authors). 8 refs., 5 figs., 1 tab
Modeling and analysis with induction generators
Simões, M Godoy
2014-01-01
ForewordPrefaceAcknowledgmentsAuthorsPrinciples of Alternative Sources of Energy and Electric GenerationScope of This ChapterLegal DefinitionsPrinciples of Electrical ConversionBasic Definitions of Electrical PowerCharacteristics of Primary SourcesCharacteristics of Remote Industrial, Commercial, and Residential Sites and Rural EnergySelection of the Electric GeneratorInterfacing Primary Source, Generator, and LoadExample of a Simple Integrated Generating and Energy-Storing SystemSolved ProblemsSuggested ProblemsReferencesSteady-State Model of Induction GeneratorsScope of This ChapterInterconnection and Disconnection of the Electric Distribution NetworkRobustness of Induction GeneratorsClassical Steady-State Representation of the Asynchronous MachineGenerated PowerInduced TorqueRepresentation of Induction Generator LossesMeasurement of Induction Generator ParametersBlocked Rotor Test (s = 1)No-Load Test (s = 0)Features of Induction Machines Working as Generators Interconnected to the Distribution NetworkHigh-...
Statistical mechanics of directed models of polymers in the square lattice
Rensburg, J V
2003-01-01
Directed square lattice models of polymers and vesicles have received considerable attention in the recent mathematical and physical sciences literature. These are idealized geometric directed lattice models introduced to study phase behaviour in polymers, and include Dyck paths, partially directed paths, directed trees and directed vesicles models. Directed models are closely related to models studied in the combinatorics literature (and are often exactly solvable). They are also simplified versions of a number of statistical mechanics models, including the self-avoiding walk, lattice animals and lattice vesicles. The exchange of approaches and ideas between statistical mechanics and combinatorics have considerably advanced the description and understanding of directed lattice models, and this will be explored in this review. The combinatorial nature of directed lattice path models makes a study using generating function approaches most natural. In contrast, the statistical mechanics approach would introduce...
Models for probability and statistical inference theory and applications
Stapleton, James H
2007-01-01
This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...
Statistical detection model for eddy-current systems
International Nuclear Information System (INIS)
Martinez, J.R.; Bahr, A.J.
1984-01-01
This chapter presents a detailed analysis of some measured noise data and the results of using those data with a probe-flaw interaction model to compute the surface-crack detection characteristics of two different air-core coil probes. The objective is to develop a statistical model for determining the probability of detecting a given flaw using an eddy-current system. The basis for developing a statistical detection model is a measurement model that relates the output voltage of the system to its various signal and noise components. Topics considered include statistics of the measured background voltage, calibration of the probe-flaw interaction model and signal-to-noise ratio (SNR) definition, the operating characteristic, and a comparison of air-core probes
Energy Technology Data Exchange (ETDEWEB)
Wilson, Kevin R.; Smith, Jared D.; Kessler, Sean; Kroll, Jesse H.
2011-10-03
The heterogeneous reaction of hydroxyl radicals (OH) with squalane and bis(2-ethylhexyl) sebacate (BES) particles are used as model systems to examine how distributions of reactionproducts evolve during the oxidation of chemically reduced organic aerosol. A kinetic model of multigenerational chemistry, which is compared to previously measured (squalane) and new(BES) experimental data, reveals that it is the statistical mixtures of different generations of oxidation products that control the average particle mass and elemental composition during thereaction. The model suggests that more highly oxidized reaction products, although initially formed with low probability, play a large role in the production of gas phase reaction products.In general, these results highlight the importance of considering atmospheric oxidation as a statistical process, further suggesting that the underlying distribution of molecules could playimportant roles in aerosol formation as well as in the evolution of key physicochemical properties such as volatility and hygroscopicity.
Next Generation Transport Phenomenology Model
Strickland, Douglas J.; Knight, Harold; Evans, J. Scott
2004-01-01
This report describes the progress made in Quarter 3 of Contract Year 3 on the development of Aeronomy Phenomenology Modeling Tool (APMT), an open-source, component-based, client-server architecture for distributed modeling, analysis, and simulation activities focused on electron and photon transport for general atmospheres. In the past quarter, column emission rate computations were implemented in Java, preexisting Fortran programs for computing synthetic spectra were embedded into APMT through Java wrappers, and work began on a web-based user interface for setting input parameters and running the photoelectron and auroral electron transport models.
Modelling with stakeholders - Next generation
Voinov, Alexey; Kolagani, Nagesh; McCall, Michael K; Glynn, Pierre D.; Kragt, Marit E; Ostermann, Frank O; Pierce, Suzanne A; Ramu, Palaniappan
2016-01-01
This paper updates and builds on ‘Modelling with Stakeholders’ Voinov and Bousquet, 2010 which demonstrated the importance of, and demand for, stakeholder participation in resource and environmental modelling. This position paper returns to the concepts of that publication and reviews the progress made since 2010. A new development is the wide introduction and acceptance of social media and web applications, which dramatically changes the context and scale of stakeholder interactions and participation. Technology advances make it easier to incorporate information in interactive formats via visualization and games to augment participatory experiences. Citizens as stakeholders are increasingly demanding to be engaged in planning decisions that affect them and their communities, at scales from local to global. How people interact with and access models and data is rapidly evolving. In turn, this requires changes in how models are built, packaged, and disseminated: citizens are less in awe of experts and external authorities, and they are increasingly aware of their own capabilities to provide inputs to planning processes, including models. The continued acceleration of environmental degradation and natural resource depletion accompanies these societal changes, even as there is a growing acceptance of the need to transition to alternative, possibly very different, life styles. Substantive transitions cannot occur without significant changes in human behaviour and perceptions. The important and diverse roles that models can play in guiding human behaviour, and in disseminating and increasing societal knowledge, are a feature of stakeholder processes today.
Linear mixed models a practical guide using statistical software
West, Brady T; Galecki, Andrzej T
2006-01-01
Simplifying the often confusing array of software programs for fitting linear mixed models (LMMs), Linear Mixed Models: A Practical Guide Using Statistical Software provides a basic introduction to primary concepts, notation, software implementation, model interpretation, and visualization of clustered and longitudinal data. This easy-to-navigate reference details the use of procedures for fitting LMMs in five popular statistical software packages: SAS, SPSS, Stata, R/S-plus, and HLM. The authors introduce basic theoretical concepts, present a heuristic approach to fitting LMMs based on bo
Statistical Model and the mesonic-baryonic transition region
Oeschler, H.; Redlich, K.; Wheaton, S.
2009-01-01
The statistical model assuming chemical equilibriumand local strangeness conservation describes most of the observed features of strange particle production from SIS up to RHIC. Deviations are found as the maximum in the measured K+/pi+ ratio is much sharper than in the model calculations. At the incident energy of the maximum, the statistical model shows that freeze out changes regime from one being dominated by baryons at the lower energies toward one being dominated by mesons. It will be shown how deviations from the usual freeze-out curve influence the various particle ratios. Furthermore, other observables exhibit also changes just in this energy regime.
Multiple commodities in statistical microeconomics: Model and market
Baaquie, Belal E.; Yu, Miao; Du, Xin
2016-11-01
A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.
Generation of Java code from Alvis model
Matyasik, Piotr; Szpyrka, Marcin; Wypych, Michał
2015-12-01
Alvis is a formal language that combines graphical modelling of interconnections between system entities (called agents) and a high level programming language to describe behaviour of any individual agent. An Alvis model can be verified formally with model checking techniques applied to the model LTS graph that represents the model state space. This paper presents transformation of an Alvis model into executable Java code. Thus, the approach provides a method of automatic generation of a Java application from formally verified Alvis model.
Multi-region Statistical Shape Model for Cochlear Implantation
DEFF Research Database (Denmark)
Romera, Jordi; Kjer, H. Martin; Piella, Gemma
2016-01-01
Statistical shape models are commonly used to analyze the variability between similar anatomical structures and their use is established as a tool for analysis and segmentation of medical images. However, using a global model to capture the variability of complex structures is not enough to achie...
Evaluation of Statistical Models for Analysis of Insect, Disease and ...
African Journals Online (AJOL)
It is concluded that LMMs and GLMs simultaneously consider the effect of treatments and heterogeneity of variance and hence are more appropriate for analysis of abundance and incidence data than ordinary ANOVA. Keywords: Mixed Models; Generalized Linear Models; Statistical Power East African Journal of Sciences ...
Poppe, L.J.; Eliason, A.H.; Hastings, M.E.
2004-01-01
Measures that describe and summarize sediment grain-size distributions are important to geologists because of the large amount of information contained in textural data sets. Statistical methods are usually employed to simplify the necessary comparisons among samples and quantify the observed differences. The two statistical methods most commonly used by sedimentologists to describe particle distributions are mathematical moments (Krumbein and Pettijohn, 1938) and inclusive graphics (Folk, 1974). The choice of which of these statistical measures to use is typically governed by the amount of data available (Royse, 1970). If the entire distribution is known, the method of moments may be used; if the next to last accumulated percent is greater than 95, inclusive graphics statistics can be generated. Unfortunately, earlier programs designed to describe sediment grain-size distributions statistically do not run in a Windows environment, do not allow extrapolation of the distribution's tails, or do not generate both moment and graphic statistics (Kane and Hubert, 1963; Collias et al., 1963; Schlee and Webster, 1967; Poppe et al., 2000)1.Owing to analytical limitations, electro-resistance multichannel particle-size analyzers, such as Coulter Counters, commonly truncate the tails of the fine-fraction part of grain-size distributions. These devices do not detect fine clay in the 0.6–0.1 μm range (part of the 11-phi and all of the 12-phi and 13-phi fractions). Although size analyses performed down to 0.6 μm microns are adequate for most freshwater and near shore marine sediments, samples from many deeper water marine environments (e.g. rise and abyssal plain) may contain significant material in the fine clay fraction, and these analyses benefit from extrapolation.The program (GSSTAT) described herein generates statistics to characterize sediment grain-size distributions and can extrapolate the fine-grained end of the particle distribution. It is written in Microsoft
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
A balanced team generating model
van de Water, Tara; van de Water, Henny; Bukman, Cock
2007-01-01
This paper introduces a general team balancing model. It first summarizes existing balancing methods. It is shown that for these methods it is difficult to meet all the conditions posed by Belbin on balanced teams. This mainly is caused by the complexity of the balancing problem. A mathematical
2015-09-30
information on fish school distributions by monitoring the direction of birds returning to the colony or the behavior of other birds at sea through...active sonar. Toward this goal, fundamental advances in the understanding of fish behavior , especially in aggregations, will be made under conditions...relevant to the echo statistics problem. OBJECTIVES To develop new models of behavior of fish aggregations, including the fission/fusion process
Understanding and forecasting polar stratospheric variability with statistical models
Directory of Open Access Journals (Sweden)
C. Blume
2012-07-01
Full Text Available The variability of the north-polar stratospheric vortex is a prominent aspect of the middle atmosphere. This work investigates a wide class of statistical models with respect to their ability to model geopotential and temperature anomalies, representing variability in the polar stratosphere. Four partly nonstationary, nonlinear models are assessed: linear discriminant analysis (LDA; a cluster method based on finite elements (FEM-VARX; a neural network, namely the multi-layer perceptron (MLP; and support vector regression (SVR. These methods model time series by incorporating all significant external factors simultaneously, including ENSO, QBO, the solar cycle, volcanoes, to then quantify their statistical importance. We show that variability in reanalysis data from 1980 to 2005 is successfully modeled. The period from 2005 to 2011 can be hindcasted to a certain extent, where MLP performs significantly better than the remaining models. However, variability remains that cannot be statistically hindcasted within the current framework, such as the unexpected major warming in January 2009. Finally, the statistical model with the best generalization performance is used to predict a winter 2011/12 with warm and weak vortex conditions. A vortex breakdown is predicted for late January, early February 2012.
Monthly to seasonal low flow prediction: statistical versus dynamical models
Ionita-Scholz, Monica; Klein, Bastian; Meissner, Dennis; Rademacher, Silke
2016-04-01
the Alfred Wegener Institute a purely statistical scheme to generate streamflow forecasts for several months ahead. Instead of directly using teleconnection indices (e.g. NAO, AO) the idea is to identify regions with stable teleconnections between different global climate information (e.g. sea surface temperature, geopotential height etc.) and streamflow at different gauges relevant for inland waterway transport. So-called stability (correlation) maps are generated showing regions where streamflow and climate variable from previous months are significantly correlated in a 21 (31) years moving window. Finally, the optimal forecast model is established based on a multiple regression analysis of the stable predictors. We will present current results of the aforementioned approaches with focus on the River Rhine (being one of the world's most frequented waterways and the backbone of the European inland waterway network) and the Elbe River. Overall, our analysis reveals the existence of a valuable predictability of the low flows at monthly and seasonal time scales, a result that may be useful to water resources management. Given that all predictors used in the models are available at the end of each month, the forecast scheme can be used operationally to predict extreme events and to provide early warnings for upcoming low flows.
Digital relief generation from 3D models
Wang, Meili; Sun, Yu; Zhang, Hongming; Qian, Kun; Chang, Jian; He, Dongjian
2016-09-01
It is difficult to extend image-based relief generation to high-relief generation, as the images contain insufficient height information. To generate reliefs from three-dimensional (3D) models, it is necessary to extract the height fields from the model, but this can only generate bas-reliefs. To overcome this problem, an efficient method is proposed to generate bas-reliefs and high-reliefs directly from 3D meshes. To produce relief features that are visually appropriate, the 3D meshes are first scaled. 3D unsharp masking is used to enhance the visual features in the 3D mesh, and average smoothing and Laplacian smoothing are implemented to achieve better smoothing results. A nonlinear variable scaling scheme is then employed to generate the final bas-reliefs and high-reliefs. Using the proposed method, relief models can be generated from arbitrary viewing positions with different gestures and combinations of multiple 3D models. The generated relief models can be printed by 3D printers. The proposed method provides a means of generating both high-reliefs and bas-reliefs in an efficient and effective way under the appropriate scaling factors.
Statistical Validation of Engineering and Scientific Models: Background
International Nuclear Information System (INIS)
Hills, Richard G.; Trucano, Timothy G.
1999-01-01
A tutorial is presented discussing the basic issues associated with propagation of uncertainty analysis and statistical validation of engineering and scientific models. The propagation of uncertainty tutorial illustrates the use of the sensitivity method and the Monte Carlo method to evaluate the uncertainty in predictions for linear and nonlinear models. Four example applications are presented; a linear model, a model for the behavior of a damped spring-mass system, a transient thermal conduction model, and a nonlinear transient convective-diffusive model based on Burger's equation. Correlated and uncorrelated model input parameters are considered. The model validation tutorial builds on the material presented in the propagation of uncertainty tutoriaI and uses the damp spring-mass system as the example application. The validation tutorial illustrates several concepts associated with the application of statistical inference to test model predictions against experimental observations. Several validation methods are presented including error band based, multivariate, sum of squares of residuals, and optimization methods. After completion of the tutorial, a survey of statistical model validation literature is presented and recommendations for future work are made
Statistical Validation of Normal Tissue Complication Probability Models
Energy Technology Data Exchange (ETDEWEB)
Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Veld, Aart A. van' t; Langendijk, Johannes A. [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schilstra, Cornelis [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Radiotherapy Institute Friesland, Leeuwarden (Netherlands)
2012-09-01
Purpose: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. Methods and Materials: A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Results: Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Conclusion: Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use.
Modern statistical models for forensic fingerprint examinations: a critical review.
Abraham, Joshua; Champod, Christophe; Lennard, Chris; Roux, Claude
2013-10-10
Over the last decade, the development of statistical models in support of forensic fingerprint identification has been the subject of increasing research attention, spurned on recently by commentators who claim that the scientific basis for fingerprint identification has not been adequately demonstrated. Such models are increasingly seen as useful tools in support of the fingerprint identification process within or in addition to the ACE-V framework. This paper provides a critical review of recent statistical models from both a practical and theoretical perspective. This includes analysis of models of two different methodologies: Probability of Random Correspondence (PRC) models that focus on calculating probabilities of the occurrence of fingerprint configurations for a given population, and Likelihood Ratio (LR) models which use analysis of corresponding features of fingerprints to derive a likelihood value representing the evidential weighting for a potential source. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Stochastic Modeling and Analysis of Power System with Renewable Generation
DEFF Research Database (Denmark)
Chen, Peiyuan
. With the increasing number of wind turbines (WTs) connected to distribution systems, network operators are concerned about how such a stochastic generation affects power losses of the network. Furthermore, the operators need to estimate how much and when the stochastic generation can reduce the loading of substation...... be achieved through a probabilistic analysis that takes into account the stochastic behavior of wind power generation (WPG) and load demand. Such a probabilistic analysis may help network operators to cut down the cost associated with system planning. Thus, the objective of this thesis is to develop...... stochastic models of renewable generation and load demand for the optimal operation and planning of modern distribution systems through a probabilistic approach. On the basis of statistical data, stochastic models of WPG, load and combined heat and power (CHP) generation are developed. The stochastic wind...
Growth Curve Models and Applications : Indian Statistical Institute
2017-01-01
Growth curve models in longitudinal studies are widely used to model population size, body height, biomass, fungal growth, and other variables in the biological sciences, but these statistical methods for modeling growth curves and analyzing longitudinal data also extend to general statistics, economics, public health, demographics, epidemiology, SQC, sociology, nano-biotechnology, fluid mechanics, and other applied areas. There is no one-size-fits-all approach to growth measurement. The selected papers in this volume build on presentations from the GCM workshop held at the Indian Statistical Institute, Giridih, on March 28-29, 2016. They represent recent trends in GCM research on different subject areas, both theoretical and applied. This book includes tools and possibilities for further work through new techniques and modification of existing ones. The volume includes original studies, theoretical findings and case studies from a wide range of app lied work, and these contributions have been externally r...
Statistical modelling for recurrent events: an application to sports injuries.
Ullah, Shahid; Gabbett, Tim J; Finch, Caroline F
2014-09-01
Injuries are often recurrent, with subsequent injuries influenced by previous occurrences and hence correlation between events needs to be taken into account when analysing such data. This paper compares five different survival models (Cox proportional hazards (CoxPH) model and the following generalisations to recurrent event data: Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time (WLW-TT) marginal, Prentice-Williams-Peterson gap time (PWP-GT) conditional models) for the analysis of recurrent injury data. Empirical evaluation and comparison of different models were performed using model selection criteria and goodness-of-fit statistics. Simulation studies assessed the size and power of each model fit. The modelling approach is demonstrated through direct application to Australian National Rugby League recurrent injury data collected over the 2008 playing season. Of the 35 players analysed, 14 (40%) players had more than 1 injury and 47 contact injuries were sustained over 29 matches. The CoxPH model provided the poorest fit to the recurrent sports injury data. The fit was improved with the A-G and frailty models, compared to WLW-TT and PWP-GT models. Despite little difference in model fit between the A-G and frailty models, in the interest of fewer statistical assumptions it is recommended that, where relevant, future studies involving modelling of recurrent sports injury data use the frailty model in preference to the CoxPH model or its other generalisations. The paper provides a rationale for future statistical modelling approaches for recurrent sports injury. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Parametric analysis of the statistical model of the stick-slip process
Lima, Roberta; Sampaio, Rubens
2017-06-01
In this paper it is performed a parametric analysis of the statistical model of the response of a dry-friction oscillator. The oscillator is a spring-mass system which moves over a base with a rough surface. Due to this roughness, the mass is subject to a dry-frictional force modeled as a Coulomb friction. The system is stochastically excited by an imposed bang-bang base motion. The base velocity is modeled by a Poisson process for which a probabilistic model is fully specified. The excitation induces in the system stochastic stick-slip oscillations. The system response is composed by a random sequence alternating stick and slip-modes. With realizations of the system, a statistical model is constructed for this sequence. In this statistical model, the variables of interest of the sequence are modeled as random variables, as for example, the number of time intervals in which stick or slip occur, the instants at which they begin, and their duration. Samples of the system response are computed by integration of the dynamic equation of the system using independent samples of the base motion. Statistics and histograms of the random variables which characterize the stick-slip process are estimated for the generated samples. The objective of the paper is to analyze how these estimated statistics and histograms vary with the system parameters, i.e., to make a parametric analysis of the statistical model of the stick-slip process.
A Statistical and Spectral Model for Representing Noisy Sounds with Short-Time Sinusoids
Directory of Open Access Journals (Sweden)
Myriam Desainte-Catherine
2005-07-01
Full Text Available We propose an original model for noise analysis, transformation, and synthesis: the CNSS model. Noisy sounds are represented with short-time sinusoids whose frequencies and phases are random variables. This spectral and statistical model represents information about the spectral density of frequencies. This perceptually relevant property is modeled by three mathematical parameters that define the distribution of the frequencies. This model also represents the spectral envelope. The mathematical parameters are defined and the analysis algorithms to extract these parameters from sounds are introduced. Then algorithms for generating sounds from the parameters of the model are presented. Applications of this model include tools for composers, psychoacoustic experiments, and pedagogy.
The Statistical Modeling of the Trends Concerning the Romanian Population
Directory of Open Access Journals (Sweden)
Gabriela OPAIT
2014-11-01
Full Text Available This paper reflects the statistical modeling concerning the resident population in Romania, respectively the total of the romanian population, through by means of the „Least Squares Method”. Any country it develops by increasing of the population, respectively of the workforce, which is a factor of influence for the growth of the Gross Domestic Product (G.D.P.. The „Least Squares Method” represents a statistical technique for to determine the trend line of the best fit concerning a model.
Statistical Model of the 2001 Czech Census for Interactive Presentation
Czech Academy of Sciences Publication Activity Database
Grim, Jiří; Hora, Jan; Boček, Pavel; Somol, Petr; Pudil, Pavel
Vol. 26, č. 4 (2010), s. 1-23 ISSN 0282-423X R&D Projects: GA ČR GA102/07/1594; GA MŠk 1M0572 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Interactive statistical model * census data presentation * distribution mixtures * data modeling * EM algorithm * incomplete data * data reproduction accuracy * data mining Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.492, year: 2010 http://library.utia.cas.cz/separaty/2010/RO/grim-0350513.pdf
Applied systems ecology: models, data, and statistical methods
Energy Technology Data Exchange (ETDEWEB)
Eberhardt, L L
1976-01-01
In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.
Analyzing sickness absence with statistical models for survival data
DEFF Research Database (Denmark)
Christensen, Karl Bang; Andersen, Per Kragh; Smith-Hansen, Lars
2007-01-01
OBJECTIVES: Sickness absence is the outcome in many epidemiologic studies and is often based on summary measures such as the number of sickness absences per year. In this study the use of modern statistical methods was examined by making better use of the available information. Since sickness...... absence data deal with events occurring over time, the use of statistical models for survival data has been reviewed, and the use of frailty models has been proposed for the analysis of such data. METHODS: Three methods for analyzing data on sickness absences were compared using a simulation study...... involving the following: (i) Poisson regression using a single outcome variable (number of sickness absences), (ii) analysis of time to first event using the Cox proportional hazards model, and (iii) frailty models, which are random effects proportional hazards models. Data from a study of the relation...
Linking statistical bias description to multiobjective model calibration
Reichert, P.; Schuwirth, N.
2012-09-01
In the absence of model deficiencies, simulation results at the correct parameter values lead to an unbiased description of observed data with remaining deviations due to observation errors only. However, this ideal cannot be reached in the practice of environmental modeling, because the required simplified representation of the complex reality by the model and errors in model input lead to errors that are reflected in biased model output. This leads to two related problems: First, ignoring bias of output in the statistical model description leads to bias in parameter estimates, model predictions and, in particular, in the quantification of their uncertainty. Second, as there is no objective choice of how much bias to accept in which output variable, it is not possible to design an "objective" model calibration procedure. The first of these problems has been addressed by introducing a statistical (Bayesian) description of bias, the second by suggesting the use of multiobjective calibration techniques that cannot easily be used for uncertainty analysis. We merge the ideas of these two approaches by using the prior of the statistical bias description to quantify the importance of multiple calibration objectives. This leads to probabilistic inference and prediction while still taking multiple calibration objectives into account. The ideas and technical details of the suggested approach are outlined and a didactical example as well as an application to environmental data are provided to demonstrate its practical feasibility and computational efficiency.
Parametric study for horizontal steam generator modelling
Energy Technology Data Exchange (ETDEWEB)
Ovtcharova, I. [Energoproekt, Sofia (Bulgaria)
1995-12-31
In the presentation some of the calculated results of horizontal steam generator PGV - 440 modelling with RELAP5/Mod3 are described. Two nodalization schemes have been used with different components in the steam dome. A study of parameters variation on the steam generator work and calculated results is made in cases with separator and branch.
Liu, Chen; Wu, Xin-wu
2011-04-01
A relationship between the waste production and socio-economic factors is essential in waste management. In the present study, the factors influencing municipal solid waste generation in China were investigated by multiple statistical analysis. Twelve items were chosen for investigation: GDP, per capita GDP, urban population, the proportion of urban population, the area of urban construction, the area of paved roads, the area of urban gardens and green areas, the number of the large cities, annual per capita disposable income of urban households, annual per capita consumption expenditure of urban households, total energy consumption and annual per capital consumption for households. Two methodologies from multiple statistical analysis were selected; specifically principal components analysis (PCA) and cluster analysis (CA). Three new dimensions were identified by PCA: component 1: economy and urban development; component 2: energy consumption; and component 3: urban scale. The three components together accounted for 99.1% of the initial variance. The results show that economy and urban development are important items influencing MSW generation. The proportion of urban population and urban population had the highest loadings in all factors. The relationship between growth of gross domestic product (GDP) and production of MSW was not as clear-cut as often assumed in China, a situation that is more likely to apply to developed countries. Energy consumption was another factor considered in our study of MSW generation. In addition, the annual MSW quantity variation was investigated by cluster analysis.
Statistical learning modeling method for space debris photometric measurement
Sun, Wenjing; Sun, Jinqiu; Zhang, Yanning; Li, Haisen
2016-03-01
Photometric measurement is an important way to identify the space debris, but the present methods of photometric measurement have many constraints on star image and need complex image processing. Aiming at the problems, a statistical learning modeling method for space debris photometric measurement is proposed based on the global consistency of the star image, and the statistical information of star images is used to eliminate the measurement noises. First, the known stars on the star image are divided into training stars and testing stars. Then, the training stars are selected as the least squares fitting parameters to construct the photometric measurement model, and the testing stars are used to calculate the measurement accuracy of the photometric measurement model. Experimental results show that, the accuracy of the proposed photometric measurement model is about 0.1 magnitudes.
Statistical, Morphometric, Anatomical Shape Model (Atlas) of Calcaneus
Melinska, Aleksandra U.; Romaszkiewicz, Patryk; Wagel, Justyna; Sasiadek, Marek; Iskander, D. Robert
2015-01-01
The aim was to develop a morphometric and anatomically accurate atlas (statistical shape model) of calcaneus. The model is based on 18 left foot and 18 right foot computed tomography studies of 28 male individuals aged from 17 to 62 years, with no known foot pathology. A procedure for automatic atlas included extraction and identification of common features, averaging feature position, obtaining mean geometry, mathematical shape description and variability analysis. Expert manual assistance was included for the model to fulfil the accuracy sought by medical professionals. The proposed for the first time statistical shape model of the calcaneus could be of value in many orthopaedic applications including providing support in diagnosing pathological lesions, pre-operative planning, classification and treatment of calcaneus fractures as well as for the development of future implant procedures. PMID:26270812
Statistical Modeling for Radiation Hardness Assurance: Toward Bigger Data
Ladbury, R.; Campola, M. J.
2015-01-01
New approaches to statistical modeling in radiation hardness assurance are discussed. These approaches yield quantitative bounds on flight-part radiation performance even in the absence of conventional data sources. This allows the analyst to bound radiation risk at all stages and for all decisions in the RHA process. It also allows optimization of RHA procedures for the project's risk tolerance.
Interactive comparison of hypothesis tests for statistical model checking
de Boer, Pieter-Tjerk; Reijsbergen, D.P.; Scheinhardt, Willem R.W.
2015-01-01
We present a web-based interactive comparison of hypothesis tests as are used in statistical model checking, providing users and tool developers with more insight into their characteristics. Parameters can be modified easily and their influence is visualized in real time; an integrated simulation
Syntactic discriminative language model rerankers for statistical machine translation
Carter, S.; Monz, C.
2011-01-01
This article describes a method that successfully exploits syntactic features for n-best translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over n-gram language models in differentiating between Statistical
Hierarchical modelling for the environmental sciences statistical methods and applications
Clark, James S
2006-01-01
New statistical tools are changing the way in which scientists analyze and interpret data and models. Hierarchical Bayes and Markov Chain Monte Carlo methods for analysis provide a consistent framework for inference and prediction where information is heterogeneous and uncertain, processes are complicated, and responses depend on scale. Nowhere are these methods more promising than in the environmental sciences.
Using statistical compatibility to derive advanced probabilistic fatigue models
Czech Academy of Sciences Publication Activity Database
Fernández-Canteli, A.; Castillo, E.; López-Aenlle, M.; Seitl, Stanislav
2010-01-01
Roč. 2, č. 1 (2010), s. 1131-1140 E-ISSN 1877-7058. [Fatigue 2010. Praha, 06.06.2010-11.06.2010] Institutional research plan: CEZ:AV0Z20410507 Keywords : Fatigue models * Statistical compatibility * Functional equations Subject RIV: JL - Materials Fatigue, Friction Mechanics
Modelling geographical graduate job search using circular statistics
Faggian, Alessandra; Corcoran, Jonathan; McCann, Philip
Theory suggests that the spatial patterns of migration flows are contingent both on individual human capital and underlying geographical structures. Here we demonstrate these features by using circular statistics in an econometric modelling framework applied to the flows of UK university graduates.
Statistical Modeling of Energy Production by Photovoltaic Farms
Czech Academy of Sciences Publication Activity Database
Brabec, Marek; Pelikán, Emil; Krč, Pavel; Eben, Kryštof; Musílek, P.
2011-01-01
Roč. 5, č. 9 (2011), s. 785-793 ISSN 1934-8975 Grant - others:GA AV ČR(CZ) M100300904 Institutional research plan: CEZ:AV0Z10300504 Keywords : electrical energy * solar energy * numerical weather prediction model * nonparametric regression * beta regression Subject RIV: BB - Applied Statistics, Operational Research
Statistical properties of the nuclear shell-model Hamiltonian
International Nuclear Information System (INIS)
Dias, H.; Hussein, M.S.; Oliveira, N.A. de
1986-01-01
The statistical properties of realistic nuclear shell-model Hamiltonian are investigated in sd-shell nuclei. The probability distribution of the basic-vector amplitude is calculated and compared with the Porter-Thomas distribution. Relevance of the results to the calculation of the giant resonance mixing parameter is pointed out. (Author) [pt
Eigenfunction statistics for Anderson model with Hölder continuous ...
Indian Academy of Sciences (India)
continuous (0 < α ≤ 1) single site distribution. In localized regime, we study the distri- bution of eigenfunctions in space and energy simultaneously. In a certain scaling limit, we prove limit points are Poisson. Keywords. Anderson model; Hölder continuous measure; Poisson statistics. 2010 Mathematics Subject Classification ...
Generalized Reduced Order Model Generation, Phase I
National Aeronautics and Space Administration — M4 Engineering proposes to develop a generalized reduced order model generation method. This method will allow for creation of reduced order aeroservoelastic state...
Generalized Reduced Order Model Generation Project
National Aeronautics and Space Administration — M4 Engineering proposes to develop a generalized reduced order model generation method. This method will allow for creation of reduced order aeroservoelastic state...
Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.
2016-01-01
Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.
Integration of Advanced Statistical Analysis Tools and Geophysical Modeling
2012-08-01
1.56 0.48 Beale: MetalMapper Cued: Beale_MMstat Target: 477 Cell 202 of 1547 (SOI, 2OI) Model 1 of 3 (Inv #1 / 2 = SOI: 1 / 1) Tag...Statistical classification of buried unexploded ordnance using nonparametric prior models. IEEE Trans. Geosci. Remote Sensing, 45: 2794–2806, 2007. T...Bell and B. Barrow. Subsurface discrimination using electromagnetic induction sensors. IEEE Trans. Geosci. Remote Sensing, 39:1286–1293, 2001. S. D
A Statistical Model for Synthesis of Detailed Facial Geometry
Golovinskiy, Aleksey; Matusik, Wojciech; Pfister, Hanspeter; Rusinkiewicz, Szymon; Funkhouser, Thomas
2006-01-01
Detailed surface geometry contributes greatly to the visual realism of 3D face models. However, acquiring high-resolution face geometry is often tedious and expensive. Consequently, most face models used in games, virtual reality, or computer vision look unrealistically smooth. In this paper, we introduce a new statistical technique for the analysis and synthesis of small three-dimensional facial features, such as wrinkles and pores. We acquire high-resolution face geometry for people across ...
Statistical and RBF NN models : providing forecasts and risk assessment
Marček, Milan
2009-01-01
Forecast accuracy of economic and financial processes is a popular measure for quantifying the risk in decision making. In this paper, we develop forecasting models based on statistical (stochastic) methods, sometimes called hard computing, and on a soft method using granular computing. We consider the accuracy of forecasting models as a measure for risk evaluation. It is found that the risk estimation process based on soft methods is simplified and less critical to the question w...
Advances on statistical/thermodynamical models for unpolarized structure functions
Energy Technology Data Exchange (ETDEWEB)
Trevisan, Luis A. [Departamento de Matematica e Estatistica, Universidade Estadual de Ponta Grossa, 84010-790, Ponta Grossa, PR (Brazil); Mirez, Carlos [Universidade Federal dos Vales do Jequitinhonha e Mucuri, Campus do Mucuri, 39803-371, Teofilo Otoni, Minas Gerais (Brazil); Tomio, Lauro [Instituto de Fisica Teorica, Universidade Estadual Paulista, R. Dr. Bento Teobaldo Ferraz 271, Bl II Barra Funda, 01140070, Sao Paulo, SP (Brazil)
2013-03-25
During the eights and nineties many statistical/thermodynamical models were proposed to describe the nucleons' structure functions and distribution of the quarks in the hadrons. Most of these models describe the compound quarks and gluons inside the nucleon as a Fermi / Bose gas respectively, confined in a MIT bag with continuous energy levels. Another models considers discrete spectrum. Some interesting features of the nucleons are obtained by these models, like the sea asymmetries {sup -}d/{sup -}u and {sup -}d-{sup -}u.
Bilingual Cluster Based Models for Statistical Machine Translation
Yamamoto, Hirofumi; Sumita, Eiichiro
We propose a domain specific model for statistical machine translation. It is well-known that domain specific language models perform well in automatic speech recognition. We show that domain specific language and translation models also benefit statistical machine translation. However, there are two problems with using domain specific models. The first is the data sparseness problem. We employ an adaptation technique to overcome this problem. The second issue is domain prediction. In order to perform adaptation, the domain must be provided, however in many cases, the domain is not known or changes dynamically. For these cases, not only the translation target sentence but also the domain must be predicted. This paper focuses on the domain prediction problem for statistical machine translation. In the proposed method, a bilingual training corpus, is automatically clustered into sub-corpora. Each sub-corpus is deemed to be a domain. The domain of a source sentence is predicted by using its similarity to the sub-corpora. The predicted domain (sub-corpus) specific language and translation models are then used for the translation decoding. This approach gave an improvement of 2.7 in BLEU score on the IWSLT05 Japanese to English evaluation corpus (improving the score from 52.4 to 55.1). This is a substantial gain and indicates the validity of the proposed bilingual cluster based models.
Approximate Inference and Deep Generative Models
CERN. Geneva
2018-01-01
Advances in deep generative models are at the forefront of deep learning research because of the promise they offer for allowing data-efficient learning, and for model-based reinforcement learning. In this talk I'll review a few standard methods for approximate inference and introduce modern approximations which allow for efficient large-scale training of a wide variety of generative models. Finally, I'll demonstrate several important application of these models to density estimation, missing data imputation, data compression and planning.
International Nuclear Information System (INIS)
Weathers, J.B.; Luck, R.; Weathers, J.W.
2009-01-01
The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.
Interferometric data modelling: issues in realistic data generation
International Nuclear Information System (INIS)
Mukherjee, Soma
2004-01-01
This study describes algorithms developed for modelling interferometric noise in a realistic manner, i.e. incorporating non-stationarity that can be seen in the data from the present generation of interferometers. The noise model is based on individual component models (ICM) with the application of auto regressive moving average (ARMA) models. The data obtained from the model are vindicated by standard statistical tests, e.g. the KS test and Akaike minimum criterion. The results indicate a very good fit. The advantage of using ARMA for ICMs is that the model parameters can be controlled and hence injection and efficiency studies can be conducted in a more controlled environment. This realistic non-stationary noise generator is intended to be integrated within the data monitoring tool framework
WE-A-201-02: Modern Statistical Modeling
International Nuclear Information System (INIS)
Niemierko, A.
2016-01-01
Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear
WE-A-201-02: Modern Statistical Modeling
Energy Technology Data Exchange (ETDEWEB)
Niemierko, A.
2016-06-15
Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear
The Bayesian statistical decision theory applied to the optimization of generating set maintenance
International Nuclear Information System (INIS)
Procaccia, H.; Cordier, R.; Muller, S.
1994-11-01
The difficulty in RCM methodology is the allocation of a new periodicity of preventive maintenance on one equipment when a critical failure has been identified: until now this new allocation has been based on the engineer's judgment, and one must wait for a full cycle of feedback experience before to validate it. Statistical decision theory could be a more rational alternative for the optimization of preventive maintenance periodicity. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants, and has shown that previous preventive maintenance periodicity can be extended. (authors). 8 refs., 5 figs
Pitch Gestures in Generative Modeling of Music
DEFF Research Database (Denmark)
Jensen, Kristoffer
2011-01-01
Generative models of music are in need of performance and gesture additions, i.e. inclusions of subtle temporal and dynamic alterations, and gestures so as to render the music musical. While much of the research regarding music generation is based on music theory, the work presented here is based...... on the temporal perception, which is divided into three parts, the immediate (subchunk), the short-term memory (chunk), and the superchunk. By review of the relevant temporal perception literature, the necessary performance elements to add in the metrical generative model, related to the chunk memory...
Risk prediction model: Statistical and artificial neural network approach
Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim
2017-04-01
Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.
Organism-level models: When mechanisms and statistics fail us
Phillips, M. H.; Meyer, J.; Smith, W. P.; Rockhill, J. K.
2014-03-01
Purpose: To describe the unique characteristics of models that represent the entire course of radiation therapy at the organism level and to highlight the uses to which such models can be put. Methods: At the level of an organism, traditional model-building runs into severe difficulties. We do not have sufficient knowledge to devise a complete biochemistry-based model. Statistical model-building fails due to the vast number of variables and the inability to control many of them in any meaningful way. Finally, building surrogate models, such as animal-based models, can result in excluding some of the most critical variables. Bayesian probabilistic models (Bayesian networks) provide a useful alternative that have the advantages of being mathematically rigorous, incorporating the knowledge that we do have, and being practical. Results: Bayesian networks representing radiation therapy pathways for prostate cancer and head & neck cancer were used to highlight the important aspects of such models and some techniques of model-building. A more specific model representing the treatment of occult lymph nodes in head & neck cancer were provided as an example of how such a model can inform clinical decisions. A model of the possible role of PET imaging in brain cancer was used to illustrate the means by which clinical trials can be modelled in order to come up with a trial design that will have meaningful outcomes. Conclusions: Probabilistic models are currently the most useful approach to representing the entire therapy outcome process.
Experimental, statistical, and biological models of radon carcinogenesis
International Nuclear Information System (INIS)
Cross, F.T.
1991-09-01
Risk models developed for underground miners have not been consistently validated in studies of populations exposed to indoor radon. Imprecision in risk estimates results principally from differences between exposures in mines as compared to domestic environments and from uncertainties about the interaction between cigarette-smoking and exposure to radon decay products. Uncertainties in extrapolating miner data to domestic exposures can be reduced by means of a broad-based health effects research program that addresses the interrelated issues of exposure, respiratory tract dose, carcinogenesis (molecular/cellular and animal studies, plus developing biological and statistical models), and the relationship of radon to smoking and other copollutant exposures. This article reviews experimental animal data on radon carcinogenesis observed primarily in rats at Pacific Northwest Laboratory. Recent experimental and mechanistic carcinogenesis models of exposures to radon, uranium ore dust, and cigarette smoke are presented with statistical analyses of animal data. 20 refs., 1 fig
Statistical model selection with “Big Data”
Directory of Open Access Journals (Sweden)
Jurgen A. Doornik
2015-12-01
Full Text Available Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem, using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem, using a viable approach that resolves the computational problem of immense numbers of possible models.
Experimental, statistical and biological models of radon carcinogenesis
International Nuclear Information System (INIS)
Cross, F.T.
1992-01-01
Risk models developed for underground miners have not been consistently validated in studies of populations exposed to indoor radon. Imprecision in risk estimates results principally from differences between exposures in mines as compared with domestic environments and from uncertainties about the interaction between cigarette smoking and exposure to radon decay products. Uncertainties in extrapolating miner data to domestic exposures can be reduced by means of a broad-based health effects research programme that addresses the interrelated issues of exposure, respiratory tract dose, carcinogenesis (molecular/cellular and animal studies, plus developing biological and statistical models) and the relationship of radon to smoking and other co-pollutant exposures. This article reviews experimental animal data on radon carcinogenesis observed primarily in rats at Pacific Northwest Laboratory. Recent experimental and mechanistic carcinogenesis models of exposures to radon, uranium ore dust, and cigarette smoke are presented with statistical analyses of animal data. (author)
Statistical 3D damage accumulation model for ion implant simulators
International Nuclear Information System (INIS)
Hernandez-Mangas, J.M.; Lazaro, J.; Enriquez, L.; Bailon, L.; Barbolla, J.; Jaraiz, M.
2003-01-01
A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided
Statistical 3D damage accumulation model for ion implant simulators
Hernandez-Mangas, J M; Enriquez, L E; Bailon, L; Barbolla, J; Jaraiz, M
2003-01-01
A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided.
SoS contract verification using statistical model checking
Directory of Open Access Journals (Sweden)
Alessandro Mignogna
2013-11-01
Full Text Available Exhaustive formal verification for systems of systems (SoS is impractical and cannot be applied on a large scale. In this paper we propose to use statistical model checking for efficient verification of SoS. We address three relevant aspects for systems of systems: 1 the model of the SoS, which includes stochastic aspects; 2 the formalization of the SoS requirements in the form of contracts; 3 the tool-chain to support statistical model checking for SoS. We adapt the SMC technique for application to heterogeneous SoS. We extend the UPDM/SysML specification language to express the SoS requirements that the implemented strategies over the SoS must satisfy. The requirements are specified with a new contract language specifically designed for SoS, targeting a high-level English- pattern language, but relying on an accurate semantics given by the standard temporal logics. The contracts are verified against the UPDM/SysML specification using the Statistical Model Checker (SMC PLASMA combined with the simulation engine DESYRE, which integrates heterogeneous behavioral models through the functional mock-up interface (FMI standard. The tool-chain allows computing an estimation of the satisfiability of the contracts by the SoS. The results help the system architect to trade-off different solutions to guide the evolution of the SoS.
PKreport: report generation for checking population pharmacokinetic model assumptions
Directory of Open Access Journals (Sweden)
Li Jun
2011-05-01
Full Text Available Abstract Background Graphics play an important and unique role in population pharmacokinetic (PopPK model building by exploring hidden structure among data before modeling, evaluating model fit, and validating results after modeling. Results The work described in this paper is about a new R package called PKreport, which is able to generate a collection of plots and statistics for testing model assumptions, visualizing data and diagnosing models. The metric system is utilized as the currency for communicating between data sets and the package to generate special-purpose plots. It provides ways to match output from diverse software such as NONMEM, Monolix, R nlme package, etc. The package is implemented with S4 class hierarchy, and offers an efficient way to access the output from NONMEM 7. The final reports take advantage of the web browser as user interface to manage and visualize plots. Conclusions PKreport provides 1 a flexible and efficient R class to store and retrieve NONMEM 7 output, 2 automate plots for users to visualize data and models, 3 automatically generated R scripts that are used to create the plots; 4 an archive-oriented management tool for users to store, retrieve and modify figures, 5 high-quality graphs based on the R packages, lattice and ggplot2. The general architecture, running environment and statistical methods can be readily extended with R class hierarchy. PKreport is free to download at http://cran.r-project.org/web/packages/PKreport/index.html.
Nuclear EMC effect in non-extensive statistical model
Energy Technology Data Exchange (ETDEWEB)
Trevisan, Luis A. [Departamento de Matematica e Estatistica, Universidade Estadual de Ponta Grossa, 84010-790, Ponta Grossa, PR (Brazil); Mirez, Carlos [ICET, Universidade Federal dos Vales do Jequitinhonha e Mucuri - UFVJM, Campus do Mucuri, Rua do Cruzeiro 01, Jardim Sao Paulo, 39803-371, Teofilo Otoni, MG (Brazil)
2013-05-06
In the present work, we attempt to describe the nuclear EMC effect by using the proton structure functions obtained from the non-extensive statistical quark model. We record that such model has three fundamental variables, the temperature T, the radius, and the Tsallis parameter q. By combining different small changes, a good agreement with the experimental data may be obtained. Another interesting point of the model is to allow phenomenological interpretation, for instance, with q constant and changing the radius and the temperature or changing the radius and q and keeping the temperature.
DEFF Research Database (Denmark)
Korneliussen, Thorfinn Sand; Moltke, Ida; Albrechtsen, Anders
2013-01-01
A number of different statistics are used for detecting natural selection using DNA sequencing data, including statistics that are summaries of the frequency spectrum, such as Tajima's D. These statistics are now often being applied in the analysis of Next Generation Sequencing (NGS) data. However......, estimates of frequency spectra from NGS data are strongly affected by low sequencing coverage; the inherent technology dependent variation in sequencing depth causes systematic differences in the value of the statistic among genomic regions....
Martin, Justin D.
2017-01-01
This essay presents data from a census of statistics requirements and offerings at all 4-year journalism programs in the United States (N = 369) and proposes a model of a potential course in statistics for journalism majors. The author proposes that three philosophies underlie a statistics course for journalism students. Such a course should (a)…
Modeling rule-based item generation
Geerlings, Hanneke; Glas, Cornelis A.W.; van der Linden, Willem J.
2011-01-01
An application of a hierarchical IRT model for items in families generated through the application of different combinations of design rules is discussed. Within the families, the items are assumed to differ only in surface features. The parameters of the model are estimated in a Bayesian framework,
Statistical modelling of a new global potential vegetation distribution
Levavasseur, G.; Vrac, M.; Roche, D. M.; Paillard, D.
2012-12-01
The potential natural vegetation (PNV) distribution is required for several studies in environmental sciences. Most of the available databases are quite subjective or depend on vegetation models. We have built a new high-resolution world-wide PNV map using a objective statistical methodology based on multinomial logistic models. Our method appears as a fast and robust alternative in vegetation modelling, independent of any vegetation model. In comparison with other databases, our method provides a realistic PNV distribution in agreement with respect to BIOME 6000 data. Among several advantages, the use of probabilities allows us to estimate the uncertainty, bringing some confidence in the modelled PNV, or to highlight the regions needing some data to improve the PNV modelling. Despite our PNV map being highly dependent on the distribution of data points, it is easily updatable as soon as additional data are available and provides very useful additional information for further applications.
Bayesian statistic methods and theri application in probabilistic simulation models
Directory of Open Access Journals (Sweden)
Sergio Iannazzo
2007-03-01
Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.
Modelling gas generation in radioactive waste repositories
International Nuclear Information System (INIS)
Agg, P.J.
1992-07-01
In a repository containing low- and intermediate-level waste, gas generation will occur principally by the coupled processes of metal corrosion and microbial degradation of cellulosic waste. This paper describes a mathematical model designed to address gas generation by these mechanisms. The metal corrosion model incorporates a three-stage process encompassing both aerobic and anaerobic corrosion regimes; the microbial degradation model simulates the activities of eight different microbial populations, which are maintained as functions both of pH and of the concentrations of particular chemical species. Gas concentrations have been measured over a period of three years in large-scale drum experiments designed to simulate repository conditions. Model predictions are confirmed against the experimental measurements, and a prediction is then made of gas concentrations and generation rates over an assessment period of one million years in a radioactive waste repository. (Author)
Modelling gas generation in radioactive waste repositories
International Nuclear Information System (INIS)
Agg, P.J.
1993-02-01
In a repository containing low- and intermediate-level waste, gas generation will occur principally by the coupled processes of metal corrosion and microbial degradation of cellulosic waste. This Paper describes a mathematical model design to address gas generation by these mechanisms. The metal corrosion model incorporates a three-stage process encompassing both aerobic and anaerobic corrosion regimes; the microbial degradation model simulates the activities of eight different microbial populations, which are maintained as functions both of pH and of the concentrations of particular chemical species. Gas concentrations have been measured over a period of three years in large-scale drum experiments designed to simulate repository conditions. Model predictions are confirmed against the experimental measurements, and a prediction is then made of gas concentrations and generation rates over an assessment period of one million years in a radioactive waste repository. (author)
Workflow Fault Tree Generation Through Model Checking
DEFF Research Database (Denmark)
Herbert, Luke Thomas; Sharp, Robin
2014-01-01
We present a framework for the automated generation of fault trees from models of realworld process workflows, expressed in a formalised subset of the popular Business Process Modelling and Notation (BPMN) language. To capture uncertainty and unreliability in workflows, we extend this formalism...... to calculate the probabilities of reaching each non-error system state. Each generated error state is assigned a variable indicating its individual probability of occurrence. Our method can determine the probability of combined faults occurring, while accounting for the basic probabilistic structure...... of the system being modelled. From these calculations, a comprehensive fault tree is generated. Further, we show that annotating the model with rewards (data) allows the expected mean values of reward structures to be calculated at points of failure....
Long Memory Models to Generate Synthetic Hydrological Series
Directory of Open Access Journals (Sweden)
Guilherme Armando de Almeida Pereira
2014-01-01
Full Text Available In Brazil, much of the energy production comes from hydroelectric plants whose planning is not trivial due to the strong dependence on rainfall regimes. This planning is accomplished through optimization models that use inputs such as synthetic hydrologic series generated from the statistical model PAR(p (periodic autoregressive. Recently, Brazil began the search for alternative models able to capture the effects that the traditional model PAR(p does not incorporate, such as long memory effects. Long memory in a time series can be defined as a significant dependence between lags separated by a long period of time. Thus, this research develops a study of the effects of long dependence in the series of streamflow natural energy in the South subsystem, in order to estimate a long memory model capable of generating synthetic hydrologic series.
Spatio-temporal statistical models with applications to atmospheric processes
International Nuclear Information System (INIS)
Wikle, C.K.
1996-01-01
This doctoral dissertation is presented as three self-contained papers. An introductory chapter considers traditional spatio-temporal statistical methods used in the atmospheric sciences from a statistical perspective. Although this section is primarily a review, many of the statistical issues considered have not been considered in the context of these methods and several open questions are posed. The first paper attempts to determine a means of characterizing the semiannual oscillation (SAO) spatial variation in the northern hemisphere extratropical height field. It was discovered that the midlatitude SAO in 500hPa geopotential height could be explained almost entirely as a result of spatial and temporal asymmetries in the annual variation of stationary eddies. It was concluded that the mechanism for the SAO in the northern hemisphere is a result of land-sea contrasts. The second paper examines the seasonal variability of mixed Rossby-gravity waves (MRGW) in lower stratospheric over the equatorial Pacific. Advanced cyclostationary time series techniques were used for analysis. It was found that there are significant twice-yearly peaks in MRGW activity. Analyses also suggested a convergence of horizontal momentum flux associated with these waves. In the third paper, a new spatio-temporal statistical model is proposed that attempts to consider the influence of both temporal and spatial variability. This method is mainly concerned with prediction in space and time, and provides a spatially descriptive and temporally dynamic model
Spatio-temporal statistical models with applications to atmospheric processes
Energy Technology Data Exchange (ETDEWEB)
Wikle, Christopher K. [Iowa State Univ., Ames, IA (United States)
1996-01-01
This doctoral dissertation is presented as three self-contained papers. An introductory chapter considers traditional spatio-temporal statistical methods used in the atmospheric sciences from a statistical perspective. Although this section is primarily a review, many of the statistical issues considered have not been considered in the context of these methods and several open questions are posed. The first paper attempts to determine a means of characterizing the semiannual oscillation (SAO) spatial variation in the northern hemisphere extratropical height field. It was discovered that the midlatitude SAO in 500hPa geopotential height could be explained almost entirely as a result of spatial and temporal asymmetries in the annual variation of stationary eddies. It was concluded that the mechanism for the SAO in the northern hemisphere is a result of land-sea contrasts. The second paper examines the seasonal variability of mixed Rossby-gravity waves (MRGW) in lower stratospheric over the equatorial Pacific. Advanced cyclostationary time series techniques were used for analysis. It was found that there are significant twice-yearly peaks in MRGW activity. Analyses also suggested a convergence of horizontal momentum flux associated with these waves. In the third paper, a new spatio-temporal statistical model is proposed that attempts to consider the influence of both temporal and spatial variability. This method is mainly concerned with prediction in space and time, and provides a spatially descriptive and temporally dynamic model.
Can spatial statistical river temperature models be transferred between catchments?
Jackson, Faye L.; Fryer, Robert J.; Hannah, David M.; Malcolm, Iain A.
2017-09-01
There has been increasing use of spatial statistical models to understand and predict river temperature (Tw) from landscape covariates. However, it is not financially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses Tw data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum Tw (Twmax) within and between catchments. Models were fitted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature (Ta) metric (LS_Ta models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of Twmax. However, the predictions were biased when mean Twmax differed between catchments. The RNS was needed to characterise and predict finer-scale spatially correlated variation. Because the RNS was unique to each catchment and thus non-transferable, predictions were better within catchments than between catchments. A single model fitted to all catchments found no interactions between the landscape covariates and catchment, suggesting that the landscape relationships were transferable. The LS_Ta models transferred less well, with particularly poor performance when the relationship with the Ta metric was physically implausible or required extrapolation outside the range of the data. A single model fitted to all catchments found catchment-specific relationships between Twmax and the Ta metric, indicating that the Ta metric was not transferable. These findings improve our understanding of the transferability of spatial statistical river temperature models and provide a foundation for developing new approaches for predicting Tw at unmonitored locations across
Pseudo-dynamic source modelling with 1-point and 2-point statistics of earthquake source parameters
Song, S. G.
2013-12-24
Ground motion prediction is an essential element in seismic hazard and risk analysis. Empirical ground motion prediction approaches have been widely used in the community, but efficient simulation-based ground motion prediction methods are needed to complement empirical approaches, especially in the regions with limited data constraints. Recently, dynamic rupture modelling has been successfully adopted in physics-based source and ground motion modelling, but it is still computationally demanding and many input parameters are not well constrained by observational data. Pseudo-dynamic source modelling keeps the form of kinematic modelling with its computational efficiency, but also tries to emulate the physics of source process. In this paper, we develop a statistical framework that governs the finite-fault rupture process with 1-point and 2-point statistics of source parameters in order to quantify the variability of finite source models for future scenario events. We test this method by extracting 1-point and 2-point statistics from dynamically derived source models and simulating a number of rupture scenarios, given target 1-point and 2-point statistics. We propose a new rupture model generator for stochastic source modelling with the covariance matrix constructed from target 2-point statistics, that is, auto- and cross-correlations. Our sensitivity analysis of near-source ground motions to 1-point and 2-point statistics of source parameters provides insights into relations between statistical rupture properties and ground motions. We observe that larger standard deviation and stronger correlation produce stronger peak ground motions in general. The proposed new source modelling approach will contribute to understanding the effect of earthquake source on near-source ground motion characteristics in a more quantitative and systematic way.
Generation and analysis of large reliability models
Palumbo, Daniel L.; Nicol, David M.
1990-01-01
An effort has been underway for several years at NASA's Langley Research Center to extend the capability of Markov modeling techniques for reliability analysis to the designers of highly reliable avionic systems. This effort has been focused in the areas of increased model abstraction and increased computational capability. The reliability model generator (RMG), a software tool which uses as input a graphical, object-oriented block diagram of the system, is discussed. RMG uses an automated failure modes-effects analysis algorithm to produce the reliability model from the graphical description. Also considered is the ASSURE software tool, a parallel processing program which uses the ASSIST modeling language and SURE semi-Markov solution technique. An executable failure modes-effects analysis is used by ASSURE. The successful combination of the power of graphical representation, automated model generation, and parallel computation leads to the conclusion that large system architectures can now be analyzed.
Generating Performance Models for Irregular Applications
Energy Technology Data Exchange (ETDEWEB)
Friese, Ryan D.; Tallent, Nathan R.; Vishnu, Abhinav; Kerbyson, Darren J.; Hoisie, Adolfy
2017-05-30
Many applications have irregular behavior --- non-uniform input data, input-dependent solvers, irregular memory accesses, unbiased branches --- that cannot be captured using today's automated performance modeling techniques. We describe new hierarchical critical path analyses for the \\Palm model generation tool. To create a model's structure, we capture tasks along representative MPI critical paths. We create a histogram of critical tasks with parameterized task arguments and instance counts. To model each task, we identify hot instruction-level sub-paths and model each sub-path based on data flow, instruction scheduling, and data locality. We describe application models that generate accurate predictions for strong scaling when varying CPU speed, cache speed, memory speed, and architecture. We present results for the Sweep3D neutron transport benchmark; Page Rank on multiple graphs; Support Vector Machine with pruning; and PFLOTRAN's reactive flow/transport solver with domain-induced load imbalance.
Statistical volumetric model for characterization and visualization of prostate cancer
Lu, Jianping; Srikanchana, Rujirutana; McClain, Maxine A.; Wang, Yue J.; Xuan, Jian Hua; Sesterhenn, Isabell A.; Freedman, Matthew T.; Mun, Seong K.
2000-04-01
To reveal the spatial pattern of localized prostate cancer distribution, a 3D statistical volumetric model, showing the probability map of prostate cancer distribution, together with the anatomical structure of the prostate, has been developed from 90 digitally-imaged surgical specimens. Through an enhanced virtual environment with various visualization modes, this master model permits for the first time an accurate characterization and understanding of prostate cancer distribution patterns. The construction of the statistical volumetric model is characterized by mapping all of the individual models onto a generic prostate site model, in which a self-organizing scheme is used to decompose a group of contours representing multifold tumors into localized tumor elements. Next crucial step of creating the master model is the development of an accurate multi- object and non-rigid registration/warping scheme incorporating various variations among these individual moles in true 3D. This is achieved with a multi-object based principle-axis alignment followed by an affine transform, and further fine-tuned by a thin-plate spline interpolation driven by the surface based deformable warping dynamics. Based on the accurately mapped tumor distribution, a standard finite normal mixture is used to model the cancer volumetric distribution statistics, whose parameters are estimated using both the K-means and expectation- maximization algorithms under the information theoretic criteria. Given the desired number of tissue samplings, the prostate needle biopsy site selection is optimized through a probabilistic self-organizing map thus achieving a maximum likelihood of cancer detection. We describe the details of our theory and methodology, and report our pilot results and evaluation of the effectiveness of the algorithm in characterizing prostate cancer distributions and optimizing needle biopsy techniques.
Spin studies of nucleons in a statistical model
International Nuclear Information System (INIS)
Singh, J P; Upadhyay, Alka
2004-01-01
We decompose various quark-gluon Fock states of a nucleon in a set of states in which each of the three-quark core and the rest of the stuff, termed as sea, appears with definite spin and colour quantum number, their weights being determined, statistically, from their multiplicities. The expansion coefficients in the quark-gluon Fock state expansion have been taken from a recently proposed statistical model. We have also considered two modifications of this model with a view to reducing the contributions of the sea components with higher multiplicities. With certain approximations, we have calculated the quark contributions to the spin of the nucleon, the ratio of the magnetic moments of nucleons, their weak decay constant and the ratio of SU(3) reduced matrix elements for the axial current
Statistical inference to advance network models in epidemiology.
Welch, David; Bansal, Shweta; Hunter, David R
2011-03-01
Contact networks are playing an increasingly important role in the study of epidemiology. Most of the existing work in this area has focused on considering the effect of underlying network structure on epidemic dynamics by using tools from probability theory and computer simulation. This work has provided much insight on the role that heterogeneity in host contact patterns plays on infectious disease dynamics. Despite the important understanding afforded by the probability and simulation paradigm, this approach does not directly address important questions about the structure of contact networks such as what is the best network model for a particular mode of disease transmission, how parameter values of a given model should be estimated, or how precisely the data allow us to estimate these parameter values. We argue that these questions are best answered within a statistical framework and discuss the role of statistical inference in estimating contact networks from epidemiological data. Copyright © 2011 Elsevier B.V. All rights reserved.
A statistical method for descriminating between alternative radiobiological models
International Nuclear Information System (INIS)
Kinsella, I.A.; Malone, J.F.
1977-01-01
Radiobiological models assist understanding of the development of radiation damage, and may provide a basis for extrapolating dose-effect curves from high to low dose regions. Many models have been proposed such as multitarget and its modifications, enzymatic models, and those with a quadratic dose response relationship (i.e. αD + βD 2 forms). It is difficult to distinguish between these because the statistical techniques used are almost always limited, in that one method can rarely be applied to the whole range of models. A general statistical procedure for parameter estimation (Maximum Liklihood Method) has been found applicable to a wide range of radiobiological models. The curve parameters are estimated using a computerised search that continues until the most likely set of values to fit the data is obtained. When the search is complete two procedures are carried out. First a goodness of fit test is applied which examines the applicability of an individual model to the data. Secondly an index is derived which provides an indication of the adequacy of any model compared with alternative models. Thus the models may be ranked according to how well they fit the data. For example, with one set of data, multitarget types were found to be more suitable than quadratic types (αD + βD 2 ). This method should be of assitance is evaluating various models. It may also be profitably applied to selection of the most appropriate model to use, when it is necessary to extrapolate from high to low doses
Wolf Creek Generating Station containment model
International Nuclear Information System (INIS)
Nguyen, D.H.; Neises, G.J.; Howard, M.L.
1995-01-01
This paper presents a CONTEMPT-LT/28 containment model that has been developed by Wolf Creek Nuclear Operating Corporation (WCNOC) to predict containment pressure and temperature behavior during the postulated events at Wolf Creek Generating Station (WCGS). The model has been validated using data provided in the WCGS Updated Safety Analysis Report (USAR). CONTEMPT-LT/28 model has been used extensively at WCGS to support plant operations, and recently, to support its 4.5% thermal power uprate project
A statistical model of structure functions and quantum chromodynamics
International Nuclear Information System (INIS)
Mac, E.; Ugaz, E.; Universidad Nacional de Ingenieria, Lima
1989-01-01
We consider a model for the x-dependence of the quark distributions in the proton. Within the context of simple statistical assumptions, we obtain the parton densities in the infinite momentum frame. In a second step lowest order QCD corrections are incorporated to these distributions. Crude, but reasonable, agreement with experiment is found for the F 2 , valence and q, anti q distributions for x> or approx.0.2. (orig.)
A Statistical Model for Soliton Particle Interaction in Plasmas
DEFF Research Database (Denmark)
Dysthe, K. B.; Pécseli, Hans; Truelsen, J.
1986-01-01
A statistical model for soliton-particle interaction is presented. A master equation is derived for the time evolution of the particle velocity distribution as induced by resonant interaction with Korteweg-de Vries solitons. The detailed energy balance during the interaction subsequently determines...... the evolution of the soliton amplitude distribution. The analysis applies equally well for weakly nonlinear plasma waves in a strongly magnetized waveguide, or for ion acoustic waves propagating in one-dimensional systems....
Physical-Statistical Model of Thermal Conductivity of Nanofluids
Directory of Open Access Journals (Sweden)
B. Usowicz
2014-01-01
Full Text Available A physical-statistical model for predicting the effective thermal conductivity of nanofluids is proposed. The volumetric unit of nanofluids in the model consists of solid, liquid, and gas particles and is treated as a system made up of regular geometric figures, spheres, filling the volumetric unit by layers. The model assumes that connections between layers of the spheres and between neighbouring spheres in the layer are represented by serial and parallel connections of thermal resistors, respectively. This model is expressed in terms of thermal resistance of nanoparticles and fluids and the multinomial distribution of particles in the nanofluids. The results for predicted and measured effective thermal conductivity of several nanofluids (Al2O3/ethylene glycol-based and Al2O3/water-based; CuO/ethylene glycol-based and CuO/water-based; and TiO2/ethylene glycol-based are presented. The physical-statistical model shows a reasonably good agreement with the experimental results and gives more accurate predictions for the effective thermal conductivity of nanofluids compared to existing classical models.
Statistical modeling of global geogenic fluoride contamination in groundwaters.
Amini, Manouchehr; Mueller, Kim; Abbaspour, Karim C; Rosenberg, Thomas; Afyuni, Majid; Møller, Klaus N; Sarr, Mamadou; Johnson, C Annette
2008-05-15
The use of groundwater with high fluoride concentrations poses a health threat to millions of people around the world. This study aims at providing a global overview of potentially fluoride-rich groundwaters by modeling fluoride concentration. A large database of worldwide fluoride concentrations as well as available information on related environmental factors such as soil properties, geological settings, and climatic and topographical information on a global scale have all been used in the model. The modeling approach combines geochemical knowledge with statistical methods to devise a rule-based statistical procedure, which divides the world into 8 different "process regions". For each region a separate predictive model was constructed. The end result is a global probability map of fluoride concentration in the groundwater. Comparisons of the modeled and measured data indicate that 60-70% of the fluoride variation could be explained by the models in six process regions, while in two process regions only 30% of the variation in the measured data was explained. Furthermore, the global probability map corresponded well with fluorotic areas described in the international literature. Although the probability map should not replace fluoride testing, it can give a first indication of possible contamination and thus may support the planning process of new drinking water projects.
Simulating European wind power generation applying statistical downscaling to reanalysis data
DEFF Research Database (Denmark)
Gonzalez-Aparicio, I.; Monforti, F.; Volker, Patrick
2017-01-01
The growing share of electricity production from solar and mainly wind resources constantly increases the stochastic nature of the power system. Modelling the high share of renewable energy sources and in particular wind power - crucially depends on the adequate representation of the intermittency...... generation time series dataset for the EU-28 and neighbouring countries at hourly intervals and at different geographical aggregation levels (country, bidding zone and administrative territorial unit), for a 30 year period taking into account the wind generating fleet at the end of 2015. (C) 2017 The Authors...
UPPAAL-SMC: Statistical Model Checking for Priced Timed Automata
DEFF Research Database (Denmark)
Bulychev, Petr; David, Alexandre; Larsen, Kim Guldstrand
2012-01-01
in the form of probability distributions and compare probabilities to analyze performance aspects of systems. The focus of the survey is on the evolution of the tool – including modeling and specification formalisms as well as techniques applied – together with applications of the tool to case studies....... on a series of extensions of the statistical model checking approach generalized to handle real-time systems and estimate undecidable problems. U PPAAL - SMC comes together with a friendly user interface that allows a user to specify complex problems in an efficient manner as well as to get feedback...
Statistical mechanics of attractor neural network models with synaptic depression
International Nuclear Information System (INIS)
Igarashi, Yasuhiko; Oizumi, Masafumi; Otsubo, Yosuke; Nagata, Kenji; Okada, Masato
2009-01-01
Synaptic depression is known to control gain for presynaptic inputs. Since cortical neurons receive thousands of presynaptic inputs, and their outputs are fed into thousands of other neurons, the synaptic depression should influence macroscopic properties of neural networks. We employ simple neural network models to explore the macroscopic effects of synaptic depression. Systems with the synaptic depression cannot be analyzed due to asymmetry of connections with the conventional equilibrium statistical-mechanical approach. Thus, we first propose a microscopic dynamical mean field theory. Next, we derive macroscopic steady state equations and discuss the stabilities of steady states for various types of neural network models.
Linguistically motivated statistical machine translation models and algorithms
Xiong, Deyi
2015-01-01
This book provides a wide variety of algorithms and models to integrate linguistic knowledge into Statistical Machine Translation (SMT). It helps advance conventional SMT to linguistically motivated SMT by enhancing the following three essential components: translation, reordering and bracketing models. It also serves the purpose of promoting the in-depth study of the impacts of linguistic knowledge on machine translation. Finally it provides a systematic introduction of Bracketing Transduction Grammar (BTG) based SMT, one of the state-of-the-art SMT formalisms, as well as a case study of linguistically motivated SMT on a BTG-based platform.
Statistical models for expert judgement and wear prediction
International Nuclear Information System (INIS)
Pulkkinen, U.
1994-01-01
This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)
The ModelCC Model-Driven Parser Generator
Directory of Open Access Journals (Sweden)
Fernando Berzal
2015-01-01
Full Text Available Syntax-directed translation tools require the specification of a language by means of a formal grammar. This grammar must conform to the specific requirements of the parser generator to be used. This grammar is then annotated with semantic actions for the resulting system to perform its desired function. In this paper, we introduce ModelCC, a model-based parser generator that decouples language specification from language processing, avoiding some of the problems caused by grammar-driven parser generators. ModelCC receives a conceptual model as input, along with constraints that annotate it. It is then able to create a parser for the desired textual syntax and the generated parser fully automates the instantiation of the language conceptual model. ModelCC also includes a reference resolution mechanism so that ModelCC is able to instantiate abstract syntax graphs, rather than mere abstract syntax trees.
The Impact of Statistical Leakage Models on Design Yield Estimation
Directory of Open Access Journals (Sweden)
Rouwaida Kanj
2011-01-01
Full Text Available Device mismatch and process variation models play a key role in determining the functionality and yield of sub-100 nm design. Average characteristics are often of interest, such as the average leakage current or the average read delay. However, detecting rare functional fails is critical for memory design and designers often seek techniques that enable accurately modeling such events. Extremely leaky devices can inflict functionality fails. The plurality of leaky devices on a bitline increase the dimensionality of the yield estimation problem. Simplified models are possible by adopting approximations to the underlying sum of lognormals. The implications of such approximations on tail probabilities may in turn bias the yield estimate. We review different closed form approximations and compare against the CDF matching method, which is shown to be most effective method for accurate statistical leakage modeling.
The GNASH preequilibrium-statistical nuclear model code
International Nuclear Information System (INIS)
Arthur, E. D.
1988-01-01
The following report is based on materials presented in a series of lectures at the International Center for Theoretical Physics, Trieste, which were designed to describe the GNASH preequilibrium statistical model code and its use. An overview is provided of the code with emphasis upon code's calculational capabilities and the theoretical models that have been implemented in it. Two sample problems are discussed, the first dealing with neutron reactions on 58 Ni. the second illustrates the fission model capabilities implemented in the code and involves n + 235 U reactions. Finally a description is provided of current theoretical model and code development underway. Examples of calculated results using these new capabilities are also given. 19 refs., 17 figs., 3 tabs
A nested multisite daily rainfall stochastic generation model
Srikanthan, Ratnasingham; Pegram, Geoffrey G. S.
2009-06-01
SummaryThis paper describes a nested multisite daily rainfall generation model which preserves the statistics at daily, monthly and annual levels of aggregation. A multisite two-part daily model is nested in multisite monthly, then annual models. A multivariate set of fourth order Markov chains is used to model the daily occurrence of rainfall; the daily spatial correlation in the occurrence process is handled by using suitably correlated uniformly distributed variates via a Normal Scores Transform (NST) obtained from a set of matched multinormal pseudo-random variates, following Wilks [Wilks, D.S., 1998. Multisite generalisation of a daily stochastic precipitation generation model. Journal of Hydrology 210, 178-191]; we call it a hidden covariance model. A spatially correlated two parameter gamma distribution is used to obtain the rainfall depths; these values are also correlated via a specially matched hidden multinormal process. For nesting, the generated daily rainfall sequences at all the sites are aggregated to monthly rainfall values and these values are modified by a set of lag-1 autoregressive multisite monthly rainfall models. The modified monthly rainfall values are aggregated to annual rainfall and these are then modified by a lag-1 autoregressive multisite annual model. This nesting process ensures that the daily, monthly and annual means and covariances are preserved. The model was applied to a region with 30 rainfall sites, one of the five sets reported by Srikanthan [Srikanthan, R., 2005. Stochastic Generation of Daily Rainfall Data at a Number of Sites. Technical Report 05/7, CRC for Catchment Hydrology. Monash University, 66p]. A comparison of the historical and generated statistics shows that the model preserves all the important characteristics of rainfall at the daily, monthly and annual time scales, including the spatial structure. There are some outstanding features that need to be improved: depths of rainfall on isolated wet days and
Estimating Predictive Variance for Statistical Gas Distribution Modelling
International Nuclear Information System (INIS)
Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo
2009-01-01
Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.
Statistical emulation of a tsunami model for sensitivity analysis and uncertainty quantification
Directory of Open Access Journals (Sweden)
A. Sarri
2012-06-01
Full Text Available Due to the catastrophic consequences of tsunamis, early warnings need to be issued quickly in order to mitigate the hazard. Additionally, there is a need to represent the uncertainty in the predictions of tsunami characteristics corresponding to the uncertain trigger features (e.g. either position, shape and speed of a landslide, or sea floor deformation associated with an earthquake. Unfortunately, computer models are expensive to run. This leads to significant delays in predictions and makes the uncertainty quantification impractical. Statistical emulators run almost instantaneously and may represent well the outputs of the computer model. In this paper, we use the outer product emulator to build a fast statistical surrogate of a landslide-generated tsunami computer model. This Bayesian framework enables us to build the emulator by combining prior knowledge of the computer model properties with a few carefully chosen model evaluations. The good performance of the emulator is validated using the leave-one-out method.
Experimental investigation of statistical models describing distribution of counts
International Nuclear Information System (INIS)
Salma, I.; Zemplen-Papp, E.
1992-01-01
The binomial, Poisson and modified Poisson models which are used for describing the statistical nature of the distribution of counts are compared theoretically, and conclusions for application are considered. The validity of the Poisson and the modified Poisson statistical distribution for observing k events in a short time interval is investigated experimentally for various measuring times. The experiments to measure the influence of the significant radioactive decay were performed with 89 Y m (T 1/2 =16.06 s), using a multichannel analyser (4096 channels) in the multiscaling mode. According to the results, Poisson statistics describe the counting experiment for short measuring times (up to T=0.5T 1/2 ) and its application is recommended. However, analysis of the data demonstrated, with confidence, that for long measurements (T≥T 1/2 ) Poisson distribution is not valid and the modified Poisson function is preferable. The practical implications in calculating uncertainties and in optimizing the measuring time are discussed. Differences between the standard deviations evaluated on the basis of the Poisson and binomial models are especially significant for experiments with long measuring time (T/T 1/2 ≥2) and/or large detection efficiency (ε>0.30). Optimization of the measuring time for paired observations yields the same solution for either the binomial or the Poisson distribution. (orig.)
CDFTBL: A statistical program for generating cumulative distribution functions from data
International Nuclear Information System (INIS)
Eslinger, P.W.
1991-06-01
This document describes the theory underlying the CDFTBL code and gives details for using the code. The CDFTBL code provides an automated tool for generating a statistical cumulative distribution function that describes a set of field data. The cumulative distribution function is written in the form of a table of probabilities, which can be used in a Monte Carlo computer code. A a specific application, CDFTBL can be used to analyze field data collected for parameters required by the PORMC computer code. Section 2.0 discusses the mathematical basis of the code. Section 3.0 discusses the code structure. Section 4.0 describes the free-format input command language, while Section 5.0 describes in detail the commands to run the program. Section 6.0 provides example program runs, and Section 7.0 provides references. The Appendix provides a program source listing. 11 refs., 2 figs., 19 tabs
Editorial to: Six papers on Dynamic Statistical Models
DEFF Research Database (Denmark)
2014-01-01
statistical methodology and theory for large and complex data sets that included biostatisticians and mathematical statisticians from three faculties at the University of Copenhagen. The satellite meeting took place August 17–19, 2011. Its purpose was to bring together researchers in statistics and related...... Group-Sequential Covariate-Adjusted Randomized Clinical Trials Antoine Chambaz and Mark J. van der Laan Estimation of Causal Odds of Concordance using the Aalen Additive Model Torben Martinussen and Christian Bressen Pipper We would like to acknowledge the financial support from the University...... of Copenhagen Program of Excellence and Elsevier. We would also like to thank the authors for contributing interesting papers, the referees for their helpful reports, and the present and previous editors of SJS for their support of the publication of the papers from the satellite meeting....
The Statistical Multifragmentation Model with Skyrme Effective Interactions
Carlson, B V; Donangelo, R; Lynch, W G; Steiner, A W; Tsang, M B
2010-01-01
The Statistical Multifragmentation Model is modified to incorporate Helmholtz free energies calculated in the finite temperature Thomas-Fermi approximation using Skyrme effective interactions. In this formulation, the density of the fragments at the freeze-out configuration corresponds to the equilibrium value obtained in the Thomas-Fermi approximation at the given temperature. The behavior of the nuclear caloric curve, at constant volume, is investigated in the micro-canonical ensemble and a plateau is observed for excitation energies between 8 and 10 MeV per nucleon. A small kink in the caloric curve is found at the onset of this gas transition, indicating the existence of negative heat capacity, even in this case in which the system is constrained to a fixed volume, in contrast to former statistical calculations.
Model output statistics applied to wind power prediction
Energy Technology Data Exchange (ETDEWEB)
Joensen, A.; Giebel, G.; Landberg, L. [Risoe National Lab., Roskilde (Denmark); Madsen, H.; Nielsen, H.A. [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)
1999-03-01
Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.
A statistical model for interpreting computerized dynamic posturography data
Feiveson, Alan H.; Metter, E. Jeffrey; Paloski, William H.
2002-01-01
Computerized dynamic posturography (CDP) is widely used for assessment of altered balance control. CDP trials are quantified using the equilibrium score (ES), which ranges from zero to 100, as a decreasing function of peak sway angle. The problem of how best to model and analyze ESs from a controlled study is considered. The ES often exhibits a skewed distribution in repeated trials, which can lead to incorrect inference when applying standard regression or analysis of variance models. Furthermore, CDP trials are terminated when a patient loses balance. In these situations, the ES is not observable, but is assigned the lowest possible score--zero. As a result, the response variable has a mixed discrete-continuous distribution, further compromising inference obtained by standard statistical methods. Here, we develop alternative methodology for analyzing ESs under a stochastic model extending the ES to a continuous latent random variable that always exists, but is unobserved in the event of a fall. Loss of balance occurs conditionally, with probability depending on the realized latent ES. After fitting the model by a form of quasi-maximum-likelihood, one may perform statistical inference to assess the effects of explanatory variables. An example is provided, using data from the NIH/NIA Baltimore Longitudinal Study on Aging.
Yonemoto, Naohiro; Tanaka, Shiro; Furukawa, Toshi A; Kato, Tadashi; Mantani, Akio; Ogawa, Yusuke; Tajika, Aran; Takeshima, Nozomi; Hayasaka, Yu; Shinohara, Kiyomi; Miki, Kazuhira; Inagaki, Masatoshi; Shimodera, Shinji; Akechi, Tatsuo; Yamada, Mitsuhiko; Watanabe, Norio; Guyatt, Gordon H
2015-10-14
SUN(^_^)D, the Strategic Use of New generation antidepressants for Depression, is an assessor-blinded, parallel-group, multicenter pragmatic mega-trial to examine the optimum treatment strategy for the first- and second-line treatments for unipolar major depressive episodes. The trial has three steps and two randomizations. Step I randomization compares the minimum and the maximum dosing strategy for the first-line antidepressant. Step II randomization compares the continuation, augmentation or switching strategy for the second-line antidepressant treatment. Step III is a naturalistic continuation phase. The original protocol was published in 2011, and we hereby report its updated protocol including the statistical analysis plan. We implemented two important changes to the original protocol. One is about the required sample size, reflecting the smaller number of dropouts than had been expected. Another is in the organization of the primary and secondary outcomes in order to make the report of the main trial results as pertinent and interpretable as possible for clinical practices. Due to the complexity of the trial, we plan to report the main results in two separate reports, and this updated protocol and the statistical analysis plan have laid out respective primary and secondary outcomes and their analyses. We will convene the blind interpretation committee before the randomization code is broken. This paper presents the updated protocol and the detailed statistical analysis plan for the SUN(^_^)D trial in order to avoid reporting bias and data-driven results. ClinicalTrials.gov: NCT01109693 (registered on 21 April 2010).
Modelling the horizontal steam generator with APROS
Energy Technology Data Exchange (ETDEWEB)
Ylijoki, J. [VTT Energy, Espoo (Finland); Palsinajaervi, C.; Porkholm, K. [IVO International Ltd, Vantaa (Finland)
1995-12-31
In this paper the capability of the five- and six-equation models of the simulation code APROS to simulate the behaviour of the horizontal steam generator is discussed. Different nodalizations are used in the modelling and the results of the stationary state runs are compared. Exactly the same nodalizations have been created for the five- and six-equation models. The main simulation results studied in this paper are void fraction and mass flow distributions in the secondary side of the steam generator. It was found that quite a large number of simulation volumes is required to simulate the distributions with a reasonable accuracy. The simulation results of the different models are presented and their validity is discussed. (orig.). 4 refs.
Alternative methods of modeling wind generation using production costing models
International Nuclear Information System (INIS)
Milligan, M.R.; Pang, C.K.
1996-08-01
This paper examines the methods of incorporating wind generation in two production costing models: one is a load duration curve (LDC) based model and the other is a chronological-based model. These two models were used to evaluate the impacts of wind generation on two utility systems using actual collected wind data at two locations with high potential for wind generation. The results are sensitive to the selected wind data and the level of benefits of wind generation is sensitive to the load forecast. The total production cost over a year obtained by the chronological approach does not differ significantly from that of the LDC approach, though the chronological commitment of units is more realistic and more accurate. Chronological models provide the capability of answering important questions about wind resources which are difficult or impossible to address with LDC models
Deep Generative Models for Molecular Science
DEFF Research Database (Denmark)
Jørgensen, Peter Bjørn; Schmidt, Mikkel Nørgaard; Winther, Ole
2018-01-01
Generative deep machine learning models now rival traditional quantum-mechanical computations in predicting properties of new structures, and they come with a significantly lower computational cost, opening new avenues in computational molecular science. In the last few years, a variety of deep...
Higher Order Moments Generation by Mellin Transform for Compound Models of Clutter
Bhattacharya, C
2008-01-01
The compound models of clutter statistics are found suitable to describe the nonstationary nature of radar backscattering from high-resolution observations. In this letter, we show that the properties of Mellin transform can be utilized to generate higher order moments of simple and compound models of clutter statistics in a compact manner.
MuffinInfo: HTML5-Based Statistics Extractor from Next-Generation Sequencing Data.
Alic, Andy S; Blanquer, Ignacio
2016-09-01
Usually, the information known a priori about a newly sequenced organism is limited. Even resequencing the same organism can generate unpredictable output. We introduce MuffinInfo, a FastQ/Fasta/SAM information extractor implemented in HTML5 capable of offering insights into next-generation sequencing (NGS) data. Our new tool can run on any software or hardware environment, in command line or graphically, and in browser or standalone. It presents information such as average length, base distribution, quality scores distribution, k-mer histogram, and homopolymers analysis. MuffinInfo improves upon the existing extractors by adding the ability to save and then reload the results obtained after a run as a navigable file (also supporting saving pictures of the charts), by supporting custom statistics implemented by the user, and by offering user-adjustable parameters involved in the processing, all in one software. At the moment, the extractor works with all base space technologies such as Illumina, Roche, Ion Torrent, Pacific Biosciences, and Oxford Nanopore. Owing to HTML5, our software demonstrates the readiness of web technologies for mild intensive tasks encountered in bioinformatics.
Dynamic statistical models of biological cognition: insights from communications theory
Wallace, Rodrick
2014-10-01
Maturana's cognitive perspective on the living state, Dretske's insight on how information theory constrains cognition, the Atlan/Cohen cognitive paradigm, and models of intelligence without representation, permit construction of a spectrum of dynamic necessary conditions statistical models of signal transduction, regulation, and metabolism at and across the many scales and levels of organisation of an organism and its context. Nonequilibrium critical phenomena analogous to physical phase transitions, driven by crosstalk, will be ubiquitous, representing not only signal switching, but the recruitment of underlying cognitive modules into tunable dynamic coalitions that address changing patterns of need and opportunity at all scales and levels of organisation. The models proposed here, while certainly providing much conceptual insight, should be most useful in the analysis of empirical data, much as are fitted regression equations.
A Statistical Graphical Model of the California Reservoir System
Taeb, A.; Reager, J. T.; Turmon, M.; Chandrasekaran, V.
2017-11-01
The recent California drought has highlighted the potential vulnerability of the state's water management infrastructure to multiyear dry intervals. Due to the high complexity of the network, dynamic storage changes in California reservoirs on a state-wide scale have previously been difficult to model using either traditional statistical or physical approaches. Indeed, although there is a significant line of research on exploring models for single (or a small number of) reservoirs, these approaches are not amenable to a system-wide modeling of the California reservoir network due to the spatial and hydrological heterogeneities of the system. In this work, we develop a state-wide statistical graphical model to characterize the dependencies among a collection of 55 major California reservoirs across the state; this model is defined with respect to a graph in which the nodes index reservoirs and the edges specify the relationships or dependencies between reservoirs. We obtain and validate this model in a data-driven manner based on reservoir volumes over the period 2003-2016. A key feature of our framework is a quantification of the effects of external phenomena that influence the entire reservoir network. We further characterize the degree to which physical factors (e.g., state-wide Palmer Drought Severity Index (PDSI), average temperature, snow pack) and economic factors (e.g., consumer price index, number of agricultural workers) explain these external influences. As a consequence of this analysis, we obtain a system-wide health diagnosis of the reservoir network as a function of PDSI.
MASKED AREAS IN SHEAR PEAK STATISTICS: A FORWARD MODELING APPROACH
Energy Technology Data Exchange (ETDEWEB)
Bard, D. [KIPAC, SLAC National Accelerator Laboratory, 2575 Sand Hill Rd, Menlo Park, CA 94025 (United States); Kratochvil, J. M. [Astrophysics and Cosmology Research Unit, University of KwaZulu-Natal, Westville, Durban 4000 (South Africa); Dawson, W., E-mail: djbard@slac.stanford.edu [Lawrence Livermore National Laboratory, 7000 East Ave, Livermore, CA 94550 (United States)
2016-03-10
The statistics of shear peaks have been shown to provide valuable cosmological information beyond the power spectrum, and will be an important constraint of models of cosmology in forthcoming astronomical surveys. Surveys include masked areas due to bright stars, bad pixels etc., which must be accounted for in producing constraints on cosmology from shear maps. We advocate a forward-modeling approach, where the impacts of masking and other survey artifacts are accounted for in the theoretical prediction of cosmological parameters, rather than correcting survey data to remove them. We use masks based on the Deep Lens Survey, and explore the impact of up to 37% of the survey area being masked on LSST and DES-scale surveys. By reconstructing maps of aperture mass the masking effect is smoothed out, resulting in up to 14% smaller statistical uncertainties compared to simply reducing the survey area by the masked area. We show that, even in the presence of large survey masks, the bias in cosmological parameter estimation produced in the forward-modeling process is ≈1%, dominated by bias caused by limited simulation volume. We also explore how this potential bias scales with survey area and evaluate how much small survey areas are impacted by the differences in cosmological structure in the data and simulated volumes, due to cosmic variance.
Cressie, Noel; Calder, Catherine A; Clark, James S; Ver Hoef, Jay M; Wikle, Christopher K
2009-04-01
Analyses of ecological data should account for the uncertainty in the process(es) that generated the data. However, accounting for these uncertainties is a difficult task, since ecology is known for its complexity. Measurement and/or process errors are often the only sources of uncertainty modeled when addressing complex ecological problems, yet analyses should also account for uncertainty in sampling design, in model specification, in parameters governing the specified model, and in initial and boundary conditions. Only then can we be confident in the scientific inferences and forecasts made from an analysis. Probability and statistics provide a framework that accounts for multiple sources of uncertainty. Given the complexities of ecological studies, the hierarchical statistical model is an invaluable tool. This approach is not new in ecology, and there are many examples (both Bayesian and non-Bayesian) in the literature illustrating the benefits of this approach. In this article, we provide a baseline for concepts, notation, and methods, from which discussion on hierarchical statistical modeling in ecology can proceed. We have also planted some seeds for discussion and tried to show where the practical difficulties lie. Our thesis is that hierarchical statistical modeling is a powerful way of approaching ecological analysis in the presence of inevitable but quantifiable uncertainties, even if practical issues sometimes require pragmatic compromises.
Development of modelling algorithm of technological systems by statistical tests
Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.
2018-03-01
The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.
A Tensor Statistical Model for Quantifying Dynamic Functional Connectivity.
Zhu, Yingying; Zhu, Xiaofeng; Kim, Minjeong; Yan, Jin; Wu, Guorong
2017-06-01
Functional connectivity (FC) has been widely investigated in many imaging-based neuroscience and clinical studies. Since functional Magnetic Resonance Image (MRI) signal is just an indirect reflection of brain activity, it is difficult to accurately quantify the FC strength only based on signal correlation. To address this limitation, we propose a learning-based tensor model to derive high sensitivity and specificity connectome biomarkers at the individual level from resting-state fMRI images. First, we propose a learning-based approach to estimate the intrinsic functional connectivity. In addition to the low level region-to-region signal correlation, latent module-to-module connection is also estimated and used to provide high level heuristics for measuring connectivity strength. Furthermore, sparsity constraint is employed to automatically remove the spurious connections, thus alleviating the issue of searching for optimal threshold. Second, we integrate our learning-based approach with the sliding-window technique to further reveal the dynamics of functional connectivity. Specifically, we stack the functional connectivity matrix within each sliding window and form a 3D tensor where the third dimension denotes for time. Then we obtain dynamic functional connectivity (dFC) for each individual subject by simultaneously estimating the within-sliding-window functional connectivity and characterizing the across-sliding-window temporal dynamics. Third, in order to enhance the robustness of the connectome patterns extracted from dFC, we extend the individual-based 3D tensors to a population-based 4D tensor (with the fourth dimension stands for the training subjects) and learn the statistics of connectome patterns via 4D tensor analysis. Since our 4D tensor model jointly (1) optimizes dFC for each training subject and (2) captures the principle connectome patterns, our statistical model gains more statistical power of representing new subject than current state
Multiple Scenario Generation of Subsurface Models
DEFF Research Database (Denmark)
Cordua, Knud Skou
In geosciences, as well as in astrophysics, direct observations of a studied physical system and objects are not always accessible. Instead, indirect observations have to be used in order to obtain information about the unknown system, which leads to an inverse problem. Such geoscientific inverse....... Finally, the probabilistic formulation provides a means of analyzing uncertainties and potential multiple-scenario solutions to be used for risk assessments in relation to, e.g., reservoir characterization and forecasting. Prior models rely on information from old data sets or expert knowledge in form of......, e.g., training images that expresses structural, lithological, or textural features. Statistics obtained from these types of observations will be referred to as sample models. Geostatistical sampling algorithms use a sample model as input and produce multiple realizations of the model parameters...
Erdem, Riza; Aydiner, Ekrem
2009-03-01
Voltage-gated ion channels are key molecules for the generation and propagation of electrical signals in excitable cell membranes. The voltage-dependent switching of these channels between conducting and nonconducting states is a major factor in controlling the transmembrane voltage. In this study, a statistical mechanics model of these molecules has been discussed on the basis of a two-dimensional spin model. A new Hamiltonian and a new Monte Carlo simulation algorithm are introduced to simulate such a model. It was shown that the results well match the experimental data obtained from batrachotoxin-modified sodium channels in the squid giant axon using the cut-open axon technique.
Statistical Models for Inferring Vegetation Composition from Fossil Pollen
Paciorek, C.; McLachlan, J. S.; Shang, Z.
2011-12-01
Fossil pollen provide information about vegetation composition that can be used to help understand how vegetation has changed over the past. However, these data have not traditionally been analyzed in a way that allows for statistical inference about spatio-temporal patterns and trends. We build a Bayesian hierarchical model called STEPPS (Spatio-Temporal Empirical Prediction from Pollen in Sediments) that predicts forest composition in southern New England, USA, over the last two millenia based on fossil pollen. The critical relationships between abundances of tree taxa in the pollen record and abundances in actual vegetation are estimated using modern (Forest Inventory Analysis) data and (witness tree) data from colonial records. This gives us two time points at which both pollen and direct vegetation data are available. Based on these relationships, and incorporating our uncertainty about them, we predict forest composition using fossil pollen. We estimate the spatial distribution and relative abundances of tree species and draw inference about how these patterns have changed over time. Finally, we describe ongoing work to extend the modeling to the upper Midwest of the U.S., including an approach to infer tree density and thereby estimate the prairie-forest boundary in Minnesota and Wisconsin. This work is part of the PalEON project, which brings together a team of ecosystem modelers, paleoecologists, and statisticians with the goal of reconstructing vegetation responses to climate during the last two millenia in the northeastern and midwestern United States. The estimates from the statistical modeling will be used to assess and calibrate ecosystem models that are used to project ecological changes in response to global change.
Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.
2017-12-01
Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.
A new Markov-chain-related statistical approach for modelling synthetic wind power time series
International Nuclear Information System (INIS)
Pesch, T; Hake, J F; Schröders, S; Allelein, H J
2015-01-01
The integration of rising shares of volatile wind power in the generation mix is a major challenge for the future energy system. To address the uncertainties involved in wind power generation, models analysing and simulating the stochastic nature of this energy source are becoming increasingly important. One statistical approach that has been frequently used in the literature is the Markov chain approach. Recently, the method was identified as being of limited use for generating wind time series with time steps shorter than 15–40 min as it is not capable of reproducing the autocorrelation characteristics accurately. This paper presents a new Markov-chain-related statistical approach that is capable of solving this problem by introducing a variable second lag. Furthermore, additional features are presented that allow for the further adjustment of the generated synthetic time series. The influences of the model parameter settings are examined by meaningful parameter variations. The suitability of the approach is demonstrated by an application analysis with the example of the wind feed-in in Germany. It shows that—in contrast to conventional Markov chain approaches—the generated synthetic time series do not systematically underestimate the required storage capacity to balance wind power fluctuation. (paper)
Statistical Agent Based Modelization of the Phenomenon of Drug Abuse
di Clemente, Riccardo; Pietronero, Luciano
2012-07-01
We introduce a statistical agent based model to describe the phenomenon of drug abuse and its dynamical evolution at the individual and global level. The agents are heterogeneous with respect to their intrinsic inclination to drugs, to their budget attitude and social environment. The various levels of drug use were inspired by the professional description of the phenomenon and this permits a direct comparison with all available data. We show that certain elements have a great importance to start the use of drugs, for example the rare events in the personal experiences which permit to overcame the barrier of drug use occasionally. The analysis of how the system reacts to perturbations is very important to understand its key elements and it provides strategies for effective policy making. The present model represents the first step of a realistic description of this phenomenon and can be easily generalized in various directions.
A statistical model of Rift Valley fever activity in Egypt.
Drake, John M; Hassan, Ali N; Beier, John C
2013-12-01
Rift Valley fever (RVF) is a viral disease of animals and humans and a global public health concern due to its ecological plasticity, adaptivity, and potential for spread to countries with a temperate climate. In many places, outbreaks are episodic and linked to climatic, hydrologic, and socioeconomic factors. Although outbreaks of RVF have occurred in Egypt since 1977, attempts to identify risk factors have been limited. Using a statistical learning approach (lasso-regularized generalized linear model), we tested the hypotheses that outbreaks in Egypt are linked to (1) River Nile conditions that create a mosquito vector habitat, (2) entomologic conditions favorable to transmission, (3) socio-economic factors (Islamic festival of Greater Bairam), and (4) recent history of transmission activity. Evidence was found for effects of rainfall and river discharge and recent history of transmission activity. There was no evidence for an effect of Greater Bairam. The model predicted RVF activity correctly in 351 of 358 months (98.0%). This is the first study to statistically identify risk factors for RVF outbreaks in a region of unstable transmission. © 2013 The Society for Vector Ecology.
Directory of Open Access Journals (Sweden)
Wenzhi Wang
2016-07-01
Full Text Available Modeling the random fiber distribution of a fiber-reinforced composite is of great importance for studying the progressive failure behavior of the material on the micro scale. In this paper, we develop a new algorithm for generating random representative volume elements (RVEs with statistical equivalent fiber distribution against the actual material microstructure. The realistic statistical data is utilized as inputs of the new method, which is archived through implementation of the probability equations. Extensive statistical analysis is conducted to examine the capability of the proposed method and to compare it with existing methods. It is found that the proposed method presents a good match with experimental results in all aspects including the nearest neighbor distance, nearest neighbor orientation, Ripley’s K function, and the radial distribution function. Finite element analysis is presented to predict the effective elastic properties of a carbon/epoxy composite, to validate the generated random representative volume elements, and to provide insights of the effect of fiber distribution on the elastic properties. The present algorithm is shown to be highly accurate and can be used to generate statistically equivalent RVEs for not only fiber-reinforced composites but also other materials such as foam materials and particle-reinforced composites.
Statistical modeling of global geogenic arsenic contamination in groundwater.
Amini, Manouchehr; Abbaspour, Karim C; Berg, Michael; Winkel, Lenny; Hug, Stephan J; Hoehn, Eduard; Yang, Hong; Johnson, C Annette
2008-05-15
Contamination of groundwaters with geogenic arsenic poses a major health risk to millions of people. Although the main geochemical mechanisms of arsenic mobilization are well understood, the worldwide scale of affected regions is still unknown. In this study we used a large database of measured arsenic concentration in groundwaters (around 20,000 data points) from around the world as well as digital maps of physical characteristics such as soil, geology, climate, and elevation to model probability maps of global arsenic contamination. A novel rule-based statistical procedure was used to combine the physical data and expert knowledge to delineate two process regions for arsenic mobilization: "reducing" and "high-pH/ oxidizing". Arsenic concentrations were modeled in each region using regression analysis and adaptive neuro-fuzzy inferencing followed by Latin hypercube sampling for uncertainty propagation to produce probability maps. The derived global arsenic models could benefit from more accurate geologic information and aquifer chemical/physical information. Using some proxy surface information, however, the models explained 77% of arsenic variation in reducing regions and 68% of arsenic variation in high-pH/oxidizing regions. The probability maps based on the above models correspond well with the known contaminated regions around the world and delineate new untested areas that have a high probability of arsenic contamination. Notable among these regions are South East and North West of China in Asia, Central Australia, New Zealand, Northern Afghanistan, and Northern Mali and Zambia in Africa.
Flashover of a vacuum-insulator interface: A statistical model
Directory of Open Access Journals (Sweden)
W. A. Stygar
2004-07-01
Full Text Available We have developed a statistical model for the flashover of a 45° vacuum-insulator interface (such as would be found in an accelerator subject to a pulsed electric field. The model assumes that the initiation of a flashover plasma is a stochastic process, that the characteristic statistical component of the flashover delay time is much greater than the plasma formative time, and that the average rate at which flashovers occur is a power-law function of the instantaneous value of the electric field. Under these conditions, we find that the flashover probability is given by 1-exp(-E_{p}^{β}t_{eff}C/k^{β}, where E_{p} is the peak value in time of the spatially averaged electric field E(t, t_{eff}≡∫[E(t/E_{p}]^{β}dt is the effective pulse width, C is the insulator circumference, k∝exp(λ/d, and β and λ are constants. We define E(t as V(t/d, where V(t is the voltage across the insulator and d is the insulator thickness. Since the model assumes that flashovers occur at random azimuthal locations along the insulator, it does not apply to systems that have a significant defect, i.e., a location contaminated with debris or compromised by an imperfection at which flashovers repeatedly take place, and which prevents a random spatial distribution. The model is consistent with flashover measurements to within 7% for pulse widths between 0.5 ns and 10 μs, and to within a factor of 2 between 0.5 ns and 90 s (a span of over 11 orders of magnitude. For these measurements, E_{p} ranges from 64 to 651 kV/cm, d from 0.50 to 4.32 cm, and C from 4.96 to 95.74 cm. The model is significantly more accurate, and is valid over a wider range of parameters, than the J. C. Martin flashover relation that has been in use since 1971 [J. C. Martin on Pulsed Power, edited by T. H. Martin, A. H. Guenther, and M. Kristiansen (Plenum, New York, 1996]. We have generalized the statistical model to estimate the total-flashover probability of an
Statistics Based Models for the Dynamics of Chernivtsi Children Disease
Directory of Open Access Journals (Sweden)
Igor G. Nesteruk
2017-10-01
Full Text Available Background. Simple mathematical models of contamination and SIR-model of spreading an infection were used to simulate the time dynamics of the unknown before children disease, which occurred in Chernivtsi (Ukraine. The cause of many cases of alopecia, which began in this city in August 1988 is still not fully clarified. According to the official report of the governmental commission, the last new cases occurred in the middle of November 1988, and the reason of the illness was reported as chemical exogenous intoxication. Later this illness became the name “Chernivtsi chemical disease”. Nevertheless, the significantly increased number of new cases of the local alopecia was registered almost three years and is still not clarified. Objective. The comparison of two different versions of the disease: chemical exogenous intoxication and infection. Identification of the parameters of mathematical models and prediction of the disease development. Methods. Analytical solutions of the contamination models and SIR-model for an epidemic are obtained. The optimal values of parameters with the use of linear regression were found. Results. The optimal values of the models parameters with the use of statistical approach were identified. The calculations showed that the infectious version of the disease is more reliable in comparison with the popular contamination one. The possible date of the epidemic beginning was estimated. Conclusions. The optimal parameters of SIR-model allow calculating the realistic number of victims and other characteristics of possible epidemic. They also show that increased number of cases of local alopecia could be a part of the same epidemic as “Chernivtsi chemical disease”.
A Statistical Model for Regional Tornado Climate Studies.
Directory of Open Access Journals (Sweden)
Thomas H Jagger
Full Text Available Tornado reports are locally rare, often clustered, and of variable quality making it difficult to use them directly to describe regional tornado climatology. Here a statistical model is demonstrated that overcomes some of these difficulties and produces a smoothed regional-scale climatology of tornado occurrences. The model is applied to data aggregated at the level of counties. These data include annual population, annual tornado counts and an index of terrain roughness. The model has a term to capture the smoothed frequency relative to the state average. The model is used to examine whether terrain roughness is related to tornado frequency and whether there are differences in tornado activity by County Warning Area (CWA. A key finding is that tornado reports increase by 13% for a two-fold increase in population across Kansas after accounting for improvements in rating procedures. Independent of this relationship, tornadoes have been increasing at an annual rate of 1.9%. Another finding is the pattern of correlated residuals showing more Kansas tornadoes in a corridor of counties running roughly north to south across the west central part of the state consistent with the dryline climatology. The model is significantly improved by adding terrain roughness. The effect amounts to an 18% reduction in the number of tornadoes for every ten meter increase in elevation standard deviation. The model indicates that tornadoes are 51% more likely to occur in counties served by the CWAs of DDC and GID than elsewhere in the state. Flexibility of the model is illustrated by fitting it to data from Illinois, Mississippi, South Dakota, and Ohio.
Linear mixed models a practical guide using statistical software
West, Brady T; Galecki, Andrzej T
2014-01-01
Highly recommended by JASA, Technometrics, and other journals, the first edition of this bestseller showed how to easily perform complex linear mixed model (LMM) analyses via a variety of software programs. Linear Mixed Models: A Practical Guide Using Statistical Software, Second Edition continues to lead readers step by step through the process of fitting LMMs. This second edition covers additional topics on the application of LMMs that are valuable for data analysts in all fields. It also updates the case studies using the latest versions of the software procedures and provides up-to-date information on the options and features of the software procedures available for fitting LMMs in SAS, SPSS, Stata, R/S-plus, and HLM.New to the Second Edition A new chapter on models with crossed random effects that uses a case study to illustrate software procedures capable of fitting these models Power analysis methods for longitudinal and clustered study designs, including software options for power analyses and suggest...
Stochastic Spatial Models in Ecology: A Statistical Physics Approach
Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.
2017-11-01
Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.
Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy
2016-02-01
Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Quiroz, Jorge; Tsao, Yung-Shyeng
2016-07-08
Assurance of monoclonality of recombinant cell lines is a critical issue to gain regulatory approval in biological license application (BLA). Some of the requirements of regulatory agencies are the use of proper documentations and appropriate statistical analysis to demonstrate monoclonality. In some cases, one round may be sufficient to demonstrate monoclonality. In this article, we propose the use of confidence intervals for assessing monoclonality for limiting dilution cloning in the generation of recombinant manufacturing cell lines based on a single round. The use of confidence intervals instead of point estimates allow practitioners to account for the uncertainty present in the data when assessing whether an estimated level of monoclonality is consistent with regulatory requirements. In other cases, one round may not be sufficient and two consecutive rounds are required to assess monoclonality. When two consecutive subclonings are required, we improved the present methodology by reducing the infinite series proposed by Coller and Coller (Hybridoma 1983;2:91-96) to a simpler series. The proposed simpler series provides more accurate and reliable results. It also reduces the level of computation and can be easily implemented in any spreadsheet program like Microsoft Excel. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1061-1068, 2016. © 2016 American Institute of Chemical Engineers.
Chapter 3 – Phenomenology of Tsunamis: Statistical Properties from Generation to Runup
Geist, Eric L.
2015-01-01
Observations related to tsunami generation, propagation, and runup are reviewed and described in a phenomenological framework. In the three coastal regimes considered (near-field broadside, near-field oblique, and far field), the observed maximum wave amplitude is associated with different parts of the tsunami wavefield. The maximum amplitude in the near-field broadside regime is most often associated with the direct arrival from the source, whereas in the near-field oblique regime, the maximum amplitude is most often associated with the propagation of edge waves. In the far field, the maximum amplitude is most often caused by the interaction of the tsunami coda that develops during basin-wide propagation and the nearshore response, including the excitation of edge waves, shelf modes, and resonance. Statistical distributions that describe tsunami observations are also reviewed, both in terms of spatial distributions, such as coseismic slip on the fault plane and near-field runup, and temporal distributions, such as wave amplitudes in the far field. In each case, fundamental theories of tsunami physics are heuristically used to explain the observations.
Automated mask generation for PIV image analysis based on pixel intensity statistics
Masullo, Alessandro; Theunissen, Raf
2017-06-01
The measurement of displacements near the vicinity of surfaces involves advanced PIV algorithms requiring accurate knowledge of object boundaries. These data typically come in the form of a logical mask, generated manually or through automatic algorithms. The automatic detection of masks usually necessitates special features or reference points such as bright lines, high contrast objects, and sufficiently observable coherence between pixels. These are, however, not always present in experimental images necessitating a more robust and general approach. In this work, the authors propose a novel method for the automatic detection of static image regions which do not contain relevant information for the estimation of particle image displacements and can consequently be excluded or masked out. The method does not require any a priori knowledge of the static objects (i.e., contrast, brightness, or strong features) as it exploits statistical information from multiple PIV images. Based on the observation that the temporal variation in light intensity follows a completely different distribution for flow regions and object regions, the method utilizes a normality test and an automatic thresholding method on the retrieved probability to identify regions to be masked. The method is assessed through a Monte Carlo simulation with synthetic images and its performance under realistic imaging conditions is proven based on three experimental test cases.
Security Audit of WLAN Networks Using Statistical Models of Specified Language Group
Directory of Open Access Journals (Sweden)
KREKAN Jan
2013-05-01
Full Text Available In order to build a secure computing environment, persons responsible for data security need tools which allow them to test the security of data being protected. Research of passwords, used in usual computing environments, showed that easy to remember non-dictionary passwords are widely used. So it should be useful to build a statistical model,which can then be used to create very effective password lists for testing the security of a given protected data object. The problem is that the society from specified location is using also foreign words,from languages widely used. This article describes a comparison of different language models used for this new statistical candidates generation method. This generator could be then used to test the strength of passwords used to protect wireless networks which useWPA-PSK as its data encryption standard. The password candidates passed to tools which perform the security audit. This method could be described also as sorting of Brute-force password candidates usingknowledge about languages used by the users. The tests showed that using combination of language models (MIX of specified language group for the password candidates’ generator could improve thespeed of the security procedure by 37% relatively in average (60% speedup when finding 50% of passwords – in 0.69% vs 1.715% of Bruteforce combinations comparing to mother language model (SK and 20 times average absolute speedup comparing to Bruteforce.
A Statistical Toolbox For Mining And Modeling Spatial Data
Directory of Open Access Journals (Sweden)
D’Aubigny Gérard
2016-12-01
Full Text Available Most data mining projects in spatial economics start with an evaluation of a set of attribute variables on a sample of spatial entities, looking for the existence and strength of spatial autocorrelation, based on the Moran’s and the Geary’s coefficients, the adequacy of which is rarely challenged, despite the fact that when reporting on their properties, many users seem likely to make mistakes and to foster confusion. My paper begins by a critical appraisal of the classical definition and rational of these indices. I argue that while intuitively founded, they are plagued by an inconsistency in their conception. Then, I propose a principled small change leading to corrected spatial autocorrelation coefficients, which strongly simplifies their relationship, and opens the way to an augmented toolbox of statistical methods of dimension reduction and data visualization, also useful for modeling purposes. A second section presents a formal framework, adapted from recent work in statistical learning, which gives theoretical support to our definition of corrected spatial autocorrelation coefficients. More specifically, the multivariate data mining methods presented here, are easily implementable on the existing (free software, yield methods useful to exploit the proposed corrections in spatial data analysis practice, and, from a mathematical point of view, whose asymptotic behavior, already studied in a series of papers by Belkin & Niyogi, suggests that they own qualities of robustness and a limited sensitivity to the Modifiable Areal Unit Problem (MAUP, valuable in exploratory spatial data analysis.
Generative modelling of regulated dynamical behavior in cultured neuronal networks
Volman, Vladislav; Baruchi, Itay; Persi, Erez; Ben-Jacob, Eshel
2004-04-01
The spontaneous activity of cultured in vitro neuronal networks exhibits rich dynamical behavior. Despite the artificial manner of their construction, the networks’ activity includes features which seemingly reflect the action of underlying regulating mechanism rather than arbitrary causes and effects. Here, we study the cultured networks dynamical behavior utilizing a generative modelling approach. The idea is to include the minimal required generic mechanisms to capture the non-autonomous features of the behavior, which can be reproduced by computer modelling, and then, to identify the additional features of biotic regulation in the observed behavior which are beyond the scope of the model. Our model neurons are composed of soma described by the two Morris-Lecar dynamical variables (voltage and fraction of open potassium channels), with dynamical synapses described by the Tsodyks-Markram three variables dynamics. The model neuron satisfies our self-consistency test: when fed with data recorded from a real cultured networks, it exhibits dynamical behavior very close to that of the networks’ “representative” neuron. Specifically, it shows similar statistical scaling properties (approximated by similar symmetric Lévy distribution with finite mean). A network of such M-L elements spontaneously generates (when weak “structured noise” is added) synchronized bursting events (SBEs) similar to the observed ones. Both the neuronal statistical scaling properties within the bursts and the properties of the SBEs time series show generative (a new discussed concept) agreement with the recorded data. Yet, the model network exhibits different structure of temporal variations and does not recover the observed hierarchical temporal ordering, unless fed with recorded special neurons (with much higher rates of activity), thus indicating the existence of self-regulation mechanisms. It also implies that the spontaneous activity is not simply noise-induced. Instead, the
Statistical Models and Methods for Network Meta-Analysis.
Madden, L V; Piepho, H-P; Paul, P A
2016-08-01
Meta-analysis, the methodology for analyzing the results from multiple independent studies, has grown tremendously in popularity over the last four decades. Although most meta-analyses involve a single effect size (summary result, such as a treatment difference) from each study, there are often multiple treatments of interest across the network of studies in the analysis. Multi-treatment (or network) meta-analysis can be used for simultaneously analyzing the results from all the treatments. However, the methodology is considerably more complicated than for the analysis of a single effect size, and there have not been adequate explanations of the approach for agricultural investigations. We review the methods and models for conducting a network meta-analysis based on frequentist statistical principles, and demonstrate the procedures using a published multi-treatment plant pathology data set. A major advantage of network meta-analysis is that correlations of estimated treatment effects are automatically taken into account when an appropriate model is used. Moreover, treatment comparisons may be possible in a network meta-analysis that are not possible in a single study because all treatments of interest may not be included in any given study. We review several models that consider the study effect as either fixed or random, and show how to interpret model-fitting output. We further show how to model the effect of moderator variables (study-level characteristics) on treatment effects, and present one approach to test for the consistency of treatment effects across the network. Online supplemental files give explanations on fitting the network meta-analytical models using SAS.
A statistical downscaling model for summer rainfall over Pakistan
Kazmi, Dildar Hussain; Li, Jianping; Ruan, Chengqing; Zhao, Sen; Li, Yanjie
2016-10-01
A statistical approach is utilized to construct an interannual model for summer (July-August) rainfall over the western parts of South Asian Monsoon. Observed monthly rainfall data for selected stations of Pakistan for the last 55 years (1960-2014) is taken as predictand. Recommended climate indices along with the oceanic and atmospheric data on global scales, for the period April-June are employed as predictors. First 40 years data has been taken as training period and the rest as validation period. Cross-validation stepwise regression approach adopted to select the robust predictors. Upper tropospheric zonal wind at 200 hPa over the northeastern Atlantic is finally selected as the best predictor for interannual model. Besides, the next possible candidate `geopotential height at upper troposphere' is taken as the indirect predictor for being a source of energy transportation from core region (northeast Atlantic/western Europe) to the study area. The model performed well for both the training as well as validation period with correlation coefficient of 0.71 and tolerable root mean square errors. Cross-validation of the model has been processed by incorporating JRA-55 data for potential predictors in addition to NCEP and fragmentation of study period to five non-overlapping test samples. Subsequently, to verify the outcome of the model on physical grounds, observational analyses as well as the model simulations are incorporated. It is revealed that originating from the jet exit region through large vorticity gradients, zonally dominating waves may transport energy and momentum to the downstream areas of west-central Asia, that ultimately affect interannual variability of the specific rainfall. It has been detected that both the circumglobal teleconnection and Rossby wave propagation play vital roles in modulating the proposed mechanism.
DEFF Research Database (Denmark)
Guo, Yougui; Zeng, Ping; Blaabjerg, Frede
2010-01-01
A real wind power generation system is given in this paper. SVM control strategy and vector control is applied for generator side converter and doubly fed induction generator respectively. First the mathematical models of the wind turbine rotor, drive train, generator side converter are described....... Then the control strategy of generator side converter system is given in detail. Finally the simulation model of the generator side converter system is set up. The simulation results have verified that it is feasible to apply for generator side converter of wind power generation system and the generator side...
Directory of Open Access Journals (Sweden)
Kok-Yong Seng
2008-01-01
Full Text Available Currently, statistical techniques for analysis of microarray-generated data sets have deficiencies due to limited understanding of errors inherent in the data. A generalized likelihood ratio (GLR test based on an error model has been recently proposed to identify differentially expressed genes from microarray experiments. However, the use of different error structures under the GLR test has not been evaluated, nor has this method been compared to commonly used statistical tests such as the parametric t-test. The concomitant effects of varying data signal-to-noise ratio and replication number on the performance of statistical tests also remain largely unexplored. In this study, we compared the effects of different underlying statistical error structures on the GLR test’s power in identifying differentially expressed genes in microarray data. We evaluated such variants of the GLR test as well as the one sample t-test based on simulated data by means of receiver operating characteristic (ROC curves. Further, we used bootstrapping of ROC curves to assess statistical significance of differences between the areas under the curves. Our results showed that i the GLR tests outperformed the t-test for detecting differential gene expression, ii the identity of the underlying error structure was important in determining the GLR tests’ performance, and iii signal-to-noise ratio was a more important contributor than sample replication in identifying statistically significant differential gene expression.
Turking Statistics: Student-Generated Surveys Increase Student Engagement and Performance
Whitley, Cameron T.; Dietz, Thomas
2018-01-01
Thirty years ago, Hubert M. Blalock Jr. published an article in "Teaching Sociology" about the importance of teaching statistics. We honor Blalock's legacy by assessing how using Amazon Mechanical Turk (MTurk) in statistics classes can enhance student learning and increase statistical literacy among social science gradaute students. In…
STATISTICAL MECHANICS MODELING OF MESOSCALE DEFORMATION IN METALS
Energy Technology Data Exchange (ETDEWEB)
Anter El-Azab
2013-04-08
The research under this project focused on a theoretical and computational modeling of dislocation dynamics of mesoscale deformation of metal single crystals. Specifically, the work aimed to implement a continuum statistical theory of dislocations to understand strain hardening and cell structure formation under monotonic loading. These aspects of crystal deformation are manifestations of the evolution of the underlying dislocation system under mechanical loading. The project had three research tasks: 1) Investigating the statistical characteristics of dislocation systems in deformed crystals. 2) Formulating kinetic equations of dislocations and coupling these kinetics equations and crystal mechanics. 3) Computational solution of coupled crystal mechanics and dislocation kinetics. Comparison of dislocation dynamics predictions with experimental results in the area of statistical properties of dislocations and their field was also a part of the proposed effort. In the first research task, the dislocation dynamics simulation method was used to investigate the spatial, orientation, velocity, and temporal statistics of dynamical dislocation systems, and on the use of the results from this investigation to complete the kinetic description of dislocations. The second task focused on completing the formulation of a kinetic theory of dislocations that respects the discrete nature of crystallographic slip and the physics of dislocation motion and dislocation interaction in the crystal. Part of this effort also targeted the theoretical basis for establishing the connection between discrete and continuum representation of dislocations and the analysis of discrete dislocation simulation results within the continuum framework. This part of the research enables the enrichment of the kinetic description with information representing the discrete dislocation systems behavior. The third task focused on the development of physics-inspired numerical methods of solution of the coupled
Ontology modeling for generation of clinical pathways
Directory of Open Access Journals (Sweden)
Jasmine Tehrani
2012-12-01
Full Text Available Purpose: Increasing costs of health care, fuelled by demand for high quality, cost-effective healthcare has drove hospitals to streamline their patient care delivery systems. One such systematic approach is the adaptation of Clinical Pathways (CP as a tool to increase the quality of healthcare delivery. However, most organizations still rely on are paper-based pathway guidelines or specifications, which have limitations in process management and as a result can influence patient safety outcomes. In this paper, we present a method for generating clinical pathways based on organizational semiotics by capturing knowledge from syntactic, semantic and pragmatic to social level. Design/methodology/approach: The proposed modeling approach to generation of CPs adopts organizational semiotics and enables the generation of semantically rich representation of CP knowledge. Semantic Analysis Method (SAM is applied to explicitly represent the semantics of the concepts, their relationships and patterns of behavior in terms of an ontology chart. Norm Analysis Method (NAM is adopted to identify and formally specify patterns of behavior and rules that govern the actions identified on the ontology chart. Information collected during semantic and norm analysis is integrated to guide the generation of CPs using best practice represented in BPMN thus enabling the automation of CP. Findings: This research confirms the necessity of taking into consideration social aspects in designing information systems and automating CP. The complexity of healthcare processes can be best tackled by analyzing stakeholders, which we treat as social agents, their goals and patterns of action within the agent network. Originality/value: The current modeling methods describe CPs from a structural aspect comprising activities, properties and interrelationships. However, these methods lack a mechanism to describe possible patterns of human behavior and the conditions under which the
Finding the Root Causes of Statistical Inconsistency in Community Earth System Model Output
Milroy, D.; Hammerling, D.; Baker, A. H.
2017-12-01
Baker et al (2015) developed the Community Earth System Model Ensemble Consistency Test (CESM-ECT) to provide a metric for software quality assurance by determining statistical consistency between an ensemble of CESM outputs and new test runs. The test has proved useful for detecting statistical difference caused by compiler bugs and errors in physical modules. However, detection is only the necessary first step in finding the causes of statistical difference. The CESM is a vastly complex model comprised of millions of lines of code which is developed and maintained by a large community of software engineers and scientists. Any root cause analysis is correspondingly challenging. We propose a new capability for CESM-ECT: identifying the sections of code that cause statistical distinguishability. The first step is to discover CESM variables that cause CESM-ECT to classify new runs as statistically distinct, which we achieve via Randomized Logistic Regression. Next we use a tool developed to identify CESM components that define or compute the variables found in the first step. Finally, we employ the application Kernel GENerator (KGEN) created in Kim et al (2016) to detect fine-grained floating point differences. We demonstrate an example of the procedure and advance a plan to automate this process in our future work.
Model documentation report: Short-Term Hydroelectric Generation Model
International Nuclear Information System (INIS)
1993-08-01
The purpose of this report is to define the objectives of the Short- Term Hydroelectric Generation Model (STHGM), describe its basic approach, and to provide details on the model structure. This report is intended as a reference document for model analysts, users, and the general public. Documentation of the model is in accordance with the Energy Information Administration's (AYE) legal obligation to provide adequate documentation in support of its models (Public Law 94-385, Section 57.b.2). The STHGM performs a short-term (18 to 27- month) forecast of hydroelectric generation in the United States using an autoregressive integrated moving average (UREMIA) time series model with precipitation as an explanatory variable. The model results are used as input for the short-term Energy Outlook
Optimizing DNA assembly based on statistical language modelling.
Fang, Gang; Zhang, Shemin; Dong, Yafei
2017-12-15
By successively assembling genetic parts such as BioBrick according to grammatical models, complex genetic constructs composed of dozens of functional blocks can be built. However, usually every category of genetic parts includes a few or many parts. With increasing quantity of genetic parts, the process of assembling more than a few sets of these parts can be expensive, time consuming and error prone. At the last step of assembling it is somewhat difficult to decide which part should be selected. Based on statistical language model, which is a probability distribution P(s) over strings S that attempts to reflect how frequently a string S occurs as a sentence, the most commonly used parts will be selected. Then, a dynamic programming algorithm was designed to figure out the solution of maximum probability. The algorithm optimizes the results of a genetic design based on a grammatical model and finds an optimal solution. In this way, redundant operations can be reduced and the time and cost required for conducting biological experiments can be minimized. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Critical, statistical, and thermodynamical properties of lattice models
Energy Technology Data Exchange (ETDEWEB)
Varma, Vipin Kerala
2013-10-15
In this thesis we investigate zero temperature and low temperature properties - critical, statistical and thermodynamical - of lattice models in the contexts of bosonic cold atom systems, magnetic materials, and non-interacting particles on various lattice geometries. We study quantum phase transitions in the Bose-Hubbard model with higher body interactions, as relevant for optical lattice experiments of strongly interacting bosons, in one and two dimensions; the universality of the Mott insulator to superfluid transition is found to remain unchanged for even large three body interaction strengths. A systematic renormalization procedure is formulated to fully re-sum these higher (three and four) body interactions into the two body terms. In the strongly repulsive limit, we analyse the zero and low temperature physics of interacting hard-core bosons on the kagome lattice at various fillings. Evidence for a disordered phase in the Ising limit of the model is presented; in the strong coupling limit, the transition between the valence bond solid and the superfluid is argued to be first order at the tip of the solid lobe.
Critical, statistical, and thermodynamical properties of lattice models
International Nuclear Information System (INIS)
Varma, Vipin Kerala
2013-10-01
In this thesis we investigate zero temperature and low temperature properties - critical, statistical and thermodynamical - of lattice models in the contexts of bosonic cold atom systems, magnetic materials, and non-interacting particles on various lattice geometries. We study quantum phase transitions in the Bose-Hubbard model with higher body interactions, as relevant for optical lattice experiments of strongly interacting bosons, in one and two dimensions; the universality of the Mott insulator to superfluid transition is found to remain unchanged for even large three body interaction strengths. A systematic renormalization procedure is formulated to fully re-sum these higher (three and four) body interactions into the two body terms. In the strongly repulsive limit, we analyse the zero and low temperature physics of interacting hard-core bosons on the kagome lattice at various fillings. Evidence for a disordered phase in the Ising limit of the model is presented; in the strong coupling limit, the transition between the valence bond solid and the superfluid is argued to be first order at the tip of the solid lobe.
Terminal-Dependent Statistical Inference for the FBSDEs Models
Directory of Open Access Journals (Sweden)
Yunquan Song
2014-01-01
Full Text Available The original stochastic differential equations (OSDEs and forward-backward stochastic differential equations (FBSDEs are often used to model complex dynamic process that arise in financial, ecological, and many other areas. The main difference between OSDEs and FBSDEs is that the latter is designed to depend on a terminal condition, which is a key factor in some financial and ecological circumstances. It is interesting but challenging to estimate FBSDE parameters from noisy data and the terminal condition. However, to the best of our knowledge, the terminal-dependent statistical inference for such a model has not been explored in the existing literature. We proposed a nonparametric terminal control variables estimation method to address this problem. The reason why we use the terminal control variables is that the newly proposed inference procedures inherit the terminal-dependent characteristic. Through this new proposed method, the estimators of the functional coefficients of the FBSDEs model are obtained. The asymptotic properties of the estimators are also discussed. Simulation studies show that the proposed method gives satisfying estimates for the FBSDE parameters from noisy data and the terminal condition. A simulation is performed to test the feasibility of our method.
The statistical multifragmentation model: Origins and recent advances
Energy Technology Data Exchange (ETDEWEB)
Donangelo, R., E-mail: donangel@fing.edu.uy [Instituto de Física, Facultad de Ingeniería, Universidad de la República, Julio Herrera y Reissig 565, 11300, Montevideo (Uruguay); Instituto de Física, Universidade Federal do Rio de Janeiro, C.P. 68528, 21941-972 Rio de Janeiro - RJ (Brazil); Souza, S. R., E-mail: srsouza@if.ufrj.br [Instituto de Física, Universidade Federal do Rio de Janeiro, C.P. 68528, 21941-972 Rio de Janeiro - RJ (Brazil); Instituto de Física, Universidade Federal do Rio Grande do Sul, C.P. 15051, 91501-970 Porto Alegre - RS (Brazil)
2016-07-07
We review the Statistical Multifragmentation Model (SMM) which considers a generalization of the liquid-drop model for hot nuclei and allows one to calculate thermodynamic quantities characterizing the nuclear ensemble at the disassembly stage. We show how to determine probabilities of definite partitions of finite nuclei and how to determine, through Monte Carlo calculations, observables such as the caloric curve, multiplicity distributions, heat capacity, among others. Some experimental measurements of the caloric curve confirmed the SMM predictions of over 10 years before, leading to a surge in the interest in the model. However, the experimental determination of the fragmentation temperatures relies on the yields of different isotopic species, which were not correctly calculated in the schematic, liquid-drop picture, employed in the SMM. This led to a series of improvements in the SMM, in particular to the more careful choice of nuclear masses and energy densities, specially for the lighter nuclei. With these improvements the SMM is able to make quantitative determinations of isotope production. We show the application of SMM to the production of exotic nuclei through multifragmentation. These preliminary calculations demonstrate the need for a careful choice of the system size and excitation energy to attain maximum yields.
Olguin, Carlos José Maria; Sampaio, Silvio César; Dos Reis, Ralpho Rinaldo
2017-10-01
The soil sorption coefficient normalized to the organic carbon content (K oc ) is a physicochemical parameter used in environmental risk assessments and in determining the final fate of chemicals released into the environment. Several models for predicting this parameter have been proposed based on the relationship between log K oc and log P. The difficulty and cost of obtaining experimental log P values led to the development of algorithms to calculate these values, some of which are free to use. However, quantitative structure-property relationship (QSPR) studies did not detail how or why a particular algorithm was chosen. In this study, we evaluated several free algorithms for calculating log P in the modeling of log K oc , using a broad and diverse set of compounds (n = 639) that included several chemical classes. In addition, we propose the adoption of a simple test to verify if there is statistical equivalence between models obtained using different data sets. Our results showed that the ALOGPs, KOWWIN and XLOGP3 algorithms generated the best models for modeling K oc , and these models are statistically equivalent. This finding shows that it is possible to use the different algorithms without compromising statistical quality and predictive capacity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Shahabi, Himan; Hashim, Mazlan
2015-04-22
This research presents the results of the GIS-based statistical models for generation of landslide susceptibility mapping using geographic information system (GIS) and remote-sensing data for Cameron Highlands area in Malaysia. Ten factors including slope, aspect, soil, lithology, NDVI, land cover, distance to drainage, precipitation, distance to fault, and distance to road were extracted from SAR data, SPOT 5 and WorldView-1 images. The relationships between the detected landslide locations and these ten related factors were identified by using GIS-based statistical models including analytical hierarchy process (AHP), weighted linear combination (WLC) and spatial multi-criteria evaluation (SMCE) models. The landslide inventory map which has a total of 92 landslide locations was created based on numerous resources such as digital aerial photographs, AIRSAR data, WorldView-1 images, and field surveys. Then, 80% of the landslide inventory was used for training the statistical models and the remaining 20% was used for validation purpose. The validation results using the Relative landslide density index (R-index) and Receiver operating characteristic (ROC) demonstrated that the SMCE model (accuracy is 96%) is better in prediction than AHP (accuracy is 91%) and WLC (accuracy is 89%) models. These landslide susceptibility maps would be useful for hazard mitigation purpose and regional planning.
Multivariate Statistical Modelling of Drought and Heat Wave Events
Manning, Colin; Widmann, Martin; Vrac, Mathieu; Maraun, Douglas; Bevaqua, Emanuele
2016-04-01
Multivariate Statistical Modelling of Drought and Heat Wave Events C. Manning1,2, M. Widmann1, M. Vrac2, D. Maraun3, E. Bevaqua2,3 1. School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, UK 2. Laboratoire des Sciences du Climat et de l'Environnement, (LSCE-IPSL), Centre d'Etudes de Saclay, Gif-sur-Yvette, France 3. Wegener Center for Climate and Global Change, University of Graz, Brandhofgasse 5, 8010 Graz, Austria Compound extreme events are a combination of two or more contributing events which in themselves may not be extreme but through their joint occurrence produce an extreme impact. Compound events are noted in the latest IPCC report as an important type of extreme event that have been given little attention so far. As part of the CE:LLO project (Compound Events: muLtivariate statisticaL mOdelling) we are developing a multivariate statistical model to gain an understanding of the dependence structure of certain compound events. One focus of this project is on the interaction between drought and heat wave events. Soil moisture has both a local and non-local effect on the occurrence of heat waves where it strongly controls the latent heat flux affecting the transfer of sensible heat to the atmosphere. These processes can create a feedback whereby a heat wave maybe amplified or suppressed by the soil moisture preconditioning, and vice versa, the heat wave may in turn have an effect on soil conditions. An aim of this project is to capture this dependence in order to correctly describe the joint probabilities of these conditions and the resulting probability of their compound impact. We will show an application of Pair Copula Constructions (PCCs) to study the aforementioned compound event. PCCs allow in theory for the formulation of multivariate dependence structures in any dimension where the PCC is a decomposition of a multivariate distribution into a product of bivariate components modelled using copulas. A
Towards a Statistical Model of Tropical Cyclone Genesis
Fernandez, A.; Kashinath, K.; McAuliffe, J.; Prabhat, M.; Stark, P. B.; Wehner, M. F.
2017-12-01
Tropical Cyclones (TCs) are important extreme weather phenomena that have a strong impact on humans. TC forecasts are largely based on global numerical models that produce TC-like features. Aspects of Tropical Cyclones such as their formation/genesis, evolution, intensification and dissipation over land are important and challenging problems in climate science. This study investigates the environmental conditions associated with Tropical Cyclone Genesis (TCG) by testing how accurately a statistical model can predict TCG in the CAM5.1 climate model. TCG events are defined using TECA software @inproceedings{Prabhat2015teca, title={TECA: Petascale Pattern Recognition for Climate Science}, author={Prabhat and Byna, Surendra and Vishwanath, Venkatram and Dart, Eli and Wehner, Michael and Collins, William D}, booktitle={Computer Analysis of Images and Patterns}, pages={426-436}, year={2015}, organization={Springer}} to extract TC trajectories from CAM5.1. L1-regularized logistic regression (L1LR) is applied to the CAM5.1 output. The predictions have nearly perfect accuracy for data not associated with TC tracks and high accuracy differentiating between high vorticity and low vorticity systems. The model's active variables largely correspond to current hypotheses about important factors for TCG, such as wind field patterns and local pressure minima, and suggests new routes for investigation. Furthermore, our model's predictions of TC activity are competitive with the output of an instantaneous version of Emanuel and Nolan's Genesis Potential Index (GPI) @inproceedings{eman04, title = "Tropical cyclone activity and the global climate system", author = "Kerry Emanuel and Nolan, {David S.}", year = "2004", pages = "240-241", booktitle = "26th Conference on Hurricanes and Tropical Meteorology"}.
Canary, Jana D; Blizzard, Leigh; Barry, Ronald P; Hosmer, David W; Quinn, Stephen J
2016-05-01
Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness-of-fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (TG), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer-Lemeshow (HL) and Pigeon-Heyse (J(2) ) statistics can be applied directly. In a simulation study, TG, HL, and J(2) were used to evaluate the fit of probit, log-log, complementary log-log, and log models, all calculated with a common grouping method. The TG statistic consistently maintained Type I error rates, while those of HL and J(2) were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, TG had more power than HL or J(2) . © 2015 John Wiley & Sons Ltd/London School of Economics.
Feature and Statistical Model Development in Structural Health Monitoring
Kim, Inho
All structures suffer wear and tear because of impact, excessive load, fatigue, corrosion, etc. in addition to inherent defects during their manufacturing processes and their exposure to various environmental effects. These structural degradations are often imperceptible, but they can severely affect the structural performance of a component, thereby severely decreasing its service life. Although previous studies of Structural Health Monitoring (SHM) have revealed extensive prior knowledge on the parts of SHM processes, such as the operational evaluation, data processing, and feature extraction, few studies have been conducted from a systematical perspective, the statistical model development. The first part of this dissertation, the characteristics of inverse scattering problems, such as ill-posedness and nonlinearity, reviews ultrasonic guided wave-based structural health monitoring problems. The distinctive features and the selection of the domain analysis are investigated by analytically searching the conditions of the uniqueness solutions for ill-posedness and are validated experimentally. Based on the distinctive features, a novel wave packet tracing (WPT) method for damage localization and size quantification is presented. This method involves creating time-space representations of the guided Lamb waves (GLWs), collected at a series of locations, with a spatially dense distribution along paths at pre-selected angles with respect to the direction, normal to the direction of wave propagation. The fringe patterns due to wave dispersion, which depends on the phase velocity, are selected as the primary features that carry information, regarding the wave propagation and scattering. The following part of this dissertation presents a novel damage-localization framework, using a fully automated process. In order to construct the statistical model for autonomous damage localization deep-learning techniques, such as restricted Boltzmann machine and deep belief network
Paprotny, Dominik; Morales-Nápoles, Oswaldo; Jonkman, Sebastiaan N.
2017-07-01
Flood hazard is currently being researched on continental and global scales, using models of increasing complexity. In this paper we investigate a different, simplified approach, which combines statistical and physical models in place of conventional rainfall-run-off models to carry out flood mapping for Europe. A Bayesian-network-based model built in a previous study is employed to generate return-period flow rates in European rivers with a catchment area larger than 100 km2. The simulations are performed using a one-dimensional steady-state hydraulic model and the results are post-processed using Geographical Information System (GIS) software in order to derive flood zones. This approach is validated by comparison with Joint Research Centre's (JRC) pan-European map and five local flood studies from different countries. Overall, the two approaches show a similar performance in recreating flood zones of local maps. The simplified approach achieved a similar level of accuracy, while substantially reducing the computational time. The paper also presents the aggregated results on the flood hazard in Europe, including future projections. We find relatively small changes in flood hazard, i.e. an increase of flood zones area by 2-4 % by the end of the century compared to the historical scenario. However, when current flood protection standards are taken into account, the flood-prone area increases substantially in the future (28-38 % for a 100-year return period). This is because in many parts of Europe river discharge with the same return period is projected to increase in the future, thus making the protection standards insufficient.
The issue of statistical power for overall model fit in evaluating structural equation models
Directory of Open Access Journals (Sweden)
Richard HERMIDA
2015-06-01
Full Text Available Statistical power is an important concept for psychological research. However, examining the power of a structural equation model (SEM is rare in practice. This article provides an accessible review of the concept of statistical power for the Root Mean Square Error of Approximation (RMSEA index of overall model fit in structural equation modeling. By way of example, we examine the current state of power in the literature by reviewing studies in top Industrial-Organizational (I/O Psychology journals using SEMs. Results indicate that in many studies, power is very low, which implies acceptance of invalid models. Additionally, we examined methodological situations which may have an influence on statistical power of SEMs. Results showed that power varies significantly as a function of model type and whether or not the model is the main model for the study. Finally, results indicated that power is significantly related to model fit statistics used in evaluating SEMs. The results from this quantitative review imply that researchers should be more vigilant with respect to power in structural equation modeling. We therefore conclude by offering methodological best practices to increase confidence in the interpretation of structural equation modeling results with respect to statistical power issues.
Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data
Directory of Open Access Journals (Sweden)
Alexander P. Kartun-Giles
2018-04-01
Full Text Available A projective network model is a model that enables predictions to be made based on a subsample of the network data, with the predictions remaining unchanged if a larger sample is taken into consideration. An exchangeable model is a model that does not depend on the order in which nodes are sampled. Despite a large variety of non-equilibrium (growing and equilibrium (static sparse complex network models that are widely used in network science, how to reconcile sparseness (constant average degree with the desired statistical properties of projectivity and exchangeability is currently an outstanding scientific problem. Here we propose a network process with hidden variables which is projective and can generate sparse power-law networks. Despite the model not being exchangeable, it can be closely related to exchangeable uncorrelated networks as indicated by its information theory characterization and its network entropy. The use of the proposed network process as a null model is here tested on real data, indicating that the model offers a promising avenue for statistical network modelling.
Directory of Open Access Journals (Sweden)
Qian eWang
2015-04-01
Full Text Available Results from numerous linkage and association studies have greatly deepened scientists’ understanding of the genetic basis of many human diseases, yet some important questions remain unanswered. For example, although a large number of disease-associated loci have been identified from genome-wide association studies (GWAS in the past 10 years, it is challenging to interpret these results as most disease-associated markers have no clear functional roles in disease etiology, and all the identified genomic factors only explain a small portion of disease heritability. With the help of next-generation sequencing (NGS, diverse types of genomic and epigenetic variations can be detected with high accuracy. More importantly, instead of using linkage disequilibrium to detect association signals based on a set of pre-set probes, NGS allows researchers to directly study all the variants in each individual, therefore promises opportunities for identifying functional variants and a more comprehensive dissection of disease heritability. Although the current scale of NGS studies is still limited due to the high cost, the success of several recent studies suggests the great potential for applying NGS in genomic epidemiology, especially as the cost of sequencing continues to drop. In this review, we discuss several pioneer applications of NGS, summarize scientific discoveries for rare and complex diseases, and compare various study designs including targeted sequencing and whole-genome sequencing using population-based and family-based cohorts. Finally, we highlight recent advancements in statistical methods proposed for sequencing analysis, including group-based association tests, meta-analysis techniques, and annotation tools for variant prioritization.
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
Araya, Takao; Kubo, Takuya; von Wirén, Nicolaus; Takahashi, Hideki
2016-03-01
Plant root development is strongly affected by nutrient availability. Despite the importance of structure and function of roots in nutrient acquisition, statistical modeling approaches to evaluate dynamic and temporal modulations of root system architecture in response to nutrient availability have remained as widely open and exploratory areas in root biology. In this study, we developed a statistical modeling approach to investigate modulations of root system architecture in response to nitrogen availability. Mathematical models were designed for quantitative assessment of root growth and root branching phenotypes and their dynamic relationships based on hierarchical configuration of primary and lateral roots formulating the fishbone-shaped root system architecture in Arabidopsis thaliana. Time-series datasets reporting dynamic changes in root developmental traits on different nitrate or ammonium concentrations were generated for statistical analyses. Regression analyses unraveled key parameters associated with: (i) inhibition of primary root growth under nitrogen limitation or on ammonium; (ii) rapid progression of lateral root emergence in response to ammonium; and (iii) inhibition of lateral root elongation in the presence of excess nitrate or ammonium. This study provides a statistical framework for interpreting dynamic modulation of root system architecture, supported by meta-analysis of datasets displaying morphological responses of roots to diverse nitrogen supplies. © 2015 Institute of Botany, Chinese Academy of Sciences.
Statistical osteoporosis models using composite finite elements: a parameter study.
Wolfram, Uwe; Schwen, Lars Ole; Simon, Ulrich; Rumpf, Martin; Wilke, Hans-Joachim
2009-09-18
Osteoporosis is a widely spread disease with severe consequences for patients and high costs for health care systems. The disease is characterised by a loss of bone mass which induces a loss of mechanical performance and structural integrity. It was found that transverse trabeculae are thinned and perforated while vertical trabeculae stay intact. For understanding these phenomena and the mechanisms leading to fractures of trabecular bone due to osteoporosis, numerous researchers employ micro-finite element models. To avoid disadvantages in setting up classical finite element models, composite finite elements (CFE) can be used. The aim of the study is to test the potential of CFE. For that, a parameter study on numerical lattice samples with statistically simulated, simplified osteoporosis is performed. These samples are subjected to compression and shear loading. Results show that the biggest drop of compressive stiffness is reached for transverse isotropic structures losing 32% of the trabeculae (minus 89.8% stiffness). The biggest drop in shear stiffness is found for an isotropic structure also losing 32% of the trabeculae (minus 67.3% stiffness). The study indicates that losing trabeculae leads to a worse drop of macroscopic stiffness than thinning of trabeculae. The results further demonstrate the advantages of CFEs for simulating micro-structured samples.
Statistical Shape Modelling and Markov Random Field Restoration (invited tutorial and exercise)
DEFF Research Database (Denmark)
Hilger, Klaus Baggesen
This tutorial focuses on statistical shape analysis using point distribution models (PDM) which is widely used in modelling biological shape variability over a set of annotated training data. Furthermore, Active Shape Models (ASM) and Active Appearance Models (AAM) are based on PDMs and have proven...... themselves a generic holistic tool in various segmentation and simulation studies. Finding a basis of homologous points is a fundamental issue in PDMs which effects both alignment and decomposition of the training data, and may be aided by Markov Random Field Restoration (MRF) of the correspondence...... deformation field between shapes. The tutorial demonstrates both generative active shape and appearance models, and MRF restoration on 3D polygonized surfaces. ''Exercise: Spectral-Spatial classification of multivariate images'' From annotated training data this exercise applies spatial image restoration...
DEFF Research Database (Denmark)
Hansen, Niels Christian; Loui, Psyche; Vuust, Peter
Statistical learning underlies the generation of expectations with different degrees of uncertainty. In music, uncertainty applies to expectations for pitches in a melody. This uncertainty can be quantified by Shannon entropy from distributions of expectedness ratings for multiple continuations...... of each melody, as obtained with the probe-tone paradigm. We hypothesised that statistical learning of music can be modelled as a process of entropy reduction. Specifically, implicit learning of statistical regularities allows reduction in the relative entropy (i.e. symmetrised Kullback-Leibler Divergence...... of musical training, and within-participant decreases in entropy after short-term statistical learning of novel music. Thus, whereas inexperienced listeners make high-entropy predictions, following the Principle of Maximum Entropy, statistical learning over varying timescales enables listeners to generate...
Microstructure Modeling of Third Generation Disk Alloys
Jou, Herng-Jeng
2010-01-01
The objective of this program was to model, validate, and predict the precipitation microstructure evolution, using PrecipiCalc (QuesTek Innovations LLC) software, for 3rd generation Ni-based gas turbine disc superalloys during processing and service, with a set of logical and consistent experiments and characterizations. Furthermore, within this program, the originally research-oriented microstructure simulation tool was to be further improved and implemented to be a useful and user-friendly engineering tool. In this report, the key accomplishments achieved during the third year (2009) of the program are summarized. The activities of this year included: Further development of multistep precipitation simulation framework for gamma prime microstructure evolution during heat treatment; Calibration and validation of gamma prime microstructure modeling with supersolvus heat treated LSHR; Modeling of the microstructure evolution of the minor phases, particularly carbides, during isothermal aging, representing the long term microstructure stability during thermal exposure; and the implementation of software tools. During the research and development efforts to extend the precipitation microstructure modeling and prediction capability in this 3-year program, we identified a hurdle, related to slow gamma prime coarsening rate, with no satisfactory scientific explanation currently available. It is desirable to raise this issue to the Ni-based superalloys research community, with hope that in future there will be a mechanistic understanding and physics-based treatment to overcome the hurdle. In the mean time, an empirical correction factor was developed in this modeling effort to capture the experimental observations.
Local yield stress statistics in model amorphous solids
Barbot, Armand; Lerbinger, Matthias; Hernandez-Garcia, Anier; García-García, Reinaldo; Falk, Michael L.; Vandembroucq, Damien; Patinet, Sylvain
2018-03-01
We develop and extend a method presented by Patinet, Vandembroucq, and Falk [Phys. Rev. Lett. 117, 045501 (2016), 10.1103/PhysRevLett.117.045501] to compute the local yield stresses at the atomic scale in model two-dimensional Lennard-Jones glasses produced via differing quench protocols. This technique allows us to sample the plastic rearrangements in a nonperturbative manner for different loading directions on a well-controlled length scale. Plastic activity upon shearing correlates strongly with the locations of low yield stresses in the quenched states. This correlation is higher in more structurally relaxed systems. The distribution of local yield stresses is also shown to strongly depend on the quench protocol: the more relaxed the glass, the higher the local plastic thresholds. Analysis of the magnitude of local plastic relaxations reveals that stress drops follow exponential distributions, justifying the hypothesis of an average characteristic amplitude often conjectured in mesoscopic or continuum models. The amplitude of the local plastic rearrangements increases on average with the yield stress, regardless of the system preparation. The local yield stress varies with the shear orientation tested and strongly correlates with the plastic rearrangement locations when the system is sheared correspondingly. It is thus argued that plastic rearrangements are the consequence of shear transformation zones encoded in the glass structure that possess weak slip planes along different orientations. Finally, we justify the length scale employed in this work and extract the yield threshold statistics as a function of the size of the probing zones. This method makes it possible to derive physically grounded models of plasticity for amorphous materials by directly revealing the relevant details of the shear transformation zones that mediate this process.
Statistical gravitational waveform models: What to simulate next?
Doctor, Zoheyr; Farr, Ben; Holz, Daniel E.; Pürrer, Michael
2017-12-01
Models of gravitational waveforms play a critical role in detecting and characterizing the gravitational waves (GWs) from compact binary coalescences. Waveforms from numerical relativity (NR), while highly accurate, are too computationally expensive to produce to be directly used with Bayesian parameter estimation tools like Markov-chain-Monte-Carlo and nested sampling. We propose a Gaussian process regression (GPR) method to generate reduced-order-model waveforms based only on existing accurate (e.g. NR) simulations. Using a training set of simulated waveforms, our GPR approach produces interpolated waveforms along with uncertainties across the parameter space. As a proof of concept, we use a training set of IMRPhenomD waveforms to build a GPR model in the 2-d parameter space of mass ratio q and equal-and-aligned spin χ1=χ2. Using a regular, equally-spaced grid of 120 IMRPhenomD training waveforms in q ∈[1 ,3 ] and χ1∈[-0.5 ,0.5 ], the GPR mean approximates IMRPhenomD in this space to mismatches below 4.3 ×10-5. Our approach could in principle use training waveforms directly from numerical relativity. Beyond interpolation of waveforms, we also present a greedy algorithm that utilizes the errors provided by our GPR model to optimize the placement of future simulations. In a fiducial test case we find that using the greedy algorithm to iteratively add simulations achieves GPR errors that are ˜1 order of magnitude lower than the errors from using Latin-hypercube or square training grids.
Extension of the Wald statistic to models with dependent observations
Czech Academy of Sciences Publication Activity Database
Morales, D.; Pardo, L.; Pardo, M. C.; Vajda, Igor
2000-01-01
Roč. 52, č. 2 (2000), s. 97-113 ISSN 0026-1335 R&D Projects: GA ČR GA102/99/1137 Grant - others:DGES(ES) PB-960635; GV(ES) 99/159/01 Institutional research plan: AV0Z1075907 Keywords : composite parametric hypotheses * generalized likelihood ratio statistic * generalized Wald statistic Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.212, year: 2000
Xu, Y.; Jones, A. D.; Rhoades, A.
2017-12-01
Precipitation is a key component in hydrologic cycles, and changing precipitation regimes contribute to more intense and frequent drought and flood events around the world. Numerical climate modeling is a powerful tool to study climatology and to predict future changes. Despite the continuous improvement in numerical models, long-term precipitation prediction remains a challenge especially at regional scales. To improve numerical simulations of precipitation, it is important to find out where the uncertainty in precipitation simulations comes from. There are two types of uncertainty in numerical model predictions. One is related to uncertainty in the input data, such as model's boundary and initial conditions. These uncertainties would propagate to the final model outcomes even if the numerical model has exactly replicated the true world. But a numerical model cannot exactly replicate the true world. Therefore, the other type of model uncertainty is related the errors in the model physics, such as the parameterization of sub-grid scale processes, i.e., given precise input conditions, how much error could be generated by the in-precise model. Here, we build two statistical models based on a neural network algorithm to predict long-term variation of precipitation over California: one uses "true world" information derived from observations, and the other uses "modeled world" information using model inputs and outputs from the North America Coordinated Regional Downscaling Project (NA CORDEX). We derive multiple climate feature metrics as the predictors for the statistical model to represent the impact of global climate on local hydrology, and include topography as a predictor to represent the local control. We first compare the predictors between the true world and the modeled world to determine the errors contained in the input data. By perturbing the predictors in the statistical model, we estimate how much uncertainty in the model's final outcomes is accounted for
DEFF Research Database (Denmark)
A methodology is presented that combines modelling based on first principles and data based modelling into a modelling cycle that facilitates fast decision-making based on statistical methods. A strong feature of this methodology is that given a first principles model along with process data......, the corresponding modelling cycle model of the given system for a given purpose. A computer-aided tool, which integrates the elements of the modelling cycle, is also presented, and an example is given of modelling a fed-batch bioreactor....
Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and...
Dataset of coded handwriting features for use in statistical modelling
Directory of Open Access Journals (Sweden)
Anna Agius
2018-02-01
Full Text Available The data presented here is related to the article titled, “Using handwriting to infer a writer's country of origin for forensic intelligence purposes” (Agius et al., 2017 [1]. This article reports original writer, spatial and construction characteristic data for thirty-seven English Australian writers and thirty-seven Vietnamese writers. All of these characteristics were coded and recorded in Microsoft Excel 2013 (version 15.31. The construction characteristics coded were only extracted from seven characters, which were: ‘g’, ‘h’, ‘th’, ‘M’, ‘0’, ‘7’ and ‘9’. The coded format of the writer, spatial and construction characteristics is made available in this Data in Brief in order to allow others to perform statistical analyses and modelling to investigate whether there is a relationship between the handwriting features and the nationality of the writer, and whether the two nationalities can be differentiated. Furthermore, to employ mathematical techniques that are capable of characterising the extracted features from each participant.
Increased Statistical Efficiency in a Lognormal Mean Model
Directory of Open Access Journals (Sweden)
Grant H. Skrepnek
2014-01-01
Full Text Available Within the context of clinical and other scientific research, a substantial need exists for an accurate determination of the point estimate in a lognormal mean model, given that highly skewed data are often present. As such, logarithmic transformations are often advocated to achieve the assumptions of parametric statistical inference. Despite this, existing approaches that utilize only a sample’s mean and variance may not necessarily yield the most efficient estimator. The current investigation developed and tested an improved efficient point estimator for a lognormal mean by capturing more complete information via the sample’s coefficient of variation. Results of an empirical simulation study across varying sample sizes and population standard deviations indicated relative improvements in efficiency of up to 129.47 percent compared to the usual maximum likelihood estimator and up to 21.33 absolute percentage points above the efficient estimator presented by Shen and colleagues (2006. The relative efficiency of the proposed estimator increased particularly as a function of decreasing sample size and increasing population standard deviation.
Statistical physics of medical diagnostics: Study of a probabilistic model
Mashaghi, Alireza; Ramezanpour, Abolfazl
2018-03-01
We study a diagnostic strategy which is based on the anticipation of the diagnostic process by simulation of the dynamical process starting from the initial findings. We show that such a strategy could result in more accurate diagnoses compared to a strategy that is solely based on the direct implications of the initial observations. We demonstrate this by employing the mean-field approximation of statistical physics to compute the posterior disease probabilities for a given subset of observed signs (symptoms) in a probabilistic model of signs and diseases. A Monte Carlo optimization algorithm is then used to maximize an objective function of the sequence of observations, which favors the more decisive observations resulting in more polarized disease probabilities. We see how the observed signs change the nature of the macroscopic (Gibbs) states of the sign and disease probability distributions. The structure of these macroscopic states in the configuration space of the variables affects the quality of any approximate inference algorithm (so the diagnostic performance) which tries to estimate the sign-disease marginal probabilities. In particular, we find that the simulation (or extrapolation) of the diagnostic process is helpful when the disease landscape is not trivial and the system undergoes a phase transition to an ordered phase.
Tornadoes and related damage costs: statistical modelling with a semi-Markov approach
Directory of Open Access Journals (Sweden)
Guglielmo D’Amico
2016-09-01
Full Text Available We propose a statistical approach to modelling for predicting and simulating occurrences of tornadoes and accumulated cost distributions over a time interval. This is achieved by modelling the tornado intensity, measured with the Fujita scale, as a stochastic process. Since the Fujita scale divides tornado intensity into six states, it is possible to model the tornado intensity by using Markov and semi-Markov models. We demonstrate that the semi-Markov approach is able to reproduce the duration effect that is detected in tornado occurrence. The superiority of the semi-Markov model as compared to the Markov chain model is also affirmed by means of a statistical test of hypothesis. As an application, we compute the expected value and the variance of the costs generated by the tornadoes over a given time interval in a given area. The paper contributes to the literature by demonstrating that semi-Markov models represent an effective tool for physical analysis of tornadoes as well as for the estimation of the economic damages to human things.
Olive mill wastewater characteristics: modelling and statistical analysis
Directory of Open Access Journals (Sweden)
Martins-Dias, Susete
2004-09-01
Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus
Statistical Damage Detection of Civil Engineering Structures using ARMAV Models
DEFF Research Database (Denmark)
Andersen, P.; Kirkegaard, Poul Henning
In this paper a statistically based damage detection of a lattice steel mast is performed. By estimation of the modal parameters and their uncertainties it is possible to detect whether some of the modal parameters have changed with a statistical significance. The estimation of the uncertainties ...
Definitions and Models of Statistical Literacy: A Literature Review
Sharma, Sashi
2017-01-01
Despite statistical literacy being relatively new in statistics education research, it needs special attention as attempts are being made to enhance the teaching, learning and assessing of this sub-strand. It is important that teachers and researchers are aware of the challenges of teaching this literacy. In this article, the growing importance of…
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
In the previous paper ({\\it Pramana – J. Phys.} 81(6), 1009 (2013)), the mechanism of stress corrosion cracking (SCC) based on non-quadratic form of Dirichlet energy was proposed and its statistical features were discussed. Following those results, we discuss here how SCC propagates on pipe wall statistically. It reveals ...
Maximum entropy principle and hydrodynamic models in statistical mechanics
International Nuclear Information System (INIS)
Trovato, M.; Reggiani, L.
2012-01-01
This review presents the state of the art of the maximum entropy principle (MEP) in its classical and quantum (QMEP) formulation. Within the classical MEP we overview a general theory able to provide, in a dynamical context, the macroscopic relevant variables for carrier transport in the presence of electric fields of arbitrary strength. For the macroscopic variables the linearized maximum entropy approach is developed including full-band effects within a total energy scheme. Under spatially homogeneous conditions, we construct a closed set of hydrodynamic equations for the small-signal (dynamic) response of the macroscopic variables. The coupling between the driving field and the energy dissipation is analyzed quantitatively by using an arbitrary number of moments of the distribution function. Analogously, the theoretical approach is applied to many one-dimensional n + nn + submicron Si structures by using different band structure models, different doping profiles, different applied biases and is validated by comparing numerical calculations with ensemble Monte Carlo simulations and with available experimental data. Within the quantum MEP we introduce a quantum entropy functional of the reduced density matrix, the principle of quantum maximum entropy is then asserted as fundamental principle of quantum statistical mechanics. Accordingly, we have developed a comprehensive theoretical formalism to construct rigorously a closed quantum hydrodynamic transport within a Wigner function approach. The theory is formulated both in thermodynamic equilibrium and nonequilibrium conditions, and the quantum contributions are obtained by only assuming that the Lagrange multipliers can be expanded in powers of ħ 2 , being ħ the reduced Planck constant. In particular, by using an arbitrary number of moments, we prove that: i) on a macroscopic scale all nonlocal effects, compatible with the uncertainty principle, are imputable to high-order spatial derivatives both of the
Global Adjoint Tomography: Next-Generation Models
Bozdag, Ebru; Lefebvre, Matthieu; Lei, Wenjie; Orsvuran, Ridvan; Peter, Daniel; Ruan, Youyi; Smith, James; Komatitsch, Dimitri; Tromp, Jeroen
2017-04-01
The first-generation global adjoint tomography model GLAD-M15 (Bozdag et al. 2016) is the result of 15 conjugate-gradient iterations based on GPU-accelerated spectral-element simulations of 3D wave propagation and Fréchet kernels. For simplicity, GLAD-M15 was constructed as an elastic model with transverse isotropy confined to the upper mantle. However, Earth's mantle and crust show significant evidence of anisotropy as a result of its composition and deformation. There may be different sources of seismic anisotropy affecting both body and surface waves. As a first attempt, we initially tackle with surface-wave anisotropy and proceed iterations using the same 253 earthquake data set used in GLAD-M15 with an emphasize on upper-mantle. Furthermore, we explore new misfits, such as double-difference measurements (Yuan et al. 2016), to better deal with the possible artifacts of the uneven distribution of seismic stations globally and minimize source uncertainties in structural inversions. We will present our observations with the initial results of azimuthally anisotropic inversions and also discuss the next generation global models with various parametrizations. Meanwhile our goal is to use all available seismic data in imaging. This however requires a solid framework to perform iterative adjoint tomography workflows with big data on supercomputers. We will talk about developments in adjoint tomography workflow from the need of defining new seismic and computational data formats (e.g., ASDF by Krischer et al. 2016, ADIOS by Liu et al. 2011) to developing new pre- and post-processing tools together with experimenting workflow management tools, such as Pegasus (Deelman et al. 2015). All our simulations are performed on Oak Ridge National Laboratory's Cray XK7 "Titan" system. Our ultimate aim is to get ready to harness ORNL's next-generation supercomputer "Summit", an IBM with Power-9 CPUs and NVIDIA Volta GPU accelerators, to be ready by 2018 which will enable us to
Strangeness content and structure function of the nucleon in a statistical quark model
Trevisan, L A; Tomio, L
1999-01-01
The strangeness content of the nucleon is determined from a statistical model using confined quark levels, and is shown to have a good agreement with the corresponding values extracted from experimental data. The quark levels are generated in a Dirac equation that uses a linear confining potential (scalar plus vector). With the requirement that the result for the Gottfried sum rule violation, given by the new muon collaboration (NMC), is well reproduced, we also obtain the difference between the structure functions of the proton and neutron, and the corresponding sea quark contributions. (27 refs).
Global adjoint tomography: first-generation model
Bozdağ, Ebru
2016-09-23
We present the first-generation global tomographic model constructed based on adjoint tomography, an iterative full-waveform inversion technique. Synthetic seismograms were calculated using GPU-accelerated spectral-element simulations of global seismic wave propagation, accommodating effects due to 3-D anelastic crust & mantle structure, topography & bathymetry, the ocean load, ellipticity, rotation, and self-gravitation. Fréchet derivatives were calculated in 3-D anelastic models based on an adjoint-state method. The simulations were performed on the Cray XK7 named \\'Titan\\', a computer with 18 688 GPU accelerators housed at Oak Ridge National Laboratory. The transversely isotropic global model is the result of 15 tomographic iterations, which systematically reduced differences between observed and simulated three-component seismograms. Our starting model combined 3-D mantle model S362ANI with 3-D crustal model Crust2.0. We simultaneously inverted for structure in the crust and mantle, thereby eliminating the need for widely used \\'crustal corrections\\'. We used data from 253 earthquakes in the magnitude range 5.8 ≤ M ≤ 7.0. We started inversions by combining ~30 s body-wave data with ~60 s surface-wave data. The shortest period of the surface waves was gradually decreased, and in the last three iterations we combined ~17 s body waves with ~45 s surface waves. We started using 180 min long seismograms after the 12th iteration and assimilated minor- and major-arc body and surface waves. The 15th iteration model features enhancements of well-known slabs, an enhanced image of the Samoa/Tahiti plume, as well as various other plumes and hotspots, such as Caroline, Galapagos, Yellowstone and Erebus. Furthermore, we see clear improvements in slab resolution along the Hellenic and Japan Arcs, as well as subduction along the East of Scotia Plate, which does not exist in the starting model. Point-spread function tests demonstrate that we are approaching the
Modeling and Generating Strategy Games Mechanics
DEFF Research Database (Denmark)
Mahlmann, Tobias
Strategy games are a popular genre of games with a long history, originating from games like Chess or Go. The first strategy games were published as “Kriegspiele” (engl. wargames) in the late 18th century, intended for the education of young cadets. Since then strategy games were refined...... and transformed over two centuries into a medium of entertainment. Today’s computer strategy games have their roots in the board- and roleplaying games of the 20th century and enjoy great popularity. We use strategy games as an application for the procedural generation of game content. Procedural game content...... of the game is, how players may manipulate the game world, etc. We present the Strategy Games Description Language (SGDL), a tree-based approach to model the game mechanics of strategy games. SGDL allows game designers to rapid prototype their game ideas with the help of our customisable game engine. We...
Ushakov, Yuriy V.; Dubkov, Alexander A.; Spagnolo, Bernardo
2010-04-01
The phenomena of dissonance and consonance in a simple auditory sensory model composed of three neurons are considered. Two of them, here so-called sensory neurons, are driven by noise and subthreshold periodic signals with different ratio of frequencies, and its outputs plus noise are applied synaptically to a third neuron, so-called interneuron. We present a theoretical analysis with a probabilistic approach to investigate the interspike intervals statistics of the spike train generated by the interneuron. We find that tones with frequency ratios that are considered consonant by musicians produce at the third neuron inter-firing intervals statistics densities that are very distinctive from densities obtained using tones with ratios that are known to be dissonant. In other words, at the output of the interneuron, inharmonious signals give rise to blurry spike trains, while the harmonious signals produce more regular, less noisy, spike trains. Theoretical results are compared with numerical simulations.
Statistical behaviour of adaptive multilevel splitting algorithms in simple models
International Nuclear Information System (INIS)
Rolland, Joran; Simonnet, Eric
2015-01-01
Adaptive multilevel splitting algorithms have been introduced rather recently for estimating tail distributions in a fast and efficient way. In particular, they can be used for computing the so-called reactive trajectories corresponding to direct transitions from one metastable state to another. The algorithm is based on successive selection–mutation steps performed on the system in a controlled way. It has two intrinsic parameters, the number of particles/trajectories and the reaction coordinate used for discriminating good or bad trajectories. We investigate first the convergence in law of the algorithm as a function of the timestep for several simple stochastic models. Second, we consider the average duration of reactive trajectories for which no theoretical predictions exist. The most important aspect of this work concerns some systems with two degrees of freedom. They are studied in detail as a function of the reaction coordinate in the asymptotic regime where the number of trajectories goes to infinity. We show that during phase transitions, the statistics of the algorithm deviate significatively from known theoretical results when using non-optimal reaction coordinates. In this case, the variance of the algorithm is peaking at the transition and the convergence of the algorithm can be much slower than the usual expected central limit behaviour. The duration of trajectories is affected as well. Moreover, reactive trajectories do not correspond to the most probable ones. Such behaviour disappears when using the optimal reaction coordinate called committor as predicted by the theory. We finally investigate a three-state Markov chain which reproduces this phenomenon and show logarithmic convergence of the trajectory durations
Modelling malaria treatment practices in Bangladesh using spatial statistics
Directory of Open Access Journals (Sweden)
Haque Ubydul
2012-03-01
Full Text Available Abstract Background Malaria treatment-seeking practices vary worldwide and Bangladesh is no exception. Individuals from 88 villages in Rajasthali were asked about their treatment-seeking practices. A portion of these households preferred malaria treatment from the National Control Programme, but still a large number of households continued to use drug vendors and approximately one fourth of the individuals surveyed relied exclusively on non-control programme treatments. The risks of low-control programme usage include incomplete malaria treatment, possible misuse of anti-malarial drugs, and an increased potential for drug resistance. Methods The spatial patterns of treatment-seeking practices were first examined using hot-spot analysis (Local Getis-Ord Gi statistic and then modelled using regression. Ordinary least squares (OLS regression identified key factors explaining more than 80% of the variation in control programme and vendor treatment preferences. Geographically weighted regression (GWR was then used to assess where each factor was a strong predictor of treatment-seeking preferences. Results Several factors including tribal affiliation, housing materials, household densities, education levels, and proximity to the regional urban centre, were found to be effective predictors of malaria treatment-seeking preferences. The predictive strength of each of these factors, however, varied across the study area. While education, for example, was a strong predictor in some villages, it was less important for predicting treatment-seeking outcomes in other villages. Conclusion Understanding where each factor is a strong predictor of treatment-seeking outcomes may help in planning targeted interventions aimed at increasing control programme usage. Suggested strategies include providing additional training for the Building Resources across Communities (BRAC health workers, implementing educational programmes, and addressing economic factors.
Improving statistical reasoning: theoretical models and practical implications
National Research Council Canada - National Science Library
Sedlmeier, Peter
1999-01-01
... in Psychology? 206 References 216 Author Index 230 Subject Index 235 v PrefacePreface Statistical literacy, the art of drawing reasonable inferences from an abundance of numbers provided daily by...
A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839
Cai, Li; Monroe, Scott
2014-01-01
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Computational and Statistical Models: A Comparison for Policy Modeling of Childhood Obesity
Mabry, Patricia L.; Hammond, Ross; Ip, Edward Hak-Sing; Huang, Terry T.-K.
As systems science methodologies have begun to emerge as a set of innovative approaches to address complex problems in behavioral, social science, and public health research, some apparent conflicts with traditional statistical methodologies for public health have arisen. Computational modeling is an approach set in context that integrates diverse sources of data to test the plausibility of working hypotheses and to elicit novel ones. Statistical models are reductionist approaches geared towards proving the null hypothesis. While these two approaches may seem contrary to each other, we propose that they are in fact complementary and can be used jointly to advance solutions to complex problems. Outputs from statistical models can be fed into computational models, and outputs from computational models can lead to further empirical data collection and statistical models. Together, this presents an iterative process that refines the models and contributes to a greater understanding of the problem and its potential solutions. The purpose of this panel is to foster communication and understanding between statistical and computational modelers. Our goal is to shed light on the differences between the approaches and convey what kinds of research inquiries each one is best for addressing and how they can serve complementary (and synergistic) roles in the research process, to mutual benefit. For each approach the panel will cover the relevant "assumptions" and how the differences in what is assumed can foster misunderstandings. The interpretations of the results from each approach will be compared and contrasted and the limitations for each approach will be delineated. We will use illustrative examples from CompMod, the Comparative Modeling Network for Childhood Obesity Policy. The panel will also incorporate interactive discussions with the audience on the issues raised here.
Characterizing and Addressing the Need for Statistical Adjustment of Global Climate Model Data
White, K. D.; Baker, B.; Mueller, C.; Villarini, G.; Foley, P.; Friedman, D.
2017-12-01
As part of its mission to research and measure the effects of the changing climate, the U. S. Army Corps of Engineers (USACE) regularly uses the World Climate Research Programme's Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model dataset. However, these data are generated at a global level and are not fine-tuned for specific watersheds. This often causes CMIP5 output to vary from locally observed patterns in the climate. Several downscaling methods have been developed to increase the resolution of the CMIP5 data and decrease systemic differences to support decision-makers as they evaluate results at the watershed scale. Evaluating preliminary comparisons of observed and projected flow frequency curves over the US revealed a simple framework for water resources decision makers to plan and design water resources management measures under changing conditions using standard tools. Using this framework as a basis, USACE has begun to explore to use of statistical adjustment to alter global climate model data to better match the locally observed patterns while preserving the general structure and behavior of the model data. When paired with careful measurement and hypothesis testing, statistical adjustment can be particularly effective at navigating the compromise between the locally observed patterns and the global climate model structures for decision makers.
GND-PCA-based statistical modeling of diaphragm motion extracted from 4D MRI.
Swastika, Windra; Masuda, Yoshitada; Xu, Rui; Kido, Shoji; Chen, Yen-Wei; Haneishi, Hideaki
2013-01-01
We analyzed a statistical model of diaphragm motion using regular principal component analysis (PCA) and generalized N-dimensional PCA (GND-PCA). First, we generate 4D MRI of respiratory motion from 2D MRI using an intersection profile method. We then extract semiautomatically the diaphragm boundary from the 4D-MRI to get subject-specific diaphragm motion. In order to build a general statistical model of diaphragm motion, we normalize the diaphragm motion in time and spatial domains and evaluate the diaphragm motion model of 10 healthy subjects by applying regular PCA and GND-PCA. We also validate the results using the leave-one-out method. The results show that the first three principal components of regular PCA contain more than 98% of the total variation of diaphragm motion. However, validation using leave-one-out method gives up to 5.0 mm mean of error for right diaphragm motion and 3.8 mm mean of error for left diaphragm motion. Model analysis using GND-PCA provides about 1 mm margin of error and is able to reconstruct the diaphragm model by fewer samples.
A two-component rain model for the prediction of attenuation statistics
Crane, R. K.
1982-01-01
A two-component rain model has been developed for calculating attenuation statistics. In contrast to most other attenuation prediction models, the two-component model calculates the occurrence probability for volume cells or debris attenuation events. The model performed significantly better than the International Radio Consultative Committee model when used for predictions on earth-satellite paths. It is expected that the model will have applications in modeling the joint statistics required for space diversity system design, the statistics of interference due to rain scatter at attenuating frequencies, and the duration statistics for attenuation events.
Statistics-based model for prediction of chemical biosynthesis yield from Saccharomyces cerevisiae
Directory of Open Access Journals (Sweden)
Leonard Effendi
2011-06-01
Full Text Available Abstract Background The robustness of Saccharomyces cerevisiae in facilitating industrial-scale production of ethanol extends its utilization as a platform to synthesize other metabolites. Metabolic engineering strategies, typically via pathway overexpression and deletion, continue to play a key role for optimizing the conversion efficiency of substrates into the desired products. However, chemical production titer or yield remains difficult to predict based on reaction stoichiometry and mass balance. We sampled a large space of data of chemical production from S. cerevisiae, and developed a statistics-based model to calculate production yield using input variables that represent the number of enzymatic steps in the key biosynthetic pathway of interest, metabolic modifications, cultivation modes, nutrition and oxygen availability. Results Based on the production data of about 40 chemicals produced from S. cerevisiae, metabolic engineering methods, nutrient supplementation, and fermentation conditions described therein, we generated mathematical models with numerical and categorical variables to predict production yield. Statistically, the models showed that: 1. Chemical production from central metabolic precursors decreased exponentially with increasing number of enzymatic steps for biosynthesis (>30% loss of yield per enzymatic step, P-value = 0; 2. Categorical variables of gene overexpression and knockout improved product yield by 2~4 folds (P-value Saccharomyces cerevisiae has historically evolved for robust alcohol fermentation. Conclusions We generated simple mathematical models for first-order approximation of chemical production yield from S. cerevisiae. These linear models provide empirical insights to the effects of strain engineering and cultivation conditions toward biosynthetic efficiency. These models may not only provide guidelines for metabolic engineers to synthesize desired products, but also be useful to compare the
One of the main uses of biomarker measurements is to compare different populations to each other and to assess risk in comparison to established parameters. This is most often done using summary statistics such as central tendency, variance components, confidence intervals, excee...
Statistical and dynamical modeling of heavy-ion fusion-fission reactions
Eslamizadeh, H.; Razazzadeh, H.
2018-02-01
A modified statistical model and a four dimensional dynamical model based on Langevin equations have been used to simulate the fission process of the excited compound nuclei 207At and 216Ra produced in the fusion 19F + 188Os and 19F + 197Au reactions. The evaporation residue cross section, the fission cross section, the pre-scission neutron, proton and alpha multiplicities and the anisotropy of fission fragments angular distribution have been calculated for the excited compound nuclei 207At and 216Ra. In the modified statistical model the effects of spin K about the symmetry axis and temperature have been considered in calculations of the fission widths and the potential energy surfaces. It was shown that the modified statistical model can reproduce the above mentioned experimental data by using appropriate values of the temperature coefficient of the effective potential equal to λ = 0.0180 ± 0.0055, 0.0080 ± 0.0030 MeV-2 and the scaling factor of the fission barrier height equal to rs = 1.0015 ± 0.0025, 1.0040 ± 0.0020 for the compound nuclei 207At and 216Ra, respectively. Three collective shape coordinates plus the projection of total spin of the compound nucleus on the symmetry axis, K, were considered in the four dimensional dynamical model. In the dynamical calculations, dissipation was generated through the chaos weighted wall and window friction formula. Comparison of the theoretical results with the experimental data showed that two models make it possible to reproduce satisfactorily the above mentioned experimental data for the excited compound nuclei 207At and 216Ra.
Statistical and dynamical modeling of heavy-ion fusion–fission reactions
Directory of Open Access Journals (Sweden)
H. Eslamizadeh
2018-02-01
Full Text Available A modified statistical model and a four dimensional dynamical model based on Langevin equations have been used to simulate the fission process of the excited compound nuclei 207At and 216Ra produced in the fusion 19F + 188Os and 19F + 197Au reactions. The evaporation residue cross section, the fission cross section, the pre-scission neutron, proton and alpha multiplicities and the anisotropy of fission fragments angular distribution have been calculated for the excited compound nuclei 207At and 216Ra. In the modified statistical model the effects of spin K about the symmetry axis and temperature have been considered in calculations of the fission widths and the potential energy surfaces. It was shown that the modified statistical model can reproduce the above mentioned experimental data by using appropriate values of the temperature coefficient of the effective potential equal to λ=0.0180±0.0055, 0.0080±0.0030 MeV−2 and the scaling factor of the fission barrier height equal to rs=1.0015±0.0025, 1.0040±0.0020 for the compound nuclei 207At and 216Ra, respectively. Three collective shape coordinates plus the projection of total spin of the compound nucleus on the symmetry axis, K, were considered in the four dimensional dynamical model. In the dynamical calculations, dissipation was generated through the chaos weighted wall and window friction formula. Comparison of the theoretical results with the experimental data showed that two models make it possible to reproduce satisfactorily the above mentioned experimental data for the excited compound nuclei 207At and 216Ra.
Resolution of climate model outputs are too coarse to be used as direct inputs to impact models for assessing climate change impacts on agricultural production, water resources, and eco-system services at local or site-specific scales. Statistical downscaling approaches are usually used to bridge th...
Steger, Stefan; Brenning, Alexander; Bell, Rainer; Glade, Thomas
2016-12-01
There is unanimous agreement that a precise spatial representation of past landslide occurrences is a prerequisite to produce high quality statistical landslide susceptibility models. Even though perfectly accurate landslide inventories rarely exist, investigations of how landslide inventory-based errors propagate into subsequent statistical landslide susceptibility models are scarce. The main objective of this research was to systematically examine whether and how inventory-based positional inaccuracies of different magnitudes influence modelled relationships, validation results, variable importance and the visual appearance of landslide susceptibility maps. The study was conducted for a landslide-prone site located in the districts of Amstetten and Waidhofen an der Ybbs, eastern Austria, where an earth-slide point inventory was available. The methodological approach comprised an artificial introduction of inventory-based positional errors into the present landslide data set and an in-depth evaluation of subsequent modelling results. Positional errors were introduced by artificially changing the original landslide position by a mean distance of 5, 10, 20, 50 and 120 m. The resulting differently precise response variables were separately used to train logistic regression models. Odds ratios of predictor variables provided insights into modelled relationships. Cross-validation and spatial cross-validation enabled an assessment of predictive performances and permutation-based variable importance. All analyses were additionally carried out with synthetically generated data sets to further verify the findings under rather controlled conditions. The results revealed that an increasing positional inventory-based error was generally related to increasing distortions of modelling and validation results. However, the findings also highlighted that interdependencies between inventory-based spatial inaccuracies and statistical landslide susceptibility models are complex. The
"Method, system and storage medium for generating virtual brick models"
DEFF Research Database (Denmark)
2009-01-01
An exemplary embodiment is a method for generating a virtual brick model. The virtual brick models are generated by users and uploaded to a centralized host system. Users can build virtual models themselves or download and edit another user's virtual brick models while retaining the identity...... of the original virtual brick model. Routines are provided for both storing user created building steps in and generating automated building instructions for virtual brick models, generating a bill of materials for a virtual brick model and ordering physical bricks corresponding to a virtual brick model....
Information Geometric Complexity of a Trivariate Gaussian Statistical Model
Directory of Open Access Journals (Sweden)
Domenico Felice
2014-05-01
Full Text Available We evaluate the information geometric complexity of entropic motion on low-dimensional Gaussian statistical manifolds in order to quantify how difficult it is to make macroscopic predictions about systems in the presence of limited information. Specifically, we observe that the complexity of such entropic inferences not only depends on the amount of available pieces of information but also on the manner in which such pieces are correlated. Finally, we uncover that, for certain correlational structures, the impossibility of reaching the most favorable configuration from an entropic inference viewpoint seems to lead to an information geometric analog of the well-known frustration effect that occurs in statistical physics.
Generating flexible proper name references in text : Data, models and evaluation
Castro Ferreira, Thiago; Krahmer, Emiel; Wubben, Sander
2017-01-01
This study introduces a statistical model able to generate variations of a proper name, taking into account the person to be mentioned, the discourse context and individual variation. The model relies on the REGnames corpus, a dataset with 53,102 proper name references to 1,000 people in different
Validation of the measure automobile emissions model : a statistical analysis
2000-09-01
The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEASURE) model provides an external validation capability for hot stabilized option; the model is one of several new modal emissions models designed to predict hot stabilized e...
Parameterizing Phrase Based Statistical Machine Translation Models: An Analytic Study
Cer, Daniel
2011-01-01
The goal of this dissertation is to determine the best way to train a statistical machine translation system. I first develop a state-of-the-art machine translation system called Phrasal and then use it to examine a wide variety of potential learning algorithms and optimization criteria and arrive at two very surprising results. First, despite the…
Applications of spatial statistical network models to stream data
Daniel J. Isaak; Erin E. Peterson; Jay M. Ver Hoef; Seth J. Wenger; Jeffrey A. Falke; Christian E. Torgersen; Colin Sowder; E. Ashley Steel; Marie-Josee Fortin; Chris E. Jordan; Aaron S. Ruesch; Nicholas Som; Pascal. Monestiez
2014-01-01
Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for...
Monte Carlo simulation of quantum statistical lattice models
Raedt, Hans De; Lagendijk, Ad
1985-01-01
In this article we review recent developments in computational methods for quantum statistical lattice problems. We begin by giving the necessary mathematical basis, the generalized Trotter formula, and discuss the computational tools, exact summations and Monte Carlo simulation, that will be used
On cumulative process model and its statistical analysis
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2000-01-01
Roč. 36, č. 2 (2000), s. 165-176 ISSN 0023-5954 R&D Projects: GA ČR GA201/97/0354; GA ČR GA402/98/0742 Institutional research plan: AV0Z1075907 Subject RIV: BB - Applied Statistics, Operational Research
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
2016-09-07
Sep 7, 2016 ... Abstract. In the previous paper (Pramana – J. Phys. 81(6), 1009 (2013)), the mechanism of stress corrosion cracking (SCC) based on non-quadratic form of Dirichlet energy was proposed and its statistical features were discussed. Following those results, we discuss here how SCC propagates on pipe wall ...
Horvath , E.A.; Fosnight, E.A.; Klingebiel, A.A.; Moore, D.G.; Stone, J.E.; Reybold, W.U.; Petersen, G.W.
1987-01-01
A methodology has been developed to create a spatial database by referencing digital elevation, Landsat multispectral scanner data, and digitized soil premap delineations of a number of adjacent 7.5-min quadrangle areas to a 30-m Universal Transverse Mercator projection. Slope and aspect transformations are calculated from elevation data and grouped according to field office specifications. An unsupervised classification is performed on a brightness and greenness transformation of the spectral data. The resulting spectral, slope, and aspect maps of each of the 7.5-min quadrangle areas are then plotted and submitted to the field office to be incorporated into the soil premapping stages of a soil survey. A tabular database is created from spatial data by generating descriptive statistics for each data layer within each soil premap delineation. The tabular data base is then entered into a data base management system to be accessed by the field office personnel during the soil survey and to be used for subsequent resource management decisions.Large amounts of data are collected and archived during resource inventories for public land management. Often these data are stored as stacks of maps or folders in a file system in someone's office, with the maps in a variety of formats, scales, and with various standards of accuracy depending on their purpose. This system of information storage and retrieval is cumbersome at best when several categories of information are needed simultaneously for analysis or as input to resource management models. Computers now provide the resource scientist with the opportunity to design increasingly complex models that require even more categories of resource-related information, thus compounding the problem.Recently there has been much emphasis on the use of geographic information systems (GIS) as an alternative method for map data archives and as a resource management tool. Considerable effort has been devoted to the generation of tabular
Learning Statistical Patterns in Relational Data Using Probabilistic Relational Models
National Research Council Canada - National Science Library
Koller, Daphne
2005-01-01
.... This effort focused on developing undirected probabilistic models for representing and learning graph patterns, learning patterns involving links between objects, learning discriminative models...
Unbiased Statistics of a Constraint Satisfaction Problem - a Controlled-Bias Generator
Berthier, Denis
We show that estimating the complexity (mean and distribution) of the instances of a fixed size Constraint Satisfaction Problem (CSP) can be very hard. We deal with the main two aspects of the problem: defining a measure of complexity and generating random unbiased instances. For the first problem, we rely on a general framework and a measure of complexity we presented at CISSE08. For the generation problem, we restrict our analysis to the Sudoku example and we provide a solution that also explains why it is so difficult.
Statistical description of tropospheric delay for InSAR : Overview and a new model
DEFF Research Database (Denmark)
Merryman Boncori, John Peter; Mohr, Johan Jacob
2007-01-01
This paper focuses on statistical modeling of water vapor fluctuations for InSAR. The structure function and power spectral density approaches are reviewed, summarizing their assumptions and results. The linking equations between these modeling techniques are reported. A structure function model ...... of these, to atmospheric statistics. The latter approach is used to compare the derived model with previously published results....
An accurate behavioral model for single-photon avalanche diode statistical performance simulation
Xu, Yue; Zhao, Tingchen; Li, Ding
2018-01-01
An accurate behavioral model is presented to simulate important statistical performance of single-photon avalanche diodes (SPADs), such as dark count and after-pulsing noise. The derived simulation model takes into account all important generation mechanisms of the two kinds of noise. For the first time, thermal agitation, trap-assisted tunneling and band-to-band tunneling mechanisms are simultaneously incorporated in the simulation model to evaluate dark count behavior of SPADs fabricated in deep sub-micron CMOS technology. Meanwhile, a complete carrier trapping and de-trapping process is considered in afterpulsing model and a simple analytical expression is derived to estimate after-pulsing probability. In particular, the key model parameters of avalanche triggering probability and electric field dependence of excess bias voltage are extracted from Geiger-mode TCAD simulation and this behavioral simulation model doesn't include any empirical parameters. The developed SPAD model is implemented in Verilog-A behavioral hardware description language and successfully operated on commercial Cadence Spectre simulator, showing good universality and compatibility. The model simulation results are in a good accordance with the test data, validating high simulation accuracy.
When civil registration is inadequate: interim methods for generating vital statistics.
AbouZahr, Carla; Rampatige, Rasika; Lopez, Alan; deSavigny, Don
2012-04-01
Comprehensive guidelines and tools to help countries rapidly improve their vital statistics systems, based on international best practice are now available. For many countries, however, attainment of timely, accurate statistics on births and deaths and causes of death will require years of strategic and prioritized investment, with technical assistance from WHO, the United Nations, and academia. In the meantime, however, countries will need accurate and unbiased data in order to measure progress with their health programs and broader development goals, such as the MDGs and the growing crisis of non-communicable diseases. This article has introduced some interim strategies that can yield adequate vital statistics and cause of death data as countries work to strengthen their civil registration systems. These methods mirror the skills, practices and advantages of complete and functioning civil registration and vital statistics systems, but for a sample of the population. They are based on the principle of rigorous and continuous data collection for a defined and manageable part of the population. Doing "smaller, representative" populations well rather than "larger populations poorly" will reduce the biases that would otherwise occur from missing data, incorrect application of data management procedures, poor data quality checking and lack of medical certification of causes of death. A critical component of this strategy is to routinely apply verbal autopsy methods to collect essential cause of death data. When properly applied, VA can yield population-based cause of death data of comparable quality to what is typically collected in hospitals in developing countries. Moreover, with the availability of automated methods to diagnose causes of death, it is now possible to obtain accurate cause of death data routinely, cheaply and quickly in resource-poor settings. The long-term goal of strengthening civil registration and vital statistics systems is to ensure that every
Model Based Analysis and Test Generation for Flight Software
Pasareanu, Corina S.; Schumann, Johann M.; Mehlitz, Peter C.; Lowry, Mike R.; Karsai, Gabor; Nine, Harmon; Neema, Sandeep
2009-01-01
We describe a framework for model-based analysis and test case generation in the context of a heterogeneous model-based development paradigm that uses and combines Math- Works and UML 2.0 models and the associated code generation tools. This paradigm poses novel challenges to analysis and test case generation that, to the best of our knowledge, have not been addressed before. The framework is based on a common intermediate representation for different modeling formalisms and leverages and extends model checking and symbolic execution tools for model analysis and test case generation, respectively. We discuss the application of our framework to software models for a NASA flight mission.
New tools for generation IV assemblies modelling
International Nuclear Information System (INIS)
Sylvie Aniel-Buchheit; Edwige Richebois
2005-01-01
Full text of publication follows: In the framework of the development of generation IV concepts, the need of new assembly modelling tools arises. These concepts present more geometrical and spectral heterogeneities (radially and axially). Moreover thermal-hydraulics and neutronics aspects are so closely related that coupled computations are necessary. That raises the need for more precise and flexible tools presenting 3D features. The 3D-coupling of the thermal-hydraulic code FLICA4 with the Monte-Carlo neutronics code TRIPOLI4 was developed in that frame. This new tool enables for the first time to obtain realistic axial and radial power profiles with real feedback effects in an assembly where thermal-hydraulics and neutronics effects are closely related. The BWR is the existing concept presenting the closest heterogeneous characteristics to the various new proposed concepts. This assembly design is thus chosen to compare this new tool, presenting real 3D characteristics, to the existing ones. For design studies, the evaluation of the assembly behavior, currently necessitate a depletion scheme using a 3D thermal-hydraulics assembly calculation coupled with a 1D axial neutronics deterministic calculation (or an axial power profile chosen as a function of the assembly averaged burn-up). The 3D neutronics code (CRONOS2) uses neutronic data built by 2D deterministic assembly calculations without feedback. These cross section libraries enable to take feedbacks into account via parameters such as fuel temperature, moderator density and temperature (history parameters such as void and control rod are not useful in design evaluation). Recently, the libraries build-up has been replaced by on line multi-2D deterministic assembly calculations performed by a cell code (APOLLO2). That avoids interpolation between pre-determined parameters in the cross-section data used by the 1D axial neutronics calculation and enable to give a radial power map to the 3D thermal
DEFF Research Database (Denmark)
ter Beek, Maurice H.; Legay, Axel; Lluch Lafuente, Alberto
2015-01-01
We investigate the suitability of statistical model checking techniques for analysing quantitative properties of software product line models with probabilistic aspects. For this purpose, we enrich the feature-oriented language FLAN with action rates, which specify the likelihood of exhibiting...... particular behaviour or of installing features at a specific moment or in a specific order. The enriched language (called PFLAN) allows us to specify models of software product lines with probabilistic configurations and behaviour, e.g. by considering a PFLAN semantics based on discrete-time Markov chains....... The Maude implementation of PFLAN is combined with the distributed statistical model checker MultiVeStA to perform quantitative analyses of a simple product line case study. The presented analyses include the likelihood of certain behaviour of interest (e.g. product malfunctioning) and the expected average...
Directory of Open Access Journals (Sweden)
Maurice H. ter Beek
2015-04-01
Full Text Available We investigate the suitability of statistical model checking techniques for analysing quantitative properties of software product line models with probabilistic aspects. For this purpose, we enrich the feature-oriented language FLan with action rates, which specify the likelihood of exhibiting particular behaviour or of installing features at a specific moment or in a specific order. The enriched language (called PFLan allows us to specify models of software product lines with probabilistic configurations and behaviour, e.g. by considering a PFLan semantics based on discrete-time Markov chains. The Maude implementation of PFLan is combined with the distributed statistical model checker MultiVeStA to perform quantitative analyses of a simple product line case study. The presented analyses include the likelihood of certain behaviour of interest (e.g. product malfunctioning and the expected average cost of products.
Gallagher, H. Colin; Robins, Garry
2015-01-01
As part of the shift within second language acquisition (SLA) research toward complex systems thinking, researchers have called for investigations of social network structure. One strand of social network analysis yet to receive attention in SLA is network statistical models, whereby networks are explained in terms of smaller substructures of…
Study on Semi-Parametric Statistical Model of Safety Monitoring of Cracks in Concrete Dams
Gu, Chongshi; Qin, Dong; Li, Zhanchao; Zheng, Xueqin
2013-01-01
Cracks are one of the hidden dangers in concrete dams. The study on safety monitoring models of concrete dam cracks has always been difficult. Using the parametric statistical model of safety monitoring of cracks in concrete dams, with the help of the semi-parametric statistical theory, and considering the abnormal behaviors of these cracks, the semi-parametric statistical model of safety monitoring of concrete dam cracks is established to overcome the limitation of the parametric model in ex...
Linear System Models for Ultrasonic Imaging: Intensity Signal Statistics.
Abbey, Craig K; Zhu, Yang; Bahramian, Sara; Insana, Michael F
2017-04-01
Despite a great deal of work characterizing the statistical properties of radio frequency backscattered ultrasound signals, less is known about the statistical properties of demodulated intensity signals. Analysis of intensity is made more difficult by a strong nonlinearity that arises in the process of demodulation. This limits our ability to characterize the spatial resolution and noise properties of B-mode ultrasound images. In this paper, we generalize earlier results on two-point intensity covariance using a multivariate systems approach. We derive the mean and autocovariance function of the intensity signal under Gaussian assumptions on both the object scattering function and acquisition noise, and with the assumption of a locally shift-invariant pulse-echo system function. We investigate the limiting cases of point statistics and a uniform scattering field with a stationary distribution. Results from validation studies using simulation and data from a real system applied to a uniform scattering phantom are presented. In the simulation studies, we find errors less than 10% between the theoretical mean and variance, and sample estimates of these quantities. Prediction of the intensity power spectrum (PS) in the real system exhibits good qualitative agreement (errors less than 3.5 dB for frequencies between 0.1 and 10 cyc/mm, but with somewhat higher error outside this range that may be due to the use of a window in the PS estimation procedure). We also replicate the common finding that the intensity mean is equal to its standard deviation (i.e., signal-to-noise ratio = 1) for fully developed speckle. We show how the derived statistical properties can be used to characterize the quality of an ultrasound linear array for low-contrast patterns using generalized noise-equivalent quanta directly on the intensity signal.
Statistical model of the powder flow regulation by nanomaterials
Kurfess, D.; Hinrichsen, H.; Zimmermann, I.
2005-01-01
Fine powders often tend to agglomerate due to van der Waals forces between the particles. These forces can be reduced significantly by covering the particles with nanoscaled adsorbates, as shown by recent experiments. In the present work a quantitative statistical analysis of the effect of powder flow regulating nanomaterials on the adhesive forces in powders is given. Covering two spherical powder particles randomly with nanoadsorbates we compute the decrease of the mutual van der Waals forc...
A Frequency Matching Method for Generation of a Priori Sample Models from Training Images
DEFF Research Database (Denmark)
Lange, Katrine; Cordua, Knud Skou; Frydendall, Jan
2011-01-01
This paper presents a Frequency Matching Method (FMM) for generation of a priori sample models based on training images and illustrates its use by an example. In geostatistics, training images are used to represent a priori knowledge or expectations of models, and the FMM can be used to generate...... new images that share the same multi-point statistics as a given training image. The FMM proceeds by iteratively updating voxel values of an image until the frequency of patterns in the image matches the frequency of patterns in the training image; making the resulting image statistically...... indistinguishable from the training image....
International Nuclear Information System (INIS)
Lim, Gyeong Hui
2008-03-01
This book consists of 15 chapters, which are basic conception and meaning of statistical thermodynamics, Maxwell-Boltzmann's statistics, ensemble, thermodynamics function and fluctuation, statistical dynamics with independent particle system, ideal molecular system, chemical equilibrium and chemical reaction rate in ideal gas mixture, classical statistical thermodynamics, ideal lattice model, lattice statistics and nonideal lattice model, imperfect gas theory on liquid, theory on solution, statistical thermodynamics of interface, statistical thermodynamics of a high molecule system and quantum statistics
Pathak, Lakshmi; Singh, Vineeta; Niwas, Ram; Osama, Khwaja; Khan, Saif; Haque, Shafiul; Tripathi, C K M; Mishra, B N
2015-01-01
Cholesterol oxidase (COD) is a bi-functional FAD-containing oxidoreductase which catalyzes the oxidation of cholesterol into 4-cholesten-3-one. The wider biological functions and clinical applications of COD have urged the screening, isolation and characterization of newer microbes from diverse habitats as a source of COD and optimization and over-production of COD for various uses. The practicability of statistical/ artificial intelligence techniques, such as response surface methodology (RSM), artificial neural network (ANN) and genetic algorithm (GA) have been tested to optimize the medium composition for the production of COD from novel strain Streptomyces sp. NCIM 5500. All experiments were performed according to the five factor central composite design (CCD) and the generated data was analysed using RSM and ANN. GA was employed to optimize the models generated by RSM and ANN. Based upon the predicted COD concentration, the model developed with ANN was found to be superior to the model developed with RSM. The RSM-GA approach predicted maximum of 6.283 U/mL COD production, whereas the ANN-GA approach predicted a maximum of 9.93 U/mL COD concentration. The optimum concentrations of the medium variables predicted through ANN-GA approach were: 1.431 g/50 mL soybean, 1.389 g/50 mL maltose, 0.029 g/50 mL MgSO4, 0.45 g/50 mL NaCl and 2.235 ml/50 mL glycerol. The experimental COD concentration was concurrent with the GA predicted yield and led to 9.75 U/mL COD production, which was nearly two times higher than the yield (4.2 U/mL) obtained with the un-optimized medium. This is the very first time we are reporting the statistical versus artificial intelligence based modeling and optimization of COD production by Streptomyces sp. NCIM 5500.
Statistical Evaluation of the Emissions Level Of CO, CO2 and HC Generated by Passenger Cars
Directory of Open Access Journals (Sweden)
Claudiu Ursu
2014-12-01
Full Text Available This paper aims to make an evaluation of differences emission level of CO, CO2 and HC generated by passenger cars in different walking regimes and times, to identify measures of reducing pollution. Was analyzed a sample of Dacia Logan passenger cars (n = 515, made during the period 2004-2007, equipped with spark ignition engines, assigned to emission standards EURO 3 (E3 and EURO4 (E4. These cars were evaluated at periodical technical inspection (ITP by two times in the two walk regimes (slow idle and accelerated idle. Using the t test for paired samples (Paired Samples T Test, the results showed that there are significant differences between emissions levels (CO, CO2, HC generated by Dacia Logan passenger cars at both assessments, and regression analysis showed that these differences are not significantly influenced by turnover differences.
2007-08-01
well (Lapata and Brew , 1999). In this case, the fact that a chair is a physical GOAL makes it more likely that tied means “to physically attach”. These...Ben Gurion University, Beer Sheva, Israel. Katrin Erk (2005). Frame assignment as word sense disambiguation. In Proceed- ings of the Sixth...Generation (INLG-2002), pp. 17–24. Harriman, NY. Maria Lapata and Chris Brew (1999). Using subcategorization to resolve verb class ambiguity. In
Statistical analysis and model validation of automobile emissions
2000-09-01
The article discusses the development of a comprehensive modal emissions model that is currently being integrated with a variety of transportation models as part of National Cooperative Highway Research Program project 25-11. Described is the second-...
Cross-Lingual Lexical Triggers in Statistical Language Modeling
National Research Council Canada - National Science Library
Kim, Woosung; Khudanpur, Sanjeev
2003-01-01
.... We achieve this through an extension of the method of lexical triggers to the cross-language problem, and by developing a likelihoodbased adaptation scheme for combining a trigger model with an N-gram model...
Energy Technology Data Exchange (ETDEWEB)
Lovejoy, S., E-mail: lovejoy@physics.mcgill.ca [Physics Department, McGill University, Montreal, Quebec H3A 2T8 (Canada); Lima, M. I. P. de [Institute of Marine Research (IMAR) and Marine and Environmental Sciences Centre (MARE), Coimbra (Portugal); Department of Civil Engineering, University of Coimbra, 3030-788 Coimbra (Portugal)
2015-07-15
Over the range of time scales from about 10 days to 30–100 years, in addition to the familiar weather and climate regimes, there is an intermediate “macroweather” regime characterized by negative temporal fluctuation exponents: implying that fluctuations tend to cancel each other out so that averages tend to converge. We show theoretically and numerically that macroweather precipitation can be modeled by a stochastic weather-climate model (the Climate Extended Fractionally Integrated Flux, model, CEFIF) first proposed for macroweather temperatures and we show numerically that a four parameter space-time CEFIF model can approximately reproduce eight or so empirical space-time exponents. In spite of this success, CEFIF is theoretically and numerically difficult to manage. We therefore propose a simplified stochastic model in which the temporal behavior is modeled as a fractional Gaussian noise but the spatial behaviour as a multifractal (climate) cascade: a spatial extension of the recently introduced ScaLIng Macroweather Model, SLIMM. Both the CEFIF and this spatial SLIMM model have a property often implicitly assumed by climatologists that climate statistics can be “homogenized” by normalizing them with the standard deviation of the anomalies. Physically, it means that the spatial macroweather variability corresponds to different climate zones that multiplicatively modulate the local, temporal statistics. This simplified macroweather model provides a framework for macroweather forecasting that exploits the system's long range memory and spatial correlations; for it, the forecasting problem has been solved. We test this factorization property and the model with the help of three centennial, global scale precipitation products that we analyze jointly in space and in time.
Lovejoy, S; de Lima, M I P
2015-07-01
Over the range of time scales from about 10 days to 30-100 years, in addition to the familiar weather and climate regimes, there is an intermediate "macroweather" regime characterized by negative temporal fluctuation exponents: implying that fluctuations tend to cancel each other out so that averages tend to converge. We show theoretically and numerically that macroweather precipitation can be modeled by a stochastic weather-climate model (the Climate Extended Fractionally Integrated Flux, model, CEFIF) first proposed for macroweather temperatures and we show numerically that a four parameter space-time CEFIF model can approximately reproduce eight or so empirical space-time exponents. In spite of this success, CEFIF is theoretically and numerically difficult to manage. We therefore propose a simplified stochastic model in which the temporal behavior is modeled as a fractional Gaussian noise but the spatial behaviour as a multifractal (climate) cascade: a spatial extension of the recently introduced ScaLIng Macroweather Model, SLIMM. Both the CEFIF and this spatial SLIMM model have a property often implicitly assumed by climatologists that climate statistics can be "homogenized" by normalizing them with the standard deviation of the anomalies. Physically, it means that the spatial macroweather variability corresponds to different climate zones that multiplicatively modulate the local, temporal statistics. This simplified macroweather model provides a framework for macroweather forecasting that exploits the system's long range memory and spatial correlations; for it, the forecasting problem has been solved. We test this factorization property and the model with the help of three centennial, global scale precipitation products that we analyze jointly in space and in time.
Vaittinada Ayar, Pradeebane; Vrac, Mathieu; Bastin, Sophie; Carreau, Julie
2014-05-01
Statistical downscaling models (SDM) appear now as complementary to dynamical downscaling. Most state-of-the-art SDMs can be classified into the four following (sometimes overlapping) approaches : Transfert Functions, Weather Typing, Stochastic Weather Generator and Bias Correction. Here, we aim to perform an Intercomparison exercice of several SDMs of Precipitation at high resolution. Those are tested with selected predictors from ERA-Interim reanalysis data over the EURO-CORDEX domain. The SDMs intercomparison is performed via a cross-validation over the last 30 years. In this work, we focus on relevant indicators to assess the quality of the simulations compared to observations in terms of spatial, temporal and extremes properties. These indicators will allow us to characterize uncertainties associated to the different simulations and point out their main weaknesses. Hence, this work will further help us to target the needed improvements of the existing models as well as provide statiscally simulated time series to be compared to RCM outputs in the MED- and EURO-CORDEX framework. This work stands within the French ANR project ``Statistical Regionalization Models Intercomparisons and hydrological impacts Project'' (StaRMIP, 2013-2016).
A statistical framework for modeling HLA-dependent T cell response data.
Directory of Open Access Journals (Sweden)
Jennifer Listgarten
2007-10-01
Full Text Available The identification of T cell epitopes and their HLA (human leukocyte antigen restrictions is important for applications such as the design of cellular vaccines for HIV. Traditional methods for such identification are costly and time-consuming. Recently, a more expeditious laboratory technique using ELISpot assays has been developed that allows for rapid screening of specific responses. However, this assay does not directly provide information concerning the HLA restriction of a response, a critical piece of information for vaccine design. Thus, we introduce, apply, and validate a statistical model for identifying HLA-restricted epitopes from ELISpot data. By looking at patterns across a broad range of donors, in conjunction with our statistical model, we can determine (probabilistically which of the HLA alleles are likely to be responsible for the observed reactivities. Additionally, we can provide a good estimate of the number of false positives generated by our analysis (i.e., the false discovery rate. This model allows us to learn about new HLA-restricted epitopes from ELISpot data in an efficient, cost-effective, and high-throughput manner. We applied our approach to data from donors infected with HIV and identified many potential new HLA restrictions. Among 134 such predictions, six were confirmed in the lab and the remainder could not be ruled as invalid. These results shed light on the extent of HLA class I promiscuity, which has significant implications for the understanding of HLA class I antigen presentation and vaccine development.
Dabanlı, İsmail; Şen, Zekai
2018-04-01
The statistical climate downscaling model by the Turkish Water Foundation (TWF) is further developed and applied to a set of monthly precipitation records. The model is structured by two phases as spatial (regional) and temporal downscaling of global circulation model (GCM) scenarios. The TWF model takes into consideration the regional dependence function (RDF) for spatial structure and Markov whitening process (MWP) for temporal characteristics of the records to set projections. The impact of climate change on monthly precipitations is studied by downscaling Intergovernmental Panel on Climate Change-Special Report on Emission Scenarios (IPCC-SRES) A2 and B2 emission scenarios from Max Plank Institute (EH40PYC) and Hadley Center (HadCM3). The main purposes are to explain the TWF statistical climate downscaling model procedures and to expose the validation tests, which are rewarded in same specifications as "very good" for all stations except one (Suhut) station in the Akarcay basin that is in the west central part of Turkey. Eventhough, the validation score is just a bit lower at the Suhut station, the results are "satisfactory." It is, therefore, possible to say that the TWF model has reasonably acceptable skill for highly accurate estimation regarding standard deviation ratio (SDR), Nash-Sutcliffe efficiency (NSE), and percent bias (PBIAS) criteria. Based on the validated model, precipitation predictions are generated from 2011 to 2100 by using 30-year reference observation period (1981-2010). Precipitation arithmetic average and standard deviation have less than 5% error for EH40PYC and HadCM3 SRES (A2 and B2) scenarios.
Statistical modelling of spatio-temporal dependencies in NGS data
Ranciati, Saverio
2016-01-01
Next-generation sequencing (NGS) heeft zich snel gevestigd als de huidige standaard in de genetische analyse. Deze omschakeling van microarray naar NGS vereist nieuwe statistische strategieën om de onderzoeksvragen aan te pakken. Ten eerste, NGS data bestaat uit discrete waarnemingen, meestal
Regional temperature models are needed for characterizing and mapping stream thermal regimes, establishing reference conditions, predicting future impacts and identifying critical thermal refugia. Spatial statistical models have been developed to improve regression modeling techn...
Induction generator models in dynamic simulation tools
DEFF Research Database (Denmark)
Knudsen, Hans; Akhmatov, Vladislav
1999-01-01
For AC network with large amount of induction generators (windmills) the paper demonstrates a significant discrepancy in the simulated voltage recovery after fault in weak networks when comparing dynamic and transient stability descriptions and the reasons of discrepancies are explained. It is fo......For AC network with large amount of induction generators (windmills) the paper demonstrates a significant discrepancy in the simulated voltage recovery after fault in weak networks when comparing dynamic and transient stability descriptions and the reasons of discrepancies are explained...... to a tunny generator through a shaft....
Central Limit Theorem for Exponentially Quasi-local Statistics of Spin Models on Cayley Graphs
Reddy, Tulasi Ram; Vadlamani, Sreekar; Yogeshwaran, D.
2018-04-01
Central limit theorems for linear statistics of lattice random fields (including spin models) are usually proven under suitable mixing conditions or quasi-associativity. Many interesting examples of spin models do not satisfy mixing conditions, and on the other hand, it does not seem easy to show central limit theorem for local statistics via quasi-associativity. In this work, we prove general central limit theorems for local statistics and exponentially quasi-local statistics of spin models on discrete Cayley graphs with polynomial growth. Further, we supplement these results by proving similar central limit theorems for random fields on discrete Cayley graphs taking values in a countable space, but under the stronger assumptions of α -mixing (for local statistics) and exponential α -mixing (for exponentially quasi-local statistics). All our central limit theorems assume a suitable variance lower bound like many others in the literature. We illustrate our general central limit theorem with specific examples of lattice spin models and statistics arising in computational topology, statistical physics and random networks. Examples of clustering spin models include quasi-associated spin models with fast decaying covariances like the off-critical Ising model, level sets of Gaussian random fields with fast decaying covariances like the massive Gaussian free field and determinantal point processes with fast decaying kernels. Examples of local statistics include intrinsic volumes, face counts, component counts of random cubical complexes while exponentially quasi-local statistics include nearest neighbour distances in spin models and Betti numbers of sub-critical random cubical complexes.
Oseloka Ezepue, Patrick; Ojo, Adegbola
2012-12-01
A challenging problem in some developing countries such as Nigeria is inadequate training of students in effective problem solving using the core concepts of their disciplines. Related to this is a disconnection between their learning and socio-economic development agenda of a country. These problems are more vivid in statistical education which is dominated by textbook examples and unbalanced assessment 'for' and 'of' learning within traditional curricula. The problems impede the achievement of socio-economic development objectives such as those stated in the Nigerian Vision 2020 blueprint and United Nations Millennium Development Goals. They also impoverish the ability of (statistics) graduates to creatively use their knowledge in relevant business and industry sectors, thereby exacerbating mass graduate unemployment in Nigeria and similar developing countries. This article uses a case study in statistical modelling to discuss the nature of innovations in statistics education vital to producing new kinds of graduates who can link their learning to national economic development goals, create wealth and alleviate poverty through (self) employment. Wider implications of the innovations for repositioning mathematical sciences education globally are explored in this article.
Statistical Ensembles With Finite Bath: A Description for an Event Generator
Hauer, M.
2009-01-01
A Monte Carlo event generator has been developed assuming thermal production of hadrons. The system under consideration is sampled grand canonically in the Boltzmann approximation. A re-weighting scheme is then introduced to account for conservation of charges (baryon number, strangeness, electric charge) and energy and momentum, effectively allowing for extrapolation of grand canonical results to the microcanonical limit. This method has two strong advantages compared to analytical approaches and standard microcanonical Monte Carlo techniques, in that it is capable of handling resonance decays as well as (very) large system sizes.
A Statistical Model for Natural Gas Standardized Load Profiles
Czech Academy of Sciences Publication Activity Database
Brabec, Marek; Konár, Ondřej; Malý, Marek; Pelikán, Emil; Vondráček, Jiří
2009-01-01
Roč. 58, č. 1 (2009), s. 123-139 ISSN 0035-9254 R&D Projects: GA AV ČR 1ET400300513 Institutional research plan: CEZ:AV0Z10300504 Keywords : disaggregation * generalized additive models * multiplicative model * non-linear effects * segmentation * semiparametric regression model Subject RIV: JE - Non-nuclear Energetics, Energy Consumption ; Use Impact factor: 1.060, year: 2009
Carrier Statistics and Quantum Capacitance Models of Graphene Nanoscroll
Directory of Open Access Journals (Sweden)
M. Khaledian
2014-01-01
schematic perfect scroll-like Archimedes spiral. The DOS model was derived at first, while it was later applied to compute the carrier concentration and quantum capacitance model. Furthermore, the carrier concentration and quantum capacitance were modeled for both degenerate and nondegenerate regimes, along with examining the effect of structural parameters and chirality number on the density of state and carrier concentration. Latterly, the temperature effect on the quantum capacitance was studied too.
Role of scaling in the statistical modelling of finance
Indian Academy of Sciences (India)
Abstract. Modelling the evolution of a financial index as a stochastic process is a prob- lem awaiting a full, satisfactory solution since it was first formulated by Bachelier in 1900. Here it is shown that the scaling with time of the return probability density function sampled from the historical series suggests a successful model.
Statistical shape model with random walks for inner ear segmentation
DEFF Research Database (Denmark)
Pujadas, Esmeralda Ruiz; Kjer, Hans Martin; Piella, Gemma
2016-01-01
Cochlear implants can restore hearing to completely or partially deaf patients. The intervention planning can be aided by providing a patient-specific model of the inner ear. Such a model has to be built from high resolution images with accurate segmentations. Thus, a precise segmentation is requ...
Recent advances in importance sampling for statistical model checking
Reijsbergen, D.P.; de Boer, Pieter-Tjerk; Scheinhardt, Willem R.W.; Haverkort, Boudewijn R.H.M.
2013-01-01
In the following work we present an overview of recent advances in rare event simulation for model checking made at the University of Twente. The overview is divided into the several model classes for which we propose algorithms, namely multicomponent systems, Markov chains and stochastic Petri
Role of scaling in the statistical modelling of finance
Indian Academy of Sciences (India)
Modelling the evolution of a financial index as a stochastic process is a problem awaiting a full, satisfactory solution since it was first formulated by Bachelier in 1900. Here it is shown that the scaling with time of the return probability density function sampled from the historical series suggests a successful model.
Statistical model of stress corrosion cracking based on extended
Indian Academy of Sciences (India)
The mechanism of stress corrosion cracking (SCC) has been discussed for decades. Here I propose a model of SCC reflecting the feature of fracture in brittle manner based on the variational principle under approximately supposed thermal equilibrium. In that model the functionals are expressed with extended forms of ...
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
The mechanism of stress corrosion cracking (SCC) has been discussed for decades. Here I propose a model of SCC reflecting the feature of fracture in brittle manner based on the variational principle under approximately supposed thermal equilibrium. In that model the functionals are expressed with extended forms of ...
Statistical model of stress corrosion cracking based on extended ...
Indian Academy of Sciences (India)
2013-12-01
Dec 1, 2013 ... Abstract. The mechanism of stress corrosion cracking (SCC) has been discussed for decades. Here I propose a model of SCC reflecting the feature of fracture in brittle manner based on the vari- ational principle under approximately supposed thermal equilibrium. In that model the functionals are expressed ...
Study on Semi-Parametric Statistical Model of Safety Monitoring of Cracks in Concrete Dams
Directory of Open Access Journals (Sweden)
Chongshi Gu
2013-01-01
Full Text Available Cracks are one of the hidden dangers in concrete dams. The study on safety monitoring models of concrete dam cracks has always been difficult. Using the parametric statistical model of safety monitoring of cracks in concrete dams, with the help of the semi-parametric statistical theory, and considering the abnormal behaviors of these cracks, the semi-parametric statistical model of safety monitoring of concrete dam cracks is established to overcome the limitation of the parametric model in expressing the objective model. Previous projects show that the semi-parametric statistical model has a stronger fitting effect and has a better explanation for cracks in concrete dams than the parametric statistical model. However, when used for forecast, the forecast capability of the semi-parametric statistical model is equivalent to that of the parametric statistical model. The modeling of the semi-parametric statistical model is simple, has a reasonable principle, and has a strong practicality, with a good application prospect in the actual project.
Reflections on the Baron and Kenny model of statistical mediation
Directory of Open Access Journals (Sweden)
Antonio Pardo
2013-05-01
Full Text Available In the 25 years since Baron and Kenny (1986 published their ideas on how to analyze and interpret statistical mediation, few works have been more cited, and perhaps, so decisively influenced the way applied researchers understand and analyze mediation in social and health sciences. However, the widespread use of a procedure does not necessarily make it a safe or reliable strategy. In fact, during these years, many researchers have pointed out the limitations of the procedure Baron and Kenny proposed for demonstrating mediation. The twofold aim of this paper is to (1 carry out a review of the limitations of the method by Baron and Kenny, with particular attention to the weakness in the confirmatory logic of the procedure, and (2 provide an empirical example that, in applying the method, data obtained from the same theoretical scenario (i.e., with or without the presence of mediation can be compatible with both the mediation and no-mediation hypotheses.
Statistical modelling of Poisson/log-normal data
International Nuclear Information System (INIS)
Miller, G.
2007-01-01
In statistical data fitting, self consistency is checked by examining the closeness of the quantity Χ 2 /NDF to 1, where Χ 2 is the sum of squares of data minus fit divided by standard deviation, and NDF is the number of data minus the number of fit parameters. In order to calculate Χ 2 one needs an expression for the standard deviation. In this note several alternative expressions for the standard deviation of data distributed according to a Poisson/log-normal distribution are proposed and evaluated by Monte Carlo simulation. Two preferred alternatives are identified. The use of replicate data to obtain uncertainty is problematic for a small number of replicates. A method to correct this problem is proposed. The log-normal approximation is good for sufficiently positive data. A modification of the log-normal approximation is proposed, which allows it to be used to test the hypothesis that the true value is zero. (authors)
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Statistical Texture Model for mass Detection in Mammography
Directory of Open Access Journals (Sweden)
Nicolás Gallego-Ortiz
2013-12-01
Full Text Available In the context of image processing algorithms for mass detection in mammography, texture is a key feature to be used to distinguish abnormal tissue from normal tissue. Recently, a texture model based on a multivariate Gaussian mixture was proposed, of which the parameters are learned in an unsupervised way from the pixel intensities of images. The model produces images that are probabilistic maps of texture normality and it was proposed as a visualization aid for diagnostic by clinical experts. In this paper, the usability of the model is studied for automatic mass detection. A segmentation strategy is proposed and evaluated using 79 mammography cases.
Accelerated testing statistical models, test plans, and data analysis
Nelson, Wayne B
2009-01-01
The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "". . . a goldmine of knowledge on accelerated life testing principles and practices . . . one of the very few capable of advancing the science of reliability. It definitely belongs in every bookshelf on engineering.""-Dev G.
Computational modeling of neural activities for statistical inference
Kolossa, Antonio
2016-01-01
This authored monograph supplies empirical evidence for the Bayesian brain hypothesis by modeling event-related potentials (ERP) of the human electroencephalogram (EEG) during successive trials in cognitive tasks. The employed observer models are useful to compute probability distributions over observable events and hidden states, depending on which are present in the respective tasks. Bayesian model selection is then used to choose the model which best explains the ERP amplitude fluctuations. Thus, this book constitutes a decisive step towards a better understanding of the neural coding and computing of probabilities following Bayesian rules. The target audience primarily comprises research experts in the field of computational neurosciences, but the book may also be beneficial for graduate students who want to specialize in this field. .
A Statistical Model of Current Loops and Magnetic Monopoles
International Nuclear Information System (INIS)
Ayyer, Arvind
2015-01-01
We formulate a natural model of loops and isolated vertices for arbitrary planar graphs, which we call the monopole-dimer model. We show that the partition function of this model can be expressed as a determinant. We then extend the method of Kasteleyn and Temperley-Fisher to calculate the partition function exactly in the case of rectangular grids. This partition function turns out to be a square of a polynomial with positive integer coefficients when the grid lengths are even. Finally, we analyse this formula in the infinite volume limit and show that the local monopole density, free energy and entropy can be expressed in terms of well-known elliptic functions. Our technique is a novel determinantal formula for the partition function of a model of isolated vertices and loops for arbitrary graphs
Directory of Open Access Journals (Sweden)
Dongkyun Kim
2014-01-01
Full Text Available A novel approach for a Poisson cluster stochastic rainfall generator was validated in its ability to reproduce important rainfall and watershed response characteristics at 104 locations in the United States. The suggested novel approach, The Hybrid Model (THM, as compared to the traditional Poisson cluster rainfall modeling approaches, has an additional capability to account for the interannual variability of rainfall statistics. THM and a traditional approach of Poisson cluster rainfall model (modified Bartlett-Lewis rectangular pulse model were compared in their ability to reproduce the characteristics of extreme rainfall and watershed response variables such as runoff and peak flow. The results of the comparison indicate that THM generally outperforms the traditional approach in reproducing the distributions of peak rainfall, peak flow, and runoff volume. In addition, THM significantly outperformed the traditional approach in reproducing extreme rainfall by 2.3% to 66% and extreme flow values by 32% to 71%.
Statistical model based gender prediction for targeted NGS clinical panels
Directory of Open Access Journals (Sweden)
Palani Kannan Kandavel
2017-12-01
The reference test dataset are being used to test the model. The sensitivity on predicting the gender has been increased from the current “genotype composition in ChrX” based approach. In addition, the prediction score given by the model can be used to evaluate the quality of clinical dataset. The higher prediction score towards its respective gender indicates the higher quality of sequenced data.
Illness-death model: statistical perspective and differential equations.
Brinks, Ralph; Hoyer, Annika
2018-01-27
The aim of this work is to relate the theory of stochastic processes with the differential equations associated with multistate (compartment) models. We show that the Kolmogorov Forward Differential Equations can be used to derive a relation between the prevalence and the transition rates in the illness-death model. Then, we prove mathematical well-definedness and epidemiological meaningfulness of the prevalence of the disease. As an application, we derive the incidence of diabetes from a series of cross-sections.
Statistical geological discrete fracture network model. Forsmark modelling stage 2.2
International Nuclear Information System (INIS)
Fox, Aaron; La Pointe, Paul; Simeonov, Assen; Hermanson, Jan; Oehman, Johan
2007-11-01
The Swedish Nuclear Fuel and Waste Management Company (SKB) is performing site characterization at two different locations, Forsmark and Laxemar, in order to locate a site for a final geologic repository for spent nuclear fuel. The program is built upon the development of Site Descriptive Models (SDMs) at specific timed data freezes. Each SDM is formed from discipline-specific reports from across the scientific spectrum. This report describes the methods, analyses, and conclusions of the geological modeling team with respect to a geological and statistical model of fractures and minor deformation zones (henceforth referred to as the geological DFN), version 2.2, at the Forsmark site. The geological DFN builds upon the work of other geological modelers, including the deformation zone (DZ), rock domain (RD), and fracture domain (FD) models. The geological DFN is a statistical model for stochastically simulating rock fractures and minor deformation zones as a scale of less than 1,000 m (the lower cut-off of the DZ models). The geological DFN is valid within four specific fracture domains inside the local model region, and encompassing the candidate volume at Forsmark: FFM01, FFM02, FFM03, and FFM06. The models are build using data from detailed surface outcrop maps and the cored borehole record at Forsmark. The conceptual model for the Forsmark 2.2 geological revolves around the concept of orientation sets; for each fracture domain, other model parameters such as size and intensity are tied to the orientation sets. Two classes of orientation sets were described; Global sets, which are encountered everywhere in the model region, and Local sets, which represent highly localized stress environments. Orientation sets were described in terms of their general cardinal direction (NE, NW, etc). Two alternatives are presented for fracture size modeling: - the tectonic continuum approach (TCM, TCMF) described by coupled size-intensity scaling following power law distributions
Statistical geological discrete fracture network model. Forsmark modelling stage 2.2
Energy Technology Data Exchange (ETDEWEB)
Fox, Aaron; La Pointe, Paul [Golder Associates Inc (United States); Simeonov, Assen [Swedish Nuclear Fuel and Waste Management Co., Stockholm (Sweden); Hermanson, Jan; Oehman, Johan [Golder Associates AB, Stockholm (Sweden)
2007-11-15
The Swedish Nuclear Fuel and Waste Management Company (SKB) is performing site characterization at two different locations, Forsmark and Laxemar, in order to locate a site for a final geologic repository for spent nuclear fuel. The program is built upon the development of Site Descriptive Models (SDMs) at specific timed data freezes. Each SDM is formed from discipline-specific reports from across the scientific spectrum. This report describes the methods, analyses, and conclusions of the geological modeling team with respect to a geological and statistical model of fractures and minor deformation zones (henceforth referred to as the geological DFN), version 2.2, at the Forsmark site. The geological DFN builds upon the work of other geological modelers, including the deformation zone (DZ), rock domain (RD), and fracture domain (FD) models. The geological DFN is a statistical model for stochastically simulating rock fractures and minor deformation zones as a scale of less than 1,000 m (the lower cut-off of the DZ models). The geological DFN is valid within four specific fracture domains inside the local model region, and encompassing the candidate volume at Forsmark: FFM01, FFM02, FFM03, and FFM06. The models are build using data from detailed surface outcrop maps and the cored borehole record at Forsmark. The conceptual model for the Forsmark 2.2 geological revolves around the concept of orientation sets; for each fracture domain, other model parameters such as size and intensity are tied to the orientation sets. Two classes of orientation sets were described; Global sets, which are encountered everywhere in the model region, and Local sets, which represent highly localized stress environments. Orientation sets were described in terms of their general cardinal direction (NE, NW, etc). Two alternatives are presented for fracture size modeling: - the tectonic continuum approach (TCM, TCMF) described by coupled size-intensity scaling following power law distributions
Modelling West African Total Precipitation Depth: A Statistical Approach
Directory of Open Access Journals (Sweden)
S. Sovoe
2015-09-01
Full Text Available Even though several reports over the past few decades indicate an increasing aridity over West Africa, attempts to establish the controlling factor(s have not been successful. The traditional belief of the position of the Inter-tropical Convergence Zone (ITCZ as the predominant factor over the region has been refuted by recent findings. Changes in major atmospheric circulations such as African Easterly Jet (AEJ and Tropical Easterly Jet (TEJ are being cited as major precipitation driving forces over the region. Thus, any attempt to predict long term precipitation events over the region using Global Circulation or Local Circulation Models could be flawed as the controlling factors are not fully elucidated yet. Successful prediction effort may require models which depend on past events as their inputs as in the case of time series models such as Autoregressive Integrated Moving Average (ARIMA model. In this study, historical precipitation data was imported as time series data structure into an R programming language and was used to build appropriate Seasonal Multiplicative Autoregressive Integrated Moving Average model, ARIMA (p, d, q*(P, D, Q. The model was then used to predict long term precipitation events over the Ghanaian segment of the Volta Basin which could be used in planning and implementation of development policies.
Simulink Code Generation: Tutorial for Generating C Code from Simulink Models using Simulink Coder
MolinaFraticelli, Jose Carlos
2012-01-01
This document explains all the necessary steps in order to generate optimized C code from Simulink Models. This document also covers some general information on good programming practices, selection of variable types, how to organize models and subsystems, and finally how to test the generated C code and compare it with data from MATLAB.
Craniofacial statistical deformation models of wild-type mice and Crouzon mice
Ólafsdóttir, Hildur; Darvann, Tron A.; Ersbøll, Bjarne K.; Hermann, Nuno V.; Oubel, Estanislao; Larsen, Rasmus; Frangi, Alejandro F.; Larsen, Per; Perlyn, Chad A.; Morriss-Kay, Gillian M.; Kreiborg, Sven
2007-03-01
Crouzon syndrome is characterised by premature fusion of cranial sutures and synchondroses leading to craniofacial growth disturbances. The gene causing the syndrome was discovered approximately a decade ago and recently the first mouse model of the syndrome was generated. In this study, a set of Micro CT scans of the heads of wild-type (normal) mice and Crouzon mice were investigated. Statistical deformation models were built to assess the anatomical differences between the groups, as well as the within-group anatomical variation. Following the approach by Rueckert et al. we built an atlas using B-spline-based nonrigid registration and subsequently, the atlas was nonrigidly registered to the cases being modelled. The parameters of these registrations were then used as input to a PCA. Using different sets of registration parameters, different models were constructed to describe (i) the difference between the two groups in anatomical variation and (ii) the within-group variation. These models confirmed many known traits in the wild-type and Crouzon mouse craniofacial anatomy. However, they also showed some new traits.
Directory of Open Access Journals (Sweden)
Gary L. Brase
2017-11-01
Full Text Available Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk. However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models. Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings.
Directory of Open Access Journals (Sweden)
B. Azzouz
2007-01-01
Full Text Available The textile fibre mixture as a multicomponent blend of variable fibres imposes regarding the proper method to predict the characteristics of the final blend. The length diagram and the fibrogram of cotton are generated. Then the length distribution, the length diagram, and the fibrogram of a blend of different categories of cotton are determined. The length distributions by weight of five different categories of cotton (Egyptian, USA (Pima, Brazilian, USA (Upland, and Uzbekistani are measured by AFIS. From these distributions, the length distribution, the length diagram, and the fibrogram by weight of four binary blends are expressed. The length parameters of these cotton blends are calculated and their variations are plotted against the mass fraction x of one component in the blend .These calculated parameters are compared to those of real blends. Finally, the selection of the optimal blends using the linear programming method, based on the hypothesis that the cotton blend parameters vary linearly in function of the components rations, is proved insufficient.
Peters, J. M.; Kravtsov, S.
2011-12-01
This study quantifies the dependence of nonlinear regimes (manifested in non-gaussian probability distributions) and spreads of ensemble trajectories in a reduced phase space of a realistic three-layer quasi-geostrophic (QG3) atmospheric model on this model's climate state.To elucidate probabilistic properties of the QG3 trajectories, we compute, in phase planes of leading EOFs of the model, the coefficients of the corresponding Fokker-Planck (FP) equations. These coefficients represent drift vectors (computed from one-day phase space tendencies) and diffusion tensors (computed from one-day lagged covariance matrices of model trajectory displacements), and are based on a long QG3 simulation. We also fit two statistical trajectory models to the reduced phase-space time series spanned by the full QG3 model states. One reduced model is a standard Linear Inverse Model (LIM) fitted to a long QG3 time series. The LIM model is forced by state-independent (additive) noise and has a deterministic operator which represents non-divergent velocity field in the reduced phase space considered. The other, more advanced model (NSM), is nonlinear, divergent, and is driven by state-dependent noise. The NSM model mimics well the full QG3 model trajectory behavior in the reduced phase space; its corresponding FP model is nearly identical to that based on the full QG3 simulations. By systematic analysis of the differences between the drift vectors and diffusion tensors of the QG3-based, NSM-based, and LIM-based FP models, as well as the PDF evolution simulated by these FP models, we disentangle the contributions of the multiplicative noise and deterministic dynamics into nonlinear behavior and predictability of the atmospheric states produced by the dynamical QG3 model.
Physics-based statistical model and simulation method of RF propagation in urban environments
Pao, Hsueh-Yuan; Dvorak, Steven L.
2010-09-14
A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.
Workflow Generation from the Two-Hemisphere Model
Directory of Open Access Journals (Sweden)
Gusarovs Konstantīns
2017-12-01
Full Text Available Model-Driven Software Development (MDSD is a trend in Software Development that focuses on code generation from various kinds of models. To perform such a task, it is necessary to develop an algorithm that performs source model transformation into the target model, which ideally is an actual software code written in some kind of a programming language. However, at present a lot of methods focus on Unified Modelling Language (UML diagram generation. The present paper describes a result of authors’ research on Two-Hemisphere Model (2HM processing for easier code generation.
Sharing brain mapping statistical results with the neuroimaging data model
Maumet, Camille; Auer, Tibor; Bowring, Alexander; Chen, Gang; Das, Samir; Flandin, Guillaume; Ghosh, Satrajit; Glatard, Tristan; Gorgolewski, Krzysztof J.; Helmer, Karl G.; Jenkinson, Mark; Keator, David B.; Nichols, B. Nolan; Poline, Jean-Baptiste; Reynolds, Richard; Sochat, Vanessa; Turner, Jessica; Nichols, Thomas E.
2016-01-01
Only a tiny fraction of the data and metadata produced by an fMRI study is finally conveyed to the community. This lack of transparency not only hinders the reproducibility of neuroimaging results but also impairs future meta-analyses. In this work we introduce NIDM-Results, a format specification providing a machine-readable description of neuroimaging statistical results along with key image data summarising the experiment. NIDM-Results provides a unified representation of mass univariate analyses including a level of detail consistent with available best practices. This standardized representation allows authors to relay methods and results in a platform-independent regularized format that is not tied to a particular neuroimaging software package. Tools are available to export NIDM-Result graphs and associated files from the widely used SPM and FSL software packages, and the NeuroVault repository can import NIDM-Results archives. The specification is publically available at: http://nidm.nidash.org/specs/nidm-results.html. PMID:27922621
From intuition to statistics in building subsurface structural models
Brandenburg, J.P.; Alpak, F.O.; Naruk, S.; Solum, J.
2011-01-01
Experts associated with the oil and gas exploration industry suggest that combining forward trishear models with stochastic global optimization algorithms allows a quantitative assessment of the uncertainty associated with a given structural model. The methodology is applied to incompletely imaged structures related to deepwater hydrocarbon reservoirs and results are compared to prior manual palinspastic restorations and borehole data. This methodology is also useful for extending structural interpretations into other areas of limited resolution, such as subsalt in addition to extrapolating existing data into seismic data gaps. This technique can be used for rapid reservoir appraisal and potentially have other applications for seismic processing, well planning, and borehole stability analysis.
Monitoring and statistical modelling of sedimentation in gully pots
Post, J.A.B.; Pothof, I.W.M.; Dirksen, J.; Baars, E. J.; Langeveld, J.G.; Clemens, F.H.L.R.
2016-01-01
Gully pots are essential assets designed to relief the downstream system by trapping solids and attached pollutants suspended in runoff. This study applied a methodology to develop a quantitative gully pot sedimentation and blockage model. To this end, sediment bed level time series from 300
Uncertainty analysis in statistical modeling of extreme hydrological events
Xu, YuePing; Booij, Martijn J.; Tong, Yang-Bin
2010-01-01
With the increase of both magnitude and frequency of hydrological extreme events such as drought and flooding, the significance of adequately modeling hydrological extreme events is fully recognized. Estimation of extreme rainfall/flood for various return periods is of prime importance for
Statistical Modelling of Fishing Activities in the North Atlantic
Fernández, C.; Ley, E.; Steel, M.F.J.
1997-01-01
This paper deals with the issue of modeling daily catches of fishing boats in the Grand Bank fishing grounds. We have data on catches per species for a number of vessels collected by the European Union in the context of the North Atlantic Fisheries Organization. Many variables can be thought to
A simple statistical signal loss model for deep underground garage
DEFF Research Database (Denmark)
Nguyen, Huan Cong; Gimenez, Lucas Chavarria; Kovacs, Istvan
2016-01-01
In this paper we address the channel modeling aspects for a deep-indoor scenario with extreme coverage conditions in terms of signal losses, namely underground garage areas. We provide an in-depth analysis in terms of path loss (gain) and large scale signal shadowing, and a propose simple...
Top quark event modelling and generators
Rahmat, Rahmat
2016-01-01
State-of-the-art theoretical predictions accurate to next-to-leading order QCD interfaced with Pythia8 and Herwig++ event generators are tested by comparing the unfolded ttbar differential data collected with the CMS detector at 8 TeV. These predictions are also compared with the underlying event activity distributions in ttbar events using CMS proton-proton data collected in 2015 at a center of mass energy of 13 TeV.