WorldWideScience

Sample records for models statistically significant

  1. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    Science.gov (United States)

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  2. Statistically significant relational data mining :

    Energy Technology Data Exchange (ETDEWEB)

    Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.

    2014-02-01

    This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.

  3. Statistical Significance for Hierarchical Clustering

    Science.gov (United States)

    Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

    2017-01-01

    Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

  4. Statistical significance versus clinical relevance.

    Science.gov (United States)

    van Rijn, Marieke H C; Bech, Anneke; Bouyer, Jean; van den Brand, Jan A J G

    2017-04-01

    In March this year, the American Statistical Association (ASA) posted a statement on the correct use of P-values, in response to a growing concern that the P-value is commonly misused and misinterpreted. We aim to translate these warnings given by the ASA into a language more easily understood by clinicians and researchers without a deep background in statistics. Moreover, we intend to illustrate the limitations of P-values, even when used and interpreted correctly, and bring more attention to the clinical relevance of study findings using two recently reported studies as examples. We argue that P-values are often misinterpreted. A common mistake is saying that P < 0.05 means that the null hypothesis is false, and P ≥0.05 means that the null hypothesis is true. The correct interpretation of a P-value of 0.05 is that if the null hypothesis were indeed true, a similar or more extreme result would occur 5% of the times upon repeating the study in a similar sample. In other words, the P-value informs about the likelihood of the data given the null hypothesis and not the other way around. A possible alternative related to the P-value is the confidence interval (CI). It provides more information on the magnitude of an effect and the imprecision with which that effect was estimated. However, there is no magic bullet to replace P-values and stop erroneous interpretation of scientific results. Scientists and readers alike should make themselves familiar with the correct, nuanced interpretation of statistical tests, P-values and CIs. © The Author 2017. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.

  5. Statistical significance of cis-regulatory modules

    Directory of Open Access Journals (Sweden)

    Smith Andrew D

    2007-01-01

    Full Text Available Abstract Background It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. Results We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. Conclusion The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM and MODSTORM software.

  6. Intelligent system for statistically significant expertise knowledge on the basis of the model of self-organizing nonequilibrium dissipative system

    Directory of Open Access Journals (Sweden)

    E. A. Tatokchin

    2017-01-01

    Full Text Available Development of the modern educational technologies caused by broad introduction of comput-er testing and development of distant forms of education does necessary revision of methods of an examination of pupils. In work it was shown, need transition to mathematical criteria, exami-nations of knowledge which are deprived of subjectivity. In article the review of the problems arising at realization of this task and are offered approaches for its decision. The greatest atten-tion is paid to discussion of a problem of objective transformation of rated estimates of the ex-pert on to the scale estimates of the student. In general, the discussion this question is was con-cluded that the solution to this problem lies in the creation of specialized intellectual systems. The basis for constructing intelligent system laid the mathematical model of self-organizing nonequilibrium dissipative system, which is a group of students. This article assumes that the dissipative system is provided by the constant influx of new test items of the expert and non-equilibrium – individual psychological characteristics of students in the group. As a result, the system must self-organize themselves into stable patterns. This patern will allow for, relying on large amounts of data, get a statistically significant assessment of student. To justify the pro-posed approach in the work presents the data of the statistical analysis of the results of testing a large sample of students (> 90. Conclusions from this statistical analysis allowed to develop intelligent system statistically significant examination of student performance. It is based on data clustering algorithm (k-mean for the three key parameters. It is shown that this approach allows you to create of the dynamics and objective expertise evaluation.

  7. The thresholds for statistical and clinical significance

    DEFF Research Database (Denmark)

    Jakobsen, Janus Christian; Gluud, Christian; Winkel, Per

    2014-01-01

    BACKGROUND: Thresholds for statistical significance are insufficiently demonstrated by 95% confidence intervals or P-values when assessing results from randomised clinical trials. First, a P-value only shows the probability of getting a result assuming that the null hypothesis is true and does...... not reflect the probability of getting a result assuming an alternative hypothesis to the null hypothesis is true. Second, a confidence interval or a P-value showing significance may be caused by multiplicity. Third, statistical significance does not necessarily result in clinical significance. Therefore...... of the probability that a given trial result is compatible with a 'null' effect (corresponding to the P-value) divided by the probability that the trial result is compatible with the intervention effect hypothesised in the sample size calculation; (3) adjust the confidence intervals and the statistical significance...

  8. The insignificance of statistical significance testing

    Science.gov (United States)

    Johnson, Douglas H.

    1999-01-01

    Despite their use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.

  9. Swiss solar power statistics 2007 - Significant expansion

    International Nuclear Information System (INIS)

    Hostettler, T.

    2008-01-01

    This article presents and discusses the 2007 statistics for solar power in Switzerland. A significant number of new installations is noted as is the high production figures from newer installations. The basics behind the compilation of the Swiss solar power statistics are briefly reviewed and an overview for the period 1989 to 2007 is presented which includes figures on the number of photovoltaic plant in service and installed peak power. Typical production figures in kilowatt-hours (kWh) per installed kilowatt-peak power (kWp) are presented and discussed for installations of various sizes. Increased production after inverter replacement in older installations is noted. Finally, the general political situation in Switzerland as far as solar power is concerned are briefly discussed as are international developments.

  10. Significant Statistics: Viewed with a Contextual Lens

    Science.gov (United States)

    Tait-McCutcheon, Sandi

    2010-01-01

    This paper examines the pedagogical and organisational changes three lead teachers made to their statistics teaching and learning programs. The lead teachers posed the research question: What would the effect of contextually integrating statistical investigations and literacies into other curriculum areas be on student achievement? By finding the…

  11. Sampling, Probability Models and Statistical Reasoning Statistical

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...

  12. Detection by voxel-wise statistical analysis of significant changes in regional cerebral glucose uptake in an APP/PS1 transgenic mouse model of Alzheimer's disease.

    Science.gov (United States)

    Dubois, Albertine; Hérard, Anne-Sophie; Delatour, Benoît; Hantraye, Philippe; Bonvento, Gilles; Dhenain, Marc; Delzescaux, Thierry

    2010-06-01

    Biomarkers and technologies similar to those used in humans are essential for the follow-up of Alzheimer's disease (AD) animal models, particularly for the clarification of mechanisms and the screening and validation of new candidate treatments. In humans, changes in brain metabolism can be detected by 1-deoxy-2-[(18)F] fluoro-D-glucose PET (FDG-PET) and assessed in a user-independent manner with dedicated software, such as Statistical Parametric Mapping (SPM). FDG-PET can be carried out in small animals, but its resolution is low as compared to the size of rodent brain structures. In mouse models of AD, changes in cerebral glucose utilization are usually detected by [(14)C]-2-deoxyglucose (2DG) autoradiography, but this requires prior manual outlining of regions of interest (ROI) on selected sections. Here, we evaluate the feasibility of applying the SPM method to 3D autoradiographic data sets mapping brain metabolic activity in a transgenic mouse model of AD. We report the preliminary results obtained with 4 APP/PS1 (64+/-1 weeks) and 3 PS1 (65+/-2 weeks) mice. We also describe new procedures for the acquisition and use of "blockface" photographs and provide the first demonstration of their value for the 3D reconstruction and spatial normalization of post mortem mouse brain volumes. Despite this limited sample size, our results appear to be meaningful, consistent, and more comprehensive than findings from previously published studies based on conventional ROI-based methods. The establishment of statistical significance at the voxel level, rather than with a user-defined ROI, makes it possible to detect more reliably subtle differences in geometrically complex regions, such as the hippocampus. Our approach is generic and could be easily applied to other biomarkers and extended to other species and applications. Copyright 2010 Elsevier Inc. All rights reserved.

  13. Increasing the statistical significance of entanglement detection in experiments.

    Science.gov (United States)

    Jungnitsch, Bastian; Niekamp, Sönke; Kleinmann, Matthias; Gühne, Otfried; Lu, He; Gao, Wei-Bo; Chen, Yu-Ao; Chen, Zeng-Bing; Pan, Jian-Wei

    2010-05-28

    Entanglement is often verified by a violation of an inequality like a Bell inequality or an entanglement witness. Considerable effort has been devoted to the optimization of such inequalities in order to obtain a high violation. We demonstrate theoretically and experimentally that such an optimization does not necessarily lead to a better entanglement test, if the statistical error is taken into account. Theoretically, we show for different error models that reducing the violation of an inequality can improve the significance. Experimentally, we observe this phenomenon in a four-photon experiment, testing the Mermin and Ardehali inequality for different levels of noise. Furthermore, we provide a way to develop entanglement tests with high statistical significance.

  14. Diffeomorphic Statistical Deformation Models

    DEFF Research Database (Denmark)

    Hansen, Michael Sass; Hansen, Mads/Fogtman; Larsen, Rasmus

    2007-01-01

    In this paper we present a new method for constructing diffeomorphic statistical deformation models in arbitrary dimensional images with a nonlinear generative model and a linear parameter space. Our deformation model is a modified version of the diffeomorphic model introduced by Cootes et al....... The modifications ensure that no boundary restriction has to be enforced on the parameter space to prevent folds or tears in the deformation field. For straightforward statistical analysis, principal component analysis and sparse methods, we assume that the parameters for a class of deformations lie on a linear...... with ground truth in form of manual expert annotations, and compared to Cootes's model. We anticipate applications in unconstrained diffeomorphic synthesis of images, e.g. for tracking, segmentation, registration or classification purposes....

  15. Increasing the statistical significance of entanglement detection in experiments

    Energy Technology Data Exchange (ETDEWEB)

    Jungnitsch, Bastian; Niekamp, Soenke; Kleinmann, Matthias; Guehne, Otfried [Institut fuer Quantenoptik und Quanteninformation, Innsbruck (Austria); Lu, He; Gao, Wei-Bo; Chen, Zeng-Bing [Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei (China); Chen, Yu-Ao; Pan, Jian-Wei [Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei (China); Physikalisches Institut, Universitaet Heidelberg (Germany)

    2010-07-01

    Entanglement is often verified by a violation of an inequality like a Bell inequality or an entanglement witness. Considerable effort has been devoted to the optimization of such inequalities in order to obtain a high violation. We demonstrate theoretically and experimentally that such an optimization does not necessarily lead to a better entanglement test, if the statistical error is taken into account. Theoretically, we show for different error models that reducing the violation of an inequality can improve the significance. We show this to be the case for an error model in which the variance of an observable is interpreted as its error and for the standard error model in photonic experiments. Specifically, we demonstrate that the Mermin inequality yields a Bell test which is statistically more significant than the Ardehali inequality in the case of a photonic four-qubit state that is close to a GHZ state. Experimentally, we observe this phenomenon in a four-photon experiment, testing the above inequalities for different levels of noise.

  16. Testing the Difference of Correlated Agreement Coefficients for Statistical Significance

    Science.gov (United States)

    Gwet, Kilem L.

    2016-01-01

    This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…

  17. Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance.

    Science.gov (United States)

    Kramer, Karen L; Veile, Amanda; Otárola-Castillo, Erik

    2016-01-01

    Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children's growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children's monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children's growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children's growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children's growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance.

  18. Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance.

    Directory of Open Access Journals (Sweden)

    Karen L Kramer

    Full Text Available Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children's growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children's monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1 as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2 competition from young siblings will negatively impact child growth during the post weaning period; 3 however because of their economic value, older siblings will have a negligible effect on young children's growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children's growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children's growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance.

  19. Significance levels for studies with correlated test statistics.

    Science.gov (United States)

    Shi, Jianxin; Levinson, Douglas F; Whittemore, Alice S

    2008-07-01

    When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.

  20. Caveats for using statistical significance tests in research assessments

    DEFF Research Database (Denmark)

    Schneider, Jesper Wiborg

    2013-01-01

    controversial and numerous criticisms have been leveled against their use. Based on examples from articles by proponents of the use statistical significance tests in research assessments, we address some of the numerous problems with such tests. The issues specifically discussed are the ritual practice......This article raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization procedures for citation indicators by Opthof and Leydesdorff (2010). Statistical significance tests are highly...... argue that applying statistical significance tests and mechanically adhering to their results are highly problematic and detrimental to critical thinking. We claim that the use of such tests do not provide any advantages in relation to deciding whether differences between citation indicators...

  1. On detection and assessment of statistical significance of Genomic Islands

    Directory of Open Access Journals (Sweden)

    Chaudhuri Probal

    2008-04-01

    Full Text Available Abstract Background Many of the available methods for detecting Genomic Islands (GIs in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. Results Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. Conclusion The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods.

  2. Your Chi-Square Test Is Statistically Significant: Now What?

    Science.gov (United States)

    Sharpe, Donald

    2015-01-01

    Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…

  3. Statistical Significance and Effect Size: Two Sides of a Coin.

    Science.gov (United States)

    Fan, Xitao

    This paper suggests that statistical significance testing and effect size are two sides of the same coin; they complement each other, but do not substitute for one another. Good research practice requires that both should be taken into consideration to make sound quantitative decisions. A Monte Carlo simulation experiment was conducted, and a…

  4. Reporting effect sizes as a supplement to statistical significance ...

    African Journals Online (AJOL)

    The purpose of the article is to review the statistical significance reporting practices in reading instruction studies and to provide guidelines for when to calculate and report effect sizes in educational research. A review of six readily accessible (online) and accredited journals publishing research on reading instruction ...

  5. Test for the statistical significance of differences between ROC curves

    International Nuclear Information System (INIS)

    Metz, C.E.; Kronman, H.B.

    1979-01-01

    A test for the statistical significance of observed differences between two measured Receiver Operating Characteristic (ROC) curves has been designed and evaluated. The set of observer response data for each ROC curve is assumed to be independent and to arise from a ROC curve having a form which, in the absence of statistical fluctuations in the response data, graphs as a straight line on double normal-deviate axes. To test the significance of an apparent difference between two measured ROC curves, maximum likelihood estimates of the two parameters of each curve and the associated parameter variances and covariance are calculated from the corresponding set of observer response data. An approximate Chi-square statistic with two degrees of freedom is then constructed from the differences between the parameters estimated for each ROC curve and from the variances and covariances of these estimates. This statistic is known to be truly Chi-square distributed only in the limit of large numbers of trials in the observer performance experiments. Performance of the statistic for data arising from a limited number of experimental trials was evaluated. Independent sets of rating scale data arising from the same underlying ROC curve were paired, and the fraction of differences found (falsely) significant was compared to the significance level, α, used with the test. Although test performance was found to be somewhat dependent on both the number of trials in the data and the position of the underlying ROC curve in the ROC space, the results for various significance levels showed the test to be reliable under practical experimental conditions

  6. Exclusion statistics and integrable models

    International Nuclear Information System (INIS)

    Mashkevich, S.

    1998-01-01

    The definition of exclusion statistics, as given by Haldane, allows for a statistical interaction between distinguishable particles (multi-species statistics). The thermodynamic quantities for such statistics ca be evaluated exactly. The explicit expressions for the cluster coefficients are presented. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models. The interesting questions of generalizing this correspondence onto the higher-dimensional and the multi-species cases remain essentially open

  7. Common pitfalls in statistical analysis: "P" values, statistical significance and confidence intervals

    Directory of Open Access Journals (Sweden)

    Priya Ranganathan

    2015-01-01

    Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper

  8. Common pitfalls in statistical analysis: “P” values, statistical significance and confidence intervals

    Science.gov (United States)

    Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc

    2015-01-01

    In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958

  9. Statistical significance of epidemiological data. Seminar: Evaluation of epidemiological studies

    International Nuclear Information System (INIS)

    Weber, K.H.

    1993-01-01

    In stochastic damages, the numbers of events, e.g. the persons who are affected by or have died of cancer, and thus the relative frequencies (incidence or mortality) are binomially distributed random variables. Their statistical fluctuations can be characterized by confidence intervals. For epidemiologic questions, especially for the analysis of stochastic damages in the low dose range, the following issues are interesting: - Is a sample (a group of persons) with a definite observed damage frequency part of the whole population? - Is an observed frequency difference between two groups of persons random or statistically significant? - Is an observed increase or decrease of the frequencies with increasing dose random or statistically significant and how large is the regression coefficient (= risk coefficient) in this case? These problems can be solved by sttistical tests. So-called distribution-free tests and tests which are not bound to the supposition of normal distribution are of particular interest, such as: - χ 2 -independence test (test in contingency tables); - Fisher-Yates-test; - trend test according to Cochran; - rank correlation test given by Spearman. These tests are explained in terms of selected epidemiologic data, e.g. of leukaemia clusters, of the cancer mortality of the Japanese A-bomb survivors especially in the low dose range as well as on the sample of the cancer mortality in the high background area in Yangjiang (China). (orig.) [de

  10. Systematic reviews of anesthesiologic interventions reported as statistically significant

    DEFF Research Database (Denmark)

    Imberger, Georgina; Gluud, Christian; Boylan, John

    2015-01-01

    statistically significant meta-analyses of anesthesiologic interventions, we used TSA to estimate power and imprecision in the context of sparse data and repeated updates. METHODS: We conducted a search to identify all systematic reviews with meta-analyses that investigated an intervention that may......: From 11,870 titles, we found 682 systematic reviews that investigated anesthesiologic interventions. In the 50 sampled meta-analyses, the median number of trials included was 8 (interquartile range [IQR], 5-14), the median number of participants was 964 (IQR, 523-1736), and the median number...

  11. Exclusion statistics and integrable models

    International Nuclear Information System (INIS)

    Mashkevich, S.

    1998-01-01

    The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question

  12. Statistical Model of Extreme Shear

    DEFF Research Database (Denmark)

    Larsen, Gunner Chr.; Hansen, Kurt Schaldemose

    2004-01-01

    In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...... are consistent, given the inevitabel uncertainties associated with model as well as with the extreme value data analysis. Keywords: Statistical model, extreme wind conditions, statistical analysis, turbulence, wind loading, statistical analysis, turbulence, wind loading, wind shear, wind turbines....

  13. Statistical modeling for degradation data

    CERN Document Server

    Lio, Yuhlong; Ng, Hon; Tsai, Tzong-Ru

    2017-01-01

    This book focuses on the statistical aspects of the analysis of degradation data. In recent years, degradation data analysis has come to play an increasingly important role in different disciplines such as reliability, public health sciences, and finance. For example, information on products’ reliability can be obtained by analyzing degradation data. In addition, statistical modeling and inference techniques have been developed on the basis of different degradation measures. The book brings together experts engaged in statistical modeling and inference, presenting and discussing important recent advances in degradation data analysis and related applications. The topics covered are timely and have considerable potential to impact both statistics and reliability engineering.

  14. Statistical modelling with quantile functions

    CERN Document Server

    Gilchrist, Warren

    2000-01-01

    Galton used quantiles more than a hundred years ago in describing data. Tukey and Parzen used them in the 60s and 70s in describing populations. Since then, the authors of many papers, both theoretical and practical, have used various aspects of quantiles in their work. Until now, however, no one put all the ideas together to form what turns out to be a general approach to statistics.Statistical Modelling with Quantile Functions does just that. It systematically examines the entire process of statistical modelling, starting with using the quantile function to define continuous distributions. The author shows that by using this approach, it becomes possible to develop complex distributional models from simple components. A modelling kit can be developed that applies to the whole model - deterministic and stochastic components - and this kit operates by adding, multiplying, and transforming distributions rather than data.Statistical Modelling with Quantile Functions adds a new dimension to the practice of stati...

  15. A tutorial on hunting statistical significance by chasing N

    Directory of Open Access Journals (Sweden)

    Denes Szucs

    2016-09-01

    Full Text Available There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Hidden data dredging (also called p-hacking is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. In order to build better intuition to avoid, detect and criticise some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings. I illustrate several forms of two special cases of data dredging. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. Second, researchers may group participants post-hoc along potential but unplanned independent grouping variables. The first approach 'hacks' the number of participants in studies, the second approach ‘hacks’ the number of variables in the analysis. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. Similar, usually undocumented data dredging steps can easily lead to having 20-50%, or more false positives.

  16. A Statistical Programme Assignment Model

    DEFF Research Database (Denmark)

    Rosholm, Michael; Staghøj, Jonas; Svarer, Michael

    When treatment effects of active labour market programmes are heterogeneous in an observable way  across the population, the allocation of the unemployed into different programmes becomes a particularly  important issue. In this paper, we present a statistical model designed to improve the present...... duration of unemployment spells may result if a statistical programme assignment model is introduced. We discuss several issues regarding the  plementation of such a system, especially the interplay between the statistical model and  case workers....

  17. Tropical geometry of statistical models.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.

  18. Statistical Model of Extreme Shear

    DEFF Research Database (Denmark)

    Hansen, Kurt Schaldemose; Larsen, Gunner Chr.

    2005-01-01

    In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...

  19. Statistical Models for Social Networks

    NARCIS (Netherlands)

    Snijders, Tom A. B.; Cook, KS; Massey, DS

    2011-01-01

    Statistical models for social networks as dependent variables must represent the typical network dependencies between tie variables such as reciprocity, homophily, transitivity, etc. This review first treats models for single (cross-sectionally observed) networks and then for network dynamics. For

  20. Conducting tests for statistically significant differences using forest inventory data

    Science.gov (United States)

    James A. Westfall; Scott A. Pugh; John W. Coulston

    2013-01-01

    Many forest inventory and monitoring programs are based on a sample of ground plots from which estimates of forest resources are derived. In addition to evaluating metrics such as number of trees or amount of cubic wood volume, it is often desirable to make comparisons between resource attributes. To properly conduct statistical tests for differences, it is imperative...

  1. Detecting Statistically Significant Communities of Triangle Motifs in Undirected Networks

    Science.gov (United States)

    2016-04-26

    Systems, Statistics & Management Science, University of Alabama, USA. 1 DISTRIBUTION A: Distribution approved for public release. Contents 1 Summary 5...13 5 Application to Real Networks 18 5.1 2012 FBS Football Schedule Network... football schedule network. . . . . . . . . . . . . . . . . . . . . . 21 14 Stem plot of degree-ordered vertices versus the degree for college football

  2. Sensometrics: Thurstonian and Statistical Models

    DEFF Research Database (Denmark)

    Christensen, Rune Haubo Bojesen

    . sensR is a package for sensory discrimination testing with Thurstonian models and ordinal supports analysis of ordinal data with cumulative link (mixed) models. While sensR is closely connected to the sensometrics field, the ordinal package has developed into a generic statistical package applicable......This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the use...... and sensory discrimination testing in particular in a series of papers by advancing Thurstonian models for a range of sensory discrimination protocols in addition to facilitating their application by providing software for fitting these models. The main focus is on identifying Thurstonian models...

  3. Statistical modeling of Earth's plasmasphere

    Science.gov (United States)

    Veibell, Victoir

    The behavior of plasma near Earth's geosynchronous orbit is of vital importance to both satellite operators and magnetosphere modelers because it also has a significant influence on energy transport, ion composition, and induced currents. The system is highly complex in both time and space, making the forecasting of extreme space weather events difficult. This dissertation examines the behavior and statistical properties of plasma mass density near geosynchronous orbit by using both linear and nonlinear models, as well as epoch analyses, in an attempt to better understand the physical processes that precipitates and drives its variations. It is shown that while equatorial mass density does vary significantly on an hourly timescale when a drop in the disturbance time scale index ( Dst) was observed, it does not vary significantly between the day of a Dst event onset and the day immediately following. It is also shown that increases in equatorial mass density were not, on average, preceded or followed by any significant change in the examined solar wind or geomagnetic variables, including Dst, despite prior results that considered a few selected events and found a notable influence. It is verified that equatorial mass density and and solar activity via the F10.7 index have a strong correlation, which is stronger over longer timescales such as 27 days than it is over an hourly timescale. It is then shown that this connection seems to affect the behavior of equatorial mass density most during periods of strong solar activity leading to large mass density reactions to Dst drops for high values of F10.7. It is also shown that equatorial mass density behaves differently before and after events based on the value of F10.7 at the onset of an equatorial mass density event or a Dst event, and that a southward interplanetary magnetic field at onset leads to slowed mass density growth after event onset. These behavioral differences provide insight into how solar and geomagnetic

  4. Classical model of intermediate statistics

    International Nuclear Information System (INIS)

    Kaniadakis, G.

    1994-01-01

    In this work we present a classical kinetic model of intermediate statistics. In the case of Brownian particles we show that the Fermi-Dirac (FD) and Bose-Einstein (BE) distributions can be obtained, just as the Maxwell-Boltzmann (MD) distribution, as steady states of a classical kinetic equation that intrinsically takes into account an exclusion-inclusion principle. In our model the intermediate statistics are obtained as steady states of a system of coupled nonlinear kinetic equations, where the coupling constants are the transmutational potentials η κκ' . We show that, besides the FD-BE intermediate statistics extensively studied from the quantum point of view, we can also study the MB-FD and MB-BE ones. Moreover, our model allows us to treat the three-state mixing FD-MB-BE intermediate statistics. For boson and fermion mixing in a D-dimensional space, we obtain a family of FD-BE intermediate statistics by varying the transmutational potential η BF . This family contains, as a particular case when η BF =0, the quantum statistics recently proposed by L. Wu, Z. Wu, and J. Sun [Phys. Lett. A 170, 280 (1992)]. When we consider the two-dimensional FD-BE statistics, we derive an analytic expression of the fraction of fermions. When the temperature T→∞, the system is composed by an equal number of bosons and fermions, regardless of the value of η BF . On the contrary, when T=0, η BF becomes important and, according to its value, the system can be completely bosonic or fermionic, or composed both by bosons and fermions

  5. Statistical vs. Economic Significance in Economics and Econometrics: Further comments on McCloskey & Ziliak

    DEFF Research Database (Denmark)

    Engsted, Tom

    I comment on the controversy between McCloskey & Ziliak and Hoover & Siegler on statistical versus economic significance, in the March 2008 issue of the Journal of Economic Methodology. I argue that while McCloskey & Ziliak are right in emphasizing 'real error', i.e. non-sampling error that cannot...... be eliminated through specification testing, they fail to acknowledge those areas in economics, e.g. rational expectations macroeconomics and asset pricing, where researchers clearly distinguish between statistical and economic significance and where statistical testing plays a relatively minor role in model...

  6. Textual information access statistical models

    CERN Document Server

    Gaussier, Eric

    2013-01-01

    This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:- information extraction and retrieval;- text classification and clustering;- opinion mining;- comprehension aids (automatic summarization, machine translation, visualization).In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications

  7. Uncertainty the soul of modeling, probability & statistics

    CERN Document Server

    Briggs, William

    2016-01-01

    This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science. The ultimate goal is to call into question many standard tenets and lay the philosophical and probabilistic groundwork and infrastructure for statistical modeling. It is the first book devoted to the philosophy of data aimed at working scientists and calls for a new consideration in the practice of probability and statistics to eliminate what has been referred to as the "Cult of Statistical Significance". The book explains the philosophy of these ideas and not the mathematics, though there are a handful of mathematical examples. The topics are logically laid out, starting with basic philosophy as related to probability, statistics, and science, and stepping through the key probabilistic ideas and concepts, and ending with statistical models. Its jargon-free approach asserts that standard methods, suc...

  8. Improved model for statistical alignment

    Energy Technology Data Exchange (ETDEWEB)

    Miklos, I.; Toroczkai, Z. (Zoltan)

    2001-01-01

    The statistical approach to molecular sequence evolution involves the stochastic modeling of the substitution, insertion and deletion processes. Substitution has been modeled in a reliable way for more than three decades by using finite Markov-processes. Insertion and deletion, however, seem to be more difficult to model, and thc recent approaches cannot acceptably deal with multiple insertions and deletions. A new method based on a generating function approach is introduced to describe the multiple insertion process. The presented algorithm computes the approximate joint probability of two sequences in 0(13) running time where 1 is the geometric mean of the sequence lengths.

  9. After statistics reform : Should we still teach significance testing?

    NARCIS (Netherlands)

    A. Hak (Tony)

    2014-01-01

    textabstractIn the longer term null hypothesis significance testing (NHST) will disappear because p- values are not informative and not replicable. Should we continue to teach in the future the procedures of then abolished routines (i.e., NHST)? Three arguments are discussed for not teaching NHST in

  10. Active Learning with Statistical Models.

    Science.gov (United States)

    1995-01-01

    Active Learning with Statistical Models ASC-9217041, NSF CDA-9309300 6. AUTHOR(S) David A. Cohn, Zoubin Ghahramani, and Michael I. Jordan 7. PERFORMING...TERMS 15. NUMBER OF PAGES Al, MIT, Artificial Intelligence, active learning , queries, locally weighted 6 regression, LOESS, mixtures of gaussians...COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1522 January 9. 1995 C.B.C.L. Paper No. 110 Active Learning with

  11. Distinguishing between statistical significance and practical/clinical meaningfulness using statistical inference.

    Science.gov (United States)

    Wilkinson, Michael

    2014-03-01

    Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.

  12. Statistical significance of trends in monthly heavy precipitation over the US

    KAUST Repository

    Mahajan, Salil

    2011-05-11

    Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall\\'s τ test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong. © 2011 Springer-Verlag.

  13. GIGMF - A statistical model program

    International Nuclear Information System (INIS)

    Vladuca, G.; Deberth, C.

    1978-01-01

    The program GIGMF computes the differential and integrated statistical model cross sections for the reactions proceeding through a compound nuclear stage. The computational method is based on the Hauser-Feshbach-Wolfenstein theory, modified to include the modern version of Tepel et al. Although the program was written for a PDP-15 computer, with 16K high speed memory, many reaction channels can be taken into account with the following restrictions: the pro ectile spin must be less than 2, the maximum spin momenta of the compound nucleus can not be greater than 10. These restrictions are due solely to the storage allotments and may be easily relaxed. The energy of the impinging particle, the target and projectile masses, the spin and paritjes of the projectile, target, emergent and residual nuclei the maximum orbital momentum and transmission coefficients for each reaction channel are the input parameters of the program. (author)

  14. Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.

    Science.gov (United States)

    Breunig, Nancy A.

    Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…

  15. Probing NWP model deficiencies by statistical postprocessing

    DEFF Research Database (Denmark)

    Rosgaard, Martin Haubjerg; Nielsen, Henrik Aalborg; Nielsen, Torben S.

    2016-01-01

    The objective in this article is twofold. On one hand, a Model Output Statistics (MOS) framework for improved wind speed forecast accuracy is described and evaluated. On the other hand, the approach explored identifies unintuitive explanatory value from a diagnostic variable in an operational....... Based on the statistical model candidates inferred from the data, the lifted index NWP model diagnostic is consistently found among the NWP model predictors of the best performing statistical models across sites....

  16. Power, effects, confidence, and significance: an investigation of statistical practices in nursing research.

    Science.gov (United States)

    Gaskin, Cadeyrn J; Happell, Brenda

    2014-05-01

    To (a) assess the statistical power of nursing research to detect small, medium, and large effect sizes; (b) estimate the experiment-wise Type I error rate in these studies; and (c) assess the extent to which (i) a priori power analyses, (ii) effect sizes (and interpretations thereof), and (iii) confidence intervals were reported. Statistical review. Papers published in the 2011 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. Papers were assessed for statistical power, control of experiment-wise Type I error, reporting of a priori power analyses, reporting and interpretation of effect sizes, and reporting of confidence intervals. The analyses were based on 333 papers, from which 10,337 inferential statistics were identified. The median power to detect small, medium, and large effect sizes was .40 (interquartile range [IQR]=.24-.71), .98 (IQR=.85-1.00), and 1.00 (IQR=1.00-1.00), respectively. The median experiment-wise Type I error rate was .54 (IQR=.26-.80). A priori power analyses were reported in 28% of papers. Effect sizes were routinely reported for Spearman's rank correlations (100% of papers in which this test was used), Poisson regressions (100%), odds ratios (100%), Kendall's tau correlations (100%), Pearson's correlations (99%), logistic regressions (98%), structural equation modelling/confirmatory factor analyses/path analyses (97%), and linear regressions (83%), but were reported less often for two-proportion z tests (50%), analyses of variance/analyses of covariance/multivariate analyses of variance (18%), t tests (8%), Wilcoxon's tests (8%), Chi-squared tests (8%), and Fisher's exact tests (7%), and not reported for sign tests, Friedman's tests, McNemar's tests, multi-level models, and Kruskal-Wallis tests. Effect sizes were infrequently interpreted. Confidence intervals were reported in 28% of papers. The use, reporting, and interpretation of inferential statistics in nursing research need substantial

  17. "What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"

    Science.gov (United States)

    Ozturk, Elif

    2012-01-01

    The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…

  18. Publication of statistically significant research findings in prosthodontics & implant dentistry in the context of other dental specialties.

    Science.gov (United States)

    Papageorgiou, Spyridon N; Kloukos, Dimitrios; Petridis, Haralampos; Pandis, Nikolaos

    2015-10-01

    To assess the hypothesis that there is excessive reporting of statistically significant studies published in prosthodontic and implantology journals, which could indicate selective publication. The last 30 issues of 9 journals in prosthodontics and implant dentistry were hand-searched for articles with statistical analyses. The percentages of significant and non-significant results were tabulated by parameter of interest. Univariable/multivariable logistic regression analyses were applied to identify possible predictors of reporting statistically significance findings. The results of this study were compared with similar studies in dentistry with random-effects meta-analyses. From the 2323 included studies 71% of them reported statistically significant results, with the significant results ranging from 47% to 86%. Multivariable modeling identified that geographical area and involvement of statistician were predictors of statistically significant results. Compared to interventional studies, the odds that in vitro and observational studies would report statistically significant results was increased by 1.20 times (OR: 2.20, 95% CI: 1.66-2.92) and 0.35 times (OR: 1.35, 95% CI: 1.05-1.73), respectively. The probability of statistically significant results from randomized controlled trials was significantly lower compared to various study designs (difference: 30%, 95% CI: 11-49%). Likewise the probability of statistically significant results in prosthodontics and implant dentistry was lower compared to other dental specialties, but this result did not reach statistical significant (P>0.05). The majority of studies identified in the fields of prosthodontics and implant dentistry presented statistically significant results. The same trend existed in publications of other specialties in dentistry. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Optimizing refiner operation with statistical modelling

    Energy Technology Data Exchange (ETDEWEB)

    Broderick, G [Noranda Research Centre, Pointe Claire, PQ (Canada)

    1997-02-01

    The impact of refining conditions on the energy efficiency of the process and on the handsheet quality of a chemi-mechanical pulp was studied as part of a series of pilot scale refining trials. Statistical models of refiner performance were constructed from these results and non-linear optimization of process conditions were conducted. Optimization results indicated that increasing the ratio of specific energy applied in the first stage led to a reduction of some 15 per cent in the total energy requirement. The strategy can also be used to obtain significant increases in pulp quality for a given energy input. 20 refs., 6 tabs.

  20. Statistical modelling of fish stocks

    DEFF Research Database (Denmark)

    Kvist, Trine

    1999-01-01

    for modelling the dynamics of a fish population is suggested. A new approach is introduced to analyse the sources of variation in age composition data, which is one of the most important sources of information in the cohort based models for estimation of stock abundancies and mortalities. The approach combines...... and it is argued that an approach utilising stochastic differential equations might be advantagous in fish stoch assessments....

  1. ARSENIC CONTAMINATION IN GROUNDWATER: A STATISTICAL MODELING

    Directory of Open Access Journals (Sweden)

    Palas Roy

    2013-01-01

    Full Text Available High arsenic in natural groundwater in most of the tubewells of the Purbasthali- Block II area of Burdwan district (W.B, India has recently been focused as a serious environmental concern. This paper is intending to illustrate the statistical modeling of the arsenic contaminated groundwater to identify the interrelation of that arsenic contain with other participating groundwater parameters so that the arsenic contamination level can easily be predicted by analyzing only such parameters. Multivariate data analysis was done with the collected groundwater samples from the 132 tubewells of this contaminated region shows that three variable parameters are significantly related with the arsenic. Based on these relationships, a multiple linear regression model has been developed that estimated the arsenic contamination by measuring such three predictor parameters of the groundwater variables in the contaminated aquifer. This model could also be a suggestive tool while designing the arsenic removal scheme for any affected groundwater.

  2. Statistical lung model for microdosimetry

    International Nuclear Information System (INIS)

    Fisher, D.R.; Hadley, R.T.

    1984-03-01

    To calculate the microdosimetry of plutonium in the lung, a mathematical description is needed of lung tissue microstructure that defines source-site parameters. Beagle lungs were expanded using a glutaraldehyde fixative at 30 cm water pressure. Tissue specimens, five microns thick, were stained with hematoxylin and eosin then studied using an image analyzer. Measurements were made along horizontal lines through the magnified tissue image. The distribution of air space and tissue chord lengths and locations of epithelial cell nuclei were recorded from about 10,000 line scans. The distribution parameters constituted a model of lung microstructure for predicting the paths of random alpha particle tracks in the lung and the probability of traversing biologically sensitive sites. This lung model may be used in conjunction with established deposition and retention models for determining the microdosimetry in the pulmonary lung for a wide variety of inhaled radioactive materials

  3. Statistical modelling for ship propulsion efficiency

    DEFF Research Database (Denmark)

    Petersen, Jóan Petur; Jacobsen, Daniel J.; Winther, Ole

    2012-01-01

    This paper presents a state-of-the-art systems approach to statistical modelling of fuel efficiency in ship propulsion, and also a novel and publicly available data set of high quality sensory data. Two statistical model approaches are investigated and compared: artificial neural networks...

  4. Actuarial statistics with generalized linear mixed models

    NARCIS (Netherlands)

    Antonio, K.; Beirlant, J.

    2007-01-01

    Over the last decade the use of generalized linear models (GLMs) in actuarial statistics has received a lot of attention, starting from the actuarial illustrations in the standard text by McCullagh and Nelder [McCullagh, P., Nelder, J.A., 1989. Generalized linear models. In: Monographs on Statistics

  5. Spherical Process Models for Global Spatial Statistics

    KAUST Repository

    Jeong, Jaehong; Jun, Mikyoung; Genton, Marc G.

    2017-01-01

    Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture

  6. Statistical Models and Methods for Lifetime Data

    CERN Document Server

    Lawless, Jerald F

    2011-01-01

    Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,

  7. Statistics and the shell model

    International Nuclear Information System (INIS)

    Weidenmueller, H.A.

    1985-01-01

    Starting with N. Bohr's paper on compound-nucleus reactions, we confront regular dynamical features and chaotic motion in nuclei. The shell-model and, more generally, mean-field theories describe average nuclear properties which are thus identified as regular features. The fluctuations about the average show chaotic behaviour of the same type as found in classical chaotic systems upon quantisation. These features are therefore generic and quite independent of the specific dynamics of the nucleus. A novel method to calculate fluctuations is discussed, and the results of this method are described. (orig.)

  8. Measuring individual significant change on the Beck Depression Inventory-II through IRT-based statistics.

    NARCIS (Netherlands)

    Brouwer, D.; Meijer, R.R.; Zevalkink, D.J.

    2013-01-01

    Several researchers have emphasized that item response theory (IRT)-based methods should be preferred over classical approaches in measuring change for individual patients. In the present study we discuss and evaluate the use of IRT-based statistics to measure statistical significant individual

  9. Using the Bootstrap Method for a Statistical Significance Test of Differences between Summary Histograms

    Science.gov (United States)

    Xu, Kuan-Man

    2006-01-01

    A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.

  10. Health significance and statistical uncertainty. The value of P-value.

    Science.gov (United States)

    Consonni, Dario; Bertazzi, Pier Alberto

    2017-10-27

    The P-value is widely used as a summary statistics of scientific results. Unfortunately, there is a widespread tendency to dichotomize its value in "P0.05" ("statistically not significant"), with the former implying a "positive" result and the latter a "negative" one. To show the unsuitability of such an approach when evaluating the effects of environmental and occupational risk factors. We provide examples of distorted use of P-value and of the negative consequences for science and public health of such a black-and-white vision. The rigid interpretation of P-value as a dichotomy favors the confusion between health relevance and statistical significance, discourages thoughtful thinking, and distorts attention from what really matters, the health significance. A much better way to express and communicate scientific results involves reporting effect estimates (e.g., risks, risks ratios or risk differences) and their confidence intervals (CI), which summarize and convey both health significance and statistical uncertainty. Unfortunately, many researchers do not usually consider the whole interval of CI but only examine if it includes the null-value, therefore degrading this procedure to the same P-value dichotomy (statistical significance or not). In reporting statistical results of scientific research present effects estimates with their confidence intervals and do not qualify the P-value as "significant" or "not significant".

  11. Bayesian models: A statistical primer for ecologists

    Science.gov (United States)

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  12. Statistical Model-Based Face Pose Estimation

    Institute of Scientific and Technical Information of China (English)

    GE Xinliang; YANG Jie; LI Feng; WANG Huahua

    2007-01-01

    A robust face pose estimation approach is proposed by using face shape statistical model approach and pose parameters are represented by trigonometric functions. The face shape statistical model is firstly built by analyzing the face shapes from different people under varying poses. The shape alignment is vital in the process of building the statistical model. Then, six trigonometric functions are employed to represent the face pose parameters. Lastly, the mapping function is constructed between face image and face pose by linearly relating different parameters. The proposed approach is able to estimate different face poses using a few face training samples. Experimental results are provided to demonstrate its efficiency and accuracy.

  13. Automated statistical modeling of analytical measurement systems

    International Nuclear Information System (INIS)

    Jacobson, J.J.

    1992-01-01

    The statistical modeling of analytical measurement systems at the Idaho Chemical Processing Plant (ICPP) has been completely automated through computer software. The statistical modeling of analytical measurement systems is one part of a complete quality control program used by the Remote Analytical Laboratory (RAL) at the ICPP. The quality control program is an integration of automated data input, measurement system calibration, database management, and statistical process control. The quality control program and statistical modeling program meet the guidelines set forth by the American Society for Testing Materials and American National Standards Institute. A statistical model is a set of mathematical equations describing any systematic bias inherent in a measurement system and the precision of a measurement system. A statistical model is developed from data generated from the analysis of control standards. Control standards are samples which are made up at precise known levels by an independent laboratory and submitted to the RAL. The RAL analysts who process control standards do not know the values of those control standards. The object behind statistical modeling is to describe real process samples in terms of their bias and precision and, to verify that a measurement system is operating satisfactorily. The processing of control standards gives us this ability

  14. Topology for statistical modeling of petascale data.

    Energy Technology Data Exchange (ETDEWEB)

    Pascucci, Valerio (University of Utah, Salt Lake City, UT); Mascarenhas, Ajith Arthur; Rusek, Korben (Texas A& M University, College Station, TX); Bennett, Janine Camille; Levine, Joshua (University of Utah, Salt Lake City, UT); Pebay, Philippe Pierre; Gyulassy, Attila (University of Utah, Salt Lake City, UT); Thompson, David C.; Rojas, Joseph Maurice (Texas A& M University, College Station, TX)

    2011-07-01

    This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled 'Topology for Statistical Modeling of Petascale Data', funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program. Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is thus to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, our approach is based on the complementary techniques of combinatorial topology and statistical modeling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modeling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. This document summarizes the technical advances we have made to date that were made possible in whole or in part by MAPD funding. These technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modeling, and (3) new integrated topological and statistical methods.

  15. Statistical modelling of citation exchange between statistics journals.

    Science.gov (United States)

    Varin, Cristiano; Cattelan, Manuela; Firth, David

    2016-01-01

    Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation 'export scores' within the discipline of statistics.

  16. Daily precipitation statistics in regional climate models

    DEFF Research Database (Denmark)

    Frei, Christoph; Christensen, Jens Hesselbjerg; Déqué, Michel

    2003-01-01

    An evaluation is undertaken of the statistics of daily precipitation as simulated by five regional climate models using comprehensive observations in the region of the European Alps. Four limited area models and one variable-resolution global model are considered, all with a grid spacing of 50 km...

  17. Infinite Random Graphs as Statistical Mechanical Models

    DEFF Research Database (Denmark)

    Durhuus, Bergfinnur Jøgvan; Napolitano, George Maria

    2011-01-01

    We discuss two examples of infinite random graphs obtained as limits of finite statistical mechanical systems: a model of two-dimensional dis-cretized quantum gravity defined in terms of causal triangulated surfaces, and the Ising model on generic random trees. For the former model we describe a ...

  18. Matrix Tricks for Linear Statistical Models

    CERN Document Server

    Puntanen, Simo; Styan, George PH

    2011-01-01

    In teaching linear statistical models to first-year graduate students or to final-year undergraduate students there is no way to proceed smoothly without matrices and related concepts of linear algebra; their use is really essential. Our experience is that making some particular matrix tricks very familiar to students can substantially increase their insight into linear statistical models (and also multivariate statistical analysis). In matrix algebra, there are handy, sometimes even very simple "tricks" which simplify and clarify the treatment of a problem - both for the student and

  19. Codon Deviation Coefficient: A novel measure for estimating codon usage bias and its statistical significance

    KAUST Repository

    Zhang, Zhang

    2012-03-22

    Background: Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB). Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis.Results: Here we propose a novel measure--Codon Deviation Coefficient (CDC)--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance.Conclusions: As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions. 2012 Zhang et al; licensee BioMed Central Ltd.

  20. Codon Deviation Coefficient: a novel measure for estimating codon usage bias and its statistical significance

    Directory of Open Access Journals (Sweden)

    Zhang Zhang

    2012-03-01

    Full Text Available Abstract Background Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB. Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis. Results Here we propose a novel measure--Codon Deviation Coefficient (CDC--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance. Conclusions As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions.

  1. Statistical physics of pairwise probability models

    DEFF Research Database (Denmark)

    Roudi, Yasser; Aurell, Erik; Hertz, John

    2009-01-01

    (dansk abstrakt findes ikke) Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of  data......: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying...

  2. Distributions with given marginals and statistical modelling

    CERN Document Server

    Fortiana, Josep; Rodriguez-Lallena, José

    2002-01-01

    This book contains a selection of the papers presented at the meeting `Distributions with given marginals and statistical modelling', held in Barcelona (Spain), July 17-20, 2000. In 24 chapters, this book covers topics such as the theory of copulas and quasi-copulas, the theory and compatibility of distributions, models for survival distributions and other well-known distributions, time series, categorical models, definition and estimation of measures of dependence, monotonicity and stochastic ordering, shape and separability of distributions, hidden truncation models, diagonal families, orthogonal expansions, tests of independence, and goodness of fit assessment. These topics share the use and properties of distributions with given marginals, this being the fourth specialised text on this theme. The innovative aspect of the book is the inclusion of statistical aspects such as modelling, Bayesian statistics, estimation, and tests.

  3. Aspects of statistical model for multifragmentation

    International Nuclear Information System (INIS)

    Bhattacharyya, P.; Das Gupta, S.; Mekjian, A. Z.

    1999-01-01

    We deal with two different aspects of an exactly soluble statistical model of fragmentation. First we show, using zero range force and finite temperature Thomas-Fermi theory, that a common link can be found between finite temperature mean field theory and the statistical fragmentation model. We show the latter naturally arises in the spinodal region. Next we show that although the exact statistical model is a canonical model and uses temperature, microcanonical results which use constant energy rather than constant temperature can also be obtained from the canonical model using saddle-point approximation. The methodology is extremely simple to implement and at least in all the examples studied in this work is very accurate. (c) 1999 The American Physical Society

  4. Confidence intervals permit, but don't guarantee, better inference than statistical significance testing

    Directory of Open Access Journals (Sweden)

    Melissa Coulson

    2010-07-01

    Full Text Available A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST, or confidence intervals (CIs. Authors of articles published in psychology, behavioural neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST.

  5. Statistical Compression for Climate Model Output

    Science.gov (United States)

    Hammerling, D.; Guinness, J.; Soh, Y. J.

    2017-12-01

    Numerical climate model simulations run at high spatial and temporal resolutions generate massive quantities of data. As our computing capabilities continue to increase, storing all of the data is not sustainable, and thus is it important to develop methods for representing the full datasets by smaller compressed versions. We propose a statistical compression and decompression algorithm based on storing a set of summary statistics as well as a statistical model describing the conditional distribution of the full dataset given the summary statistics. We decompress the data by computing conditional expectations and conditional simulations from the model given the summary statistics. Conditional expectations represent our best estimate of the original data but are subject to oversmoothing in space and time. Conditional simulations introduce realistic small-scale noise so that the decompressed fields are neither too smooth nor too rough compared with the original data. Considerable attention is paid to accurately modeling the original dataset-one year of daily mean temperature data-particularly with regard to the inherent spatial nonstationarity in global fields, and to determining the statistics to be stored, so that the variation in the original data can be closely captured, while allowing for fast decompression and conditional emulation on modest computers.

  6. Performance modeling, loss networks, and statistical multiplexing

    CERN Document Server

    Mazumdar, Ravi

    2009-01-01

    This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of understanding the phenomenon of statistical multiplexing. The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the important ideas of Palm distributions associated with traffic models and their role in performance measures. Also presented are recent ideas of large buffer, and many sources asymptotics that play an important role in understanding statistical multiplexing. I

  7. Simple statistical model for branched aggregates

    DEFF Research Database (Denmark)

    Lemarchand, Claire; Hansen, Jesper Schmidt

    2015-01-01

    , given that it already has bonds with others. The model is applied here to asphaltene nanoaggregates observed in molecular dynamics simulations of Cooee bitumen. The variation with temperature of the probabilities deduced from this model is discussed in terms of statistical mechanics arguments....... The relevance of the statistical model in the case of asphaltene nanoaggregates is checked by comparing the predicted value of the probability for one molecule to have exactly i bonds with the same probability directly measured in the molecular dynamics simulations. The agreement is satisfactory......We propose a statistical model that can reproduce the size distribution of any branched aggregate, including amylopectin, dendrimers, molecular clusters of monoalcohols, and asphaltene nanoaggregates. It is based on the conditional probability for one molecule to form a new bond with a molecule...

  8. Advances in statistical models for data analysis

    CERN Document Server

    Minerva, Tommaso; Vichi, Maurizio

    2015-01-01

    This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.

  9. Structured statistical models of inductive reasoning.

    Science.gov (United States)

    Kemp, Charles; Tenenbaum, Joshua B

    2009-01-01

    Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. This article presents a Bayesian framework that attempts to meet both goals and describes [corrected] 4 applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the 4 models are defined over different kinds of structures that capture different relationships between the categories in a domain. The framework therefore shows how statistical inference can operate over structured background knowledge, and the authors argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.

  10. Model for neural signaling leap statistics

    International Nuclear Information System (INIS)

    Chevrollier, Martine; Oria, Marcos

    2011-01-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T 37.5 0 C, awaken regime) and Levy statistics (T = 35.5 0 C, sleeping period), characterized by rare events of long range connections.

  11. Statistical models based on conditional probability distributions

    International Nuclear Information System (INIS)

    Narayanan, R.S.

    1991-10-01

    We present a formulation of statistical mechanics models based on conditional probability distribution rather than a Hamiltonian. We show that it is possible to realize critical phenomena through this procedure. Closely linked with this formulation is a Monte Carlo algorithm, in which a configuration generated is guaranteed to be statistically independent from any other configuration for all values of the parameters, in particular near the critical point. (orig.)

  12. Model for neural signaling leap statistics

    Science.gov (United States)

    Chevrollier, Martine; Oriá, Marcos

    2011-03-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T = 37.5°C, awaken regime) and Lévy statistics (T = 35.5°C, sleeping period), characterized by rare events of long range connections.

  13. Model for neural signaling leap statistics

    Energy Technology Data Exchange (ETDEWEB)

    Chevrollier, Martine; Oria, Marcos, E-mail: oria@otica.ufpb.br [Laboratorio de Fisica Atomica e Lasers Departamento de Fisica, Universidade Federal da ParaIba Caixa Postal 5086 58051-900 Joao Pessoa, Paraiba (Brazil)

    2011-03-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T 37.5{sup 0}C, awaken regime) and Levy statistics (T = 35.5{sup 0}C, sleeping period), characterized by rare events of long range connections.

  14. Statistics Refresher for Molecular Imaging Technologists, Part 2: Accuracy of Interpretation, Significance, and Variance.

    Science.gov (United States)

    Farrell, Mary Beth

    2018-06-01

    This article is the second part of a continuing education series reviewing basic statistics that nuclear medicine and molecular imaging technologists should understand. In this article, the statistics for evaluating interpretation accuracy, significance, and variance are discussed. Throughout the article, actual statistics are pulled from the published literature. We begin by explaining 2 methods for quantifying interpretive accuracy: interreader and intrareader reliability. Agreement among readers can be expressed simply as a percentage. However, the Cohen κ-statistic is a more robust measure of agreement that accounts for chance. The higher the κ-statistic is, the higher is the agreement between readers. When 3 or more readers are being compared, the Fleiss κ-statistic is used. Significance testing determines whether the difference between 2 conditions or interventions is meaningful. Statistical significance is usually expressed using a number called a probability ( P ) value. Calculation of P value is beyond the scope of this review. However, knowing how to interpret P values is important for understanding the scientific literature. Generally, a P value of less than 0.05 is considered significant and indicates that the results of the experiment are due to more than just chance. Variance, standard deviation (SD), confidence interval, and standard error (SE) explain the dispersion of data around a mean of a sample drawn from a population. SD is commonly reported in the literature. A small SD indicates that there is not much variation in the sample data. Many biologic measurements fall into what is referred to as a normal distribution taking the shape of a bell curve. In a normal distribution, 68% of the data will fall within 1 SD, 95% will fall within 2 SDs, and 99.7% will fall within 3 SDs. Confidence interval defines the range of possible values within which the population parameter is likely to lie and gives an idea of the precision of the statistic being

  15. Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.

    Science.gov (United States)

    Kieffer, Kevin M.; Thompson, Bruce

    As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate…

  16. Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.

    Science.gov (United States)

    Deegear, James

    This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…

  17. Is statistical significance clinically important?--A guide to judge the clinical relevance of study findings

    NARCIS (Netherlands)

    Sierevelt, Inger N.; van Oldenrijk, Jakob; Poolman, Rudolf W.

    2007-01-01

    In this paper we describe several issues that influence the reporting of statistical significance in relation to clinical importance, since misinterpretation of p values is a common issue in orthopaedic literature. Orthopaedic research is tormented by the risks of false-positive (type I error) and

  18. Statistical Significance of the Contribution of Variables to the PCA Solution: An Alternative Permutation Strategy

    Science.gov (United States)

    Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.

    2011-01-01

    In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…

  19. Statistical significance versus clinical importance: trials on exercise therapy for chronic low back pain as example.

    NARCIS (Netherlands)

    van Tulder, M.W.; Malmivaara, A.; Hayden, J.; Koes, B.

    2007-01-01

    STUDY DESIGN. Critical appraisal of the literature. OBJECIVES. The objective of this study was to assess if results of back pain trials are statistically significant and clinically important. SUMMARY OF BACKGROUND DATA. There seems to be a discrepancy between conclusions reported by authors and

  20. P-Value, a true test of statistical significance? a cautionary note ...

    African Journals Online (AJOL)

    While it's not the intention of the founders of significance testing and hypothesis testing to have the two ideas intertwined as if they are complementary, the inconvenient marriage of the two practices into one coherent, convenient, incontrovertible and misinterpreted practice has dotted our standard statistics textbooks and ...

  1. Accelerator driven reactors, - the significance of the energy distribution of spallation neutrons on the neutron statistics

    Energy Technology Data Exchange (ETDEWEB)

    Fhager, V

    2000-01-01

    In order to make correct predictions of the second moment of statistical nuclear variables, such as the number of fissions and the number of thermalized neutrons, the dependence of the energy distribution of the source particles on their number should be considered. It has been pointed out recently that neglecting this number dependence in accelerator driven systems might result in bad estimates of the second moment, and this paper contains qualitative and quantitative estimates of the size of these efforts. We walk towards the requested results in two steps. First, models of the number dependent energy distributions of the neutrons that are ejected in the spallation reactions are constructed, both by simple assumptions and by extracting energy distributions of spallation neutrons from a high-energy particle transport code. Then, the second moment of nuclear variables in a sub-critical reactor, into which spallation neutrons are injected, is calculated. The results from second moment calculations using number dependent energy distributions for the source neutrons are compared to those where only the average energy distribution is used. Two physical models are employed to simulate the neutron transport in the reactor. One is analytical, treating only slowing down of neutrons by elastic scattering in the core material. For this model, equations are written down and solved for the second moment of thermalized neutrons that include the distribution of energy of the spallation neutrons. The other model utilizes Monte Carlo methods for tracking the source neutrons as they travel inside the reactor material. Fast and thermal fission reactions are considered, as well as neutron capture and elastic scattering, and the second moment of the number of fissions, the number of neutrons that leaked out of the system, etc. are calculated. Both models use a cylindrical core with a homogenous mixture of core material. Our results indicate that the number dependence of the energy

  2. Accelerator driven reactors, - the significance of the energy distribution of spallation neutrons on the neutron statistics

    International Nuclear Information System (INIS)

    Fhager, V.

    2000-01-01

    In order to make correct predictions of the second moment of statistical nuclear variables, such as the number of fissions and the number of thermalized neutrons, the dependence of the energy distribution of the source particles on their number should be considered. It has been pointed out recently that neglecting this number dependence in accelerator driven systems might result in bad estimates of the second moment, and this paper contains qualitative and quantitative estimates of the size of these efforts. We walk towards the requested results in two steps. First, models of the number dependent energy distributions of the neutrons that are ejected in the spallation reactions are constructed, both by simple assumptions and by extracting energy distributions of spallation neutrons from a high-energy particle transport code. Then, the second moment of nuclear variables in a sub-critical reactor, into which spallation neutrons are injected, is calculated. The results from second moment calculations using number dependent energy distributions for the source neutrons are compared to those where only the average energy distribution is used. Two physical models are employed to simulate the neutron transport in the reactor. One is analytical, treating only slowing down of neutrons by elastic scattering in the core material. For this model, equations are written down and solved for the second moment of thermalized neutrons that include the distribution of energy of the spallation neutrons. The other model utilizes Monte Carlo methods for tracking the source neutrons as they travel inside the reactor material. Fast and thermal fission reactions are considered, as well as neutron capture and elastic scattering, and the second moment of the number of fissions, the number of neutrons that leaked out of the system, etc. are calculated. Both models use a cylindrical core with a homogenous mixture of core material. Our results indicate that the number dependence of the energy

  3. Growth curve models and statistical diagnostics

    CERN Document Server

    Pan, Jian-Xin

    2002-01-01

    Growth-curve models are generalized multivariate analysis-of-variance models. These models are especially useful for investigating growth problems on short times in economics, biology, medical research, and epidemiology. This book systematically introduces the theory of the GCM with particular emphasis on their multivariate statistical diagnostics, which are based mainly on recent developments made by the authors and their collaborators. The authors provide complete proofs of theorems as well as practical data sets and MATLAB code.

  4. Topology for Statistical Modeling of Petascale Data

    Energy Technology Data Exchange (ETDEWEB)

    Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Bremer, P. -T. [Univ. of Utah, Salt Lake City, UT (United States)

    2013-10-31

    Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, the approach of the entire team involving all three institutions is based on the complementary techniques of combinatorial topology and statistical modelling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modelling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. The overall technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modelling, and (3) new integrated topological and statistical methods. Roughly speaking, the division of labor between our 3 groups (Sandia Labs in Livermore, Texas A&M in College Station, and U Utah in Salt Lake City) is as follows: the Sandia group focuses on statistical methods and their formulation in algebraic terms, and finds the application problems (and data sets) most relevant to this project, the Texas A&M Group develops new algebraic geometry algorithms, in particular with fewnomial theory, and the Utah group develops new algorithms in computational topology via Discrete Morse Theory. However, we hasten to point out that our three groups stay in tight contact via videconference every 2 weeks, so there is much synergy of ideas between the groups. The following of this document is focused on the contributions that had grater direct involvement from the team at the University of Utah in Salt Lake City.

  5. A statistical model for predicting muscle performance

    Science.gov (United States)

    Byerly, Diane Leslie De Caix

    The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing

  6. Bayesian models a statistical primer for ecologists

    CERN Document Server

    Hobbs, N Thompson

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili

  7. A critical discussion of null hypothesis significance testing and statistical power analysis within psychological research

    DEFF Research Database (Denmark)

    Jones, Allan; Sommerlund, Bo

    2007-01-01

    The uses of null hypothesis significance testing (NHST) and statistical power analysis within psychological research are critically discussed. The article looks at the problems of relying solely on NHST when dealing with small and large sample sizes. The use of power-analysis in estimating...... the potential error introduced by small and large samples is advocated. Power analysis is not recommended as a replacement to NHST but as an additional source of information about the phenomena under investigation. Moreover, the importance of conceptual analysis in relation to statistical analysis of hypothesis...

  8. Statistical transmutation in doped quantum dimer models.

    Science.gov (United States)

    Lamas, C A; Ralko, A; Cabra, D C; Poilblanc, D; Pujol, P

    2012-07-06

    We prove a "statistical transmutation" symmetry of doped quantum dimer models on the square, triangular, and kagome lattices: the energy spectrum is invariant under a simultaneous change of statistics (i.e., bosonic into fermionic or vice versa) of the holes and of the signs of all the dimer resonance loops. This exact transformation enables us to define the duality equivalence between doped quantum dimer Hamiltonians and provides the analytic framework to analyze dynamical statistical transmutations. We investigate numerically the doping of the triangular quantum dimer model with special focus on the topological Z(2) dimer liquid. Doping leads to four (instead of two for the square lattice) inequivalent families of Hamiltonians. Competition between phase separation, superfluidity, supersolidity, and fermionic phases is investigated in the four families.

  9. STATISTICAL MODELS OF REPRESENTING INTELLECTUAL CAPITAL

    Directory of Open Access Journals (Sweden)

    Andreea Feraru

    2016-06-01

    Full Text Available This article entitled Statistical Models of Representing Intellectual Capital approaches and analyses the concept of intellectual capital, as well as the main models which can support enterprisers/managers in evaluating and quantifying the advantages of intellectual capital. Most authors examine intellectual capital from a static perspective and focus on the development of its various evaluation models. In this chapter we surveyed the classical static models: Sveiby, Edvisson, Balanced Scorecard, as well as the canonical model of intellectual capital. Among the group of static models for evaluating organisational intellectual capital the canonical model stands out. This model enables the structuring of organisational intellectual capital in: human capital, structural capital and relational capital. Although the model is widely spread, it is a static one and can thus create a series of errors in the process of evaluation, because all the three entities mentioned above are not independent from the viewpoint of their contents, as any logic of structuring complex entities requires.

  10. (ajst) statistical mechanics model for orientational

    African Journals Online (AJOL)

    Science and Engineering Series Vol. 6, No. 2, pp. 94 - 101. STATISTICAL MECHANICS MODEL FOR ORIENTATIONAL. MOTION OF TWO-DIMENSIONAL RIGID ROTATOR. Malo, J.O. ... there is no translational motion and that they are well separated so .... constant and I is the moment of inertia of a linear rotator. Thus, the ...

  11. Statistical Model Checking for Biological Systems

    DEFF Research Database (Denmark)

    David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel

    2014-01-01

    Statistical Model Checking (SMC) is a highly scalable simulation-based verification approach for testing and estimating the probability that a stochastic system satisfies a given linear temporal property. The technique has been applied to (discrete and continuous time) Markov chains, stochastic...

  12. Topology for Statistical Modeling of Petascale Data

    Energy Technology Data Exchange (ETDEWEB)

    Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pebay, Philippe Pierre [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Rojas, Maurice [Texas A & M Univ., College Station, TX (United States)

    2014-07-01

    This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled "Topology for Statistical Modeling of Petascale Data", funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program.

  13. Establishing statistical models of manufacturing parameters

    International Nuclear Information System (INIS)

    Senevat, J.; Pape, J.L.; Deshayes, J.F.

    1991-01-01

    This paper reports on the effect of pilgering and cold-work parameters on contractile strain ratio and mechanical properties that were investigated using a large population of Zircaloy tubes. Statistical models were established between: contractile strain ratio and tooling parameters, mechanical properties (tensile test, creep test) and cold-work parameters, and mechanical properties and stress-relieving temperature

  14. Statistical models for optimizing mineral exploration

    International Nuclear Information System (INIS)

    Wignall, T.K.; DeGeoffroy, J.

    1987-01-01

    The primary purpose of mineral exploration is to discover ore deposits. The emphasis of this volume is on the mathematical and computational aspects of optimizing mineral exploration. The seven chapters that make up the main body of the book are devoted to the description and application of various types of computerized geomathematical models. These chapters include: (1) the optimal selection of ore deposit types and regions of search, as well as prospecting selected areas, (2) designing airborne and ground field programs for the optimal coverage of prospecting areas, and (3) delineating and evaluating exploration targets within prospecting areas by means of statistical modeling. Many of these statistical programs are innovative and are designed to be useful for mineral exploration modeling. Examples of geomathematical models are applied to exploring for six main types of base and precious metal deposits, as well as other mineral resources (such as bauxite and uranium)

  15. A statistical model for mapping morphological shape

    Directory of Open Access Journals (Sweden)

    Li Jiahan

    2010-07-01

    Full Text Available Abstract Background Living things come in all shapes and sizes, from bacteria, plants, and animals to humans. Knowledge about the genetic mechanisms for biological shape has far-reaching implications for a range spectrum of scientific disciplines including anthropology, agriculture, developmental biology, evolution and biomedicine. Results We derived a statistical model for mapping specific genes or quantitative trait loci (QTLs that control morphological shape. The model was formulated within the mixture framework, in which different types of shape are thought to result from genotypic discrepancies at a QTL. The EM algorithm was implemented to estimate QTL genotype-specific shapes based on a shape correspondence analysis. Computer simulation was used to investigate the statistical property of the model. Conclusion By identifying specific QTLs for morphological shape, the model developed will help to ask, disseminate and address many major integrative biological and genetic questions and challenges in the genetic control of biological shape and function.

  16. Performance modeling, stochastic networks, and statistical multiplexing

    CERN Document Server

    Mazumdar, Ravi R

    2013-01-01

    This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of introducing an appropriate mathematical framework for modeling and analysis as well as understanding the phenomenon of statistical multiplexing. The models, techniques, and results presented form the core of traffic engineering methods used to design, control and allocate resources in communication networks.The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the importan

  17. Statistical models for competing risk analysis

    International Nuclear Information System (INIS)

    Sather, H.N.

    1976-08-01

    Research results on three new models for potential applications in competing risks problems. One section covers the basic statistical relationships underlying the subsequent competing risks model development. Another discusses the problem of comparing cause-specific risk structure by competing risks theory in two homogeneous populations, P1 and P2. Weibull models which allow more generality than the Berkson and Elveback models are studied for the effect of time on the hazard function. The use of concomitant information for modeling single-risk survival is extended to the multiple failure mode domain of competing risks. The model used to illustrate the use of this methodology is a life table model which has constant hazards within pre-designated intervals of the time scale. Two parametric models for bivariate dependent competing risks, which provide interesting alternatives, are proposed and examined

  18. Statistical physics of pairwise probability models

    Directory of Open Access Journals (Sweden)

    Yasser Roudi

    2009-11-01

    Full Text Available Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring the parameters in a pairwise model depends on the time bin chosen for binning the data. We also study the effect of the size of the time bin on the model quality itself, again using simulated data. We show that using finer time bins increases the quality of the pairwise model. We offer new ways of deriving the expressions reported in our previous work for assessing the quality of pairwise models.

  19. Confounding and Statistical Significance of Indirect Effects: Childhood Adversity, Education, Smoking, and Anxious and Depressive Symptomatology

    Directory of Open Access Journals (Sweden)

    Mashhood Ahmed Sheikh

    2017-08-01

    Full Text Available The life course perspective, the risky families model, and stress-and-coping models provide the rationale for assessing the role of smoking as a mediator in the association between childhood adversity and anxious and depressive symptomatology (ADS in adulthood. However, no previous study has assessed the independent mediating role of smoking in the association between childhood adversity and ADS in adulthood. Moreover, the importance of mediator-response confounding variables has rarely been demonstrated empirically in social and psychiatric epidemiology. The aim of this paper was to (i assess the mediating role of smoking in adulthood in the association between childhood adversity and ADS in adulthood, and (ii assess the change in estimates due to different mediator-response confounding factors (education, alcohol intake, and social support. The present analysis used data collected from 1994 to 2008 within the framework of the Tromsø Study (N = 4,530, a representative prospective cohort study of men and women. Seven childhood adversities (low mother's education, low father's education, low financial conditions, exposure to passive smoke, psychological abuse, physical abuse, and substance abuse distress were used to create a childhood adversity score. Smoking status was measured at a mean age of 54.7 years (Tromsø IV, and ADS in adulthood was measured at a mean age of 61.7 years (Tromsø V. Mediation analysis was used to assess the indirect effect and the proportion of mediated effect (% of childhood adversity on ADS in adulthood via smoking in adulthood. The test-retest reliability of smoking was good (Kappa: 0.67, 95% CI: 0.63; 0.71 in this sample. Childhood adversity was associated with a 10% increased risk of smoking in adulthood (Relative risk: 1.10, 95% CI: 1.03; 1.18, and both childhood adversity and smoking in adulthood were associated with greater levels of ADS in adulthood (p < 0.001. Smoking in adulthood did not significantly

  20. Thresholds for statistical and clinical significance in systematic reviews with meta-analytic methods

    DEFF Research Database (Denmark)

    Jakobsen, Janus Christian; Wetterslev, Jorn; Winkel, Per

    2014-01-01

    BACKGROUND: Thresholds for statistical significance when assessing meta-analysis results are being insufficiently demonstrated by traditional 95% confidence intervals and P-values. Assessment of intervention effects in systematic reviews with meta-analysis deserves greater rigour. METHODS......: Methodologies for assessing statistical and clinical significance of intervention effects in systematic reviews were considered. Balancing simplicity and comprehensiveness, an operational procedure was developed, based mainly on The Cochrane Collaboration methodology and the Grading of Recommendations...... Assessment, Development, and Evaluation (GRADE) guidelines. RESULTS: We propose an eight-step procedure for better validation of meta-analytic results in systematic reviews (1) Obtain the 95% confidence intervals and the P-values from both fixed-effect and random-effects meta-analyses and report the most...

  1. Testing statistical significance scores of sequence comparison methods with structure similarity

    Directory of Open Access Journals (Sweden)

    Leunissen Jack AM

    2006-10-01

    Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.

  2. Statistical models of petrol engines vehicles dynamics

    Science.gov (United States)

    Ilie, C. O.; Marinescu, M.; Alexa, O.; Vilău, R.; Grosu, D.

    2017-10-01

    This paper focuses on studying statistical models of vehicles dynamics. It was design and perform a one year testing program. There were used many same type cars with gasoline engines and different mileage. Experimental data were collected of onboard sensors and those on the engine test stand. A database containing data of 64th tests was created. Several mathematical modelling were developed using database and the system identification method. Each modelling is a SISO or a MISO linear predictive ARMAX (AutoRegressive-Moving-Average with eXogenous inputs) model. It represents a differential equation with constant coefficients. It were made 64th equations for each dependency like engine torque as output and engine’s load and intake manifold pressure, as inputs. There were obtained strings with 64 values for each type of model. The final models were obtained using average values of the coefficients. The accuracy of models was assessed.

  3. Equilibrium statistical mechanics of lattice models

    CERN Document Server

    Lavis, David A

    2015-01-01

    Most interesting and difficult problems in equilibrium statistical mechanics concern models which exhibit phase transitions. For graduate students and more experienced researchers this book provides an invaluable reference source of approximate and exact solutions for a comprehensive range of such models. Part I contains background material on classical thermodynamics and statistical mechanics, together with a classification and survey of lattice models. The geometry of phase transitions is described and scaling theory is used to introduce critical exponents and scaling laws. An introduction is given to finite-size scaling, conformal invariance and Schramm—Loewner evolution. Part II contains accounts of classical mean-field methods. The parallels between Landau expansions and catastrophe theory are discussed and Ginzburg—Landau theory is introduced. The extension of mean-field theory to higher-orders is explored using the Kikuchi—Hijmans—De Boer hierarchy of approximations. In Part III the use of alge...

  4. Statistical shape and appearance models of bones.

    Science.gov (United States)

    Sarkalkan, Nazli; Weinans, Harrie; Zadpoor, Amir A

    2014-03-01

    When applied to bones, statistical shape models (SSM) and statistical appearance models (SAM) respectively describe the mean shape and mean density distribution of bones within a certain population as well as the main modes of variations of shape and density distribution from their mean values. The availability of this quantitative information regarding the detailed anatomy of bones provides new opportunities for diagnosis, evaluation, and treatment of skeletal diseases. The potential of SSM and SAM has been recently recognized within the bone research community. For example, these models have been applied for studying the effects of bone shape on the etiology of osteoarthritis, improving the accuracy of clinical osteoporotic fracture prediction techniques, design of orthopedic implants, and surgery planning. This paper reviews the main concepts, methods, and applications of SSM and SAM as applied to bone. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Statistical Models of Adaptive Immune populations

    Science.gov (United States)

    Sethna, Zachary; Callan, Curtis; Walczak, Aleksandra; Mora, Thierry

    The availability of large (104-106 sequences) datasets of B or T cell populations from a single individual allows reliable fitting of complex statistical models for naïve generation, somatic selection, and hypermutation. It is crucial to utilize a probabilistic/informational approach when modeling these populations. The inferred probability distributions allow for population characterization, calculation of probability distributions of various hidden variables (e.g. number of insertions), as well as statistical properties of the distribution itself (e.g. entropy). In particular, the differences between the T cell populations of embryonic and mature mice will be examined as a case study. Comparing these populations, as well as proposed mixed populations, provides a concrete exercise in model creation, comparison, choice, and validation.

  6. Cellular automata and statistical mechanical models

    International Nuclear Information System (INIS)

    Rujan, P.

    1987-01-01

    The authors elaborate on the analogy between the transfer matrix of usual lattice models and the master equation describing the time development of cellular automata. Transient and stationary properties of probabilistic automata are linked to surface and bulk properties, respectively, of restricted statistical mechanical systems. It is demonstrated that methods of statistical physics can be successfully used to describe the dynamic and the stationary behavior of such automata. Some exact results are derived, including duality transformations, exact mappings, disorder, and linear solutions. Many examples are worked out in detail to demonstrate how to use statistical physics in order to construct cellular automata with desired properties. This approach is considered to be a first step toward the design of fully parallel, probabilistic systems whose computational abilities rely on the cooperative behavior of their components

  7. Intensive inpatient treatment for bulimia nervosa: Statistical and clinical significance of symptom changes.

    Science.gov (United States)

    Diedrich, Alice; Schlegl, Sandra; Greetfeld, Martin; Fumi, Markus; Voderholzer, Ulrich

    2018-03-01

    This study examines the statistical and clinical significance of symptom changes during an intensive inpatient treatment program with a strong psychotherapeutic focus for individuals with severe bulimia nervosa. 295 consecutively admitted bulimic patients were administered the Structured Interview for Anorexic and Bulimic Syndromes-Self-Rating (SIAB-S), the Eating Disorder Inventory-2 (EDI-2), the Brief Symptom Inventory (BSI), and the Beck Depression Inventory-II (BDI-II) at treatment intake and discharge. Results indicated statistically significant symptom reductions with large effect sizes regarding severity of binge eating and compensatory behavior (SIAB-S), overall eating disorder symptom severity (EDI-2), overall psychopathology (BSI), and depressive symptom severity (BDI-II) even when controlling for antidepressant medication. The majority of patients showed either reliable (EDI-2: 33.7%, BSI: 34.8%, BDI-II: 18.1%) or even clinically significant symptom changes (EDI-2: 43.2%, BSI: 33.9%, BDI-II: 56.9%). Patients with clinically significant improvement were less distressed at intake and less likely to suffer from a comorbid borderline personality disorder when compared with those who did not improve to a clinically significant extent. Findings indicate that intensive psychotherapeutic inpatient treatment may be effective in about 75% of severely affected bulimic patients. For the remaining non-responding patients, inpatient treatment might be improved through an even stronger focus on the reduction of comorbid borderline personality traits.

  8. Cloud-based solution to identify statistically significant MS peaks differentiating sample categories.

    Science.gov (United States)

    Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B

    2013-03-23

    Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.

  9. Statistical Modelling of Wind Proles - Data Analysis and Modelling

    DEFF Research Database (Denmark)

    Jónsson, Tryggvi; Pinson, Pierre

    The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....

  10. Statistical modeling of geopressured geothermal reservoirs

    Science.gov (United States)

    Ansari, Esmail; Hughes, Richard; White, Christopher D.

    2017-06-01

    Identifying attractive candidate reservoirs for producing geothermal energy requires predictive models. In this work, inspectional analysis and statistical modeling are used to create simple predictive models for a line drive design. Inspectional analysis on the partial differential equations governing this design yields a minimum number of fifteen dimensionless groups required to describe the physics of the system. These dimensionless groups are explained and confirmed using models with similar dimensionless groups but different dimensional parameters. This study models dimensionless production temperature and thermal recovery factor as the responses of a numerical model. These responses are obtained by a Box-Behnken experimental design. An uncertainty plot is used to segment the dimensionless time and develop a model for each segment. The important dimensionless numbers for each segment of the dimensionless time are identified using the Boosting method. These selected numbers are used in the regression models. The developed models are reduced to have a minimum number of predictors and interactions. The reduced final models are then presented and assessed using testing runs. Finally, applications of these models are offered. The presented workflow is generic and can be used to translate the output of a numerical simulator into simple predictive models in other research areas involving numerical simulation.

  11. A statistical model for instable thermodynamical systems

    International Nuclear Information System (INIS)

    Sommer, Jens-Uwe

    2003-01-01

    A generic model is presented for statistical systems which display thermodynamic features in contrast to our everyday experience, such as infinite and negative heat capacities. Such system are instable in terms of classical equilibrium thermodynamics. Using our statistical model, we are able to investigate states of instable systems which are undefined in the framework of equilibrium thermodynamics. We show that a region of negative heat capacity in the adiabatic environment, leads to a first order like phase transition when the system is coupled to a heat reservoir. This phase transition takes place without a phase coexistence. Nevertheless, all intermediate states are stable due to fluctuations. When two instable system are brought in thermal contact, the temperature of the composed system is lower than the minimum temperature of the individual systems. Generally, the equilibrium states of instable system cannot be simply decomposed into equilibrium states of the individual systems. The properties of instable system depend on the environment, ensemble equivalence is broken

  12. Logarithmic transformed statistical models in calibration

    International Nuclear Information System (INIS)

    Zeis, C.D.

    1975-01-01

    A general type of statistical model used for calibration of instruments having the property that the standard deviations of the observed values increase as a function of the mean value is described. The application to the Helix Counter at the Rocky Flats Plant is primarily from a theoretical point of view. The Helix Counter measures the amount of plutonium in certain types of chemicals. The method described can be used also for other calibrations. (U.S.)

  13. ARSENIC CONTAMINATION IN GROUNDWATER: A STATISTICAL MODELING

    OpenAIRE

    Palas Roy; Naba Kumar Mondal; Biswajit Das; Kousik Das

    2013-01-01

    High arsenic in natural groundwater in most of the tubewells of the Purbasthali- Block II area of Burdwan district (W.B, India) has recently been focused as a serious environmental concern. This paper is intending to illustrate the statistical modeling of the arsenic contaminated groundwater to identify the interrelation of that arsenic contain with other participating groundwater parameters so that the arsenic contamination level can easily be predicted by analyzing only such parameters. Mul...

  14. A simple statistical model for geomagnetic reversals

    Science.gov (United States)

    Constable, Catherine

    1990-01-01

    The diversity of paleomagnetic records of geomagnetic reversals now available indicate that the field configuration during transitions cannot be adequately described by simple zonal or standing field models. A new model described here is based on statistical properties inferred from the present field and is capable of simulating field transitions like those observed. Some insight is obtained into what one can hope to learn from paleomagnetic records. In particular, it is crucial that the effects of smoothing in the remanence acquisition process be separated from true geomagnetic field behavior. This might enable us to determine the time constants associated with the dominant field configuration during a reversal.

  15. Statistical Analysis and Evaluation of the Depth of the Ruts on Lithuanian State Significance Roads

    Directory of Open Access Journals (Sweden)

    Erinijus Getautis

    2011-04-01

    Full Text Available The aim of this work is to gather information about the national flexible pavement roads ruts depth, to determine its statistical dispersijon index and to determine their validity for needed requirements. Analysis of scientific works of ruts apearance in the asphalt and their influence for driving is presented in this work. Dynamical models of ruts in asphalt are presented in the work as well. Experimental outcome data of rut depth dispersijon in the national highway of Lithuania Vilnius – Kaunas is prepared. Conclusions are formulated and presented. Article in Lithuanian

  16. Statistical significance estimation of a signal within the GooFit framework on GPUs

    Directory of Open Access Journals (Sweden)

    Cristella Leonardo

    2017-01-01

    Full Text Available In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B+ → J/ψϕK+. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.

  17. Statistical significance of theoretical predictions: A new dimension in nuclear structure theories (I)

    International Nuclear Information System (INIS)

    DUDEK, J; SZPAK, B; FORNAL, B; PORQUET, M-G

    2011-01-01

    In this and the follow-up article we briefly discuss what we believe represents one of the most serious problems in contemporary nuclear structure: the question of statistical significance of parametrizations of nuclear microscopic Hamiltonians and the implied predictive power of the underlying theories. In the present Part I, we introduce the main lines of reasoning of the so-called Inverse Problem Theory, an important sub-field in the contemporary Applied Mathematics, here illustrated on the example of the Nuclear Mean-Field Approach.

  18. Statistical Modelling of the Soil Dielectric Constant

    Science.gov (United States)

    Usowicz, Boguslaw; Marczewski, Wojciech; Bogdan Usowicz, Jerzy; Lipiec, Jerzy

    2010-05-01

    The dielectric constant of soil is the physical property being very sensitive on water content. It funds several electrical measurement techniques for determining the water content by means of direct (TDR, FDR, and others related to effects of electrical conductance and/or capacitance) and indirect RS (Remote Sensing) methods. The work is devoted to a particular statistical manner of modelling the dielectric constant as the property accounting a wide range of specific soil composition, porosity, and mass density, within the unsaturated water content. Usually, similar models are determined for few particular soil types, and changing the soil type one needs switching the model on another type or to adjust it by parametrization of soil compounds. Therefore, it is difficult comparing and referring results between models. The presented model was developed for a generic representation of soil being a hypothetical mixture of spheres, each representing a soil fraction, in its proper phase state. The model generates a serial-parallel mesh of conductive and capacitive paths, which is analysed for a total conductive or capacitive property. The model was firstly developed to determine the thermal conductivity property, and now it is extended on the dielectric constant by analysing the capacitive mesh. The analysis is provided by statistical means obeying physical laws related to the serial-parallel branching of the representative electrical mesh. Physical relevance of the analysis is established electrically, but the definition of the electrical mesh is controlled statistically by parametrization of compound fractions, by determining the number of representative spheres per unitary volume per fraction, and by determining the number of fractions. That way the model is capable covering properties of nearly all possible soil types, all phase states within recognition of the Lorenz and Knudsen conditions. In effect the model allows on generating a hypothetical representative of

  19. Encoding Dissimilarity Data for Statistical Model Building.

    Science.gov (United States)

    Wahba, Grace

    2010-12-01

    We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.

  20. Modelling vocal anatomy's significant effect on speech

    NARCIS (Netherlands)

    de Boer, B.

    2010-01-01

    This paper investigates the effect of larynx position on the articulatory abilities of a humanlike vocal tract. Previous work has investigated models that were built to resemble the anatomy of existing species or fossil ancestors. This has led to conflicting conclusions about the relation between

  1. Average Nuclear properties based on statistical model

    International Nuclear Information System (INIS)

    El-Jaick, L.J.

    1974-01-01

    The rough properties of nuclei were investigated by statistical model, in systems with the same and different number of protons and neutrons, separately, considering the Coulomb energy in the last system. Some average nuclear properties were calculated based on the energy density of nuclear matter, from Weizsscker-Beth mass semiempiric formulae, generalized for compressible nuclei. In the study of a s surface energy coefficient, the great influence exercised by Coulomb energy and nuclear compressibility was verified. For a good adjust of beta stability lines and mass excess, the surface symmetry energy were established. (M.C.K.) [pt

  2. Plan Recognition using Statistical Relational Models

    Science.gov (United States)

    2014-08-25

    corresponding undirected model can be significantly more complex since there is no closed form solution for the maximum-likelihood set of parameters unlike in...algorithm did not scale to larger training sets, and the overall results are still not competitive with BALPs. 5In directed models, a closed form solution...opinions of ARO, DARPA, NSF or any other government agency. References Albrecht DW, Zukerman I, Nicholson AE. Bayesian models for keyhole plan

  3. Statistical determination of significant curved I-girder bridge seismic response parameters

    Science.gov (United States)

    Seo, Junwon

    2013-06-01

    Curved steel bridges are commonly used at interchanges in transportation networks and more of these structures continue to be designed and built in the United States. Though the use of these bridges continues to increase in locations that experience high seismicity, the effects of curvature and other parameters on their seismic behaviors have been neglected in current risk assessment tools. These tools can evaluate the seismic vulnerability of a transportation network using fragility curves. One critical component of fragility curve development for curved steel bridges is the completion of sensitivity analyses that help identify influential parameters related to their seismic response. In this study, an accessible inventory of existing curved steel girder bridges located primarily in the Mid-Atlantic United States (MAUS) was used to establish statistical characteristics used as inputs for a seismic sensitivity study. Critical seismic response quantities were captured using 3D nonlinear finite element models. Influential parameters from these quantities were identified using statistical tools that incorporate experimental Plackett-Burman Design (PBD), which included Pareto optimal plots and prediction profiler techniques. The findings revealed that the potential variation in the influential parameters included number of spans, radius of curvature, maximum span length, girder spacing, and cross-frame spacing. These parameters showed varying levels of influence on the critical bridge response.

  4. Examining reproducibility in psychology : A hybrid method for combining a statistically significant original study and a replication

    NARCIS (Netherlands)

    Van Aert, R.C.M.; Van Assen, M.A.L.M.

    2018-01-01

    The unrealistically high rate of positive results within psychology has increased the attention to replication research. However, researchers who conduct a replication and want to statistically combine the results of their replication with a statistically significant original study encounter

  5. A Note on Comparing the Power of Test Statistics at Low Significance Levels.

    Science.gov (United States)

    Morris, Nathan; Elston, Robert

    2011-01-01

    It is an obvious fact that the power of a test statistic is dependent upon the significance (alpha) level at which the test is performed. It is perhaps a less obvious fact that the relative performance of two statistics in terms of power is also a function of the alpha level. Through numerous personal discussions, we have noted that even some competent statisticians have the mistaken intuition that relative power comparisons at traditional levels such as α = 0.05 will be roughly similar to relative power comparisons at very low levels, such as the level α = 5 × 10 -8 , which is commonly used in genome-wide association studies. In this brief note, we demonstrate that this notion is in fact quite wrong, especially with respect to comparing tests with differing degrees of freedom. In fact, at very low alpha levels the cost of additional degrees of freedom is often comparatively low. Thus we recommend that statisticians exercise caution when interpreting the results of power comparison studies which use alpha levels that will not be used in practice.

  6. Statistically significant faunal differences among Middle Ordovician age, Chickamauga Group bryozoan bioherms, central Alabama

    Energy Technology Data Exchange (ETDEWEB)

    Crow, C.J.

    1985-01-01

    Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.

  7. Adaptive Maneuvering Frequency Method of Current Statistical Model

    Institute of Scientific and Technical Information of China (English)

    Wei Sun; Yongjian Yang

    2017-01-01

    Current statistical model(CSM) has a good performance in maneuvering target tracking. However, the fixed maneuvering frequency will deteriorate the tracking results, such as a serious dynamic delay, a slowly converging speedy and a limited precision when using Kalman filter(KF) algorithm. In this study, a new current statistical model and a new Kalman filter are proposed to improve the performance of maneuvering target tracking. The new model which employs innovation dominated subjection function to adaptively adjust maneuvering frequency has a better performance in step maneuvering target tracking, while a fluctuant phenomenon appears. As far as this problem is concerned, a new adaptive fading Kalman filter is proposed as well. In the new Kalman filter, the prediction values are amended in time by setting judgment and amendment rules,so that tracking precision and fluctuant phenomenon of the new current statistical model are improved. The results of simulation indicate the effectiveness of the new algorithm and the practical guiding significance.

  8. Statistical pairwise interaction model of stock market

    Science.gov (United States)

    Bury, Thomas

    2013-03-01

    Financial markets are a classical example of complex systems as they are compound by many interacting stocks. As such, we can obtain a surprisingly good description of their structure by making the rough simplification of binary daily returns. Spin glass models have been applied and gave some valuable results but at the price of restrictive assumptions on the market dynamics or they are agent-based models with rules designed in order to recover some empirical behaviors. Here we show that the pairwise model is actually a statistically consistent model with the observed first and second moments of the stocks orientation without making such restrictive assumptions. This is done with an approach only based on empirical data of price returns. Our data analysis of six major indices suggests that the actual interaction structure may be thought as an Ising model on a complex network with interaction strengths scaling as the inverse of the system size. This has potentially important implications since many properties of such a model are already known and some techniques of the spin glass theory can be straightforwardly applied. Typical behaviors, as multiple equilibria or metastable states, different characteristic time scales, spatial patterns, order-disorder, could find an explanation in this picture.

  9. Understanding and forecasting polar stratospheric variability with statistical models

    Directory of Open Access Journals (Sweden)

    C. Blume

    2012-07-01

    Full Text Available The variability of the north-polar stratospheric vortex is a prominent aspect of the middle atmosphere. This work investigates a wide class of statistical models with respect to their ability to model geopotential and temperature anomalies, representing variability in the polar stratosphere. Four partly nonstationary, nonlinear models are assessed: linear discriminant analysis (LDA; a cluster method based on finite elements (FEM-VARX; a neural network, namely the multi-layer perceptron (MLP; and support vector regression (SVR. These methods model time series by incorporating all significant external factors simultaneously, including ENSO, QBO, the solar cycle, volcanoes, to then quantify their statistical importance. We show that variability in reanalysis data from 1980 to 2005 is successfully modeled. The period from 2005 to 2011 can be hindcasted to a certain extent, where MLP performs significantly better than the remaining models. However, variability remains that cannot be statistically hindcasted within the current framework, such as the unexpected major warming in January 2009. Finally, the statistical model with the best generalization performance is used to predict a winter 2011/12 with warm and weak vortex conditions. A vortex breakdown is predicted for late January, early February 2012.

  10. Statistical modeling to support power system planning

    Science.gov (United States)

    Staid, Andrea

    This dissertation focuses on data-analytic approaches that improve our understanding of power system applications to promote better decision-making. It tackles issues of risk analysis, uncertainty management, resource estimation, and the impacts of climate change. Tools of data mining and statistical modeling are used to bring new insight to a variety of complex problems facing today's power system. The overarching goal of this research is to improve the understanding of the power system risk environment for improved operation, investment, and planning decisions. The first chapter introduces some challenges faced in planning for a sustainable power system. Chapter 2 analyzes the driving factors behind the disparity in wind energy investments among states with a goal of determining the impact that state-level policies have on incentivizing wind energy. Findings show that policy differences do not explain the disparities; physical and geographical factors are more important. Chapter 3 extends conventional wind forecasting to a risk-based focus of predicting maximum wind speeds, which are dangerous for offshore operations. Statistical models are presented that issue probabilistic predictions for the highest wind speed expected in a three-hour interval. These models achieve a high degree of accuracy and their use can improve safety and reliability in practice. Chapter 4 examines the challenges of wind power estimation for onshore wind farms. Several methods for wind power resource assessment are compared, and the weaknesses of the Jensen model are demonstrated. For two onshore farms, statistical models outperform other methods, even when very little information is known about the wind farm. Lastly, chapter 5 focuses on the power system more broadly in the context of the risks expected from tropical cyclones in a changing climate. Risks to U.S. power system infrastructure are simulated under different scenarios of tropical cyclone behavior that may result from climate

  11. Acceleration transforms and statistical kinetic models

    International Nuclear Information System (INIS)

    LuValle, M.J.; Welsher, T.L.; Svoboda, K.

    1988-01-01

    For a restricted class of problems a mathematical model of microscopic degradation processes, statistical kinetics, is developed and linked through acceleration transforms to the information which can be obtained from a system in which the only observable sign of degradation is sudden and catastrophic failure. The acceleration transforms were developed in accelerated life testing applications as a tool for extrapolating from the observable results of an accelerated life test to the dynamics of the underlying degradation processes. A particular concern of a physicist attempting to interpreted the results of an analysis based on acceleration transforms is determining the physical species involved in the degradation process. These species may be (a) relatively abundant or (b) relatively rare. The main results of this paper are a theorem showing that for an important subclass of statistical kinetic models, acceleration transforms cannot be used to distinguish between cases a and b, and an example showing that in some cases falling outside the restrictions of the theorem, cases a and b can be distinguished by their acceleration transforms

  12. Atmospheric corrosion: statistical validation of models

    International Nuclear Information System (INIS)

    Diaz, V.; Martinez-Luaces, V.; Guineo-Cobs, G.

    2003-01-01

    In this paper we discuss two different methods for validation of regression models, applied to corrosion data. One of them is based on the correlation coefficient and the other one is the statistical test of lack of fit. Both methods are used here to analyse fitting of bi logarithmic model in order to predict corrosion for very low carbon steel substrates in rural and urban-industrial atmospheres in Uruguay. Results for parameters A and n of the bi logarithmic model are reported here. For this purpose, all repeated values were used instead of using average values as usual. Modelling is carried out using experimental data corresponding to steel substrates under the same initial meteorological conditions ( in fact, they are put in the rack at the same time). Results of correlation coefficient are compared with the lack of it tested at two different signification levels (α=0.01 and α=0.05). Unexpected differences between them are explained and finally, it is possible to conclude, at least in the studied atmospheres, that the bi logarithmic model does not fit properly the experimental data. (Author) 18 refs

  13. Statistical significant changes in ground thermal conditions of alpine Austria during the last decade

    Science.gov (United States)

    Kellerer-Pirklbauer, Andreas

    2016-04-01

    Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter

  14. Statistical Validation of Normal Tissue Complication Probability Models

    Energy Technology Data Exchange (ETDEWEB)

    Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Veld, Aart A. van' t; Langendijk, Johannes A. [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schilstra, Cornelis [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Radiotherapy Institute Friesland, Leeuwarden (Netherlands)

    2012-09-01

    Purpose: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. Methods and Materials: A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Results: Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Conclusion: Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use.

  15. Statistical validation of normal tissue complication probability models.

    Science.gov (United States)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis

    2012-09-01

    To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Spherical Process Models for Global Spatial Statistics

    KAUST Repository

    Jeong, Jaehong

    2017-11-28

    Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture the spatial and temporal behavior of these global data sets. Though the geodesic distance is the most natural metric for measuring distance on the surface of a sphere, mathematical limitations have compelled statisticians to use the chordal distance to compute the covariance matrix in many applications instead, which may cause physically unrealistic distortions. Therefore, covariance functions directly defined on a sphere using the geodesic distance are needed. We discuss the issues that arise when dealing with spherical data sets on a global scale and provide references to recent literature. We review the current approaches to building process models on spheres, including the differential operator, the stochastic partial differential equation, the kernel convolution, and the deformation approaches. We illustrate realizations obtained from Gaussian processes with different covariance structures and the use of isotropic and nonstationary covariance models through deformations and geographical indicators for global surface temperature data. To assess the suitability of each method, we compare their log-likelihood values and prediction scores, and we end with a discussion of related research problems.

  17. Estimates of statistical significance for comparison of individual positions in multiple sequence alignments

    Directory of Open Access Journals (Sweden)

    Sadreyev Ruslan I

    2004-08-01

    Full Text Available Abstract Background Profile-based analysis of multiple sequence alignments (MSA allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1 MSA position and a set of predicted residue frequencies, and (2 between two MSA positions. These problems are important for (i evaluation and optimization of methods predicting residue occurrence at protein positions; (ii detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii detection of sites that determine functional or structural specificity in two related families. Results For problems (1 and (2, we propose analytical estimates of P-value and apply them to the detection of significant positional dissimilarities in various experimental situations. (a We compare structure-based predictions of residue propensities at a protein position to the actual residue frequencies in the MSA of homologs. (b We evaluate our method by the ability to detect erroneous position matches produced by an automatic sequence aligner. (c We compare MSA positions that correspond to residues aligned by automatic structure aligners. (d We compare MSA positions that are aligned by high-quality manual superposition of structures. Detected dissimilarities reveal shortcomings of the automatic methods for residue frequency prediction and alignment construction. For the high-quality structural alignments, the dissimilarities suggest sites of potential functional or structural importance. Conclusion The proposed computational method is of significant potential value for the analysis of protein families.

  18. Determining coding CpG islands by identifying regions significant for pattern statistics on Markov chains.

    Science.gov (United States)

    Singer, Meromit; Engström, Alexander; Schönhuth, Alexander; Pachter, Lior

    2011-09-23

    Recent experimental and computational work confirms that CpGs can be unmethylated inside coding exons, thereby showing that codons may be subjected to both genomic and epigenomic constraint. It is therefore of interest to identify coding CpG islands (CCGIs) that are regions inside exons enriched for CpGs. The difficulty in identifying such islands is that coding exons exhibit sequence biases determined by codon usage and constraints that must be taken into account. We present a method for finding CCGIs that showcases a novel approach we have developed for identifying regions of interest that are significant (with respect to a Markov chain) for the counts of any pattern. Our method begins with the exact computation of tail probabilities for the number of CpGs in all regions contained in coding exons, and then applies a greedy algorithm for selecting islands from among the regions. We show that the greedy algorithm provably optimizes a biologically motivated criterion for selecting islands while controlling the false discovery rate. We applied this approach to the human genome (hg18) and annotated CpG islands in coding exons. The statistical criterion we apply to evaluating islands reduces the number of false positives in existing annotations, while our approach to defining islands reveals significant numbers of undiscovered CCGIs in coding exons. Many of these appear to be examples of functional epigenetic specialization in coding exons.

  19. A statistical mechanical model of economics

    Science.gov (United States)

    Lubbers, Nicholas Edward Williams

    Statistical mechanics pursues low-dimensional descriptions of systems with a very large number of degrees of freedom. I explore this theme in two contexts. The main body of this dissertation explores and extends the Yard Sale Model (YSM) of economic transactions using a combination of simulations and theory. The YSM is a simple interacting model for wealth distributions which has the potential to explain the empirical observation of Pareto distributions of wealth. I develop the link between wealth condensation and the breakdown of ergodicity due to nonlinear diffusion effects which are analogous to the geometric random walk. Using this, I develop a deterministic effective theory of wealth transfer in the YSM that is useful for explaining many quantitative results. I introduce various forms of growth to the model, paying attention to the effect of growth on wealth condensation, inequality, and ergodicity. Arithmetic growth is found to partially break condensation, and geometric growth is found to completely break condensation. Further generalizations of geometric growth with growth in- equality show that the system is divided into two phases by a tipping point in the inequality parameter. The tipping point marks the line between systems which are ergodic and systems which exhibit wealth condensation. I explore generalizations of the YSM transaction scheme to arbitrary betting functions to develop notions of universality in YSM-like models. I find that wealth vi condensation is universal to a large class of models which can be divided into two phases. The first exhibits slow, power-law condensation dynamics, and the second exhibits fast, finite-time condensation dynamics. I find that the YSM, which exhibits exponential dynamics, is the critical, self-similar model which marks the dividing line between the two phases. The final chapter develops a low-dimensional approach to materials microstructure quantification. Modern materials design harnesses complex

  20. Current algebra, statistical mechanics and quantum models

    Science.gov (United States)

    Vilela Mendes, R.

    2017-11-01

    Results obtained in the past for free boson systems at zero and nonzero temperatures are revisited to clarify the physical meaning of current algebra reducible functionals which are associated to systems with density fluctuations, leading to observable effects on phase transitions. To use current algebra as a tool for the formulation of quantum statistical mechanics amounts to the construction of unitary representations of diffeomorphism groups. Two mathematical equivalent procedures exist for this purpose. One searches for quasi-invariant measures on configuration spaces, the other for a cyclic vector in Hilbert space. Here, one argues that the second approach is closer to the physical intuition when modelling complex systems. An example of application of the current algebra methodology to the pairing phenomenon in two-dimensional fermion systems is discussed.

  1. Statistical model for OCT image denoising

    KAUST Repository

    Li, Muxingzi

    2017-08-01

    Optical coherence tomography (OCT) is a non-invasive technique with a large array of applications in clinical imaging and biological tissue visualization. However, the presence of speckle noise affects the analysis of OCT images and their diagnostic utility. In this article, we introduce a new OCT denoising algorithm. The proposed method is founded on a numerical optimization framework based on maximum-a-posteriori estimate of the noise-free OCT image. It combines a novel speckle noise model, derived from local statistics of empirical spectral domain OCT (SD-OCT) data, with a Huber variant of total variation regularization for edge preservation. The proposed approach exhibits satisfying results in terms of speckle noise reduction as well as edge preservation, at reduced computational cost.

  2. New advances in statistical modeling and applications

    CERN Document Server

    Santos, Rui; Oliveira, Maria; Paulino, Carlos

    2014-01-01

    This volume presents selected papers from the XIXth Congress of the Portuguese Statistical Society, held in the town of Nazaré, Portugal, from September 28 to October 1, 2011. All contributions were selected after a thorough peer-review process. It covers a broad range of papers in the areas of statistical science, probability and stochastic processes, extremes and statistical applications.

  3. Statistical Model Checking of Rich Models and Properties

    DEFF Research Database (Denmark)

    Poulsen, Danny Bøgsted

    in undecidability issues for the traditional model checking approaches. Statistical model checking has proven itself a valuable supplement to model checking and this thesis is concerned with extending this software validation technique to stochastic hybrid systems. The thesis consists of two parts: the first part...... motivates why existing model checking technology should be supplemented by new techniques. It also contains a brief introduction to probability theory and concepts covered by the six papers making up the second part. The first two papers are concerned with developing online monitoring techniques...... systems. The fifth paper shows how stochastic hybrid automata are useful for modelling biological systems and the final paper is concerned with showing how statistical model checking is efficiently distributed. In parallel with developing the theory contained in the papers, a substantial part of this work...

  4. Statistical models of global Langmuir mixing

    Science.gov (United States)

    Li, Qing; Fox-Kemper, Baylor; Breivik, Øyvind; Webb, Adrean

    2017-05-01

    The effects of Langmuir mixing on the surface ocean mixing may be parameterized by applying an enhancement factor which depends on wave, wind, and ocean state to the turbulent velocity scale in the K-Profile Parameterization. Diagnosing the appropriate enhancement factor online in global climate simulations is readily achieved by coupling with a prognostic wave model, but with significant computational and code development expenses. In this paper, two alternatives that do not require a prognostic wave model, (i) a monthly mean enhancement factor climatology, and (ii) an approximation to the enhancement factor based on the empirical wave spectra, are explored and tested in a global climate model. Both appear to reproduce the Langmuir mixing effects as estimated using a prognostic wave model, with nearly identical and substantial improvements in the simulated mixed layer depth and intermediate water ventilation over control simulations, but significantly less computational cost. Simpler approaches, such as ignoring Langmuir mixing altogether or setting a globally constant Langmuir number, are found to be deficient. Thus, the consequences of Stokes depth and misaligned wind and waves are important.

  5. Indirectional statistics and the significance of an asymmetry discovered by Birch

    International Nuclear Information System (INIS)

    Kendall, D.G.; Young, G.A.

    1984-01-01

    Birch (1982, Nature, 298, 451) reported an apparent 'statistical asymmetry of the Universe'. The authors here develop 'indirectional analysis' as a technique for investigating statistical effects of this kind and conclude that the reported effect (whatever may be its origin) is strongly supported by the observations. The estimated pole of the asymmetry is at RA 13h 30m, Dec. -37deg. The angular error in its estimation is unlikely to exceed 20-30deg. (author)

  6. Network Data: Statistical Theory and New Models

    Science.gov (United States)

    2016-02-17

    and with environmental scientists at JPL and Emory University to retrieval from NASA MISR remote sensing images aerosol index AOD for air pollution ...Beijing, May, 2013 Beijing Statistics Forum, Beijing, May, 2013 Statistics Seminar, CREST-ENSAE, Paris , March, 2013 Statistics Seminar, University...to retrieval from NASA MISR remote sensing images aerosol index AOD for air pollution monitoring and management. Satellite- retrieved Aerosol Optical

  7. Quantum statistical model for hot dense matter

    International Nuclear Information System (INIS)

    Rukhsana Kouser; Tasneem, G.; Saleem Shahzad, M.; Shafiq-ur-Rehman; Nasim, M.H.; Amjad Ali

    2015-01-01

    In solving numerous applied problems, one needs to know the equation of state, photon absorption coefficient and opacity of substances employed. We present a code for absorption coefficient and opacity calculation based on quantum statistical model. A self-consistent method for the calculation of potential is used. By solving Schrödinger equation with self-consistent potential we find energy spectrum of quantum mechanical system and corresponding wave functions. In addition we find mean occupation numbers of electron states and average charge state of the substance studied. The main processes of interaction of radiation with matter included in our opacity calculation are photon absorption in spectral lines (Bound-bound), photoionization (Bound-free), inverse bremsstrahlung (Free-free), Compton and Thomson scattering. Bound-bound line shape function has contribution from natural, Doppler, fine structure, collisional and stark broadening. To illustrate the main features of the code and its capabilities, calculation of average charge state, absorption coefficient, Rosseland and Planck mean and group opacities of aluminum and iron are presented. Results are satisfactorily compared with the published data. (authors)

  8. Evaluation of significantly modified water bodies in Vojvodina by using multivariate statistical techniques

    Directory of Open Access Journals (Sweden)

    Vujović Svetlana R.

    2013-01-01

    Full Text Available This paper illustrates the utility of multivariate statistical techniques for analysis and interpretation of water quality data sets and identification of pollution sources/factors with a view to get better information about the water quality and design of monitoring network for effective management of water resources. Multivariate statistical techniques, such as factor analysis (FA/principal component analysis (PCA and cluster analysis (CA, were applied for the evaluation of variations and for the interpretation of a water quality data set of the natural water bodies obtained during 2010 year of monitoring of 13 parameters at 33 different sites. FA/PCA attempts to explain the correlations between the observations in terms of the underlying factors, which are not directly observable. Factor analysis is applied to physico-chemical parameters of natural water bodies with the aim classification and data summation as well as segmentation of heterogeneous data sets into smaller homogeneous subsets. Factor loadings were categorized as strong and moderate corresponding to the absolute loading values of >0.75, 0.75-0.50, respectively. Four principal factors were obtained with Eigenvalues >1 summing more than 78 % of the total variance in the water data sets, which is adequate to give good prior information regarding data structure. Each factor that is significantly related to specific variables represents a different dimension of water quality. The first factor F1 accounting for 28 % of the total variance and represents the hydrochemical dimension of water quality. The second factor F2 accounting for 18% of the total variance and may be taken factor of water eutrophication. The third factor F3 accounting 17 % of the total variance and represents the influence of point sources of pollution on water quality. The fourth factor F4 accounting 13 % of the total variance and may be taken as an ecological dimension of water quality. Cluster analysis (CA is an

  9. A Statistical Model for Energy Intensity

    Directory of Open Access Journals (Sweden)

    Marjaneh Issapour

    2012-12-01

    Full Text Available A promising approach to improve scientific literacy in regards to global warming and climate change is using a simulation as part of a science education course. The simulation needs to employ scientific analysis of actual data from internationally accepted and reputable databases to demonstrate the reality of the current climate change situation. One of the most important criteria for using a simulation in a science education course is the fidelity of the model. The realism of the events and consequences modeled in the simulation is significant as well. Therefore, all underlying equations and algorithms used in the simulation must have real-world scientific basis. The "Energy Choices" simulation is one such simulation. The focus of this paper is the development of a mathematical model for "Energy Intensity" as a part of the overall system dynamics in "Energy Choices" simulation. This model will define the "Energy Intensity" as a function of other independent variables that can be manipulated by users of the simulation. The relationship discovered by this research will be applied to an algorithm in the "Energy Choices" simulation.

  10. Statistical model selection with “Big Data”

    Directory of Open Access Journals (Sweden)

    Jurgen A. Doornik

    2015-12-01

    Full Text Available Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem, using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem, using a viable approach that resolves the computational problem of immense numbers of possible models.

  11. Robust statistical methods for significance evaluation and applications in cancer driver detection and biomarker discovery

    DEFF Research Database (Denmark)

    Madsen, Tobias

    2017-01-01

    In the present thesis I develop, implement and apply statistical methods for detecting genomic elements implicated in cancer development and progression. This is done in two separate bodies of work. The first uses the somatic mutation burden to distinguish cancer driver mutations from passenger m...

  12. A BRDF statistical model applying to space target materials modeling

    Science.gov (United States)

    Liu, Chenghao; Li, Zhi; Xu, Can; Tian, Qichen

    2017-10-01

    In order to solve the problem of poor effect in modeling the large density BRDF measured data with five-parameter semi-empirical model, a refined statistical model of BRDF which is suitable for multi-class space target material modeling were proposed. The refined model improved the Torrance-Sparrow model while having the modeling advantages of five-parameter model. Compared with the existing empirical model, the model contains six simple parameters, which can approximate the roughness distribution of the material surface, can approximate the intensity of the Fresnel reflectance phenomenon and the attenuation of the reflected light's brightness with the azimuth angle changes. The model is able to achieve parameter inversion quickly with no extra loss of accuracy. The genetic algorithm was used to invert the parameters of 11 different samples in the space target commonly used materials, and the fitting errors of all materials were below 6%, which were much lower than those of five-parameter model. The effect of the refined model is verified by comparing the fitting results of the three samples at different incident zenith angles in 0° azimuth angle. Finally, the three-dimensional modeling visualizations of these samples in the upper hemisphere space was given, in which the strength of the optical scattering of different materials could be clearly shown. It proved the good describing ability of the refined model at the material characterization as well.

  13. Statistical Challenges in Modeling Big Brain Signals

    KAUST Repository

    Yu, Zhaoxia

    2017-11-01

    Brain signal data are inherently big: massive in amount, complex in structure, and high in dimensions. These characteristics impose great challenges for statistical inference and learning. Here we review several key challenges, discuss possible solutions, and highlight future research directions.

  14. Statistical Challenges in Modeling Big Brain Signals

    KAUST Repository

    Yu, Zhaoxia; Pluta, Dustin; Shen, Tong; Chen, Chuansheng; Xue, Gui; Ombao, Hernando

    2017-01-01

    Brain signal data are inherently big: massive in amount, complex in structure, and high in dimensions. These characteristics impose great challenges for statistical inference and learning. Here we review several key challenges, discuss possible

  15. Statistical Learning Theory: Models, Concepts, and Results

    OpenAIRE

    von Luxburg, Ulrike; Schoelkopf, Bernhard

    2008-01-01

    Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We target at a broad audience, not necessarily machine learning researchers. This paper can serve as a starting point for people who want to get an overview on the field before diving into technical details.

  16. Online Statistical Modeling (Regression Analysis) for Independent Responses

    Science.gov (United States)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  17. Risk prediction model: Statistical and artificial neural network approach

    Science.gov (United States)

    Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

    2017-04-01

    Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.

  18. Statistical modelling of transcript profiles of differentially regulated genes

    Directory of Open Access Journals (Sweden)

    Sergeant Martin J

    2008-07-01

    allowed 11% of the Escherichia coli features to be fitted by an exponential function, and 25% of the Rattus norvegicus features could be described by the critical exponential model, all with statistical significance of p Conclusion The statistical non-linear regression approaches presented in this study provide detailed biologically oriented descriptions of individual gene expression profiles, using biologically variable data to generate a set of defining parameters. These approaches have application to the modelling and greater interpretation of profiles obtained across a wide range of platforms, such as microarrays. Through careful choice of appropriate model forms, such statistical regression approaches allow an improved comparison of gene expression profiles, and may provide an approach for the greater understanding of common regulatory mechanisms between genes.

  19. Integer Set Compression and Statistical Modeling

    DEFF Research Database (Denmark)

    Larsson, N. Jesper

    2014-01-01

    enumeration of elements may be arbitrary or random, but where statistics is kept in order to estimate probabilities of elements. We present a recursive subset-size encoding method that is able to benefit from statistics, explore the effects of permuting the enumeration order based on element probabilities......Compression of integer sets and sequences has been extensively studied for settings where elements follow a uniform probability distribution. In addition, methods exist that exploit clustering of elements in order to achieve higher compression performance. In this work, we address the case where...

  20. Statistically Based Morphodynamic Modeling of Tracer Slowdown

    Science.gov (United States)

    Borhani, S.; Ghasemi, A.; Hill, K. M.; Viparelli, E.

    2017-12-01

    Tracer particles are used to study bedload transport in gravel-bed rivers. One of the advantages associated with using of tracer particles is that they allow for direct measures of the entrainment rates and their size distributions. The main issue in large scale studies with tracer particles is the difference between tracer stone short term and long term behavior. This difference is due to the fact that particles undergo vertical mixing or move to less active locations such as bars or even floodplains. For these reasons the average virtual velocity of tracer particle decreases in time, i.e. the tracer slowdown. In summary, tracer slowdown can have a significant impact on the estimation of bedload transport rate or long term dispersal of contaminated sediment. The vast majority of the morphodynamic models that account for the non-uniformity of the bed material (tracer and not tracer, in this case) are based on a discrete description of the alluvial deposit. The deposit is divided in two different regions; the active layer and the substrate. The active layer is a thin layer in the topmost part of the deposit whose particles can interact with the bed material transport. The substrate is the part of the deposit below the active layer. Due to the discrete representation of the alluvial deposit, active layer models are not able to reproduce tracer slowdown. In this study we try to model the slowdown of tracer particles with the continuous Parker-Paola-Leclair morphodynamic framework. This continuous, i.e. not layer-based, framework is based on a stochastic description of the temporal variation of bed surface elevation, and of the elevation specific particle entrainment and deposition. Particle entrainment rates are computed as a function of the flow and sediment characteristics, while particle deposition is estimated with a step length formulation. Here we present one of the first implementation of the continuum framework at laboratory scale, its validation against

  1. Statistical modelling for social researchers principles and practice

    CERN Document Server

    Tarling, Roger

    2008-01-01

    This book explains the principles and theory of statistical modelling in an intelligible way for the non-mathematical social scientist looking to apply statistical modelling techniques in research. The book also serves as an introduction for those wishing to develop more detailed knowledge and skills in statistical modelling. Rather than present a limited number of statistical models in great depth, the aim is to provide a comprehensive overview of the statistical models currently adopted in social research, in order that the researcher can make appropriate choices and select the most suitable model for the research question to be addressed. To facilitate application, the book also offers practical guidance and instruction in fitting models using SPSS and Stata, the most popular statistical computer software which is available to most social researchers. Instruction in using MLwiN is also given. Models covered in the book include; multiple regression, binary, multinomial and ordered logistic regression, log-l...

  2. Linear Mixed Models in Statistical Genetics

    NARCIS (Netherlands)

    R. de Vlaming (Ronald)

    2017-01-01

    markdownabstractOne of the goals of statistical genetics is to elucidate the genetic architecture of phenotypes (i.e., observable individual characteristics) that are affected by many genetic variants (e.g., single-nucleotide polymorphisms; SNPs). A particular aim is to identify specific SNPs that

  3. Confidence Intervals: From tests of statistical significance to confidence intervals, range hypotheses and substantial effects

    Directory of Open Access Journals (Sweden)

    Dominic Beaulieu-Prévost

    2006-03-01

    Full Text Available For the last 50 years of research in quantitative social sciences, the empirical evaluation of scientific hypotheses has been based on the rejection or not of the null hypothesis. However, more than 300 articles demonstrated that this method was problematic. In summary, null hypothesis testing (NHT is unfalsifiable, its results depend directly on sample size and the null hypothesis is both improbable and not plausible. Consequently, alternatives to NHT such as confidence intervals (CI and measures of effect size are starting to be used in scientific publications. The purpose of this article is, first, to provide the conceptual tools necessary to implement an approach based on confidence intervals, and second, to briefly demonstrate why such an approach is an interesting alternative to an approach based on NHT. As demonstrated in the article, the proposed CI approach avoids most problems related to a NHT approach and can often improve the scientific and contextual relevance of the statistical interpretations by testing range hypotheses instead of a point hypothesis and by defining the minimal value of a substantial effect. The main advantage of such a CI approach is that it replaces the notion of statistical power by an easily interpretable three-value logic (probable presence of a substantial effect, probable absence of a substantial effect and probabilistic undetermination. The demonstration includes a complete example.

  4. Statistical models and methods for reliability and survival analysis

    CERN Document Server

    Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo

    2013-01-01

    Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical

  5. Spatio-temporal statistical models with applications to atmospheric processes

    International Nuclear Information System (INIS)

    Wikle, C.K.

    1996-01-01

    This doctoral dissertation is presented as three self-contained papers. An introductory chapter considers traditional spatio-temporal statistical methods used in the atmospheric sciences from a statistical perspective. Although this section is primarily a review, many of the statistical issues considered have not been considered in the context of these methods and several open questions are posed. The first paper attempts to determine a means of characterizing the semiannual oscillation (SAO) spatial variation in the northern hemisphere extratropical height field. It was discovered that the midlatitude SAO in 500hPa geopotential height could be explained almost entirely as a result of spatial and temporal asymmetries in the annual variation of stationary eddies. It was concluded that the mechanism for the SAO in the northern hemisphere is a result of land-sea contrasts. The second paper examines the seasonal variability of mixed Rossby-gravity waves (MRGW) in lower stratospheric over the equatorial Pacific. Advanced cyclostationary time series techniques were used for analysis. It was found that there are significant twice-yearly peaks in MRGW activity. Analyses also suggested a convergence of horizontal momentum flux associated with these waves. In the third paper, a new spatio-temporal statistical model is proposed that attempts to consider the influence of both temporal and spatial variability. This method is mainly concerned with prediction in space and time, and provides a spatially descriptive and temporally dynamic model

  6. Geometric modeling in probability and statistics

    CERN Document Server

    Calin, Ovidiu

    2014-01-01

    This book covers topics of Informational Geometry, a field which deals with the differential geometric study of the manifold probability density functions. This is a field that is increasingly attracting the interest of researchers from many different areas of science, including mathematics, statistics, geometry, computer science, signal processing, physics and neuroscience. It is the authors’ hope that the present book will be a valuable reference for researchers and graduate students in one of the aforementioned fields. This textbook is a unified presentation of differential geometry and probability theory, and constitutes a text for a course directed at graduate or advanced undergraduate students interested in applications of differential geometry in probability and statistics. The book contains over 100 proposed exercises meant to help students deepen their understanding, and it is accompanied by software that is able to provide numerical computations of several information geometric objects. The reader...

  7. Challenges in dental statistics: data and modelling

    OpenAIRE

    Matranga, D.; Castiglia, P.; Solinas, G.

    2013-01-01

    The aim of this work is to present the reflections and proposals derived from the first Workshop of the SISMEC STATDENT working group on statistical methods and applications in dentistry, held in Ancona (Italy) on 28th September 2011. STATDENT began as a forum of comparison and discussion for statisticians working in the field of dental research in order to suggest new and improve existing biostatistical and clinical epidemiological methods. During the meeting, we dealt with very important to...

  8. The issue of statistical power for overall model fit in evaluating structural equation models

    Directory of Open Access Journals (Sweden)

    Richard HERMIDA

    2015-06-01

    Full Text Available Statistical power is an important concept for psychological research. However, examining the power of a structural equation model (SEM is rare in practice. This article provides an accessible review of the concept of statistical power for the Root Mean Square Error of Approximation (RMSEA index of overall model fit in structural equation modeling. By way of example, we examine the current state of power in the literature by reviewing studies in top Industrial-Organizational (I/O Psychology journals using SEMs. Results indicate that in many studies, power is very low, which implies acceptance of invalid models. Additionally, we examined methodological situations which may have an influence on statistical power of SEMs. Results showed that power varies significantly as a function of model type and whether or not the model is the main model for the study. Finally, results indicated that power is significantly related to model fit statistics used in evaluating SEMs. The results from this quantitative review imply that researchers should be more vigilant with respect to power in structural equation modeling. We therefore conclude by offering methodological best practices to increase confidence in the interpretation of structural equation modeling results with respect to statistical power issues.

  9. Experimental investigation of statistical models describing distribution of counts

    International Nuclear Information System (INIS)

    Salma, I.; Zemplen-Papp, E.

    1992-01-01

    The binomial, Poisson and modified Poisson models which are used for describing the statistical nature of the distribution of counts are compared theoretically, and conclusions for application are considered. The validity of the Poisson and the modified Poisson statistical distribution for observing k events in a short time interval is investigated experimentally for various measuring times. The experiments to measure the influence of the significant radioactive decay were performed with 89 Y m (T 1/2 =16.06 s), using a multichannel analyser (4096 channels) in the multiscaling mode. According to the results, Poisson statistics describe the counting experiment for short measuring times (up to T=0.5T 1/2 ) and its application is recommended. However, analysis of the data demonstrated, with confidence, that for long measurements (T≥T 1/2 ) Poisson distribution is not valid and the modified Poisson function is preferable. The practical implications in calculating uncertainties and in optimizing the measuring time are discussed. Differences between the standard deviations evaluated on the basis of the Poisson and binomial models are especially significant for experiments with long measuring time (T/T 1/2 ≥2) and/or large detection efficiency (ε>0.30). Optimization of the measuring time for paired observations yields the same solution for either the binomial or the Poisson distribution. (orig.)

  10. A statistical model of future human actions

    International Nuclear Information System (INIS)

    Woo, G.

    1992-02-01

    A critical review has been carried out of models of future human actions during the long term post-closure period of a radioactive waste repository. Various Markov models have been considered as alternatives to the standard Poisson model, and the problems of parameterisation have been addressed. Where the simplistic Poisson model unduly exaggerates the intrusion risk, some form of Markov model may have to be introduced. This situation may well arise for shallow repositories, but it is less likely for deep repositories. Recommendations are made for a practical implementation of a computer based model and its associated database. (Author)

  11. Enhanced surrogate models for statistical design exploiting space mapping technology

    DEFF Research Database (Denmark)

    Koziel, Slawek; Bandler, John W.; Mohamed, Achmed S.

    2005-01-01

    We present advances in microwave and RF device modeling exploiting Space Mapping (SM) technology. We propose new SM modeling formulations utilizing input mappings, output mappings, frequency scaling and quadratic approximations. Our aim is to enhance circuit models for statistical analysis...

  12. Statistical models of shape optimisation and evaluation

    CERN Document Server

    Davies, Rhodri; Taylor, Chris

    2014-01-01

    Deformable shape models have wide application in computer vision and biomedical image analysis. This book addresses a key issue in shape modelling: establishment of a meaningful correspondence between a set of shapes. Full implementation details are provided.

  13. How to practise Bayesian statistics outside the Bayesian church: What philosophy for Bayesian statistical modelling?

    NARCIS (Netherlands)

    Borsboom, D.; Haig, B.D.

    2013-01-01

    Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science

  14. Statistical Tests for Mixed Linear Models

    CERN Document Server

    Khuri, André I; Sinha, Bimal K

    2011-01-01

    An advanced discussion of linear models with mixed or random effects. In recent years a breakthrough has occurred in our ability to draw inferences from exact and optimum tests of variance component models, generating much research activity that relies on linear models with mixed and random effects. This volume covers the most important research of the past decade as well as the latest developments in hypothesis testing. It compiles all currently available results in the area of exact and optimum tests for variance component models and offers the only comprehensive treatment for these models a

  15. Statistical modelling of traffic safety development

    DEFF Research Database (Denmark)

    Christens, Peter

    2004-01-01

    there were 6861 injury trafficc accidents reported by the police, resulting in 4519 minor injuries, 3946 serious injuries, and 431 fatalities. The general purpose of the research was to improve the insight into aggregated road safety methodology in Denmark. The aim was to analyse advanced statistical methods......, that were designed to study developments over time, including effects of interventions. This aim has been achieved by investigating variations in aggregated Danish traffic accident series and by applying state of the art methodologies to specific case studies. The thesis comprises an introduction...

  16. A statistical mechanical model for equilibrium ionization

    International Nuclear Information System (INIS)

    Macris, N.; Martin, P.A.; Pule, J.

    1990-01-01

    A quantum electron interacts with a classical gas of hard spheres and is in thermal equilibrium with it. The interaction is attractive and the electron can form a bound state with the classical particles. It is rigorously shown that in a well defined low density and low temperature limit, the ionization probability for the electron tends to the value predicted by the Saha formula for thermal ionization. In this regime, the electron is found to be in a statistical mixture of a bound and a free state. (orig.)

  17. Statistical image processing and multidimensional modeling

    CERN Document Server

    Fieguth, Paul

    2010-01-01

    Images are all around us! The proliferation of low-cost, high-quality imaging devices has led to an explosion in acquired images. When these images are acquired from a microscope, telescope, satellite, or medical imaging device, there is a statistical image processing task: the inference of something - an artery, a road, a DNA marker, an oil spill - from imagery, possibly noisy, blurry, or incomplete. A great many textbooks have been written on image processing. However this book does not so much focus on images, per se, but rather on spatial data sets, with one or more measurements taken over

  18. The SACE Review Panel's Final Report: Significant Flaws in the Analysis of Statistical Data

    Science.gov (United States)

    Gregory, Kelvin

    2006-01-01

    The South Australian Certificate of Education (SACE) is a credential and formal qualification within the Australian Qualifications Framework. A recent review of the SACE outlined a number of recommendations for significant changes to this certificate. These recommendations were the result of a process that began with the review panel…

  19. Fluctuations and correlations in statistical models of hadron production

    International Nuclear Information System (INIS)

    Gorenstein, M. I.

    2012-01-01

    An extension of the standard concept of the statistical ensembles is suggested. Namely, the statistical ensembles with extensive quantities fluctuating according to an externally given distribution are introduced. Applications in the statistical models of multiple hadron production in high energy physics are discussed.

  20. Analysis and Evaluation of Statistical Models for Integrated Circuits Design

    Directory of Open Access Journals (Sweden)

    Sáenz-Noval J.J.

    2011-10-01

    Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.

  1. Modeling of uncertainties in statistical inverse problems

    International Nuclear Information System (INIS)

    Kaipio, Jari

    2008-01-01

    In all real world problems, the models that tie the measurements to the unknowns of interest, are at best only approximations for reality. While moderate modeling and approximation errors can be tolerated with stable problems, inverse problems are a notorious exception. Typical modeling errors include inaccurate geometry, unknown boundary and initial data, properties of noise and other disturbances, and simply the numerical approximations of the physical models. In principle, the Bayesian approach to inverse problems, in which all uncertainties are modeled as random variables, is capable of handling these uncertainties. Depending on the type of uncertainties, however, different strategies may be adopted. In this paper we give an overview of typical modeling errors and related strategies within the Bayesian framework.

  2. Interpretation of commonly used statistical regression models.

    Science.gov (United States)

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  3. Statistical modeling and extrapolation of carcinogenesis data

    International Nuclear Information System (INIS)

    Krewski, D.; Murdoch, D.; Dewanji, A.

    1986-01-01

    Mathematical models of carcinogenesis are reviewed, including pharmacokinetic models for metabolic activation of carcinogenic substances. Maximum likelihood procedures for fitting these models to epidemiological data are discussed, including situations where the time to tumor occurrence is unobservable. The plausibility of different possible shapes of the dose response curve at low doses is examined, and a robust method for linear extrapolation to low doses is proposed and applied to epidemiological data on radiation carcinogenesis

  4. Multivariate statistical modelling based on generalized linear models

    CERN Document Server

    Fahrmeir, Ludwig

    1994-01-01

    This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...

  5. Childhood-compared to adolescent-onset bipolar disorder has more statistically significant clinical correlates.

    Science.gov (United States)

    Holtzman, Jessica N; Miller, Shefali; Hooshmand, Farnaz; Wang, Po W; Chang, Kiki D; Hill, Shelley J; Rasgon, Natalie L; Ketter, Terence A

    2015-07-01

    The strengths and limitations of considering childhood-and adolescent-onset bipolar disorder (BD) separately versus together remain to be established. We assessed this issue. BD patients referred to the Stanford Bipolar Disorder Clinic during 2000-2011 were assessed with the Systematic Treatment Enhancement Program for BD Affective Disorders Evaluation. Patients with childhood- and adolescent-onset were compared to those with adult-onset for 7 unfavorable bipolar illness characteristics with replicated associations with early-onset patients. Among 502 BD outpatients, those with childhood- (adolescent- (13-18 years, N=218) onset had significantly higher rates for 4/7 unfavorable illness characteristics, including lifetime comorbid anxiety disorder, at least ten lifetime mood episodes, lifetime alcohol use disorder, and prior suicide attempt, than those with adult-onset (>18 years, N=174). Childhood- but not adolescent-onset BD patients also had significantly higher rates of first-degree relative with mood disorder, lifetime substance use disorder, and rapid cycling in the prior year. Patients with pooled childhood/adolescent - compared to adult-onset had significantly higher rates for 5/7 of these unfavorable illness characteristics, while patients with childhood- compared to adolescent-onset had significantly higher rates for 4/7 of these unfavorable illness characteristics. Caucasian, insured, suburban, low substance abuse, American specialty clinic-referred sample limits generalizability. Onset age is based on retrospective recall. Childhood- compared to adolescent-onset BD was more robustly related to unfavorable bipolar illness characteristics, so pooling these groups attenuated such relationships. Further study is warranted to determine the extent to which adolescent-onset BD represents an intermediate phenotype between childhood- and adult-onset BD. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Statistical Modelling of Extreme Rainfall in Taiwan

    NARCIS (Netherlands)

    L-F. Chu (Lan-Fen); M.J. McAleer (Michael); C-C. Chang (Ching-Chung)

    2012-01-01

    textabstractIn this paper, the annual maximum daily rainfall data from 1961 to 2010 are modelled for 18 stations in Taiwan. We fit the rainfall data with stationary and non-stationary generalized extreme value distributions (GEV), and estimate their future behaviour based on the best fitting model.

  7. Statistical Modelling of Extreme Rainfall in Taiwan

    NARCIS (Netherlands)

    L. Chu (LanFen); M.J. McAleer (Michael); C-H. Chang (Chu-Hsiang)

    2013-01-01

    textabstractIn this paper, the annual maximum daily rainfall data from 1961 to 2010 are modelled for 18 stations in Taiwan. We fit the rainfall data with stationary and non-stationary generalized extreme value distributions (GEV), and estimate their future behaviour based on the best fitting model.

  8. The statistical significance of error probability as determined from decoding simulations for long codes

    Science.gov (United States)

    Massey, J. L.

    1976-01-01

    The very low error probability obtained with long error-correcting codes results in a very small number of observed errors in simulation studies of practical size and renders the usual confidence interval techniques inapplicable to the observed error probability. A natural extension of the notion of a 'confidence interval' is made and applied to such determinations of error probability by simulation. An example is included to show the surprisingly great significance of as few as two decoding errors in a very large number of decoding trials.

  9. On the Logical Development of Statistical Models.

    Science.gov (United States)

    1983-12-01

    1978). "Modelos con parametros variables en el analisis de series temporales " Questiio, 4, 2, 75-87. [25] Seal, H. L. (1967). "The historical...example, a classical state-space representation of a simple time series model is: yt = it + ut Ut = *It-I + Ct (2.2) ut and et are independent normal...on its past values is displayed in the structural equation. This approach has been particularly useful in time series models. For example, model (2.2

  10. A Noise Robust Statistical Texture Model

    DEFF Research Database (Denmark)

    Hilger, Klaus Baggesen; Stegmann, Mikkel Bille; Larsen, Rasmus

    2002-01-01

    Appearance Models segmentation framework. This is accomplished by augmenting the model with an estimate of the covariance of the noise present in the training data. This results in a more compact model maximising the signal-to-noise ratio, thus favouring subspaces rich on signal, but low on noise......This paper presents a novel approach to the problem of obtaining a low dimensional representation of texture (pixel intensity) variation present in a training set after alignment using a Generalised Procrustes analysis.We extend the conventional analysis of training textures in the Active...

  11. Estimating Predictive Variance for Statistical Gas Distribution Modelling

    International Nuclear Information System (INIS)

    Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo

    2009-01-01

    Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.

  12. Statistics

    CERN Document Server

    Hayslett, H T

    1991-01-01

    Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the

  13. 12th Workshop on Stochastic Models, Statistics and Their Applications

    CERN Document Server

    Rafajłowicz, Ewaryst; Szajowski, Krzysztof

    2015-01-01

    This volume presents the latest advances and trends in stochastic models and related statistical procedures. Selected peer-reviewed contributions focus on statistical inference, quality control, change-point analysis and detection, empirical processes, time series analysis, survival analysis and reliability, statistics for stochastic processes, big data in technology and the sciences, statistical genetics, experiment design, and stochastic models in engineering. Stochastic models and related statistical procedures play an important part in furthering our understanding of the challenging problems currently arising in areas of application such as the natural sciences, information technology, engineering, image analysis, genetics, energy and finance, to name but a few. This collection arises from the 12th Workshop on Stochastic Models, Statistics and Their Applications, Wroclaw, Poland.

  14. Materials Informatics: Statistical Modeling in Material Science.

    Science.gov (United States)

    Yosipof, Abraham; Shimanovich, Klimentiy; Senderowitz, Hanoch

    2016-12-01

    Material informatics is engaged with the application of informatic principles to materials science in order to assist in the discovery and development of new materials. Central to the field is the application of data mining techniques and in particular machine learning approaches, often referred to as Quantitative Structure Activity Relationship (QSAR) modeling, to derive predictive models for a variety of materials-related "activities". Such models can accelerate the development of new materials with favorable properties and provide insight into the factors governing these properties. Here we provide a comparison between medicinal chemistry/drug design and materials-related QSAR modeling and highlight the importance of developing new, materials-specific descriptors. We survey some of the most recent QSAR models developed in materials science with focus on energetic materials and on solar cells. Finally we present new examples of material-informatic analyses of solar cells libraries produced from metal oxides using combinatorial material synthesis. Different analyses lead to interesting physical insights as well as to the design of new cells with potentially improved photovoltaic parameters. © 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Statistical Significance of the Maximum Hardness Principle Applied to Some Selected Chemical Reactions.

    Science.gov (United States)

    Saha, Ranajit; Pan, Sudip; Chattaraj, Pratim K

    2016-11-05

    The validity of the maximum hardness principle (MHP) is tested in the cases of 50 chemical reactions, most of which are organic in nature and exhibit anomeric effect. To explore the effect of the level of theory on the validity of MHP in an exothermic reaction, B3LYP/6-311++G(2df,3pd) and LC-BLYP/6-311++G(2df,3pd) (def2-QZVP for iodine and mercury) levels are employed. Different approximations like the geometric mean of hardness and combined hardness are considered in case there are multiple reactants and/or products. It is observed that, based on the geometric mean of hardness, while 82% of the studied reactions obey the MHP at the B3LYP level, 84% of the reactions follow this rule at the LC-BLYP level. Most of the reactions possess the hardest species on the product side. A 50% null hypothesis is rejected at a 1% level of significance.

  16. Introduction to statistical modelling: linear regression.

    Science.gov (United States)

    Lunt, Mark

    2015-07-01

    In many studies we wish to assess how a range of variables are associated with a particular outcome and also determine the strength of such relationships so that we can begin to understand how these factors relate to each other at a population level. Ultimately, we may also be interested in predicting the outcome from a series of predictive factors available at, say, a routine clinic visit. In a recent article in Rheumatology, Desai et al. did precisely that when they studied the prediction of hip and spine BMD from hand BMD and various demographic, lifestyle, disease and therapy variables in patients with RA. This article aims to introduce the statistical methodology that can be used in such a situation and explain the meaning of some of the terms employed. It will also outline some common pitfalls encountered when performing such analyses. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  17. Model output statistics applied to wind power prediction

    Energy Technology Data Exchange (ETDEWEB)

    Joensen, A; Giebel, G; Landberg, L [Risoe National Lab., Roskilde (Denmark); Madsen, H; Nielsen, H A [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)

    1999-03-01

    Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.

  18. Latent domain models for statistical machine translation

    NARCIS (Netherlands)

    Hoàng, C.

    2017-01-01

    A data-driven approach to model translation suffers from the data mismatch problem and demands domain adaptation techniques. Given parallel training data originating from a specific domain, training an MT system on the data would result in a rather suboptimal translation for other domains. But does

  19. Behavioral and statistical models of educational inequality

    DEFF Research Database (Denmark)

    Holm, Anders; Breen, Richard

    2016-01-01

    This paper addresses the question of how students and their families make educational decisions. We describe three types of behavioral model that might underlie decision-making and we show that they have consequences for what decisions are made. Our study thus has policy implications if we wish...

  20. Statistical modelling of fine red wine production

    Directory of Open Access Journals (Sweden)

    María Rosa Castro

    2010-01-01

    Full Text Available Producing wine is a very important economic activity in the province of San Juan in Argentina; it is therefore most important to predict production regarding the quantity of raw material needed. This work was aimed at obtaining a model relating kilograms of crushed grape to the litres of wine so produced. Such model will be used for predicting precise future values and confidence intervals for determined quantities of crushed grapes. Data from a vineyard in the province of San Juan was thus used in this work. The sampling coefficient of correlation was calculated and a dispersion diagram was then constructed; this indicated a li- neal relationship between the litres of wine obtained and the kilograms of crushed grape. Two lineal models were then adopted and variance analysis was carried out because the data came from normal populations having the same variance. The most appropriate model was obtained from this analysis; it was validated with experimental values, a good approach being obtained.

  1. Sampling, Probability Models and Statistical Reasoning -RE ...

    Indian Academy of Sciences (India)

    random sampling allows data to be modelled with the help of probability ... g based on different trials to get an estimate of the experimental error. ... research interests lie in the .... if e is indeed the true value of the proportion of defectives in the.

  2. Statistical Model Checking for Product Lines

    DEFF Research Database (Denmark)

    ter Beek, Maurice H.; Legay, Axel; Lluch Lafuente, Alberto

    2016-01-01

    average cost of products (in terms of the attributes of the products’ features) and the probability of features to be (un)installed at runtime. The product lines must be modelled in QFLan, which extends the probabilistic feature-oriented language PFLan with novel quantitative constraints among features...

  3. Structured Statistical Models of Inductive Reasoning

    Science.gov (United States)

    Kemp, Charles; Tenenbaum, Joshua B.

    2009-01-01

    Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. This article presents a Bayesian framework that attempts to meet…

  4. Fluctuations of offshore wind generation: Statistical modelling

    DEFF Research Database (Denmark)

    Pinson, Pierre; Christensen, Lasse E.A.; Madsen, Henrik

    2007-01-01

    The magnitude of power fluctuations at large offshore wind farms has a significant impact on the control and management strategies of their power output. If focusing on the minute scale, one observes successive periods with smaller and larger power fluctuations. It seems that different regimes yi...

  5. Statistical Analysis and Modelling of Olkiluoto Structures

    International Nuclear Information System (INIS)

    Hellae, P.; Vaittinen, T.; Saksa, P.; Nummela, J.

    2004-11-01

    Posiva Oy is carrying out investigations for the disposal of the spent nuclear fuel at the Olkiluoto site in SW Finland. The investigations have focused on the central part of the island. The layout design of the entire repository requires characterization of notably larger areas and must rely at least at the current stage on borehole information from a rather sparse network and on the geophysical soundings providing information outside and between the holes. In this work, the structural data according to the current version of the Olkiluoto bedrock model is analyzed. The bedrock model relies much on the borehole data although results of the seismic surveys and, for example, pumping tests are used in determining the orientation and continuation of the structures. Especially in the analysis, questions related to the frequency of structures and size of the structures are discussed. The structures observed in the boreholes are mainly dipping gently to the southeast. About 9 % of the sample length belongs to structures. The proportion is higher in the upper parts of the rock. The number of fracture and crushed zones seems not to depend greatly on the depth, whereas the hydraulic features concentrate on the depth range above -100 m. Below level -300 m, the hydraulic conductivity occurs in connection of fractured zones. Especially the hydraulic features, but also fracture and crushed zones often occur in groups. The frequency of the structure (area of structures per total volume) is estimated to be of the order of 1/100m. The size of the local structures was estimated by calculating the intersection of the zone to the nearest borehole where the zone has not been detected. Stochastic models using the Fracman software by Golder Associates were generated based on the bedrock model data complemented with the magnetic ground survey data. The seismic surveys (from boreholes KR5, KR13, KR14, and KR19) were used as alternative input data. The generated models were tested by

  6. Modeling statistical properties of written text.

    Directory of Open Access Journals (Sweden)

    M Angeles Serrano

    Full Text Available Written text is one of the fundamental manifestations of human language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Among these regularities, only Zipf's law has been explored in depth. Other basic properties, such as the existence of bursts of rare words in specific documents, have only been studied independently of each other and mainly by descriptive models. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on burstiness, Heaps' law describing the sublinear growth of vocabulary size with the length of a document, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science and linguistics.

  7. Advanced data analysis in neuroscience integrating statistical and computational models

    CERN Document Server

    Durstewitz, Daniel

    2017-01-01

    This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering.  Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...

  8. Statistics

    Science.gov (United States)

    Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.

  9. Sigsearch: a new term for post hoc unplanned search for statistically significant relationships with the intent to create publishable findings.

    Science.gov (United States)

    Hashim, Muhammad Jawad

    2010-09-01

    Post-hoc secondary data analysis with no prespecified hypotheses has been discouraged by textbook authors and journal editors alike. Unfortunately no single term describes this phenomenon succinctly. I would like to coin the term "sigsearch" to define this practice and bring it within the teaching lexicon of statistics courses. Sigsearch would include any unplanned, post-hoc search for statistical significance using multiple comparisons of subgroups. It would also include data analysis with outcomes other than the prespecified primary outcome measure of a study as well as secondary data analyses of earlier research.

  10. Statistical mechanics of the cluster Ising model

    International Nuclear Information System (INIS)

    Smacchia, Pietro; Amico, Luigi; Facchi, Paolo; Fazio, Rosario; Florio, Giuseppe; Pascazio, Saverio; Vedral, Vlatko

    2011-01-01

    We study a Hamiltonian system describing a three-spin-1/2 clusterlike interaction competing with an Ising-like antiferromagnetic interaction. We compute free energy, spin-correlation functions, and entanglement both in the ground and in thermal states. The model undergoes a quantum phase transition between an Ising phase with a nonvanishing magnetization and a cluster phase characterized by a string order. Any two-spin entanglement is found to vanish in both quantum phases because of a nontrivial correlation pattern. Nevertheless, the residual multipartite entanglement is maximal in the cluster phase and dependent on the magnetization in the Ising phase. We study the block entropy at the critical point and calculate the central charge of the system, showing that the criticality of the system is beyond the Ising universality class.

  11. Statistical mechanics of the cluster Ising model

    Energy Technology Data Exchange (ETDEWEB)

    Smacchia, Pietro [SISSA - via Bonomea 265, I-34136, Trieste (Italy); Amico, Luigi [CNR-MATIS-IMM and Dipartimento di Fisica e Astronomia Universita di Catania, C/O ed. 10, viale Andrea Doria 6, I-95125 Catania (Italy); Facchi, Paolo [Dipartimento di Matematica and MECENAS, Universita di Bari, I-70125 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); Fazio, Rosario [NEST, Scuola Normale Superiore and Istituto Nanoscienze - CNR, 56126 Pisa (Italy); Center for Quantum Technology, National University of Singapore, 117542 Singapore (Singapore); Florio, Giuseppe; Pascazio, Saverio [Dipartimento di Fisica and MECENAS, Universita di Bari, I-70126 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); Vedral, Vlatko [Center for Quantum Technology, National University of Singapore, 117542 Singapore (Singapore); Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117542 (Singapore); Department of Physics, University of Oxford, Clarendon Laboratory, Oxford, OX1 3PU (United Kingdom)

    2011-08-15

    We study a Hamiltonian system describing a three-spin-1/2 clusterlike interaction competing with an Ising-like antiferromagnetic interaction. We compute free energy, spin-correlation functions, and entanglement both in the ground and in thermal states. The model undergoes a quantum phase transition between an Ising phase with a nonvanishing magnetization and a cluster phase characterized by a string order. Any two-spin entanglement is found to vanish in both quantum phases because of a nontrivial correlation pattern. Nevertheless, the residual multipartite entanglement is maximal in the cluster phase and dependent on the magnetization in the Ising phase. We study the block entropy at the critical point and calculate the central charge of the system, showing that the criticality of the system is beyond the Ising universality class.

  12. A Statistical Graphical Model of the California Reservoir System

    Science.gov (United States)

    Taeb, A.; Reager, J. T.; Turmon, M.; Chandrasekaran, V.

    2017-11-01

    The recent California drought has highlighted the potential vulnerability of the state's water management infrastructure to multiyear dry intervals. Due to the high complexity of the network, dynamic storage changes in California reservoirs on a state-wide scale have previously been difficult to model using either traditional statistical or physical approaches. Indeed, although there is a significant line of research on exploring models for single (or a small number of) reservoirs, these approaches are not amenable to a system-wide modeling of the California reservoir network due to the spatial and hydrological heterogeneities of the system. In this work, we develop a state-wide statistical graphical model to characterize the dependencies among a collection of 55 major California reservoirs across the state; this model is defined with respect to a graph in which the nodes index reservoirs and the edges specify the relationships or dependencies between reservoirs. We obtain and validate this model in a data-driven manner based on reservoir volumes over the period 2003-2016. A key feature of our framework is a quantification of the effects of external phenomena that influence the entire reservoir network. We further characterize the degree to which physical factors (e.g., state-wide Palmer Drought Severity Index (PDSI), average temperature, snow pack) and economic factors (e.g., consumer price index, number of agricultural workers) explain these external influences. As a consequence of this analysis, we obtain a system-wide health diagnosis of the reservoir network as a function of PDSI.

  13. Functional summary statistics for the Johnson-Mehl model

    DEFF Research Database (Denmark)

    Møller, Jesper; Ghorbani, Mohammad

    The Johnson-Mehl germination-growth model is a spatio-temporal point process model which among other things have been used for the description of neurotransmitters datasets. However, for such datasets parametric Johnson-Mehl models fitted by maximum likelihood have yet not been evaluated by means...... of functional summary statistics. This paper therefore invents four functional summary statistics adapted to the Johnson-Mehl model, with two of them based on the second-order properties and the other two on the nuclei-boundary distances for the associated Johnson-Mehl tessellation. The functional summary...... statistics theoretical properties are investigated, non-parametric estimators are suggested, and their usefulness for model checking is examined in a simulation study. The functional summary statistics are also used for checking fitted parametric Johnson-Mehl models for a neurotransmitters dataset....

  14. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  15. Inclusion of temperature dependence of fission barriers in statistical model calculations

    International Nuclear Information System (INIS)

    Newton, J.O.; Popescu, D.G.; Leigh, J.R.

    1990-08-01

    The temperature dependence of fission barriers has been interpolated from the results of recent theoretical calculations and included in the statistical model code PACE2. It is shown that the inclusion of temperature dependence causes significant changes to the values of the statistical model parameters deduced from fits to experimental data. 21 refs., 2 figs

  16. ClusterSignificance: A bioconductor package facilitating statistical analysis of class cluster separations in dimensionality reduced data

    DEFF Research Database (Denmark)

    Serviss, Jason T.; Gådin, Jesper R.; Eriksson, Per

    2017-01-01

    , e.g. genes in a specific pathway, alone can separate samples into these established classes. Despite this, the evaluation of class separations is often subjective and performed via visualization. Here we present the ClusterSignificance package; a set of tools designed to assess the statistical...... significance of class separations downstream of dimensionality reduction algorithms. In addition, we demonstrate the design and utility of the ClusterSignificance package and utilize it to determine the importance of long non-coding RNA expression in the identity of multiple hematological malignancies....

  17. Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

    Science.gov (United States)

    Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

    1999-01-01

    Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149

  18. Statistics

    International Nuclear Information System (INIS)

    2005-01-01

    For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees

  19. Mixed deterministic statistical modelling of regional ozone air pollution

    KAUST Repository

    Kalenderski, Stoitchko

    2011-03-17

    We develop a physically motivated statistical model for regional ozone air pollution by separating the ground-level pollutant concentration field into three components, namely: transport, local production and large-scale mean trend mostly dominated by emission rates. The model is novel in the field of environmental spatial statistics in that it is a combined deterministic-statistical model, which gives a new perspective to the modelling of air pollution. The model is presented in a Bayesian hierarchical formalism, and explicitly accounts for advection of pollutants, using the advection equation. We apply the model to a specific case of regional ozone pollution-the Lower Fraser valley of British Columbia, Canada. As a predictive tool, we demonstrate that the model vastly outperforms existing, simpler modelling approaches. Our study highlights the importance of simultaneously considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. © 2011 John Wiley & Sons, Ltd..

  20. Huffman and linear scanning methods with statistical language models.

    Science.gov (United States)

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  1. Statistical Method to Overcome Overfitting Issue in Rational Function Models

    Science.gov (United States)

    Alizadeh Moghaddam, S. H.; Mokhtarzade, M.; Alizadeh Naeini, A.; Alizadeh Moghaddam, S. A.

    2017-09-01

    Rational function models (RFMs) are known as one of the most appealing models which are extensively applied in geometric correction of satellite images and map production. Overfitting is a common issue, in the case of terrain dependent RFMs, that degrades the accuracy of RFMs-derived geospatial products. This issue, resulting from the high number of RFMs' parameters, leads to ill-posedness of the RFMs. To tackle this problem, in this study, a fast and robust statistical approach is proposed and compared to Tikhonov regularization (TR) method, as a frequently-used solution to RFMs' overfitting. In the proposed method, a statistical test, namely, significance test is applied to search for the RFMs' parameters that are resistant against overfitting issue. The performance of the proposed method was evaluated for two real data sets of Cartosat-1 satellite images. The obtained results demonstrate the efficiency of the proposed method in term of the achievable level of accuracy. This technique, indeed, shows an improvement of 50-80% over the TR.

  2. A Model of Statistics Performance Based on Achievement Goal Theory.

    Science.gov (United States)

    Bandalos, Deborah L.; Finney, Sara J.; Geske, Jenenne A.

    2003-01-01

    Tests a model of statistics performance based on achievement goal theory. Both learning and performance goals affected achievement indirectly through study strategies, self-efficacy, and test anxiety. Implications of these findings for teaching and learning statistics are discussed. (Contains 47 references, 3 tables, 3 figures, and 1 appendix.)…

  3. Kolmogorov complexity, pseudorandom generators and statistical models testing

    Czech Academy of Sciences Publication Activity Database

    Šindelář, Jan; Boček, Pavel

    2002-01-01

    Roč. 38, č. 6 (2002), s. 747-759 ISSN 0023-5954 R&D Projects: GA ČR GA102/99/1564 Institutional research plan: CEZ:AV0Z1075907 Keywords : Kolmogorov complexity * pseudorandom generators * statistical models testing Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.341, year: 2002

  4. Statistical properties of several models of fractional random point processes

    Science.gov (United States)

    Bendjaballah, C.

    2011-08-01

    Statistical properties of several models of fractional random point processes have been analyzed from the counting and time interval statistics points of view. Based on the criterion of the reduced variance, it is seen that such processes exhibit nonclassical properties. The conditions for these processes to be treated as conditional Poisson processes are examined. Numerical simulations illustrate part of the theoretical calculations.

  5. Statistical Damage Detection of Civil Engineering Structures using ARMAV Models

    DEFF Research Database (Denmark)

    Andersen, P.; Kirkegaard, Poul Henning

    In this paper a statistically based damage detection of a lattice steel mast is performed. By estimation of the modal parameters and their uncertainties it is possible to detect whether some of the modal parameters have changed with a statistical significance. The estimation of the uncertainties ...

  6. Statistics

    International Nuclear Information System (INIS)

    2001-01-01

    For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  7. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  8. Statistics

    International Nuclear Information System (INIS)

    1999-01-01

    For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  9. Assessing Statistically Significant Heavy-Metal Concentrations in Abandoned Mine Areas via Hot Spot Analysis of Portable XRF Data.

    Science.gov (United States)

    Kim, Sung-Min; Choi, Yosoon

    2017-06-18

    To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z -score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z -scores: high content with a high z -score (HH), high content with a low z -score (HL), low content with a high z -score (LH), and low content with a low z -score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1-4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.

  10. Assessing Statistically Significant Heavy-Metal Concentrations in Abandoned Mine Areas via Hot Spot Analysis of Portable XRF Data

    Directory of Open Access Journals (Sweden)

    Sung-Min Kim

    2017-06-01

    Full Text Available To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z-score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z-scores: high content with a high z-score (HH, high content with a low z-score (HL, low content with a high z-score (LH, and low content with a low z-score (LL. The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1–4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.

  11. Improving statistical reasoning theoretical models and practical implications

    CERN Document Server

    Sedlmeier, Peter

    1999-01-01

    This book focuses on how statistical reasoning works and on training programs that can exploit people''s natural cognitive capabilities to improve their statistical reasoning. Training programs that take into account findings from evolutionary psychology and instructional theory are shown to have substantially larger effects that are more stable over time than previous training regimens. The theoretical implications are traced in a neural network model of human performance on statistical reasoning problems. This book apppeals to judgment and decision making researchers and other cognitive scientists, as well as to teachers of statistics and probabilistic reasoning.

  12. Statistical validation of normal tissue complication probability models

    NARCIS (Netherlands)

    Xu, Cheng-Jian; van der Schaaf, Arjen; van t Veld, Aart; Langendijk, Johannes A.; Schilstra, Cornelis

    2012-01-01

    PURPOSE: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: A penalized regression method, LASSO (least absolute shrinkage

  13. Some remarks on the statistical model of heavy ion collisions

    International Nuclear Information System (INIS)

    Koch, V.

    2003-01-01

    This contribution is an attempt to assess what can be learned from the remarkable success of this statistical model in describing ratios of particle abundances in ultra-relativistic heavy ion collisions

  14. Eigenfunction statistics for Anderson model with Hölder continuous ...

    Indian Academy of Sciences (India)

    The Institute of Mathematical Sciences, Taramani, Chennai 600 113, India ... Anderson model; Hölder continuous measure; Poisson statistics. ...... [4] Combes J-M, Hislop P D and Klopp F, An optimal Wegner estimate and its application to.

  15. A no extensive statistical model for the nucleon structure function

    International Nuclear Information System (INIS)

    Trevisan, Luis A.; Mirez, Carlos

    2013-01-01

    We studied an application of nonextensive thermodynamics to describe the structure function of nucleon, in a model where the usual Fermi-Dirac and Bose-Einstein energy distribution were replaced by the equivalent functions of the q-statistical. The parameters of the model are given by an effective temperature T, the q parameter (from Tsallis statistics), and two chemical potentials given by the corresponding up (u) and down (d) quark normalization in the nucleon.

  16. Statistical models and NMR analysis of polymer microstructure

    Science.gov (United States)

    Statistical models can be used in conjunction with NMR spectroscopy to study polymer microstructure and polymerization mechanisms. Thus, Bernoullian, Markovian, and enantiomorphic-site models are well known. Many additional models have been formulated over the years for additional situations. Typica...

  17. Statistics

    International Nuclear Information System (INIS)

    2003-01-01

    For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products

  18. Statistics

    International Nuclear Information System (INIS)

    2004-01-01

    For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees

  19. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  20. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes.

    Science.gov (United States)

    Thiessen, Erik D

    2017-01-05

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik

  1. Evaluating significance in linear mixed-effects models in R.

    Science.gov (United States)

    Luke, Steven G

    2017-08-01

    Mixed-effects models are being used ever more frequently in the analysis of experimental data. However, in the lme4 package in R the standards for evaluating significance of fixed effects in these models (i.e., obtaining p-values) are somewhat vague. There are good reasons for this, but as researchers who are using these models are required in many cases to report p-values, some method for evaluating the significance of the model output is needed. This paper reports the results of simulations showing that the two most common methods for evaluating significance, using likelihood ratio tests and applying the z distribution to the Wald t values from the model output (t-as-z), are somewhat anti-conservative, especially for smaller sample sizes. Other methods for evaluating significance, including parametric bootstrapping and the Kenward-Roger and Satterthwaite approximations for degrees of freedom, were also evaluated. The results of these simulations suggest that Type 1 error rates are closest to .05 when models are fitted using REML and p-values are derived using the Kenward-Roger or Satterthwaite approximations, as these approximations both produced acceptable Type 1 error rates even for smaller samples.

  2. Corrected Statistical Energy Analysis Model for Car Interior Noise

    Directory of Open Access Journals (Sweden)

    A. Putra

    2015-01-01

    Full Text Available Statistical energy analysis (SEA is a well-known method to analyze the flow of acoustic and vibration energy in a complex structure. For an acoustic space where significant absorptive materials are present, direct field component from the sound source dominates the total sound field rather than a reverberant field, where the latter becomes the basis in constructing the conventional SEA model. Such environment can be found in a car interior and thus a corrected SEA model is proposed here to counter this situation. The model is developed by eliminating the direct field component from the total sound field and only the power after the first reflection is considered. A test car cabin was divided into two subsystems and by using a loudspeaker as a sound source, the power injection method in SEA was employed to obtain the corrected coupling loss factor and the damping loss factor from the corrected SEA model. These parameters were then used to predict the sound pressure level in the interior cabin using the injected input power from the engine. The results show satisfactory agreement with the directly measured SPL.

  3. Flashover of a vacuum-insulator interface: A statistical model

    Directory of Open Access Journals (Sweden)

    W. A. Stygar

    2004-07-01

    Full Text Available We have developed a statistical model for the flashover of a 45° vacuum-insulator interface (such as would be found in an accelerator subject to a pulsed electric field. The model assumes that the initiation of a flashover plasma is a stochastic process, that the characteristic statistical component of the flashover delay time is much greater than the plasma formative time, and that the average rate at which flashovers occur is a power-law function of the instantaneous value of the electric field. Under these conditions, we find that the flashover probability is given by 1-exp(-E_{p}^{β}t_{eff}C/k^{β}, where E_{p} is the peak value in time of the spatially averaged electric field E(t, t_{eff}≡∫[E(t/E_{p}]^{β}dt is the effective pulse width, C is the insulator circumference, k∝exp(λ/d, and β and λ are constants. We define E(t as V(t/d, where V(t is the voltage across the insulator and d is the insulator thickness. Since the model assumes that flashovers occur at random azimuthal locations along the insulator, it does not apply to systems that have a significant defect, i.e., a location contaminated with debris or compromised by an imperfection at which flashovers repeatedly take place, and which prevents a random spatial distribution. The model is consistent with flashover measurements to within 7% for pulse widths between 0.5 ns and 10   μs, and to within a factor of 2 between 0.5 ns and 90 s (a span of over 11 orders of magnitude. For these measurements, E_{p} ranges from 64 to 651  kV/cm, d from 0.50 to 4.32 cm, and C from 4.96 to 95.74 cm. The model is significantly more accurate, and is valid over a wider range of parameters, than the J. C. Martin flashover relation that has been in use since 1971 [J. C. Martin on Pulsed Power, edited by T. H. Martin, A. H. Guenther, and M. Kristiansen (Plenum, New York, 1996]. We have generalized the statistical model to estimate the total-flashover probability of an

  4. Models for probability and statistical inference theory and applications

    CERN Document Server

    Stapleton, James H

    2007-01-01

    This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...

  5. A Statistical Model for Regional Tornado Climate Studies.

    Directory of Open Access Journals (Sweden)

    Thomas H Jagger

    Full Text Available Tornado reports are locally rare, often clustered, and of variable quality making it difficult to use them directly to describe regional tornado climatology. Here a statistical model is demonstrated that overcomes some of these difficulties and produces a smoothed regional-scale climatology of tornado occurrences. The model is applied to data aggregated at the level of counties. These data include annual population, annual tornado counts and an index of terrain roughness. The model has a term to capture the smoothed frequency relative to the state average. The model is used to examine whether terrain roughness is related to tornado frequency and whether there are differences in tornado activity by County Warning Area (CWA. A key finding is that tornado reports increase by 13% for a two-fold increase in population across Kansas after accounting for improvements in rating procedures. Independent of this relationship, tornadoes have been increasing at an annual rate of 1.9%. Another finding is the pattern of correlated residuals showing more Kansas tornadoes in a corridor of counties running roughly north to south across the west central part of the state consistent with the dryline climatology. The model is significantly improved by adding terrain roughness. The effect amounts to an 18% reduction in the number of tornadoes for every ten meter increase in elevation standard deviation. The model indicates that tornadoes are 51% more likely to occur in counties served by the CWAs of DDC and GID than elsewhere in the state. Flexibility of the model is illustrated by fitting it to data from Illinois, Mississippi, South Dakota, and Ohio.

  6. A Statistical Model for Regional Tornado Climate Studies.

    Science.gov (United States)

    Jagger, Thomas H; Elsner, James B; Widen, Holly M

    2015-01-01

    Tornado reports are locally rare, often clustered, and of variable quality making it difficult to use them directly to describe regional tornado climatology. Here a statistical model is demonstrated that overcomes some of these difficulties and produces a smoothed regional-scale climatology of tornado occurrences. The model is applied to data aggregated at the level of counties. These data include annual population, annual tornado counts and an index of terrain roughness. The model has a term to capture the smoothed frequency relative to the state average. The model is used to examine whether terrain roughness is related to tornado frequency and whether there are differences in tornado activity by County Warning Area (CWA). A key finding is that tornado reports increase by 13% for a two-fold increase in population across Kansas after accounting for improvements in rating procedures. Independent of this relationship, tornadoes have been increasing at an annual rate of 1.9%. Another finding is the pattern of correlated residuals showing more Kansas tornadoes in a corridor of counties running roughly north to south across the west central part of the state consistent with the dryline climatology. The model is significantly improved by adding terrain roughness. The effect amounts to an 18% reduction in the number of tornadoes for every ten meter increase in elevation standard deviation. The model indicates that tornadoes are 51% more likely to occur in counties served by the CWAs of DDC and GID than elsewhere in the state. Flexibility of the model is illustrated by fitting it to data from Illinois, Mississippi, South Dakota, and Ohio.

  7. Statistics Based Models for the Dynamics of Chernivtsi Children Disease

    Directory of Open Access Journals (Sweden)

    Igor G. Nesteruk

    2017-10-01

    Full Text Available Background. Simple mathematical models of contamination and SIR-model of spreading an infection were used to simulate the time dynamics of the unknown before children disease, which occurred in Chernivtsi (Ukraine. The cause of many cases of alopecia, which began in this city in August 1988 is still not fully clarified. According to the official report of the governmental commission, the last new cases occurred in the middle of November 1988, and the reason of the illness was reported as chemical exogenous intoxication. Later this illness became the name “Chernivtsi chemical disease”. Nevertheless, the significantly increased number of new cases of the local alopecia was registered almost three years and is still not clarified. Objective. The comparison of two different versions of the disease: chemical exogenous intoxication and infection. Identification of the parameters of mathematical models and prediction of the disease development. Methods. Analytical solutions of the contamination models and SIR-model for an epidemic are obtained. The optimal values of parameters with the use of linear regression were found. Results. The optimal values of the models parameters with the use of statistical approach were identified. The calculations showed that the infectious version of the disease is more reliable in comparison with the popular contamination one. The possible date of the epidemic beginning was estimated. Conclusions. The optimal parameters of SIR-model allow calculating the realistic number of victims and other characteristics of possible epidemic. They also show that increased number of cases of local alopecia could be a part of the same epidemic as “Chernivtsi chemical disease”.

  8. Right-sizing statistical models for longitudinal data.

    Science.gov (United States)

    Wood, Phillip K; Steinley, Douglas; Jackson, Kristina M

    2015-12-01

    Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to "right-size" the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting, overly parsimonious models to more complex, better-fitting alternatives and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically underidentified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A 3-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation-covariation patterns. The orthogonal free curve slope intercept (FCSI) growth model is considered a general model that includes, as special cases, many models, including the factor mean (FM) model (McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, hierarchical linear models (HLMs), repeated-measures multivariate analysis of variance (MANOVA), and the linear slope intercept (linearSI) growth model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparing several candidate parametric growth and chronometric models in a Monte Carlo study. (c) 2015 APA, all rights reserved).

  9. A Stochastic Fractional Dynamics Model of Rainfall Statistics

    Science.gov (United States)

    Kundu, Prasun; Travis, James

    2013-04-01

    Rainfall varies in space and time in a highly irregular manner and is described naturally in terms of a stochastic process. A characteristic feature of rainfall statistics is that they depend strongly on the space-time scales over which rain data are averaged. A spectral model of precipitation has been developed based on a stochastic differential equation of fractional order for the point rain rate, that allows a concise description of the second moment statistics of rain at any prescribed space-time averaging scale. The model is designed to faithfully reflect the scale dependence and is thus capable of providing a unified description of the statistics of both radar and rain gauge data. The underlying dynamical equation can be expressed in terms of space-time derivatives of fractional orders that are adjusted together with other model parameters to fit the data. The form of the resulting spectrum gives the model adequate flexibility to capture the subtle interplay between the spatial and temporal scales of variability of rain but strongly constrains the predicted statistical behavior as a function of the averaging length and times scales. The main restriction is the assumption that the statistics of the precipitation field is spatially homogeneous and isotropic and stationary in time. We test the model with radar and gauge data collected contemporaneously at the NASA TRMM ground validation sites located near Melbourne, Florida and in Kwajalein Atoll, Marshall Islands in the tropical Pacific. We estimate the parameters by tuning them to the second moment statistics of the radar data. The model predictions are then found to fit the second moment statistics of the gauge data reasonably well without any further adjustment. Some data sets containing periods of non-stationary behavior that involves occasional anomalously correlated rain events, present a challenge for the model.

  10. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry.

    Science.gov (United States)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo

    2018-06-05

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.

  11. Variability aware compact model characterization for statistical circuit design optimization

    Science.gov (United States)

    Qiao, Ying; Qian, Kun; Spanos, Costas J.

    2012-03-01

    Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose an efficient variabilityaware compact model characterization methodology based on the linear propagation of variance. Hierarchical spatial variability patterns of selected compact model parameters are directly calculated from transistor array test structures. This methodology has been implemented and tested using transistor I-V measurements and the EKV-EPFL compact model. Calculation results compare well to full-wafer direct model parameter extractions. Further studies are done on the proper selection of both compact model parameters and electrical measurement metrics used in the method.

  12. Linear mixed models a practical guide using statistical software

    CERN Document Server

    West, Brady T; Galecki, Andrzej T

    2006-01-01

    Simplifying the often confusing array of software programs for fitting linear mixed models (LMMs), Linear Mixed Models: A Practical Guide Using Statistical Software provides a basic introduction to primary concepts, notation, software implementation, model interpretation, and visualization of clustered and longitudinal data. This easy-to-navigate reference details the use of procedures for fitting LMMs in five popular statistical software packages: SAS, SPSS, Stata, R/S-plus, and HLM. The authors introduce basic theoretical concepts, present a heuristic approach to fitting LMMs based on bo

  13. Speech emotion recognition based on statistical pitch model

    Institute of Scientific and Technical Information of China (English)

    WANG Zhiping; ZHAO Li; ZOU Cairong

    2006-01-01

    A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.

  14. Multiple commodities in statistical microeconomics: Model and market

    Science.gov (United States)

    Baaquie, Belal E.; Yu, Miao; Du, Xin

    2016-11-01

    A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.

  15. Modelling diversity in building occupant behaviour: a novel statistical approach

    DEFF Research Database (Denmark)

    Haldi, Frédéric; Calì, Davide; Andersen, Rune Korsholm

    2016-01-01

    We propose an advanced modelling framework to predict the scope and effects of behavioural diversity regarding building occupant actions on window openings, shading devices and lighting. We develop a statistical approach based on generalised linear mixed models to account for the longitudinal nat...

  16. A classical statistical model of heavy ion collisions

    International Nuclear Information System (INIS)

    Schmidt, R.; Teichert, J.

    1980-01-01

    The use of the computer code TRAJEC which represents the numerical realization of a classical statistical model for heavy ion collisions is described. The code calculates the results of a classical friction model as well as various multi-differential cross sections for heavy ion collisions. INPUT and OUTPUT information of the code are described. Two examples of data sets are given [ru

  17. On an uncorrelated jet model with Bose-Einstein statistics

    International Nuclear Information System (INIS)

    Bilic, N.; Dadic, I.; Martinis, M.

    1978-01-01

    Starting from the density of states of an ideal Bose-Einstein gas, an uncorrelated jet model with Bose-Einstein statistics has been formulated. The transition to continuum is based on the Touschek invariant measure. It has been shown that in this model average multiplicity increases logarithmically with total energy, while the inclusive distribution shows ln s violation of scaling. (author)

  18. Complex Data Modeling and Computationally Intensive Statistical Methods

    CERN Document Server

    Mantovan, Pietro

    2010-01-01

    The last years have seen the advent and development of many devices able to record and store an always increasing amount of complex and high dimensional data; 3D images generated by medical scanners or satellite remote sensing, DNA microarrays, real time financial data, system control datasets. The analysis of this data poses new challenging problems and requires the development of novel statistical models and computational methods, fueling many fascinating and fast growing research areas of modern statistics. The book offers a wide variety of statistical methods and is addressed to statistici

  19. Validation of statistical models for creep rupture by parametric analysis

    Energy Technology Data Exchange (ETDEWEB)

    Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)

    2012-01-15

    Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).

  20. Statistical Validation of Engineering and Scientific Models: Background

    International Nuclear Information System (INIS)

    Hills, Richard G.; Trucano, Timothy G.

    1999-01-01

    A tutorial is presented discussing the basic issues associated with propagation of uncertainty analysis and statistical validation of engineering and scientific models. The propagation of uncertainty tutorial illustrates the use of the sensitivity method and the Monte Carlo method to evaluate the uncertainty in predictions for linear and nonlinear models. Four example applications are presented; a linear model, a model for the behavior of a damped spring-mass system, a transient thermal conduction model, and a nonlinear transient convective-diffusive model based on Burger's equation. Correlated and uncorrelated model input parameters are considered. The model validation tutorial builds on the material presented in the propagation of uncertainty tutoriaI and uses the damp spring-mass system as the example application. The validation tutorial illustrates several concepts associated with the application of statistical inference to test model predictions against experimental observations. Several validation methods are presented including error band based, multivariate, sum of squares of residuals, and optimization methods. After completion of the tutorial, a survey of statistical model validation literature is presented and recommendations for future work are made

  1. Shell model in large spaces and statistical spectroscopy

    International Nuclear Information System (INIS)

    Kota, V.K.B.

    1996-01-01

    For many nuclear structure problems of current interest it is essential to deal with shell model in large spaces. For this, three different approaches are now in use and two of them are: (i) the conventional shell model diagonalization approach but taking into account new advances in computer technology; (ii) the shell model Monte Carlo method. A brief overview of these two methods is given. Large space shell model studies raise fundamental questions regarding the information content of the shell model spectrum of complex nuclei. This led to the third approach- the statistical spectroscopy methods. The principles of statistical spectroscopy have their basis in nuclear quantum chaos and they are described (which are substantiated by large scale shell model calculations) in some detail. (author)

  2. Computationally efficient statistical differential equation modeling using homogenization

    Science.gov (United States)

    Hooten, Mevin B.; Garlick, Martha J.; Powell, James A.

    2013-01-01

    Statistical models using partial differential equations (PDEs) to describe dynamically evolving natural systems are appearing in the scientific literature with some regularity in recent years. Often such studies seek to characterize the dynamics of temporal or spatio-temporal phenomena such as invasive species, consumer-resource interactions, community evolution, and resource selection. Specifically, in the spatial setting, data are often available at varying spatial and temporal scales. Additionally, the necessary numerical integration of a PDE may be computationally infeasible over the spatial support of interest. We present an approach to impose computationally advantageous changes of support in statistical implementations of PDE models and demonstrate its utility through simulation using a form of PDE known as “ecological diffusion.” We also apply a statistical ecological diffusion model to a data set involving the spread of mountain pine beetle (Dendroctonus ponderosae) in Idaho, USA.

  3. Growth Curve Models and Applications : Indian Statistical Institute

    CERN Document Server

    2017-01-01

    Growth curve models in longitudinal studies are widely used to model population size, body height, biomass, fungal growth, and other variables in the biological sciences, but these statistical methods for modeling growth curves and analyzing longitudinal data also extend to general statistics, economics, public health, demographics, epidemiology, SQC, sociology, nano-biotechnology, fluid mechanics, and other applied areas.   There is no one-size-fits-all approach to growth measurement. The selected papers in this volume build on presentations from the GCM workshop held at the Indian Statistical Institute, Giridih, on March 28-29, 2016. They represent recent trends in GCM research on different subject areas, both theoretical and applied. This book includes tools and possibilities for further work through new techniques and modification of existing ones. The volume includes original studies, theoretical findings and case studies from a wide range of app lied work, and these contributions have been externally r...

  4. Development of a statistical oil spill model for risk assessment.

    Science.gov (United States)

    Guo, Weijun

    2017-11-01

    To gain a better understanding of the impacts from potential risk sources, we developed an oil spill model using probabilistic method, which simulates numerous oil spill trajectories under varying environmental conditions. The statistical results were quantified from hypothetical oil spills under multiple scenarios, including area affected probability, mean oil slick thickness, and duration of water surface exposed to floating oil. The three sub-indices together with marine area vulnerability are merged to compute the composite index, characterizing the spatial distribution of risk degree. Integral of the index can be used to identify the overall risk from an emission source. The developed model has been successfully applied in comparison to and selection of an appropriate oil port construction location adjacent to a marine protected area for Phoca largha in China. The results highlight the importance of selection of candidates before project construction, since that risk estimation from two adjacent potential sources may turn out to be significantly different regarding hydrodynamic conditions and eco-environmental sensitivity. Copyright © 2017. Published by Elsevier Ltd.

  5. Statistical modelling for recurrent events: an application to sports injuries.

    Science.gov (United States)

    Ullah, Shahid; Gabbett, Tim J; Finch, Caroline F

    2014-09-01

    Injuries are often recurrent, with subsequent injuries influenced by previous occurrences and hence correlation between events needs to be taken into account when analysing such data. This paper compares five different survival models (Cox proportional hazards (CoxPH) model and the following generalisations to recurrent event data: Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time (WLW-TT) marginal, Prentice-Williams-Peterson gap time (PWP-GT) conditional models) for the analysis of recurrent injury data. Empirical evaluation and comparison of different models were performed using model selection criteria and goodness-of-fit statistics. Simulation studies assessed the size and power of each model fit. The modelling approach is demonstrated through direct application to Australian National Rugby League recurrent injury data collected over the 2008 playing season. Of the 35 players analysed, 14 (40%) players had more than 1 injury and 47 contact injuries were sustained over 29 matches. The CoxPH model provided the poorest fit to the recurrent sports injury data. The fit was improved with the A-G and frailty models, compared to WLW-TT and PWP-GT models. Despite little difference in model fit between the A-G and frailty models, in the interest of fewer statistical assumptions it is recommended that, where relevant, future studies involving modelling of recurrent sports injury data use the frailty model in preference to the CoxPH model or its other generalisations. The paper provides a rationale for future statistical modelling approaches for recurrent sports injury. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  6. Statistical Model of the 2001 Czech Census for Interactive Presentation

    Czech Academy of Sciences Publication Activity Database

    Grim, Jiří; Hora, Jan; Boček, Pavel; Somol, Petr; Pudil, Pavel

    Vol. 26, č. 4 (2010), s. 1-23 ISSN 0282-423X R&D Projects: GA ČR GA102/07/1594; GA MŠk 1M0572 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Interactive statistical model * census data presentation * distribution mixtures * data modeling * EM algorithm * incomplete data * data reproduction accuracy * data mining Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.492, year: 2010 http://library.utia.cas.cz/separaty/2010/RO/grim-0350513.pdf

  7. The Statistical Modeling of the Trends Concerning the Romanian Population

    Directory of Open Access Journals (Sweden)

    Gabriela OPAIT

    2014-11-01

    Full Text Available This paper reflects the statistical modeling concerning the resident population in Romania, respectively the total of the romanian population, through by means of the „Least Squares Method”. Any country it develops by increasing of the population, respectively of the workforce, which is a factor of influence for the growth of the Gross Domestic Product (G.D.P.. The „Least Squares Method” represents a statistical technique for to determine the trend line of the best fit concerning a model.

  8. The distribution of P-values in medical research articles suggested selective reporting associated with statistical significance.

    Science.gov (United States)

    Perneger, Thomas V; Combescure, Christophe

    2017-07-01

    Published P-values provide a window into the global enterprise of medical research. The aim of this study was to use the distribution of published P-values to estimate the relative frequencies of null and alternative hypotheses and to seek irregularities suggestive of publication bias. This cross-sectional study included P-values published in 120 medical research articles in 2016 (30 each from the BMJ, JAMA, Lancet, and New England Journal of Medicine). The observed distribution of P-values was compared with expected distributions under the null hypothesis (i.e., uniform between 0 and 1) and the alternative hypothesis (strictly decreasing from 0 to 1). P-values were categorized according to conventional levels of statistical significance and in one-percent intervals. Among 4,158 recorded P-values, 26.1% were highly significant (P values values equal to 1, and (3) about twice as many P-values less than 0.05 compared with those more than 0.05. The latter finding was seen in both randomized trials and observational studies, and in most types of analyses, excepting heterogeneity tests and interaction tests. Under plausible assumptions, we estimate that about half of the tested hypotheses were null and the other half were alternative. This analysis suggests that statistical tests published in medical journals are not a random sample of null and alternative hypotheses but that selective reporting is prevalent. In particular, significant results are about twice as likely to be reported as nonsignificant results. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Statistically significant dependence of the Xaa-Pro peptide bond conformation on secondary structure and amino acid sequence

    Directory of Open Access Journals (Sweden)

    Leitner Dietmar

    2005-04-01

    Full Text Available Abstract Background A reliable prediction of the Xaa-Pro peptide bond conformation would be a useful tool for many protein structure calculation methods. We have analyzed the Protein Data Bank and show that the combined use of sequential and structural information has a predictive value for the assessment of the cis versus trans peptide bond conformation of Xaa-Pro within proteins. For the analysis of the data sets different statistical methods such as the calculation of the Chou-Fasman parameters and occurrence matrices were used. Furthermore we analyzed the relationship between the relative solvent accessibility and the relative occurrence of prolines in the cis and in the trans conformation. Results One of the main results of the statistical investigations is the ranking of the secondary structure and sequence information with respect to the prediction of the Xaa-Pro peptide bond conformation. We observed a significant impact of secondary structure information on the occurrence of the Xaa-Pro peptide bond conformation, while the sequence information of amino acids neighboring proline is of little predictive value for the conformation of this bond. Conclusion In this work, we present an extensive analysis of the occurrence of the cis and trans proline conformation in proteins. Based on the data set, we derived patterns and rules for a possible prediction of the proline conformation. Upon adoption of the Chou-Fasman parameters, we are able to derive statistically relevant correlations between the secondary structure of amino acid fragments and the Xaa-Pro peptide bond conformation.

  10. Applied systems ecology: models, data, and statistical methods

    Energy Technology Data Exchange (ETDEWEB)

    Eberhardt, L L

    1976-01-01

    In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.

  11. Analyzing sickness absence with statistical models for survival data

    DEFF Research Database (Denmark)

    Christensen, Karl Bang; Andersen, Per Kragh; Smith-Hansen, Lars

    2007-01-01

    OBJECTIVES: Sickness absence is the outcome in many epidemiologic studies and is often based on summary measures such as the number of sickness absences per year. In this study the use of modern statistical methods was examined by making better use of the available information. Since sickness...... absence data deal with events occurring over time, the use of statistical models for survival data has been reviewed, and the use of frailty models has been proposed for the analysis of such data. METHODS: Three methods for analyzing data on sickness absences were compared using a simulation study...... involving the following: (i) Poisson regression using a single outcome variable (number of sickness absences), (ii) analysis of time to first event using the Cox proportional hazards model, and (iii) frailty models, which are random effects proportional hazards models. Data from a study of the relation...

  12. Automated robust generation of compact 3D statistical shape models

    Science.gov (United States)

    Vrtovec, Tomaz; Likar, Bostjan; Tomazevic, Dejan; Pernus, Franjo

    2004-05-01

    Ascertaining the detailed shape and spatial arrangement of anatomical structures is important not only within diagnostic settings but also in the areas of planning, simulation, intraoperative navigation, and tracking of pathology. Robust, accurate and efficient automated segmentation of anatomical structures is difficult because of their complexity and inter-patient variability. Furthermore, the position of the patient during image acquisition, the imaging device and protocol, image resolution, and other factors induce additional variations in shape and appearance. Statistical shape models (SSMs) have proven quite successful in capturing structural variability. A possible approach to obtain a 3D SSM is to extract reference voxels by precisely segmenting the structure in one, reference image. The corresponding voxels in other images are determined by registering the reference image to each other image. The SSM obtained in this way describes statistically plausible shape variations over the given population as well as variations due to imperfect registration. In this paper, we present a completely automated method that significantly reduces shape variations induced by imperfect registration, thus allowing a more accurate description of variations. At each iteration, the derived SSM is used for coarse registration, which is further improved by describing finer variations of the structure. The method was tested on 64 lumbar spinal column CT scans, from which 23, 38, 45, 46 and 42 volumes of interest containing vertebra L1, L2, L3, L4 and L5, respectively, were extracted. Separate SSMs were generated for each vertebra. The results show that the method is capable of reducing the variations induced by registration errors.

  13. A Review of Modeling Bioelectrochemical Systems: Engineering and Statistical Aspects

    Directory of Open Access Journals (Sweden)

    Shuai Luo

    2016-02-01

    Full Text Available Bioelectrochemical systems (BES are promising technologies to convert organic compounds in wastewater to electrical energy through a series of complex physical-chemical, biological and electrochemical processes. Representative BES such as microbial fuel cells (MFCs have been studied and advanced for energy recovery. Substantial experimental and modeling efforts have been made for investigating the processes involved in electricity generation toward the improvement of the BES performance for practical applications. However, there are many parameters that will potentially affect these processes, thereby making the optimization of system performance hard to be achieved. Mathematical models, including engineering models and statistical models, are powerful tools to help understand the interactions among the parameters in BES and perform optimization of BES configuration/operation. This review paper aims to introduce and discuss the recent developments of BES modeling from engineering and statistical aspects, including analysis on the model structure, description of application cases and sensitivity analysis of various parameters. It is expected to serves as a compass for integrating the engineering and statistical modeling strategies to improve model accuracy for BES development.

  14. New robust statistical procedures for the polytomous logistic regression models.

    Science.gov (United States)

    Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

    2018-05-17

    This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

  15. Simple classical model for Fano statistics in radiation detectors

    Energy Technology Data Exchange (ETDEWEB)

    Jordan, David V. [Pacific Northwest National Laboratory, National Security Division - Radiological and Chemical Sciences Group PO Box 999, Richland, WA 99352 (United States)], E-mail: David.Jordan@pnl.gov; Renholds, Andrea S.; Jaffe, John E.; Anderson, Kevin K.; Rene Corrales, L.; Peurrung, Anthony J. [Pacific Northwest National Laboratory, National Security Division - Radiological and Chemical Sciences Group PO Box 999, Richland, WA 99352 (United States)

    2008-02-01

    A simple classical model that captures the essential statistics of energy partitioning processes involved in the creation of information carriers (ICs) in radiation detectors is presented. The model pictures IC formation from a fixed amount of deposited energy in terms of the statistically analogous process of successively sampling water from a large, finite-volume container ('bathtub') with a small dipping implement ('shot or whiskey glass'). The model exhibits sub-Poisson variance in the distribution of the number of ICs generated (the 'Fano effect'). Elementary statistical analysis of the model clarifies the role of energy conservation in producing the Fano effect and yields Fano's prescription for computing the relative variance of the IC number distribution in terms of the mean and variance of the underlying, single-IC energy distribution. The partitioning model is applied to the development of the impact ionization cascade in semiconductor radiation detectors. It is shown that, in tandem with simple assumptions regarding the distribution of energies required to create an (electron, hole) pair, the model yields an energy-independent Fano factor of 0.083, in accord with the lower end of the range of literature values reported for silicon and high-purity germanium. The utility of this simple picture as a diagnostic tool for guiding or constraining more detailed, 'microscopic' physical models of detector material response to ionizing radiation is discussed.

  16. Development of 3D statistical mandible models for cephalometric measurements

    International Nuclear Information System (INIS)

    Kim, Sung Goo; Yi, Won Jin; Hwang, Soon Jung; Choi, Soon Chul; Lee, Sam Sun; Heo, Min Suk; Huh, Kyung Hoe; Kim, Tae Il; Hong, Helen; Yoo, Ji Hyun

    2012-01-01

    The aim of this study was to provide sex-matched three-dimensional (3D) statistical shape models of the mandible, which would provide cephalometric parameters for 3D treatment planning and cephalometric measurements in orthognathic surgery. The subjects used to create the 3D shape models of the mandible included 23 males and 23 females. The mandibles were segmented semi-automatically from 3D facial CT images. Each individual mandible shape was reconstructed as a 3D surface model, which was parameterized to establish correspondence between different individual surfaces. The principal component analysis (PCA) applied to all mandible shapes produced a mean model and characteristic models of variation. The cephalometric parameters were measured directly from the mean models to evaluate the 3D shape models. The means of the measured parameters were compared with those from other conventional studies. The male and female 3D statistical mean models were developed from 23 individual mandibles, respectively. The male and female characteristic shapes of variation produced by PCA showed a large variability included in the individual mandibles. The cephalometric measurements from the developed models were very close to those from some conventional studies. We described the construction of 3D mandibular shape models and presented the application of the 3D mandibular template in cephalometric measurements. Optimal reference models determined from variations produced by PCA could be used for craniofacial patients with various types of skeletal shape.

  17. Development of 3D statistical mandible models for cephalometric measurements

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung Goo; Yi, Won Jin; Hwang, Soon Jung; Choi, Soon Chul; Lee, Sam Sun; Heo, Min Suk; Huh, Kyung Hoe; Kim, Tae Il [School of Dentistry, Seoul National University, Seoul (Korea, Republic of); Hong, Helen; Yoo, Ji Hyun [Division of Multimedia Engineering, Seoul Women' s University, Seoul (Korea, Republic of)

    2012-09-15

    The aim of this study was to provide sex-matched three-dimensional (3D) statistical shape models of the mandible, which would provide cephalometric parameters for 3D treatment planning and cephalometric measurements in orthognathic surgery. The subjects used to create the 3D shape models of the mandible included 23 males and 23 females. The mandibles were segmented semi-automatically from 3D facial CT images. Each individual mandible shape was reconstructed as a 3D surface model, which was parameterized to establish correspondence between different individual surfaces. The principal component analysis (PCA) applied to all mandible shapes produced a mean model and characteristic models of variation. The cephalometric parameters were measured directly from the mean models to evaluate the 3D shape models. The means of the measured parameters were compared with those from other conventional studies. The male and female 3D statistical mean models were developed from 23 individual mandibles, respectively. The male and female characteristic shapes of variation produced by PCA showed a large variability included in the individual mandibles. The cephalometric measurements from the developed models were very close to those from some conventional studies. We described the construction of 3D mandibular shape models and presented the application of the 3D mandibular template in cephalometric measurements. Optimal reference models determined from variations produced by PCA could be used for craniofacial patients with various types of skeletal shape.

  18. Statistical sampling and modelling for cork oak and eucalyptus stands

    NARCIS (Netherlands)

    Paulo, M.J.

    2002-01-01

    This thesis focuses on the use of modern statistical methods to solve problems on sampling, optimal cutting time and agricultural modelling in Portuguese cork oak and eucalyptus stands. The results are contained in five chapters that have been submitted for publication

  19. Two-dimensional models in statistical mechanics and field theory

    International Nuclear Information System (INIS)

    Koberle, R.

    1980-01-01

    Several features of two-dimensional models in statistical mechanics and Field theory, such as, lattice quantum chromodynamics, Z(N), Gross-Neveu and CP N-1 are discussed. The problems of confinement and dynamical mass generation are also analyzed. (L.C.) [pt

  20. Statistical Modeling of Energy Production by Photovoltaic Farms

    Czech Academy of Sciences Publication Activity Database

    Brabec, Marek; Pelikán, Emil; Krč, Pavel; Eben, Kryštof; Musílek, P.

    2011-01-01

    Roč. 5, č. 9 (2011), s. 785-793 ISSN 1934-8975 Grant - others:GA AV ČR(CZ) M100300904 Institutional research plan: CEZ:AV0Z10300504 Keywords : electrical energy * solar energy * numerical weather prediction model * nonparametric regression * beta regression Subject RIV: BB - Applied Statistics, Operational Research

  1. Model selection for contingency tables with algebraic statistics

    NARCIS (Netherlands)

    Krampe, A.; Kuhnt, S.; Gibilisco, P.; Riccimagno, E.; Rogantin, M.P.; Wynn, H.P.

    2009-01-01

    Goodness-of-fit tests based on chi-square approximations are commonly used in the analysis of contingency tables. Results from algebraic statistics combined with MCMC methods provide alternatives to the chi-square approximation. However, within a model selection procedure usually a large number of

  2. Syntactic discriminative language model rerankers for statistical machine translation

    NARCIS (Netherlands)

    Carter, S.; Monz, C.

    2011-01-01

    This article describes a method that successfully exploits syntactic features for n-best translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over n-gram language models in differentiating between Statistical

  3. Using statistical compatibility to derive advanced probabilistic fatigue models

    Czech Academy of Sciences Publication Activity Database

    Fernández-Canteli, A.; Castillo, E.; López-Aenlle, M.; Seitl, Stanislav

    2010-01-01

    Roč. 2, č. 1 (2010), s. 1131-1140 E-ISSN 1877-7058. [Fatigue 2010. Praha, 06.06.2010-11.06.2010] Institutional research plan: CEZ:AV0Z20410507 Keywords : Fatigue models * Statistical compatibility * Functional equations Subject RIV: JL - Materials Fatigue, Friction Mechanics

  4. Statistical properties of the nuclear shell-model Hamiltonian

    International Nuclear Information System (INIS)

    Dias, H.; Hussein, M.S.; Oliveira, N.A. de

    1986-01-01

    The statistical properties of realistic nuclear shell-model Hamiltonian are investigated in sd-shell nuclei. The probability distribution of the basic-vector amplitude is calculated and compared with the Porter-Thomas distribution. Relevance of the results to the calculation of the giant resonance mixing parameter is pointed out. (Author) [pt

  5. Statistical shape model with random walks for inner ear segmentation

    DEFF Research Database (Denmark)

    Pujadas, Esmeralda Ruiz; Kjer, Hans Martin; Piella, Gemma

    2016-01-01

    is required. We propose a new framework for segmentation of micro-CT cochlear images using random walks combined with a statistical shape model (SSM). The SSM allows us to constrain the less contrasted areas and ensures valid inner ear shape outputs. Additionally, a topology preservation method is proposed...

  6. Hierarchical modelling for the environmental sciences statistical methods and applications

    CERN Document Server

    Clark, James S

    2006-01-01

    New statistical tools are changing the way in which scientists analyze and interpret data and models. Hierarchical Bayes and Markov Chain Monte Carlo methods for analysis provide a consistent framework for inference and prediction where information is heterogeneous and uncertain, processes are complicated, and responses depend on scale. Nowhere are these methods more promising than in the environmental sciences.

  7. A Statistical Model for the Estimation of Natural Gas Consumption

    Czech Academy of Sciences Publication Activity Database

    Vondráček, Jiří; Pelikán, Emil; Konár, Ondřej; Čermáková, Jana; Eben, Kryštof; Malý, Marek; Brabec, Marek

    2008-01-01

    Roč. 85, c. 5 (2008), s. 362-370 ISSN 0306-2619 R&D Projects: GA AV ČR 1ET400300513 Institutional research plan: CEZ:AV0Z10300504 Keywords : nonlinear regression * gas consumption modeling Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.371, year: 2008

  8. Statistical learning modeling method for space debris photometric measurement

    Science.gov (United States)

    Sun, Wenjing; Sun, Jinqiu; Zhang, Yanning; Li, Haisen

    2016-03-01

    Photometric measurement is an important way to identify the space debris, but the present methods of photometric measurement have many constraints on star image and need complex image processing. Aiming at the problems, a statistical learning modeling method for space debris photometric measurement is proposed based on the global consistency of the star image, and the statistical information of star images is used to eliminate the measurement noises. First, the known stars on the star image are divided into training stars and testing stars. Then, the training stars are selected as the least squares fitting parameters to construct the photometric measurement model, and the testing stars are used to calculate the measurement accuracy of the photometric measurement model. Experimental results show that, the accuracy of the proposed photometric measurement model is about 0.1 magnitudes.

  9. Workshop on Model Uncertainty and its Statistical Implications

    CERN Document Server

    1988-01-01

    In this book problems related to the choice of models in such diverse fields as regression, covariance structure, time series analysis and multinomial experiments are discussed. The emphasis is on the statistical implications for model assessment when the assessment is done with the same data that generated the model. This is a problem of long standing, notorious for its difficulty. Some contributors discuss this problem in an illuminating way. Others, and this is a truly novel feature, investigate systematically whether sample re-use methods like the bootstrap can be used to assess the quality of estimators or predictors in a reliable way given the initial model uncertainty. The book should prove to be valuable for advanced practitioners and statistical methodologists alike.

  10. Statistical models describing the energy signature of buildings

    DEFF Research Database (Denmark)

    Bacher, Peder; Madsen, Henrik; Thavlov, Anders

    2010-01-01

    Approximately one third of the primary energy production in Denmark is used for heating in buildings. Therefore efforts to accurately describe and improve energy performance of the building mass are very important. For this purpose statistical models describing the energy signature of a building, i...... or varying energy prices. The paper will give an overview of statistical methods and applied models based on experiments carried out in FlexHouse, which is an experimental building in SYSLAB, Risø DTU. The models are of different complexity and can provide estimates of physical quantities such as UA......-values, time constants of the building, and other parameters related to the heat dynamics. A method for selecting the most appropriate model for a given building is outlined and finally a perspective of the applications is given. Aknowledgements to the Danish Energy Saving Trust and the Interreg IV ``Vind i...

  11. Improved air ventilation rate estimation based on a statistical model

    International Nuclear Information System (INIS)

    Brabec, M.; Jilek, K.

    2004-01-01

    A new approach to air ventilation rate estimation from CO measurement data is presented. The approach is based on a state-space dynamic statistical model, allowing for quick and efficient estimation. Underlying computations are based on Kalman filtering, whose practical software implementation is rather easy. The key property is the flexibility of the model, allowing various artificial regimens of CO level manipulation to be treated. The model is semi-parametric in nature and can efficiently handle time-varying ventilation rate. This is a major advantage, compared to some of the methods which are currently in practical use. After a formal introduction of the statistical model, its performance is demonstrated on real data from routine measurements. It is shown how the approach can be utilized in a more complex situation of major practical relevance, when time-varying air ventilation rate and radon entry rate are to be estimated simultaneously from concurrent radon and CO measurements

  12. Applications of spatial statistical network models to stream data

    Science.gov (United States)

    Daniel J. Isaak; Erin E. Peterson; Jay M. Ver Hoef; Seth J. Wenger; Jeffrey A. Falke; Christian E. Torgersen; Colin Sowder; E. Ashley Steel; Marie-Josee Fortin; Chris E. Jordan; Aaron S. Ruesch; Nicholas Som; Pascal. Monestiez

    2014-01-01

    Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for...

  13. Bayesian Nonparametric Statistical Inference for Shock Models and Wear Processes.

    Science.gov (United States)

    1979-12-01

    also note that the results in Section 2 do not depend on the support of F .) This shock model have been studied by Esary, Marshall and Proschan (1973...Barlow and Proschan (1975), among others. The analogy of the shock model in risk and acturial analysis has been given by BUhlmann (1970, Chapter 2... Mathematical Statistics, Vol. 4, pp. 894-906. Billingsley, P. (1968), CONVERGENCE OF PROBABILITY MEASURES, John Wiley, New York. BUhlmann, H. (1970

  14. Statistical and RBF NN models : providing forecasts and risk assessment

    OpenAIRE

    Marček, Milan

    2009-01-01

    Forecast accuracy of economic and financial processes is a popular measure for quantifying the risk in decision making. In this paper, we develop forecasting models based on statistical (stochastic) methods, sometimes called hard computing, and on a soft method using granular computing. We consider the accuracy of forecasting models as a measure for risk evaluation. It is found that the risk estimation process based on soft methods is simplified and less critical to the question w...

  15. A Statistical Model for Synthesis of Detailed Facial Geometry

    OpenAIRE

    Golovinskiy, Aleksey; Matusik, Wojciech; Pfister, Hanspeter; Rusinkiewicz, Szymon; Funkhouser, Thomas

    2006-01-01

    Detailed surface geometry contributes greatly to the visual realism of 3D face models. However, acquiring high-resolution face geometry is often tedious and expensive. Consequently, most face models used in games, virtual reality, or computer vision look unrealistically smooth. In this paper, we introduce a new statistical technique for the analysis and synthesis of small three-dimensional facial features, such as wrinkles and pores. We acquire high-resolution face geometry for people across ...

  16. Statistical approach for selection of regression model during validation of bioanalytical method

    Directory of Open Access Journals (Sweden)

    Natalija Nakov

    2014-06-01

    Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.

  17. WE-A-201-02: Modern Statistical Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Niemierko, A.

    2016-06-15

    Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear

  18. WE-A-201-02: Modern Statistical Modeling

    International Nuclear Information System (INIS)

    Niemierko, A.

    2016-01-01

    Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear

  19. An exercise in model validation: Comparing univariate statistics and Monte Carlo-based multivariate statistics

    International Nuclear Information System (INIS)

    Weathers, J.B.; Luck, R.; Weathers, J.W.

    2009-01-01

    The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.

  20. An exercise in model validation: Comparing univariate statistics and Monte Carlo-based multivariate statistics

    Energy Technology Data Exchange (ETDEWEB)

    Weathers, J.B. [Shock, Noise, and Vibration Group, Northrop Grumman Shipbuilding, P.O. Box 149, Pascagoula, MS 39568 (United States)], E-mail: James.Weathers@ngc.com; Luck, R. [Department of Mechanical Engineering, Mississippi State University, 210 Carpenter Engineering Building, P.O. Box ME, Mississippi State, MS 39762-5925 (United States)], E-mail: Luck@me.msstate.edu; Weathers, J.W. [Structural Analysis Group, Northrop Grumman Shipbuilding, P.O. Box 149, Pascagoula, MS 39568 (United States)], E-mail: Jeffrey.Weathers@ngc.com

    2009-11-15

    The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.

  1. Computer modelling of statistical properties of SASE FEL radiation

    International Nuclear Information System (INIS)

    Saldin, E. L.; Schneidmiller, E. A.; Yurkov, M. V.

    1997-01-01

    The paper describes an approach to computer modelling of statistical properties of the radiation from self amplified spontaneous emission free electron laser (SASE FEL). The present approach allows one to calculate the following statistical properties of the SASE FEL radiation: time and spectral field correlation functions, distribution of the fluctuations of the instantaneous radiation power, distribution of the energy in the electron bunch, distribution of the radiation energy after monochromator installed at the FEL amplifier exit and the radiation spectrum. All numerical results presented in the paper have been calculated for the 70 nm SASE FEL at the TESLA Test Facility being under construction at DESY

  2. Stochastic geometry, spatial statistics and random fields models and algorithms

    CERN Document Server

    2015-01-01

    Providing a graduate level introduction to various aspects of stochastic geometry, spatial statistics and random fields, this volume places a special emphasis on fundamental classes of models and algorithms as well as on their applications, for example in materials science, biology and genetics. This book has a strong focus on simulations and includes extensive codes in Matlab and R, which are widely used in the mathematical community. It can be regarded as a continuation of the recent volume 2068 of Lecture Notes in Mathematics, where other issues of stochastic geometry, spatial statistics and random fields were considered, with a focus on asymptotic methods.

  3. Testing earthquake prediction algorithms: Statistically significant advance prediction of the largest earthquakes in the Circum-Pacific, 1992-1997

    Science.gov (United States)

    Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.

    1999-01-01

    Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier

  4. GIA Model Statistics for GRACE Hydrology, Cryosphere, and Ocean Science

    Science.gov (United States)

    Caron, L.; Ivins, E. R.; Larour, E.; Adhikari, S.; Nilsson, J.; Blewitt, G.

    2018-03-01

    We provide a new analysis of glacial isostatic adjustment (GIA) with the goal of assembling the model uncertainty statistics required for rigorously extracting trends in surface mass from the Gravity Recovery and Climate Experiment (GRACE) mission. Such statistics are essential for deciphering sea level, ocean mass, and hydrological changes because the latter signals can be relatively small (≤2 mm/yr water height equivalent) over very large regions, such as major ocean basins and watersheds. With abundant new >7 year continuous measurements of vertical land motion (VLM) reported by Global Positioning System stations on bedrock and new relative sea level records, our new statistical evaluation of GIA uncertainties incorporates Bayesian methodologies. A unique aspect of the method is that both the ice history and 1-D Earth structure vary through a total of 128,000 forward models. We find that best fit models poorly capture the statistical inferences needed to correctly invert for lower mantle viscosity and that GIA uncertainty exceeds the uncertainty ascribed to trends from 14 years of GRACE data in polar regions.

  5. A Model Fit Statistic for Generalized Partial Credit Model

    Science.gov (United States)

    Liang, Tie; Wells, Craig S.

    2009-01-01

    Investigating the fit of a parametric model is an important part of the measurement process when implementing item response theory (IRT), but research examining it is limited. A general nonparametric approach for detecting model misfit, introduced by J. Douglas and A. S. Cohen (2001), has exhibited promising results for the two-parameter logistic…

  6. Experimental, statistical, and biological models of radon carcinogenesis

    International Nuclear Information System (INIS)

    Cross, F.T.

    1991-09-01

    Risk models developed for underground miners have not been consistently validated in studies of populations exposed to indoor radon. Imprecision in risk estimates results principally from differences between exposures in mines as compared to domestic environments and from uncertainties about the interaction between cigarette-smoking and exposure to radon decay products. Uncertainties in extrapolating miner data to domestic exposures can be reduced by means of a broad-based health effects research program that addresses the interrelated issues of exposure, respiratory tract dose, carcinogenesis (molecular/cellular and animal studies, plus developing biological and statistical models), and the relationship of radon to smoking and other copollutant exposures. This article reviews experimental animal data on radon carcinogenesis observed primarily in rats at Pacific Northwest Laboratory. Recent experimental and mechanistic carcinogenesis models of exposures to radon, uranium ore dust, and cigarette smoke are presented with statistical analyses of animal data. 20 refs., 1 fig

  7. Multimesonic decays of charmonium states in the statistical quark model

    International Nuclear Information System (INIS)

    Montvay, I.; Toth, J.D.

    1978-01-01

    The data known at present of multimesonic decays of chi and psi states are fitted in a statistical quark model, in which the matrix elements are assumed to be constant and resonances as well as both strong and second order electromagnetic processes are taken into account. The experimental data are well reproduced by the model. Unknown branching ratios for the rest of multimesonic channels are predicted. The fit leaves about 40% for baryonic and radiative channels in the case of J/psi(3095). The fitted parameters of the J/psi decays are used to predict the mesonic decays of the pseudoscalar eta c. The statistical quark model seems to allow the calculation of competitive multiparticle processes for the studied decays. (D.P.)

  8. Statistical 3D damage accumulation model for ion implant simulators

    CERN Document Server

    Hernandez-Mangas, J M; Enriquez, L E; Bailon, L; Barbolla, J; Jaraiz, M

    2003-01-01

    A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided.

  9. Statistical 3D damage accumulation model for ion implant simulators

    International Nuclear Information System (INIS)

    Hernandez-Mangas, J.M.; Lazaro, J.; Enriquez, L.; Bailon, L.; Barbolla, J.; Jaraiz, M.

    2003-01-01

    A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided

  10. SoS contract verification using statistical model checking

    Directory of Open Access Journals (Sweden)

    Alessandro Mignogna

    2013-11-01

    Full Text Available Exhaustive formal verification for systems of systems (SoS is impractical and cannot be applied on a large scale. In this paper we propose to use statistical model checking for efficient verification of SoS. We address three relevant aspects for systems of systems: 1 the model of the SoS, which includes stochastic aspects; 2 the formalization of the SoS requirements in the form of contracts; 3 the tool-chain to support statistical model checking for SoS. We adapt the SMC technique for application to heterogeneous SoS. We extend the UPDM/SysML specification language to express the SoS requirements that the implemented strategies over the SoS must satisfy. The requirements are specified with a new contract language specifically designed for SoS, targeting a high-level English- pattern language, but relying on an accurate semantics given by the standard temporal logics. The contracts are verified against the UPDM/SysML specification using the Statistical Model Checker (SMC PLASMA combined with the simulation engine DESYRE, which integrates heterogeneous behavioral models through the functional mock-up interface (FMI standard. The tool-chain allows computing an estimation of the satisfiability of the contracts by the SoS. The results help the system architect to trade-off different solutions to guide the evolution of the SoS.

  11. Structural reliability in context of statistical uncertainties and modelling discrepancies

    International Nuclear Information System (INIS)

    Pendola, Maurice

    2000-01-01

    Structural reliability methods have been largely improved during the last years and have showed their ability to deal with uncertainties during the design stage or to optimize the functioning and the maintenance of industrial installations. They are based on a mechanical modeling of the structural behavior according to the considered failure modes and on a probabilistic representation of input parameters of this modeling. In practice, only limited statistical information is available to build the probabilistic representation and different sophistication levels of the mechanical modeling may be introduced. Thus, besides the physical randomness, other uncertainties occur in such analyses. The aim of this work is triple: 1. at first, to propose a methodology able to characterize the statistical uncertainties due to the limited number of data in order to take them into account in the reliability analyses. The obtained reliability index measures the confidence in the structure considering the statistical information available. 2. Then, to show a methodology leading to reliability results evaluated from a particular mechanical modeling but by using a less sophisticated one. The objective is then to decrease the computational efforts required by the reference modeling. 3. Finally, to propose partial safety factors that are evolving as a function of the number of statistical data available and as a function of the sophistication level of the mechanical modeling that is used. The concepts are illustrated in the case of a welded pipe and in the case of a natural draught cooling tower. The results show the interest of the methodologies in an industrial context. [fr

  12. A Census of Statistics Requirements at U.S. Journalism Programs and a Model for a "Statistics for Journalism" Course

    Science.gov (United States)

    Martin, Justin D.

    2017-01-01

    This essay presents data from a census of statistics requirements and offerings at all 4-year journalism programs in the United States (N = 369) and proposes a model of a potential course in statistics for journalism majors. The author proposes that three philosophies underlie a statistics course for journalism students. Such a course should (a)…

  13. A statistical model for radar images of agricultural scenes

    Science.gov (United States)

    Frost, V. S.; Shanmugan, K. S.; Holtzman, J. C.; Stiles, J. A.

    1982-01-01

    The presently derived and validated statistical model for radar images containing many different homogeneous fields predicts the probability density functions of radar images of entire agricultural scenes, thereby allowing histograms of large scenes composed of a variety of crops to be described. Seasat-A SAR images of agricultural scenes are accurately predicted by the model on the basis of three assumptions: each field has the same SNR, all target classes cover approximately the same area, and the true reflectivity characterizing each individual target class is a uniformly distributed random variable. The model is expected to be useful in the design of data processing algorithms and for scene analysis using radar images.

  14. A keyword spotting model using perceptually significant energy features

    Science.gov (United States)

    Umakanthan, Padmalochini

    The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.

  15. Olive mill wastewater characteristics: modelling and statistical analysis

    Directory of Open Access Journals (Sweden)

    Martins-Dias, Susete

    2004-09-01

    Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus

  16. Discrete ellipsoidal statistical BGK model and Burnett equations

    Science.gov (United States)

    Zhang, Yu-Dong; Xu, Ai-Guo; Zhang, Guang-Cai; Chen, Zhi-Hua; Wang, Pei

    2018-06-01

    A new discrete Boltzmann model, the discrete ellipsoidal statistical Bhatnagar-Gross-Krook (ESBGK) model, is proposed to simulate nonequilibrium compressible flows. Compared with the original discrete BGK model, the discrete ES-BGK has a flexible Prandtl number. For the discrete ES-BGK model in the Burnett level, two kinds of discrete velocity model are introduced and the relations between nonequilibrium quantities and the viscous stress and heat flux in the Burnett level are established. The model is verified via four benchmark tests. In addition, a new idea is introduced to recover the actual distribution function through the macroscopic quantities and their space derivatives. The recovery scheme works not only for discrete Boltzmann simulation but also for hydrodynamic ones, for example, those based on the Navier-Stokes or the Burnett equations.

  17. Statistics of a neuron model driven by asymmetric colored noise.

    Science.gov (United States)

    Müller-Hansen, Finn; Droste, Felix; Lindner, Benjamin

    2015-02-01

    Irregular firing of neurons can be modeled as a stochastic process. Here we study the perfect integrate-and-fire neuron driven by dichotomous noise, a Markovian process that jumps between two states (i.e., possesses a non-Gaussian statistics) and exhibits nonvanishing temporal correlations (i.e., represents a colored noise). Specifically, we consider asymmetric dichotomous noise with two different transition rates. Using a first-passage-time formulation, we derive exact expressions for the probability density and the serial correlation coefficient of the interspike interval (time interval between two subsequent neural action potentials) and the power spectrum of the spike train. Furthermore, we extend the model by including additional Gaussian white noise, and we give approximations for the interspike interval (ISI) statistics in this case. Numerical simulations are used to validate the exact analytical results for pure dichotomous noise, and to test the approximations of the ISI statistics when Gaussian white noise is included. The results may help to understand how correlations and asymmetry of noise and signals in nerve cells shape neuronal firing statistics.

  18. Solar radiation data - statistical analysis and simulation models

    Energy Technology Data Exchange (ETDEWEB)

    Mustacchi, C; Cena, V; Rocchi, M; Haghigat, F

    1984-01-01

    The activities consisted in collecting meteorological data on magnetic tape for ten european locations (with latitudes ranging from 42/sup 0/ to 56/sup 0/ N), analysing the multi-year sequences, developing mathematical models to generate synthetic sequences having the same statistical properties of the original data sets, and producing one or more Short Reference Years (SRY's) for each location. The meteorological parameters examinated were (for all the locations) global + diffuse radiation on horizontal surface, dry bulb temperature, sunshine duration. For some of the locations additional parameters were available, namely, global, beam and diffuse radiation on surfaces other than horizontal, wet bulb temperature, wind velocity, cloud type, cloud cover. The statistical properties investigated were mean, variance, autocorrelation, crosscorrelation with selected parameters, probability density function. For all the meteorological parameters, various mathematical models were built: linear regression, stochastic models of the AR and the DAR type. In each case, the model with the best statistical behaviour was selected for the production of a SRY for the relevant parameter/location.

  19. A statistical model for porous structure of rocks

    Institute of Scientific and Technical Information of China (English)

    JU Yang; YANG YongMing; SONG ZhenDuo; XU WenJing

    2008-01-01

    The geometric features and the distribution properties of pores in rocks were In-vestigated by means of CT scanning tests of sandstones. The centroidal coordl-nares of pores, the statistic characterristics of pore distance, quantity, size and their probability density functions were formulated in this paper. The Monte Carlo method and the random number generating algorithm were employed to generate two series of random numbers with the desired statistic characteristics and prob-ability density functions upon which the random distribution of pore position, dis-tance and quantity were determined. A three-dimensional porous structural model of sandstone was constructed based on the FLAC3D program and the information of the pore position and distribution that the series of random numbers defined. On the basis of modelling, the Brazil split tests of rock discs were carried out to ex-amine the stress distribution, the pattern of element failure and the inoaculation of failed elements. The simulation indicated that the proposed model was consistent with the realistic porous structure of rock in terms of their statistic properties of pores and geometric similarity. The built-up model disclosed the influence of pores on the stress distribution, failure mode of material elements and the inosculation of failed elements.

  20. A statistical model for porous structure of rocks

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The geometric features and the distribution properties of pores in rocks were in- vestigated by means of CT scanning tests of sandstones. The centroidal coordi- nates of pores, the statistic characterristics of pore distance, quantity, size and their probability density functions were formulated in this paper. The Monte Carlo method and the random number generating algorithm were employed to generate two series of random numbers with the desired statistic characteristics and prob- ability density functions upon which the random distribution of pore position, dis- tance and quantity were determined. A three-dimensional porous structural model of sandstone was constructed based on the FLAC3D program and the information of the pore position and distribution that the series of random numbers defined. On the basis of modelling, the Brazil split tests of rock discs were carried out to ex- amine the stress distribution, the pattern of element failure and the inosculation of failed elements. The simulation indicated that the proposed model was consistent with the realistic porous structure of rock in terms of their statistic properties of pores and geometric similarity. The built-up model disclosed the influence of pores on the stress distribution, failure mode of material elements and the inosculation of failed elements.

  1. Bayesian statistic methods and theri application in probabilistic simulation models

    Directory of Open Access Journals (Sweden)

    Sergio Iannazzo

    2007-03-01

    Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.

  2. Can spatial statistical river temperature models be transferred between catchments?

    Science.gov (United States)

    Jackson, Faye L.; Fryer, Robert J.; Hannah, David M.; Malcolm, Iain A.

    2017-09-01

    There has been increasing use of spatial statistical models to understand and predict river temperature (Tw) from landscape covariates. However, it is not financially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses Tw data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum Tw (Twmax) within and between catchments. Models were fitted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature (Ta) metric (LS_Ta models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of Twmax. However, the predictions were biased when mean Twmax differed between catchments. The RNS was needed to characterise and predict finer-scale spatially correlated variation. Because the RNS was unique to each catchment and thus non-transferable, predictions were better within catchments than between catchments. A single model fitted to all catchments found no interactions between the landscape covariates and catchment, suggesting that the landscape relationships were transferable. The LS_Ta models transferred less well, with particularly poor performance when the relationship with the Ta metric was physically implausible or required extrapolation outside the range of the data. A single model fitted to all catchments found catchment-specific relationships between Twmax and the Ta metric, indicating that the Ta metric was not transferable. These findings improve our understanding of the transferability of spatial statistical river temperature models and provide a foundation for developing new approaches for predicting Tw at unmonitored locations across

  3. Macro-indicators of citation impacts of six prolific countries: InCites data and the statistical significance of trends.

    Directory of Open Access Journals (Sweden)

    Lutz Bornmann

    Full Text Available Using the InCites tool of Thomson Reuters, this study compares normalized citation impact values calculated for China, Japan, France, Germany, United States, and the UK throughout the time period from 1981 to 2010. InCites offers a unique opportunity to study the normalized citation impacts of countries using (i a long publication window (1981 to 2010, (ii a differentiation in (broad or more narrow subject areas, and (iii allowing for the use of statistical procedures in order to obtain an insightful investigation of national citation trends across the years. Using four broad categories, our results show significantly increasing trends in citation impact values for France, the UK, and especially Germany across the last thirty years in all areas. The citation impact of papers from China is still at a relatively low level (mostly below the world average, but the country follows an increasing trend line. The USA exhibits a stable pattern of high citation impact values across the years. With small impact differences between the publication years, the US trend is increasing in engineering and technology but decreasing in medical and health sciences as well as in agricultural sciences. Similar to the USA, Japan follows increasing as well as decreasing trends in different subject areas, but the variability across the years is small. In most of the years, papers from Japan perform below or approximately at the world average in each subject area.

  4. Development of free statistical software enabling researchers to calculate confidence levels, clinical significance curves and risk-benefit contours

    International Nuclear Information System (INIS)

    Shakespeare, T.P.; Mukherjee, R.K.; Gebski, V.J.

    2003-01-01

    Confidence levels, clinical significance curves, and risk-benefit contours are tools improving analysis of clinical studies and minimizing misinterpretation of published results, however no software has been available for their calculation. The objective was to develop software to help clinicians utilize these tools. Excel 2000 spreadsheets were designed using only built-in functions, without macros. The workbook was protected and encrypted so that users can modify only input cells. The workbook has 4 spreadsheets for use in studies comparing two patient groups. Sheet 1 comprises instructions and graphic examples for use. Sheet 2 allows the user to input the main study results (e.g. survival rates) into a 2-by-2 table. Confidence intervals (95%), p-value and the confidence level for Treatment A being better than Treatment B are automatically generated. An additional input cell allows the user to determine the confidence associated with a specified level of benefit. For example if the user wishes to know the confidence that Treatment A is at least 10% better than B, 10% is entered. Sheet 2 automatically displays clinical significance curves, graphically illustrating confidence levels for all possible benefits of one treatment over the other. Sheet 3 allows input of toxicity data, and calculates the confidence that one treatment is more toxic than the other. It also determines the confidence that the relative toxicity of the most effective arm does not exceed user-defined tolerability. Sheet 4 automatically calculates risk-benefit contours, displaying the confidence associated with a specified scenario of minimum benefit and maximum risk of one treatment arm over the other. The spreadsheet is freely downloadable at www.ontumor.com/professional/statistics.htm A simple, self-explanatory, freely available spreadsheet calculator was developed using Excel 2000. The incorporated decision-making tools can be used for data analysis and improve the reporting of results of any

  5. Probing the exchange statistics of one-dimensional anyon models

    Science.gov (United States)

    Greschner, Sebastian; Cardarelli, Lorenzo; Santos, Luis

    2018-05-01

    We propose feasible scenarios for revealing the modified exchange statistics in one-dimensional anyon models in optical lattices based on an extension of the multicolor lattice-depth modulation scheme introduced in [Phys. Rev. A 94, 023615 (2016), 10.1103/PhysRevA.94.023615]. We show that the fast modulation of a two-component fermionic lattice gas in the presence a magnetic field gradient, in combination with additional resonant microwave fields, allows for the quantum simulation of hardcore anyon models with periodic boundary conditions. Such a semisynthetic ring setup allows for realizing an interferometric arrangement sensitive to the anyonic statistics. Moreover, we show as well that simple expansion experiments may reveal the formation of anomalously bound pairs resulting from the anyonic exchange.

  6. Statistical inference to advance network models in epidemiology.

    Science.gov (United States)

    Welch, David; Bansal, Shweta; Hunter, David R

    2011-03-01

    Contact networks are playing an increasingly important role in the study of epidemiology. Most of the existing work in this area has focused on considering the effect of underlying network structure on epidemic dynamics by using tools from probability theory and computer simulation. This work has provided much insight on the role that heterogeneity in host contact patterns plays on infectious disease dynamics. Despite the important understanding afforded by the probability and simulation paradigm, this approach does not directly address important questions about the structure of contact networks such as what is the best network model for a particular mode of disease transmission, how parameter values of a given model should be estimated, or how precisely the data allow us to estimate these parameter values. We argue that these questions are best answered within a statistical framework and discuss the role of statistical inference in estimating contact networks from epidemiological data. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. Statistical models of a gas diffusion electrode: II. Current resistent

    Energy Technology Data Exchange (ETDEWEB)

    Proksch, D B; Winsel, O W

    1965-07-01

    The authors describe an apparatus for measuring the flow resistance of gas diffusion electrodes which is a mechanical analog of the Wheatstone bridge for measuring electric resistance. The flow resistance of a circular DSK electrode sheet, consisting of two covering layers and a working layer between them, was measured as a function of the gas pressure. While the pressure first was increased and then decreased, a hysteresis occurred, which is discussed and explained by a statistical model of a porous electrode.

  8. A Statistical Model for Soliton Particle Interaction in Plasmas

    DEFF Research Database (Denmark)

    Dysthe, K. B.; Pécseli, Hans; Truelsen, J.

    1986-01-01

    A statistical model for soliton-particle interaction is presented. A master equation is derived for the time evolution of the particle velocity distribution as induced by resonant interaction with Korteweg-de Vries solitons. The detailed energy balance during the interaction subsequently determines...... the evolution of the soliton amplitude distribution. The analysis applies equally well for weakly nonlinear plasma waves in a strongly magnetized waveguide, or for ion acoustic waves propagating in one-dimensional systems....

  9. Statistical model of a gas diffusion electrode. III. Photomicrograph study

    Energy Technology Data Exchange (ETDEWEB)

    Winsel, A W

    1965-12-01

    A linear section through a gas diffusion electrode produces a certain distribution function of sinews with the pores. From this distribution function some qualities of the pore structure are derived, and an automatic device to determine the distribution function is described. With a statistical model of a gas diffusion electrode the behavior of a DSK electrode is discussed and compared with earlier measurements of the flow resistance of this material.

  10. A statistical model of structure functions and quantum chromodynamics

    International Nuclear Information System (INIS)

    Mac, E.; Ugaz, E.; Universidad Nacional de Ingenieria, Lima

    1989-01-01

    We consider a model for the x-dependence of the quark distributions in the proton. Within the context of simple statistical assumptions, we obtain the parton densities in the infinite momentum frame. In a second step lowest order QCD corrections are incorporated to these distributions. Crude, but reasonable, agreement with experiment is found for the F 2 , valence and q, anti q distributions for x> or approx.0.2. (orig.)

  11. Modeling the basic superconductor thermodynamical-statistical characteristics

    International Nuclear Information System (INIS)

    Palenskis, V.; Maknys, K.

    1999-01-01

    In accordance with the Landau second-order phase transition and other thermodynamical-statistical relations for superconductors, and using the energy gap as an order parameter in the electron free energy presentation, the fundamental characteristics of electrons, such as the free energy, the total energy, the energy gap, the entropy, and the heat capacity dependences on temperature were obtained. The obtained modeling results, in principle, well reflect the basic low- and high-temperature superconductor characteristics

  12. Environmental radionuclide concentrations: statistical model to determine uniformity of distribution

    International Nuclear Information System (INIS)

    Cawley, C.N.; Fenyves, E.J.; Spitzberg, D.B.; Wiorkowski, J.; Chehroudi, M.T.

    1980-01-01

    In the evaluation of data from environmental sampling and measurement, a basic question is whether the radionuclide (or pollutant) is distributed uniformly. Since physical measurements have associated errors, it is inappropriate to consider the measurements alone in this determination. Hence, a statistical model has been developed. It consists of a weighted analysis of variance with subsequent t-tests between weighted and independent means. A computer program to perform the calculations is included

  13. Statistical and non statistical models for delayed neutron emission: applications to nuclei near A = 90

    International Nuclear Information System (INIS)

    De Oliveira, Z.M.

    1980-01-01

    A detailed analysis of the simple statistical model description for delayed neutron emission of 87 Br, 137 I, 85 As and 135 Sb has been performed. In agreement with experimental findings, structure in the #betta#-strength function is required to reproduce the envelope of the neutron spectrum from 87 Br. For 85 As and 135 Sb the model is found incapable of simultaneously reproducing envelopes of delayed neutron spectra and neutron branching ratios to excited states in the final nuclei for any choice of #betta#-strength function. The results indicate that partial widths for neutron emission are not compatible with optical-model transmission coefficients. The simple shell model with pairing is shown to qualitatively describe the main features of the #betta#-strength functions for decay of 87 Br and 91 93 95 97 Rb. It is found that the location of apparent resonances in the experimental data are in rough agreement with the location of centroids of strength calculated with this model. An extension of the shell model picture which includes the Gamow-Teller residual interaction is used to investigate decay properties of 84 86 As, 86 92 Br and 88 102 Rb. For a realistic choice of interaction strength, the half lives of these isotopes are fairly well reproduced and semiquantitative agreement with experimental #betta#-strength functions is found. Delayed neutron emission probabilities are reproduced for precursors nearer stability with systematic deviations being observed for the heavier nuclei. Contrary to the assumption of a structureless Gamow-Teller giant resonance as embodied gross theory of #betta#-decay, we find that structures in the tail of the Gamow-Teller giant resonances are expected which strongly influence the decay properties of nuclides in this region

  14. Statistical methods for mechanistic model validation: Salt Repository Project

    International Nuclear Information System (INIS)

    Eggett, D.L.

    1988-07-01

    As part of the Department of Energy's Salt Repository Program, Pacific Northwest Laboratory (PNL) is studying the emplacement of nuclear waste containers in a salt repository. One objective of the SRP program is to develop an overall waste package component model which adequately describes such phenomena as container corrosion, waste form leaching, spent fuel degradation, etc., which are possible in the salt repository environment. The form of this model will be proposed, based on scientific principles and relevant salt repository conditions with supporting data. The model will be used to predict the future characteristics of the near field environment. This involves several different submodels such as the amount of time it takes a brine solution to contact a canister in the repository, how long it takes a canister to corrode and expose its contents to the brine, the leach rate of the contents of the canister, etc. These submodels are often tested in a laboratory and should be statistically validated (in this context, validate means to demonstrate that the model adequately describes the data) before they can be incorporated into the waste package component model. This report describes statistical methods for validating these models. 13 refs., 1 fig., 3 tabs

  15. Estimating preferential flow in karstic aquifers using statistical mixed models.

    Science.gov (United States)

    Anaya, Angel A; Padilla, Ingrid; Macchiavelli, Raul; Vesper, Dorothy J; Meeker, John D; Alshawabkeh, Akram N

    2014-01-01

    Karst aquifers are highly productive groundwater systems often associated with conduit flow. These systems can be highly vulnerable to contamination, resulting in a high potential for contaminant exposure to humans and ecosystems. This work develops statistical models to spatially characterize flow and transport patterns in karstified limestone and determines the effect of aquifer flow rates on these patterns. A laboratory-scale Geo-HydroBed model is used to simulate flow and transport processes in a karstic limestone unit. The model consists of stainless steel tanks containing a karstified limestone block collected from a karst aquifer formation in northern Puerto Rico. Experimental work involves making a series of flow and tracer injections, while monitoring hydraulic and tracer response spatially and temporally. Statistical mixed models (SMMs) are applied to hydraulic data to determine likely pathways of preferential flow in the limestone units. The models indicate a highly heterogeneous system with dominant, flow-dependent preferential flow regions. Results indicate that regions of preferential flow tend to expand at higher groundwater flow rates, suggesting a greater volume of the system being flushed by flowing water at higher rates. Spatial and temporal distribution of tracer concentrations indicates the presence of conduit-like and diffuse flow transport in the system, supporting the notion of both combined transport mechanisms in the limestone unit. The temporal response of tracer concentrations at different locations in the model coincide with, and confirms the preferential flow distribution generated with the SMMs used in the study. © 2013, National Ground Water Association.

  16. A generalized statistical model for the size distribution of wealth

    International Nuclear Information System (INIS)

    Clementi, F; Gallegati, M; Kaniadakis, G

    2012-01-01

    In a recent paper in this journal (Clementi et al 2009 J. Stat. Mech. P02037), we proposed a new, physically motivated, distribution function for modeling individual incomes, having its roots in the framework of the κ-generalized statistical mechanics. The performance of the κ-generalized distribution was checked against real data on personal income for the United States in 2003. In this paper we extend our previous model so as to be able to account for the distribution of wealth. Probabilistic functions and inequality measures of this generalized model for wealth distribution are obtained in closed form. In order to check the validity of the proposed model, we analyze the US household wealth distributions from 1984 to 2009 and conclude an excellent agreement with the data that is superior to any other model already known in the literature. (paper)

  17. A generalized statistical model for the size distribution of wealth

    Science.gov (United States)

    Clementi, F.; Gallegati, M.; Kaniadakis, G.

    2012-12-01

    In a recent paper in this journal (Clementi et al 2009 J. Stat. Mech. P02037), we proposed a new, physically motivated, distribution function for modeling individual incomes, having its roots in the framework of the κ-generalized statistical mechanics. The performance of the κ-generalized distribution was checked against real data on personal income for the United States in 2003. In this paper we extend our previous model so as to be able to account for the distribution of wealth. Probabilistic functions and inequality measures of this generalized model for wealth distribution are obtained in closed form. In order to check the validity of the proposed model, we analyze the US household wealth distributions from 1984 to 2009 and conclude an excellent agreement with the data that is superior to any other model already known in the literature.

  18. UPPAAL-SMC: Statistical Model Checking for Priced Timed Automata

    DEFF Research Database (Denmark)

    Bulychev, Petr; David, Alexandre; Larsen, Kim Guldstrand

    2012-01-01

    on a series of extensions of the statistical model checking approach generalized to handle real-time systems and estimate undecidable problems. U PPAAL - SMC comes together with a friendly user interface that allows a user to specify complex problems in an efficient manner as well as to get feedback...... in the form of probability distributions and compare probabilities to analyze performance aspects of systems. The focus of the survey is on the evolution of the tool – including modeling and specification formalisms as well as techniques applied – together with applications of the tool to case studies....

  19. Statistical mechanics of attractor neural network models with synaptic depression

    International Nuclear Information System (INIS)

    Igarashi, Yasuhiko; Oizumi, Masafumi; Otsubo, Yosuke; Nagata, Kenji; Okada, Masato

    2009-01-01

    Synaptic depression is known to control gain for presynaptic inputs. Since cortical neurons receive thousands of presynaptic inputs, and their outputs are fed into thousands of other neurons, the synaptic depression should influence macroscopic properties of neural networks. We employ simple neural network models to explore the macroscopic effects of synaptic depression. Systems with the synaptic depression cannot be analyzed due to asymmetry of connections with the conventional equilibrium statistical-mechanical approach. Thus, we first propose a microscopic dynamical mean field theory. Next, we derive macroscopic steady state equations and discuss the stabilities of steady states for various types of neural network models.

  20. A model independent safeguard against background mismodeling for statistical inference

    Energy Technology Data Exchange (ETDEWEB)

    Priel, Nadav; Landsman, Hagar; Manfredini, Alessandro; Budnik, Ranny [Department of Particle Physics and Astrophysics, Weizmann Institute of Science, Herzl St. 234, Rehovot (Israel); Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il [Teilchen- und Astroteilchenphysik, Max-Planck-Institut für Kernphysik, Saupfercheckweg 1, 69117 Heidelberg (Germany)

    2017-05-01

    We propose a safeguard procedure for statistical inference that provides universal protection against mismodeling of the background. The method quantifies and incorporates the signal-like residuals of the background model into the likelihood function, using information available in a calibration dataset. This prevents possible false discovery claims that may arise through unknown mismodeling, and corrects the bias in limit setting created by overestimated or underestimated background. We demonstrate how the method removes the bias created by an incomplete background model using three realistic case studies.

  1. Document Categorization with Modified Statistical Language Models for Agglutinative Languages

    Directory of Open Access Journals (Sweden)

    Tantug

    2010-11-01

    Full Text Available In this paper, we investigate the document categorization task with statistical language models. Our study mainly focuses on categorization of documents in agglutinative languages. Due to the productive morphology of agglutinative languages, the number of word forms encountered in naturally occurring text is very large. From the language modeling perspective, a large vocabulary results in serious data sparseness problems. In order to cope with this drawback, previous studies in various application areas suggest modified language models based on different morphological units. It is reported that performance improvements can be achieved with these modified language models. In our document categorization experiments, we use standard word form based language models as well as other modified language models based on root words, root words and part-of-speech information, truncated word forms and character sequences. Additionally, to find an optimum parameter set, multiple tests are carried out with different language model orders and smoothing methods. Similar to previous studies on other tasks, our experimental results on categorization of Turkish documents reveal that applying linguistic preprocessing steps for language modeling provides improvements over standard language models to some extent. However, it is also observed that similar level of performance improvements can also be acquired by simpler character level or truncated word form models which are language independent.

  2. A neighborhood statistics model for predicting stream pathogen indicator levels.

    Science.gov (United States)

    Pandey, Pramod K; Pasternack, Gregory B; Majumder, Mahbubul; Soupir, Michelle L; Kaiser, Mark S

    2015-03-01

    Because elevated levels of water-borne Escherichia coli in streams are a leading cause of water quality impairments in the U.S., water-quality managers need tools for predicting aqueous E. coli levels. Presently, E. coli levels may be predicted using complex mechanistic models that have a high degree of unchecked uncertainty or simpler statistical models. To assess spatio-temporal patterns of instream E. coli levels, herein we measured E. coli, a pathogen indicator, at 16 sites (at four different times) within the Squaw Creek watershed, Iowa, and subsequently, the Markov Random Field model was exploited to develop a neighborhood statistics model for predicting instream E. coli levels. Two observed covariates, local water temperature (degrees Celsius) and mean cross-sectional depth (meters), were used as inputs to the model. Predictions of E. coli levels in the water column were compared with independent observational data collected from 16 in-stream locations. The results revealed that spatio-temporal averages of predicted and observed E. coli levels were extremely close. Approximately 66 % of individual predicted E. coli concentrations were within a factor of 2 of the observed values. In only one event, the difference between prediction and observation was beyond one order of magnitude. The mean of all predicted values at 16 locations was approximately 1 % higher than the mean of the observed values. The approach presented here will be useful while assessing instream contaminations such as pathogen/pathogen indicator levels at the watershed scale.

  3. Efficient Parallel Statistical Model Checking of Biochemical Networks

    Directory of Open Access Journals (Sweden)

    Paolo Ballarini

    2009-12-01

    Full Text Available We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.

  4. Statistical models for expert judgement and wear prediction

    International Nuclear Information System (INIS)

    Pulkkinen, U.

    1994-01-01

    This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)

  5. New scanning technique using Adaptive Statistical lterative Reconstruction (ASIR) significantly reduced the radiation dose of cardiac CT

    International Nuclear Information System (INIS)

    Tumur, Odgerel; Soon, Kean; Brown, Fraser; Mykytowycz, Marcus

    2013-01-01

    The aims of our study were to evaluate the effect of application of Adaptive Statistical Iterative Reconstruction (ASIR) algorithm on the radiation dose of coronary computed tomography angiography (CCTA) and its effects on image quality of CCTA and to evaluate the effects of various patient and CT scanning factors on the radiation dose of CCTA. This was a retrospective study that included 347 consecutive patients who underwent CCTA at a tertiary university teaching hospital between 1 July 2009 and 20 September 2011. Analysis was performed comparing patient demographics, scan characteristics, radiation dose and image quality in two groups of patients in whom conventional Filtered Back Projection (FBP) or ASIR was used for image reconstruction. There were 238 patients in the FBP group and 109 patients in the ASIR group. There was no difference between the groups in the use of prospective gating, scan length or tube voltage. In ASIR group, significantly lower tube current was used compared with FBP group, 550mA (450–600) vs. 650mA (500–711.25) (median (interquartile range)), respectively, P<0.001. There was 27% effective radiation dose reduction in the ASIR group compared with FBP group, 4.29mSv (2.84–6.02) vs. 5.84mSv (3.88–8.39) (median (interquartile range)), respectively, P<0.001. Although ASIR was associated with increased image noise compared with FBP (39.93±10.22 vs. 37.63±18.79 (mean ±standard deviation), respectively, P<001), it did not affect the signal intensity, signal-to-noise ratio, contrast-to-noise ratio or the diagnostic quality of CCTA. Application of ASIR reduces the radiation dose of CCTA without affecting the image quality.

  6. New scanning technique using Adaptive Statistical Iterative Reconstruction (ASIR) significantly reduced the radiation dose of cardiac CT.

    Science.gov (United States)

    Tumur, Odgerel; Soon, Kean; Brown, Fraser; Mykytowycz, Marcus

    2013-06-01

    The aims of our study were to evaluate the effect of application of Adaptive Statistical Iterative Reconstruction (ASIR) algorithm on the radiation dose of coronary computed tomography angiography (CCTA) and its effects on image quality of CCTA and to evaluate the effects of various patient and CT scanning factors on the radiation dose of CCTA. This was a retrospective study that included 347 consecutive patients who underwent CCTA at a tertiary university teaching hospital between 1 July 2009 and 20 September 2011. Analysis was performed comparing patient demographics, scan characteristics, radiation dose and image quality in two groups of patients in whom conventional Filtered Back Projection (FBP) or ASIR was used for image reconstruction. There were 238 patients in the FBP group and 109 patients in the ASIR group. There was no difference between the groups in the use of prospective gating, scan length or tube voltage. In ASIR group, significantly lower tube current was used compared with FBP group, 550 mA (450-600) vs. 650 mA (500-711.25) (median (interquartile range)), respectively, P ASIR group compared with FBP group, 4.29 mSv (2.84-6.02) vs. 5.84 mSv (3.88-8.39) (median (interquartile range)), respectively, P ASIR was associated with increased image noise compared with FBP (39.93 ± 10.22 vs. 37.63 ± 18.79 (mean ± standard deviation), respectively, P ASIR reduces the radiation dose of CCTA without affecting the image quality. © 2013 The Authors. Journal of Medical Imaging and Radiation Oncology © 2013 The Royal Australian and New Zealand College of Radiologists.

  7. Model-generated air quality statistics for application in vegetation response models in Alberta

    International Nuclear Information System (INIS)

    McVehil, G.E.; Nosal, M.

    1990-01-01

    To test and apply vegetation response models in Alberta, air pollution statistics representative of various parts of the Province are required. At this time, air quality monitoring data of the requisite accuracy and time resolution are not available for most parts of Alberta. Therefore, there exists a need to develop appropriate air quality statistics. The objectives of the work reported here were to determine the applicability of model generated air quality statistics and to develop by modelling, realistic and representative time series of hourly SO 2 concentrations that could be used to generate the statistics demanded by vegetation response models

  8. Statistical power to detect violation of the proportional hazards assumption when using the Cox regression model.

    Science.gov (United States)

    Austin, Peter C

    2018-01-01

    The use of the Cox proportional hazards regression model is widespread. A key assumption of the model is that of proportional hazards. Analysts frequently test the validity of this assumption using statistical significance testing. However, the statistical power of such assessments is frequently unknown. We used Monte Carlo simulations to estimate the statistical power of two different methods for detecting violations of this assumption. When the covariate was binary, we found that a model-based method had greater power than a method based on cumulative sums of martingale residuals. Furthermore, the parametric nature of the distribution of event times had an impact on power when the covariate was binary. Statistical power to detect a strong violation of the proportional hazards assumption was low to moderate even when the number of observed events was high. In many data sets, power to detect a violation of this assumption is likely to be low to modest.

  9. The GNASH preequilibrium-statistical nuclear model code

    International Nuclear Information System (INIS)

    Arthur, E. D.

    1988-01-01

    The following report is based on materials presented in a series of lectures at the International Center for Theoretical Physics, Trieste, which were designed to describe the GNASH preequilibrium statistical model code and its use. An overview is provided of the code with emphasis upon code's calculational capabilities and the theoretical models that have been implemented in it. Two sample problems are discussed, the first dealing with neutron reactions on 58 Ni. the second illustrates the fission model capabilities implemented in the code and involves n + 235 U reactions. Finally a description is provided of current theoretical model and code development underway. Examples of calculated results using these new capabilities are also given. 19 refs., 17 figs., 3 tabs

  10. The Impact of Statistical Leakage Models on Design Yield Estimation

    Directory of Open Access Journals (Sweden)

    Rouwaida Kanj

    2011-01-01

    Full Text Available Device mismatch and process variation models play a key role in determining the functionality and yield of sub-100 nm design. Average characteristics are often of interest, such as the average leakage current or the average read delay. However, detecting rare functional fails is critical for memory design and designers often seek techniques that enable accurately modeling such events. Extremely leaky devices can inflict functionality fails. The plurality of leaky devices on a bitline increase the dimensionality of the yield estimation problem. Simplified models are possible by adopting approximations to the underlying sum of lognormals. The implications of such approximations on tail probabilities may in turn bias the yield estimate. We review different closed form approximations and compare against the CDF matching method, which is shown to be most effective method for accurate statistical leakage modeling.

  11. Schedulability of Herschel revisited using statistical model checking

    DEFF Research Database (Denmark)

    David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel

    2015-01-01

    -approximation technique. We can safely conclude that the system is schedulable for varying values of BCET. For the cases where deadlines are violated, we use polyhedra to try to confirm the witnesses. Our alternative method to confirm non-schedulability uses statistical model-checking (SMC) to generate counter...... and blocking times of tasks. Consequently, the method may falsely declare deadline violations that will never occur during execution. This paper is a continuation of previous work of the authors in applying extended timed automata model checking (using the tool UPPAAL) to obtain more exact schedulability...... analysis, here in the presence of non-deterministic computation times of tasks given by intervals [BCET,WCET]. Computation intervals with preemptive schedulers make the schedulability analysis of the resulting task model undecidable. Our contribution is to propose a combination of model checking techniques...

  12. Fast optimization of statistical potentials for structurally constrained phylogenetic models

    Directory of Open Access Journals (Sweden)

    Rodrigue Nicolas

    2009-09-01

    Full Text Available Abstract Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure. Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.

  13. Statistical Downscaling of Temperature with the Random Forest Model

    Directory of Open Access Journals (Sweden)

    Bo Pang

    2017-01-01

    Full Text Available The issues with downscaling the outputs of a global climate model (GCM to a regional scale that are appropriate to hydrological impact studies are investigated using the random forest (RF model, which has been shown to be superior for large dataset analysis and variable importance evaluation. The RF is proposed for downscaling daily mean temperature in the Pearl River basin in southern China. Four downscaling models were developed and validated by using the observed temperature series from 61 national stations and large-scale predictor variables derived from the National Center for Environmental Prediction–National Center for Atmospheric Research reanalysis dataset. The proposed RF downscaling model was compared to multiple linear regression, artificial neural network, and support vector machine models. Principal component analysis (PCA and partial correlation analysis (PAR were used in the predictor selection for the other models for a comprehensive study. It was shown that the model efficiency of the RF model was higher than that of the other models according to five selected criteria. By evaluating the predictor importance, the RF could choose the best predictor combination without using PCA and PAR. The results indicate that the RF is a feasible tool for the statistical downscaling of temperature.

  14. Statistics of excitations in the electron glass model

    Science.gov (United States)

    Palassini, Matteo

    2011-03-01

    We study the statistics of elementary excitations in the classical electron glass model of localized electrons interacting via the unscreened Coulomb interaction in the presence of disorder. We reconsider the long-standing puzzle of the exponential suppression of the single-particle density of states near the Fermi level, by measuring accurately the density of states of charged and electron-hole pair excitations via finite temperature Monte Carlo simulation and zero-temperature relaxation. We also investigate the statistics of large charge rearrangements after a perturbation of the system, which may shed some light on the slow relaxation and glassy phenomena recently observed in a variety of Anderson insulators. In collaboration with Martin Goethe.

  15. Hybrid perturbation methods based on statistical time series models

    Science.gov (United States)

    San-Juan, Juan Félix; San-Martín, Montserrat; Pérez, Iván; López, Rosario

    2016-04-01

    In this work we present a new methodology for orbit propagation, the hybrid perturbation theory, based on the combination of an integration method and a prediction technique. The former, which can be a numerical, analytical or semianalytical theory, generates an initial approximation that contains some inaccuracies derived from the fact that, in order to simplify the expressions and subsequent computations, not all the involved forces are taken into account and only low-order terms are considered, not to mention the fact that mathematical models of perturbations not always reproduce physical phenomena with absolute precision. The prediction technique, which can be based on either statistical time series models or computational intelligence methods, is aimed at modelling and reproducing missing dynamics in the previously integrated approximation. This combination results in the precision improvement of conventional numerical, analytical and semianalytical theories for determining the position and velocity of any artificial satellite or space debris object. In order to validate this methodology, we present a family of three hybrid orbit propagators formed by the combination of three different orders of approximation of an analytical theory and a statistical time series model, and analyse their capability to process the effect produced by the flattening of the Earth. The three considered analytical components are the integration of the Kepler problem, a first-order and a second-order analytical theories, whereas the prediction technique is the same in the three cases, namely an additive Holt-Winters method.

  16. Bayesian Sensitivity Analysis of Statistical Models with Missing Data.

    Science.gov (United States)

    Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng

    2014-04-01

    Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.

  17. A statistical model for interpreting computerized dynamic posturography data

    Science.gov (United States)

    Feiveson, Alan H.; Metter, E. Jeffrey; Paloski, William H.

    2002-01-01

    Computerized dynamic posturography (CDP) is widely used for assessment of altered balance control. CDP trials are quantified using the equilibrium score (ES), which ranges from zero to 100, as a decreasing function of peak sway angle. The problem of how best to model and analyze ESs from a controlled study is considered. The ES often exhibits a skewed distribution in repeated trials, which can lead to incorrect inference when applying standard regression or analysis of variance models. Furthermore, CDP trials are terminated when a patient loses balance. In these situations, the ES is not observable, but is assigned the lowest possible score--zero. As a result, the response variable has a mixed discrete-continuous distribution, further compromising inference obtained by standard statistical methods. Here, we develop alternative methodology for analyzing ESs under a stochastic model extending the ES to a continuous latent random variable that always exists, but is unobserved in the event of a fall. Loss of balance occurs conditionally, with probability depending on the realized latent ES. After fitting the model by a form of quasi-maximum-likelihood, one may perform statistical inference to assess the effects of explanatory variables. An example is provided, using data from the NIH/NIA Baltimore Longitudinal Study on Aging.

  18. Monthly to seasonal low flow prediction: statistical versus dynamical models

    Science.gov (United States)

    Ionita-Scholz, Monica; Klein, Bastian; Meissner, Dennis; Rademacher, Silke

    2016-04-01

    the Alfred Wegener Institute a purely statistical scheme to generate streamflow forecasts for several months ahead. Instead of directly using teleconnection indices (e.g. NAO, AO) the idea is to identify regions with stable teleconnections between different global climate information (e.g. sea surface temperature, geopotential height etc.) and streamflow at different gauges relevant for inland waterway transport. So-called stability (correlation) maps are generated showing regions where streamflow and climate variable from previous months are significantly correlated in a 21 (31) years moving window. Finally, the optimal forecast model is established based on a multiple regression analysis of the stable predictors. We will present current results of the aforementioned approaches with focus on the River Rhine (being one of the world's most frequented waterways and the backbone of the European inland waterway network) and the Elbe River. Overall, our analysis reveals the existence of a valuable predictability of the low flows at monthly and seasonal time scales, a result that may be useful to water resources management. Given that all predictors used in the models are available at the end of each month, the forecast scheme can be used operationally to predict extreme events and to provide early warnings for upcoming low flows.

  19. Prediction of dimethyl disulfide levels from biosolids using statistical modeling.

    Science.gov (United States)

    Gabriel, Steven A; Vilalai, Sirapong; Arispe, Susanna; Kim, Hyunook; McConnell, Laura L; Torrents, Alba; Peot, Christopher; Ramirez, Mark

    2005-01-01

    Two statistical models were used to predict the concentration of dimethyl disulfide (DMDS) released from biosolids produced by an advanced wastewater treatment plant (WWTP) located in Washington, DC, USA. The plant concentrates sludge from primary sedimentation basins in gravity thickeners (GT) and sludge from secondary sedimentation basins in dissolved air flotation (DAF) thickeners. The thickened sludge is pumped into blending tanks and then fed into centrifuges for dewatering. The dewatered sludge is then conditioned with lime before trucking out from the plant. DMDS, along with other volatile sulfur and nitrogen-containing chemicals, is known to contribute to biosolids odors. These models identified oxidation/reduction potential (ORP) values of a GT and DAF, the amount of sludge dewatered by centrifuges, and the blend ratio between GT thickened sludge and DAF thickened sludge in blending tanks as control variables. The accuracy of the developed regression models was evaluated by checking the adjusted R2 of the regression as well as the signs of coefficients associated with each variable. In general, both models explained observed DMDS levels in sludge headspace samples. The adjusted R2 value of the regression models 1 and 2 were 0.79 and 0.77, respectively. Coefficients for each regression model also had the correct sign. Using the developed models, plant operators can adjust the controllable variables to proactively decrease this odorant. Therefore, these models are a useful tool in biosolids management at WWTPs.

  20. Statistical approach for uncertainty quantification of experimental modal model parameters

    DEFF Research Database (Denmark)

    Luczak, M.; Peeters, B.; Kahsin, M.

    2014-01-01

    Composite materials are widely used in manufacture of aerospace and wind energy structural components. These load carrying structures are subjected to dynamic time-varying loading conditions. Robust structural dynamics identification procedure impose tight constraints on the quality of modal models...... represent different complexity levels ranging from coupon, through sub-component up to fully assembled aerospace and wind energy structural components made of composite materials. The proposed method is demonstrated on two application cases of a small and large wind turbine blade........ This paper aims at a systematic approach for uncertainty quantification of the parameters of the modal models estimated from experimentally obtained data. Statistical analysis of modal parameters is implemented to derive an assessment of the entire modal model uncertainty measure. Investigated structures...

  1. Statistical mechanics of sparse generalization and graphical model selection

    International Nuclear Information System (INIS)

    Lage-Castellanos, Alejandro; Pagnani, Andrea; Weigt, Martin

    2009-01-01

    One of the crucial tasks in many inference problems is the extraction of an underlying sparse graphical model from a given number of high-dimensional measurements. In machine learning, this is frequently achieved using, as a penalty term, the L p norm of the model parameters, with p≤1 for efficient dilution. Here we propose a statistical mechanics analysis of the problem in the setting of perceptron memorization and generalization. Using a replica approach, we are able to evaluate the relative performance of naive dilution (obtained by learning without dilution, following by applying a threshold to the model parameters), L 1 dilution (which is frequently used in convex optimization) and L 0 dilution (which is optimal but computationally hard to implement). Whereas both L p diluted approaches clearly outperform the naive approach, we find a small region where L 0 works almost perfectly and strongly outperforms the simpler to implement L 1 dilution

  2. Exploiting linkage disequilibrium in statistical modelling in quantitative genomics

    DEFF Research Database (Denmark)

    Wang, Lei

    Alleles at two loci are said to be in linkage disequilibrium (LD) when they are correlated or statistically dependent. Genomic prediction and gene mapping rely on the existence of LD between gentic markers and causul variants of complex traits. In the first part of the thesis, a novel method...... to quantify and visualize local variation in LD along chromosomes in describet, and applied to characterize LD patters at the local and genome-wide scale in three Danish pig breeds. In the second part, different ways of taking LD into account in genomic prediction models are studied. One approach is to use...... the recently proposed antedependence models, which treat neighbouring marker effects as correlated; another approach involves use of haplotype block information derived using the program Beagle. The overall conclusion is that taking LD information into account in genomic prediction models potentially improves...

  3. A statistical model for field emission in superconducting cavities

    International Nuclear Information System (INIS)

    Padamsee, H.; Green, K.; Jost, W.; Wright, B.

    1993-01-01

    A statistical model is used to account for several features of performance of an ensemble of superconducting cavities. The input parameters are: the number of emitters/area, a distribution function for emitter β values, a distribution function for emissive areas, and a processing threshold. The power deposited by emitters is calculated from the field emission current and electron impact energy. The model can successfully account for the fraction of tests that reach the maximum field Epk in an ensemble of cavities, for eg, 1-cells at sign 3 GHz or 5-cells at sign 1.5 GHz. The model is used to predict the level of power needed to successfully process cavities of various surface areas with high pulsed power processing (HPP)

  4. Atmospheric statistical dynamic models. Model performance: the Lawrence Livermore Laboratoy Zonal Atmospheric Model

    International Nuclear Information System (INIS)

    Potter, G.L.; Ellsaesser, H.W.; MacCracken, M.C.; Luther, F.M.

    1978-06-01

    Results from the zonal model indicate quite reasonable agreement with observation in terms of the parameters and processes that influence the radiation and energy balance calculations. The model produces zonal statistics similar to those from general circulation models, and has also been shown to produce similar responses in sensitivity studies. Further studies of model performance are planned, including: comparison with July data; comparison of temperature and moisture transport and wind fields for winter and summer months; and a tabulation of atmospheric energetics. Based on these preliminary performance studies, however, it appears that the zonal model can be used in conjunction with more complex models to help unravel the problems of understanding the processes governing present climate and climate change. As can be seen in the subsequent paper on model sensitivity studies, in addition to reduced cost of computation, the zonal model facilitates analysis of feedback mechanisms and simplifies analysis of the interactions between processes

  5. Modeling CCN effects on tropical convection: An statistical perspective

    Science.gov (United States)

    Carrio, G. G.; Cotton, W. R.; Massie, S. T.

    2012-12-01

    This modeling study examines the response of tropical convection to the enhancement of CCN concentrations from a statistical perspective. The sensitivity runs were performed using RAMS version 6.0, covering almost the entire Amazonian Aerosol Characterization Experiment period (AMAZE, wet season of 2008). The main focus of the analysis was the indirect aerosol effects on the probability density functions (PDFs) of various cloud properties. RAMS was configured to work with four two-way interactive nested grids with 42 vertical levels and horizontal grid spacing of 150, 37.5, 7.5, and 1.5 km. Grids 2 and 3 were used to simulate the synoptic and mesoscale environments, while grid 4 was used to resolve deep convection. Comparisons were made using the finest grid with a domain size of 300 X 300km, approximately centered on the city of Manaus (3.1S, 60.01W). The vertical grid was stretched using with 75m spacing at the finest levels to provide better resolution within the first 1.5 km, and the model top extended to approximately 22 km above ground level. RAMS was initialized on February 10 2008 (00:00 UTC), the length of simulations was 32 days, and GSF data were used for initialization and nudging of the coarser-grid boundaries. The control run considered a CCN concentration of 300cm-3 while other several other simulations considered an influx of higher CCN concentrations (up to 1300/cc) . The latter concentration was observed near the end of the AMAZE project period. Both direct and indirect effects of these CCN particles were considered. Model output data (finest grid) every 15 min were used to compute the PDFs for each model level. When increasing aerosol concentrations, significant impacts were simulated for the PDFs of the water contents of various hydrometeors, vertical motions, area with precipitation, latent heat releases, among other quantities. In most cases, they exhibited a peculiar non-monotonic response similar to that seen in two previous studies of ours

  6. MASKED AREAS IN SHEAR PEAK STATISTICS: A FORWARD MODELING APPROACH

    International Nuclear Information System (INIS)

    Bard, D.; Kratochvil, J. M.; Dawson, W.

    2016-01-01

    The statistics of shear peaks have been shown to provide valuable cosmological information beyond the power spectrum, and will be an important constraint of models of cosmology in forthcoming astronomical surveys. Surveys include masked areas due to bright stars, bad pixels etc., which must be accounted for in producing constraints on cosmology from shear maps. We advocate a forward-modeling approach, where the impacts of masking and other survey artifacts are accounted for in the theoretical prediction of cosmological parameters, rather than correcting survey data to remove them. We use masks based on the Deep Lens Survey, and explore the impact of up to 37% of the survey area being masked on LSST and DES-scale surveys. By reconstructing maps of aperture mass the masking effect is smoothed out, resulting in up to 14% smaller statistical uncertainties compared to simply reducing the survey area by the masked area. We show that, even in the presence of large survey masks, the bias in cosmological parameter estimation produced in the forward-modeling process is ≈1%, dominated by bias caused by limited simulation volume. We also explore how this potential bias scales with survey area and evaluate how much small survey areas are impacted by the differences in cosmological structure in the data and simulated volumes, due to cosmic variance

  7. Statistical multistep direct and statistical multistep compound models for calculations of nuclear data for applications

    International Nuclear Information System (INIS)

    Seeliger, D.

    1993-01-01

    This contribution contains a brief presentation and comparison of the different Statistical Multistep Approaches, presently available for practical nuclear data calculations. (author). 46 refs, 5 figs

  8. A Tensor Statistical Model for Quantifying Dynamic Functional Connectivity.

    Science.gov (United States)

    Zhu, Yingying; Zhu, Xiaofeng; Kim, Minjeong; Yan, Jin; Wu, Guorong

    2017-06-01

    Functional connectivity (FC) has been widely investigated in many imaging-based neuroscience and clinical studies. Since functional Magnetic Resonance Image (MRI) signal is just an indirect reflection of brain activity, it is difficult to accurately quantify the FC strength only based on signal correlation. To address this limitation, we propose a learning-based tensor model to derive high sensitivity and specificity connectome biomarkers at the individual level from resting-state fMRI images. First, we propose a learning-based approach to estimate the intrinsic functional connectivity. In addition to the low level region-to-region signal correlation, latent module-to-module connection is also estimated and used to provide high level heuristics for measuring connectivity strength. Furthermore, sparsity constraint is employed to automatically remove the spurious connections, thus alleviating the issue of searching for optimal threshold. Second, we integrate our learning-based approach with the sliding-window technique to further reveal the dynamics of functional connectivity. Specifically, we stack the functional connectivity matrix within each sliding window and form a 3D tensor where the third dimension denotes for time. Then we obtain dynamic functional connectivity (dFC) for each individual subject by simultaneously estimating the within-sliding-window functional connectivity and characterizing the across-sliding-window temporal dynamics. Third, in order to enhance the robustness of the connectome patterns extracted from dFC, we extend the individual-based 3D tensors to a population-based 4D tensor (with the fourth dimension stands for the training subjects) and learn the statistics of connectome patterns via 4D tensor analysis. Since our 4D tensor model jointly (1) optimizes dFC for each training subject and (2) captures the principle connectome patterns, our statistical model gains more statistical power of representing new subject than current state

  9. Development of modelling algorithm of technological systems by statistical tests

    Science.gov (United States)

    Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.

    2018-03-01

    The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.

  10. New statistical model of inelastic fast neutron scattering

    International Nuclear Information System (INIS)

    Stancicj, V.

    1975-07-01

    A new statistical model for treating the fast neutron inelastic scattering has been proposed by using the general expressions of the double differential cross section in impuls approximation. The use of the Fermi-Dirac distribution of nucleons makes it possible to derive an analytical expression of the fast neutron inelastic scattering kernel including the angular momenta coupling. The obtained values of the inelastic fast neutron cross section calculated from the derived expression of the scattering kernel are in a good agreement with the experiments. A main advantage of the derived expressions is in their simplicity for the practical calculations

  11. Statistical and Machine Learning Models to Predict Programming Performance

    OpenAIRE

    Bergin, Susan

    2006-01-01

    This thesis details a longitudinal study on factors that influence introductory programming success and on the development of machine learning models to predict incoming student performance. Although numerous studies have developed models to predict programming success, the models struggled to achieve high accuracy in predicting the likely performance of incoming students. Our approach overcomes this by providing a machine learning technique, using a set of three significant...

  12. Hyperparameterization of soil moisture statistical models for North America with Ensemble Learning Models (Elm)

    Science.gov (United States)

    Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.

    2017-12-01

    Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.

  13. Statistical Models for Inferring Vegetation Composition from Fossil Pollen

    Science.gov (United States)

    Paciorek, C.; McLachlan, J. S.; Shang, Z.

    2011-12-01

    Fossil pollen provide information about vegetation composition that can be used to help understand how vegetation has changed over the past. However, these data have not traditionally been analyzed in a way that allows for statistical inference about spatio-temporal patterns and trends. We build a Bayesian hierarchical model called STEPPS (Spatio-Temporal Empirical Prediction from Pollen in Sediments) that predicts forest composition in southern New England, USA, over the last two millenia based on fossil pollen. The critical relationships between abundances of tree taxa in the pollen record and abundances in actual vegetation are estimated using modern (Forest Inventory Analysis) data and (witness tree) data from colonial records. This gives us two time points at which both pollen and direct vegetation data are available. Based on these relationships, and incorporating our uncertainty about them, we predict forest composition using fossil pollen. We estimate the spatial distribution and relative abundances of tree species and draw inference about how these patterns have changed over time. Finally, we describe ongoing work to extend the modeling to the upper Midwest of the U.S., including an approach to infer tree density and thereby estimate the prairie-forest boundary in Minnesota and Wisconsin. This work is part of the PalEON project, which brings together a team of ecosystem modelers, paleoecologists, and statisticians with the goal of reconstructing vegetation responses to climate during the last two millenia in the northeastern and midwestern United States. The estimates from the statistical modeling will be used to assess and calibrate ecosystem models that are used to project ecological changes in response to global change.

  14. Statistical molecular design of balanced compound libraries for QSAR modeling.

    Science.gov (United States)

    Linusson, A; Elofsson, M; Andersson, I E; Dahlgren, M K

    2010-01-01

    A fundamental step in preclinical drug development is the computation of quantitative structure-activity relationship (QSAR) models, i.e. models that link chemical features of compounds with activities towards a target macromolecule associated with the initiation or progression of a disease. QSAR models are computed by combining information on the physicochemical and structural features of a library of congeneric compounds, typically assembled from two or more building blocks, and biological data from one or more in vitro assays. Since the models provide information on features affecting the compounds' biological activity they can be used as guides for further optimization. However, in order for a QSAR model to be relevant to the targeted disease, and drug development in general, the compound library used must contain molecules with balanced variation of the features spanning the chemical space believed to be important for interaction with the biological target. In addition, the assays used must be robust and deliver high quality data that are directly related to the function of the biological target and the associated disease state. In this review, we discuss and exemplify the concept of statistical molecular design (SMD) in the selection of building blocks and final synthetic targets (i.e. compounds to synthesize) to generate information-rich, balanced libraries for biological testing and computation of QSAR models.

  15. Modeling of environmentally significant interfaces: Two case studies

    International Nuclear Information System (INIS)

    Williford, R.E.

    2006-01-01

    When some parameters cannot be easily measured experimentally, mathematical models can often be used to deconvolute or interpret data collected on complex systems, such as those characteristic of many environmental problems. These models can help quantify the contributions of various physical or chemical phenomena that contribute to the overall behavior, thereby enabling the scientist to control and manipulate these phenomena, and thus to optimize the performance of the material or device. In the first case study presented here, a model is used to test the hypothesis that oxygen interactions with hydrogen on the catalyst particles of solid oxide fuel cell anodes can sometimes occur a finite distance away from the triple phase boundary (TPB), so that such reactions are not restricted to the TPB as normally assumed. The model may help explain a discrepancy between the observed structure of SOFCs and their performance. The second case study develops a simple physical model that allows engineers to design and control the sizes and shapes of mesopores in silica thin films. Such pore design can be useful for enhancing the selectivity and reactivity of environmental sensors and catalysts. This paper demonstrates the mutually beneficial interactions between experiment and modeling in the solution of a wide range of problems

  16. Multiresolution wavelet-ANN model for significant wave height forecasting.

    Digital Repository Service at National Institute of Oceanography (India)

    Deka, P.C.; Mandal, S.; Prahlada, R.

    Hybrid wavelet artificial neural network (WLNN) has been applied in the present study to forecast significant wave heights (Hs). Here Discrete Wavelet Transformation is used to preprocess the time series data (Hs) prior to Artificial Neural Network...

  17. Image sequence analysis in nuclear medicine: (1) Parametric imaging using statistical modelling

    International Nuclear Information System (INIS)

    Liehn, J.C.; Hannequin, P.; Valeyre, J.

    1989-01-01

    This is a review of parametric imaging methods on Nuclear Medicine. A Parametric Image is an image in which each pixel value is a function of the value of the same pixel of an image sequence. The Local Model Method is the fitting of each pixel time activity curve by a model which parameter values form the Parametric Images. The Global Model Method is the modelling of the changes between two images. It is applied to image comparison. For both methods, the different models, the identification criterion, the optimization methods and the statistical properties of the images are discussed. The analysis of one or more Parametric Images is performed using 1D or 2D histograms. The statistically significant Parametric Images, (Images of significant Variances, Amplitudes and Differences) are also proposed [fr

  18. A combined statistical model for multiple motifs search

    International Nuclear Information System (INIS)

    Gao Lifeng; Liu Xin; Guan Shan

    2008-01-01

    Transcription factor binding sites (TFBS) play key roles in genebior 6.8 wavelet expression and regulation. They are short sequence segments with definite structure and can be recognized by the corresponding transcription factors correctly. From the viewpoint of statistics, the candidates of TFBS should be quite different from the segments that are randomly combined together by nucleotide. This paper proposes a combined statistical model for finding over-represented short sequence segments in different kinds of data set. While the over-represented short sequence segment is described by position weight matrix, the nucleotide distribution at most sites of the segment should be far from the background nucleotide distribution. The central idea of this approach is to search for such kind of signals. This algorithm is tested on 3 data sets, including binding sites data set of cyclic AMP receptor protein in E.coli, PlantProm DB which is a non-redundant collection of proximal promoter sequences from different species, collection of the intergenic sequences of the whole genome of E.Coli. Even though the complexity of these three data sets is quite different, the results show that this model is rather general and sensible. (general)

  19. Statistical Agent Based Modelization of the Phenomenon of Drug Abuse

    Science.gov (United States)

    di Clemente, Riccardo; Pietronero, Luciano

    2012-07-01

    We introduce a statistical agent based model to describe the phenomenon of drug abuse and its dynamical evolution at the individual and global level. The agents are heterogeneous with respect to their intrinsic inclination to drugs, to their budget attitude and social environment. The various levels of drug use were inspired by the professional description of the phenomenon and this permits a direct comparison with all available data. We show that certain elements have a great importance to start the use of drugs, for example the rare events in the personal experiences which permit to overcame the barrier of drug use occasionally. The analysis of how the system reacts to perturbations is very important to understand its key elements and it provides strategies for effective policy making. The present model represents the first step of a realistic description of this phenomenon and can be easily generalized in various directions.

  20. Graphene growth process modeling: a physical-statistical approach

    Science.gov (United States)

    Wu, Jian; Huang, Qiang

    2014-09-01

    As a zero-band semiconductor, graphene is an attractive material for a wide variety of applications such as optoelectronics. Among various techniques developed for graphene synthesis, chemical vapor deposition on copper foils shows high potential for producing few-layer and large-area graphene. Since fabrication of high-quality graphene sheets requires the understanding of growth mechanisms, and methods of characterization and control of grain size of graphene flakes, analytical modeling of graphene growth process is therefore essential for controlled fabrication. The graphene growth process starts with randomly nucleated islands that gradually develop into complex shapes, grow in size, and eventually connect together to cover the copper foil. To model this complex process, we develop a physical-statistical approach under the assumption of self-similarity during graphene growth. The growth kinetics is uncovered by separating island shapes from area growth rate. We propose to characterize the area growth velocity using a confined exponential model, which not only has clear physical explanation, but also fits the real data well. For the shape modeling, we develop a parametric shape model which can be well explained by the angular-dependent growth rate. This work can provide useful information for the control and optimization of graphene growth process on Cu foil.

  1. The epistemology of mathematical and statistical modeling: a quiet methodological revolution.

    Science.gov (United States)

    Rodgers, Joseph Lee

    2010-01-01

    A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. Copyrigiht 2009 APA, all rights reserved.

  2. Linear mixed models a practical guide using statistical software

    CERN Document Server

    West, Brady T; Galecki, Andrzej T

    2014-01-01

    Highly recommended by JASA, Technometrics, and other journals, the first edition of this bestseller showed how to easily perform complex linear mixed model (LMM) analyses via a variety of software programs. Linear Mixed Models: A Practical Guide Using Statistical Software, Second Edition continues to lead readers step by step through the process of fitting LMMs. This second edition covers additional topics on the application of LMMs that are valuable for data analysts in all fields. It also updates the case studies using the latest versions of the software procedures and provides up-to-date information on the options and features of the software procedures available for fitting LMMs in SAS, SPSS, Stata, R/S-plus, and HLM.New to the Second Edition A new chapter on models with crossed random effects that uses a case study to illustrate software procedures capable of fitting these models Power analysis methods for longitudinal and clustered study designs, including software options for power analyses and suggest...

  3. Stochastic Spatial Models in Ecology: A Statistical Physics Approach

    Science.gov (United States)

    Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.

    2017-11-01

    Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.

  4. Percolation for a model of statistically inhomogeneous random media

    International Nuclear Information System (INIS)

    Quintanilla, J.; Torquato, S.

    1999-01-01

    We study clustering and percolation phenomena for a model of statistically inhomogeneous two-phase random media, including functionally graded materials. This model consists of inhomogeneous fully penetrable (Poisson distributed) disks and can be constructed for any specified variation of volume fraction. We quantify the transition zone in the model, defined by the frontier of the cluster of disks which are connected to the disk-covered portion of the model, by defining the coastline function and correlation functions for the coastline. We find that the behavior of these functions becomes largely independent of the specific choice of grade in volume fraction as the separation of length scales becomes large. We also show that the correlation function behaves in a manner similar to that of fractal Brownian motion. Finally, we study fractal characteristics of the frontier itself and compare to similar properties for two-dimensional percolation on a lattice. In particular, we show that the average location of the frontier appears to be related to the percolation threshold for homogeneous fully penetrable disks. copyright 1999 American Institute of Physics

  5. Glass viscosity calculation based on a global statistical modelling approach

    Energy Technology Data Exchange (ETDEWEB)

    Fluegel, Alex

    2007-02-01

    A global statistical glass viscosity model was developed for predicting the complete viscosity curve, based on more than 2200 composition-property data of silicate glasses from the scientific literature, including soda-lime-silica container and float glasses, TV panel glasses, borosilicate fiber wool and E type glasses, low expansion borosilicate glasses, glasses for nuclear waste vitrification, lead crystal glasses, binary alkali silicates, and various further compositions from over half a century. It is shown that within a measurement series from a specific laboratory the reported viscosity values are often over-estimated at higher temperatures due to alkali and boron oxide evaporation during the measurement and glass preparation, including data by Lakatos et al. (1972) and the recently published High temperature glass melt property database for process modeling by Seward et al. (2005). Similarly, in the glass transition range many experimental data of borosilicate glasses are reported too high due to phase separation effects. The developed global model corrects those errors. The model standard error was 9-17°C, with R^2 = 0.985-0.989. The prediction 95% confidence interval for glass in mass production largely depends on the glass composition of interest, the composition uncertainty, and the viscosity level. New insights in the mixed-alkali effect are provided.

  6. A Statistical Toolbox For Mining And Modeling Spatial Data

    Directory of Open Access Journals (Sweden)

    D’Aubigny Gérard

    2016-12-01

    Full Text Available Most data mining projects in spatial economics start with an evaluation of a set of attribute variables on a sample of spatial entities, looking for the existence and strength of spatial autocorrelation, based on the Moran’s and the Geary’s coefficients, the adequacy of which is rarely challenged, despite the fact that when reporting on their properties, many users seem likely to make mistakes and to foster confusion. My paper begins by a critical appraisal of the classical definition and rational of these indices. I argue that while intuitively founded, they are plagued by an inconsistency in their conception. Then, I propose a principled small change leading to corrected spatial autocorrelation coefficients, which strongly simplifies their relationship, and opens the way to an augmented toolbox of statistical methods of dimension reduction and data visualization, also useful for modeling purposes. A second section presents a formal framework, adapted from recent work in statistical learning, which gives theoretical support to our definition of corrected spatial autocorrelation coefficients. More specifically, the multivariate data mining methods presented here, are easily implementable on the existing (free software, yield methods useful to exploit the proposed corrections in spatial data analysis practice, and, from a mathematical point of view, whose asymptotic behavior, already studied in a series of papers by Belkin & Niyogi, suggests that they own qualities of robustness and a limited sensitivity to the Modifiable Areal Unit Problem (MAUP, valuable in exploratory spatial data analysis.

  7. Statistical mechanics of learning orthogonal signals for general covariance models

    International Nuclear Information System (INIS)

    Hoyle, David C

    2010-01-01

    Statistical mechanics techniques have proved to be useful tools in quantifying the accuracy with which signal vectors are extracted from experimental data. However, analysis has previously been limited to specific model forms for the population covariance C, which may be inappropriate for real world data sets. In this paper we obtain new statistical mechanical results for a general population covariance matrix C. For data sets consisting of p sample points in R N we use the replica method to study the accuracy of orthogonal signal vectors estimated from the sample data. In the asymptotic limit of N,p→∞ at fixed α = p/N, we derive analytical results for the signal direction learning curves. In the asymptotic limit the learning curves follow a single universal form, each displaying a retarded learning transition. An explicit formula for the location of the retarded learning transition is obtained and we find marked variation in the location of the retarded learning transition dependent on the distribution of population covariance eigenvalues. The results of the replica analysis are confirmed against simulation

  8. Significance of matrix diagonalization in modelling inelastic electron scattering

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Z. [University of Ulm, Ulm 89081 (Germany); Hambach, R. [University of Ulm, Ulm 89081 (Germany); University of Jena, Jena 07743 (Germany); Kaiser, U.; Rose, H. [University of Ulm, Ulm 89081 (Germany)

    2017-04-15

    Electron scattering is always applied as one of the routines to investigate nanostructures. Nowadays the development of hardware offers more and more prospect for this technique. For example imaging nanostructures with inelastic scattered electrons may allow to produce component-sensitive images with atomic resolution. Modelling inelastic electron scattering is therefore essential for interpreting these images. The main obstacle to study inelastic scattering problem is its complexity. During inelastic scattering, incident electrons entangle with objects, and the description of this process involves a multidimensional array. Since the simulation usually involves fourdimensional Fourier transforms, the computation is highly inefficient. In this work we have offered one solution to handle the multidimensional problem. By transforming a high dimensional array into twodimensional array, we are able to perform matrix diagonalization and approximate the original multidimensional array with its twodimensional eigenvectors. Our procedure reduces the complicated multidimensional problem to a twodimensional problem. In addition, it minimizes the number of twodimensional problems. This method is very useful for studying multiple inelastic scattering. - Highlights: • 4D problems are involved in modelling inelastic electron scattering. • By means of matrix diagonalization, the 4D problems can be simplified as 2D problems. • The number of 2D problems is minimized by using this approach.

  9. Representation of the contextual statistical model by hyperbolic amplitudes

    International Nuclear Information System (INIS)

    Khrennikov, Andrei

    2005-01-01

    We continue the development of a so-called contextual statistical model (here context has the meaning of a complex of physical conditions). It is shown that, besides contexts producing the conventional trigonometric cos-interference, there exist contexts producing the hyperbolic cos-interference. Starting with the corresponding interference formula of total probability we represent such contexts by hyperbolic probabilistic amplitudes or in the abstract formalism by normalized vectors of a hyperbolic analogue of the Hilbert space. There is obtained a hyperbolic Born's rule. Incompatible observables are represented by noncommutative operators. This paper can be considered as the first step towards hyperbolic quantum probability. We also discuss possibilities of experimental verification of hyperbolic quantum mechanics: in physics of elementary particles, string theory as well as in experiments with nonphysical systems, e.g., in psychology, cognitive sciences, and economy

  10. α-ternary decay of Cf isotopes, statistical model

    International Nuclear Information System (INIS)

    Joseph, Jayesh George; Santhosh, K.P.

    2017-01-01

    The process of splitting a heavier nucleus to three simultaneous fragments is termed as ternary fission and compared to usual binary fission, it is a rare process. Depending on the nature of third particle either it is called light charged particle (LCP) accompanying fission if it is light or true ternary fission if all three fragments have nearly same mass distributions. After experimental observations in early seventies, initially with a slow pace, now theoretical studies in ternary fission has turned to a hot topic in nuclear decay studies especially in past one decade. Mean while various models have been developed, existing being modified and seeking for new with a hope that it can beam a little more light to the profound nature of nuclear interaction. In this study a statistical method, level density formulation, has been employed

  11. Smooth extrapolation of unknown anatomy via statistical shape models

    Science.gov (United States)

    Grupp, R. B.; Chiang, H.; Otake, Y.; Murphy, R. J.; Gordon, C. R.; Armand, M.; Taylor, R. H.

    2015-03-01

    Several methods to perform extrapolation of unknown anatomy were evaluated. The primary application is to enhance surgical procedures that may use partial medical images or medical images of incomplete anatomy. Le Fort-based, face-jaw-teeth transplant is one such procedure. From CT data of 36 skulls and 21 mandibles separate Statistical Shape Models of the anatomical surfaces were created. Using the Statistical Shape Models, incomplete surfaces were projected to obtain complete surface estimates. The surface estimates exhibit non-zero error in regions where the true surface is known; it is desirable to keep the true surface and seamlessly merge the estimated unknown surface. Existing extrapolation techniques produce non-smooth transitions from the true surface to the estimated surface, resulting in additional error and a less aesthetically pleasing result. The three extrapolation techniques evaluated were: copying and pasting of the surface estimate (non-smooth baseline), a feathering between the patient surface and surface estimate, and an estimate generated via a Thin Plate Spline trained from displacements between the surface estimate and corresponding vertices of the known patient surface. Feathering and Thin Plate Spline approaches both yielded smooth transitions. However, feathering corrupted known vertex values. Leave-one-out analyses were conducted, with 5% to 50% of known anatomy removed from the left-out patient and estimated via the proposed approaches. The Thin Plate Spline approach yielded smaller errors than the other two approaches, with an average vertex error improvement of 1.46 mm and 1.38 mm for the skull and mandible respectively, over the baseline approach.

  12. Statistical shape modeling based renal volume measurement using tracked ultrasound

    Science.gov (United States)

    Pai Raikar, Vipul; Kwartowitz, David M.

    2017-03-01

    Autosomal dominant polycystic kidney disease (ADPKD) is the fourth most common cause of kidney transplant worldwide accounting for 7-10% of all cases. Although ADPKD usually progresses over many decades, accurate risk prediction is an important task.1 Identifying patients with progressive disease is vital to providing new treatments being developed and enable them to enter clinical trials for new therapy. Among other factors, total kidney volume (TKV) is a major biomarker predicting the progression of ADPKD. Consortium for Radiologic Imaging Studies in Polycystic Kidney Disease (CRISP)2 have shown that TKV is an early, and accurate measure of cystic burden and likely growth rate. It is strongly associated with loss of renal function.3 While ultrasound (US) has proven as an excellent tool for diagnosing the disease; monitoring short-term changes using ultrasound has been shown to not be accurate. This is attributed to high operator variability and reproducibility as compared to tomographic modalities such as CT and MR (Gold standard). Ultrasound has emerged as one of the standout modality for intra-procedural imaging and with methods for spatial localization has afforded us the ability to track 2D ultrasound in physical space which it is being used. In addition to this, the vast amount of recorded tomographic data can be used to generate statistical shape models that allow us to extract clinical value from archived image sets. In this work, we aim at improving the prognostic value of US in managing ADPKD by assessing the accuracy of using statistical shape model augmented US data, to predict TKV, with the end goal of monitoring short-term changes.

  13. Field significance of performance measures in the context of regional climate model evaluation. Part 2: precipitation

    Science.gov (United States)

    Ivanov, Martin; Warrach-Sagi, Kirsten; Wulfmeyer, Volker

    2018-04-01

    A new approach for rigorous spatial analysis of the downscaling performance of regional climate model (RCM) simulations is introduced. It is based on a multiple comparison of the local tests at the grid cells and is also known as `field' or `global' significance. The block length for the local resampling tests is precisely determined to adequately account for the time series structure. New performance measures for estimating the added value of downscaled data relative to the large-scale forcing fields are developed. The methodology is exemplarily applied to a standard EURO-CORDEX hindcast simulation with the Weather Research and Forecasting (WRF) model coupled with the land surface model NOAH at 0.11 ∘ grid resolution. Daily precipitation climatology for the 1990-2009 period is analysed for Germany for winter and summer in comparison with high-resolution gridded observations from the German Weather Service. The field significance test controls the proportion of falsely rejected local tests in a meaningful way and is robust to spatial dependence. Hence, the spatial patterns of the statistically significant local tests are also meaningful. We interpret them from a process-oriented perspective. While the downscaled precipitation distributions are statistically indistinguishable from the observed ones in most regions in summer, the biases of some distribution characteristics are significant over large areas in winter. WRF-NOAH generates appropriate stationary fine-scale climate features in the daily precipitation field over regions of complex topography in both seasons and appropriate transient fine-scale features almost everywhere in summer. As the added value of global climate model (GCM)-driven simulations cannot be smaller than this perfect-boundary estimate, this work demonstrates in a rigorous manner the clear additional value of dynamical downscaling over global climate simulations. The evaluation methodology has a broad spectrum of applicability as it is

  14. Critical, statistical, and thermodynamical properties of lattice models

    Energy Technology Data Exchange (ETDEWEB)

    Varma, Vipin Kerala

    2013-10-15

    In this thesis we investigate zero temperature and low temperature properties - critical, statistical and thermodynamical - of lattice models in the contexts of bosonic cold atom systems, magnetic materials, and non-interacting particles on various lattice geometries. We study quantum phase transitions in the Bose-Hubbard model with higher body interactions, as relevant for optical lattice experiments of strongly interacting bosons, in one and two dimensions; the universality of the Mott insulator to superfluid transition is found to remain unchanged for even large three body interaction strengths. A systematic renormalization procedure is formulated to fully re-sum these higher (three and four) body interactions into the two body terms. In the strongly repulsive limit, we analyse the zero and low temperature physics of interacting hard-core bosons on the kagome lattice at various fillings. Evidence for a disordered phase in the Ising limit of the model is presented; in the strong coupling limit, the transition between the valence bond solid and the superfluid is argued to be first order at the tip of the solid lobe.

  15. The statistical multifragmentation model: Origins and recent advances

    International Nuclear Information System (INIS)

    Donangelo, R.; Souza, S. R.

    2016-01-01

    We review the Statistical Multifragmentation Model (SMM) which considers a generalization of the liquid-drop model for hot nuclei and allows one to calculate thermodynamic quantities characterizing the nuclear ensemble at the disassembly stage. We show how to determine probabilities of definite partitions of finite nuclei and how to determine, through Monte Carlo calculations, observables such as the caloric curve, multiplicity distributions, heat capacity, among others. Some experimental measurements of the caloric curve confirmed the SMM predictions of over 10 years before, leading to a surge in the interest in the model. However, the experimental determination of the fragmentation temperatures relies on the yields of different isotopic species, which were not correctly calculated in the schematic, liquid-drop picture, employed in the SMM. This led to a series of improvements in the SMM, in particular to the more careful choice of nuclear masses and energy densities, specially for the lighter nuclei. With these improvements the SMM is able to make quantitative determinations of isotope production. We show the application of SMM to the production of exotic nuclei through multifragmentation. These preliminary calculations demonstrate the need for a careful choice of the system size and excitation energy to attain maximum yields.

  16. Critical, statistical, and thermodynamical properties of lattice models

    International Nuclear Information System (INIS)

    Varma, Vipin Kerala

    2013-10-01

    In this thesis we investigate zero temperature and low temperature properties - critical, statistical and thermodynamical - of lattice models in the contexts of bosonic cold atom systems, magnetic materials, and non-interacting particles on various lattice geometries. We study quantum phase transitions in the Bose-Hubbard model with higher body interactions, as relevant for optical lattice experiments of strongly interacting bosons, in one and two dimensions; the universality of the Mott insulator to superfluid transition is found to remain unchanged for even large three body interaction strengths. A systematic renormalization procedure is formulated to fully re-sum these higher (three and four) body interactions into the two body terms. In the strongly repulsive limit, we analyse the zero and low temperature physics of interacting hard-core bosons on the kagome lattice at various fillings. Evidence for a disordered phase in the Ising limit of the model is presented; in the strong coupling limit, the transition between the valence bond solid and the superfluid is argued to be first order at the tip of the solid lobe.

  17. Constraining statistical-model parameters using fusion and spallation reactions

    Directory of Open Access Journals (Sweden)

    Charity Robert J.

    2011-10-01

    Full Text Available The de-excitation of compound nuclei has been successfully described for several decades by means of statistical models. However, such models involve a large number of free parameters and ingredients that are often underconstrained by experimental data. We show how the degeneracy of the model ingredients can be partially lifted by studying different entrance channels for de-excitation, which populate different regions of the parameter space of the compound nucleus. Fusion reactions, in particular, play an important role in this strategy because they fix three out of four of the compound-nucleus parameters (mass, charge and total excitation energy. The present work focuses on fission and intermediate-mass-fragment emission cross sections. We prove how equivalent parameter sets for fusion-fission reactions can be resolved using another entrance channel, namely spallation reactions. Intermediate-mass-fragment emission can be constrained in a similar way. An interpretation of the best-fit IMF barriers in terms of the Wigner energies of the nascent fragments is discussed.

  18. Optimizing DNA assembly based on statistical language modelling.

    Science.gov (United States)

    Fang, Gang; Zhang, Shemin; Dong, Yafei

    2017-12-15

    By successively assembling genetic parts such as BioBrick according to grammatical models, complex genetic constructs composed of dozens of functional blocks can be built. However, usually every category of genetic parts includes a few or many parts. With increasing quantity of genetic parts, the process of assembling more than a few sets of these parts can be expensive, time consuming and error prone. At the last step of assembling it is somewhat difficult to decide which part should be selected. Based on statistical language model, which is a probability distribution P(s) over strings S that attempts to reflect how frequently a string S occurs as a sentence, the most commonly used parts will be selected. Then, a dynamic programming algorithm was designed to figure out the solution of maximum probability. The algorithm optimizes the results of a genetic design based on a grammatical model and finds an optimal solution. In this way, redundant operations can be reduced and the time and cost required for conducting biological experiments can be minimized. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Terminal-Dependent Statistical Inference for the FBSDEs Models

    Directory of Open Access Journals (Sweden)

    Yunquan Song

    2014-01-01

    Full Text Available The original stochastic differential equations (OSDEs and forward-backward stochastic differential equations (FBSDEs are often used to model complex dynamic process that arise in financial, ecological, and many other areas. The main difference between OSDEs and FBSDEs is that the latter is designed to depend on a terminal condition, which is a key factor in some financial and ecological circumstances. It is interesting but challenging to estimate FBSDE parameters from noisy data and the terminal condition. However, to the best of our knowledge, the terminal-dependent statistical inference for such a model has not been explored in the existing literature. We proposed a nonparametric terminal control variables estimation method to address this problem. The reason why we use the terminal control variables is that the newly proposed inference procedures inherit the terminal-dependent characteristic. Through this new proposed method, the estimators of the functional coefficients of the FBSDEs model are obtained. The asymptotic properties of the estimators are also discussed. Simulation studies show that the proposed method gives satisfying estimates for the FBSDE parameters from noisy data and the terminal condition. A simulation is performed to test the feasibility of our method.

  20. Statistical inference for imperfect maintenance models with missing data

    International Nuclear Information System (INIS)

    Dijoux, Yann; Fouladirad, Mitra; Nguyen, Dinh Tuan

    2016-01-01

    The paper considers complex industrial systems with incomplete maintenance history. A corrective maintenance is performed after the occurrence of a failure and its efficiency is assumed to be imperfect. In maintenance analysis, the databases are not necessarily complete. Specifically, the observations are assumed to be window-censored. This situation arises relatively frequently after the purchase of a second-hand unit or in the absence of maintenance record during the burn-in phase. The joint assessment of the wear-out of the system and the maintenance efficiency is investigated under missing data. A review along with extensions of statistical inference procedures from an observation window are proposed in the case of perfect and minimal repair using the renewal and Poisson theories, respectively. Virtual age models are employed to model imperfect repair. In this framework, new estimation procedures are developed. In particular, maximum likelihood estimation methods are derived for the most classical virtual age models. The benefits of the new estimation procedures are highlighted by numerical simulations and an application to a real data set. - Highlights: • New estimation procedures for window-censored observations and imperfect repair. • Extensions of inference methods for perfect and minimal repair with missing data. • Overview of maximum likelihood method with complete and incomplete observations. • Benefits of the new procedures highlighted by simulation studies and real application.

  1. The statistical multifragmentation model: Origins and recent advances

    Energy Technology Data Exchange (ETDEWEB)

    Donangelo, R., E-mail: donangel@fing.edu.uy [Instituto de Física, Facultad de Ingeniería, Universidad de la República, Julio Herrera y Reissig 565, 11300, Montevideo (Uruguay); Instituto de Física, Universidade Federal do Rio de Janeiro, C.P. 68528, 21941-972 Rio de Janeiro - RJ (Brazil); Souza, S. R., E-mail: srsouza@if.ufrj.br [Instituto de Física, Universidade Federal do Rio de Janeiro, C.P. 68528, 21941-972 Rio de Janeiro - RJ (Brazil); Instituto de Física, Universidade Federal do Rio Grande do Sul, C.P. 15051, 91501-970 Porto Alegre - RS (Brazil)

    2016-07-07

    We review the Statistical Multifragmentation Model (SMM) which considers a generalization of the liquid-drop model for hot nuclei and allows one to calculate thermodynamic quantities characterizing the nuclear ensemble at the disassembly stage. We show how to determine probabilities of definite partitions of finite nuclei and how to determine, through Monte Carlo calculations, observables such as the caloric curve, multiplicity distributions, heat capacity, among others. Some experimental measurements of the caloric curve confirmed the SMM predictions of over 10 years before, leading to a surge in the interest in the model. However, the experimental determination of the fragmentation temperatures relies on the yields of different isotopic species, which were not correctly calculated in the schematic, liquid-drop picture, employed in the SMM. This led to a series of improvements in the SMM, in particular to the more careful choice of nuclear masses and energy densities, specially for the lighter nuclei. With these improvements the SMM is able to make quantitative determinations of isotope production. We show the application of SMM to the production of exotic nuclei through multifragmentation. These preliminary calculations demonstrate the need for a careful choice of the system size and excitation energy to attain maximum yields.

  2. Statistical versus Musical Significance: Commentary on Leigh VanHandel's 'National Metrical Types in Nineteenth Century Art Song'

    Directory of Open Access Journals (Sweden)

    Justin London

    2010-01-01

    Full Text Available In “National Metrical Types in Nineteenth Century Art Song” Leigh Van Handel gives a sympathetic critique of William Rothstein’s claim that in western classical music of the late 18th and 19th centuries there are discernable differences in the phrasing and metrical practice of German versus French and Italian composers. This commentary (a examines just what Rothstein means in terms of his proposed metrical typology, (b questions Van Handel on how she has applied it to a purely melodic framework, (c amplifies Van Handel’s critique of Rothstein, and then (d concludes with a rumination on the reach of quantitative (i.e., statistically-driven versus qualitative claims regarding such things as “national metrical types.”

  3. Statistical Clustering and Compositional Modeling of Iapetus VIMS Spectral Data

    Science.gov (United States)

    Pinilla-Alonso, N.; Roush, T. L.; Marzo, G.; Dalle Ore, C. M.; Cruikshank, D. P.

    2009-12-01

    It has long been known that the surfaces of Saturn's major satellites are predominantly icy objects [e.g. 1 and references therein]. Since 2004, these bodies have been the subject of observations by the Cassini-VIMS (Visual and Infrared Mapping Spectrometer) experiment [2]. Iapetus has the unique property that the hemisphere centered on the apex of its locked synchronous orbital motion around Saturn has a very low geometrical albedo of 2-6%, while the opposite hemisphere is about 10 times more reflective. The nature and origin of the dark material of Iapetus has remained a question since its discovery [3 and references therein]. The nature of this material and how it is distributed on the surface of this body, can shed new light into the knowledge of the Saturnian system. We apply statistical clustering [4] and theoretical modeling [5,6] to address the surface composition of Iapetus. The VIMS data evaluated were obtained during the second flyby of Iapetus, in September 2007. This close approach allowed VIMS to obtain spectra at relatively high spatial resolution, ~1-22 km/pixel. The data we study sampled the trailing hemisphere and part of the dark leading one. The statistical clustering [4] is used to identify statistically distinct spectra on Iapetus. The composition of these distinct spectra are evaluated using theoretical models [5,6]. We thank Allan Meyer for his help. This research was supported by an appointment to the NASA Postdoctoral Program at the Ames Research Center, administered by Oak Ridge Associated Universities through a contract with NASA. [1] A, Coradini et al., 2009, Earth, Moon & Planets, 105, 289-310. [2] Brown et al., 2004, Space Science Reviews, 115, 111-168. [3] Cruikshank, D. et al Icarus, 2008, 193, 334-343. [4] Marzo, G. et al. 2008, Journal of Geophysical Research, 113, E12, CiteID E12009. [5] Hapke, B. 1993, Theory of reflectance and emittance spectroscopy, Cambridge University Press. [6] Shkuratov, Y. et al. 1999, Icarus, 137, 235-246.

  4. A scan statistic for continuous data based on the normal probability model

    Directory of Open Access Journals (Sweden)

    Huang Lan

    2009-10-01

    Full Text Available Abstract Temporal, spatial and space-time scan statistics are commonly used to detect and evaluate the statistical significance of temporal and/or geographical disease clusters, without any prior assumptions on the location, time period or size of those clusters. Scan statistics are mostly used for count data, such as disease incidence or mortality. Sometimes there is an interest in looking for clusters with respect to a continuous variable, such as lead levels in children or low birth weight. For such continuous data, we present a scan statistic where the likelihood is calculated using the the normal probability model. It may also be used for other distributions, while still maintaining the correct alpha level. In an application of the new method, we look for geographical clusters of low birth weight in New York City.

  5. Statistical model of planning technological indicators for oil extraction

    Energy Technology Data Exchange (ETDEWEB)

    Galeyev, R G; Lavushchenko, V P; Sheshnev, A S

    1979-01-01

    The efficiency of the process of oil extraction is determined by the effect of a number of interrelated technological indicators. Analytical expression of the interrelationships of the indicators was represented by an econometric model consisting of a system of linear regression equations. The basic advantage of these models is the possibility of calculating in them different, significantly important interrelationships. This makes it possible to correlate all calculations into a single logically noncontradictory balanced system. The developed model of the technological process of oil extraction makes it possible to significantly facilitate calculation and planning of its basic indicators with regard for system and balance requirements, makes it possible to purposefully generate new variants. In this case because of the optimal distribution of the volumes of geological-technical measures, a decrease in the total outlays for their implementation is achieved. Thus for the Berezovskiy field, this saving was R 150,000.

  6. Towards a Statistical Model of Tropical Cyclone Genesis

    Science.gov (United States)

    Fernandez, A.; Kashinath, K.; McAuliffe, J.; Prabhat, M.; Stark, P. B.; Wehner, M. F.

    2017-12-01

    Tropical Cyclones (TCs) are important extreme weather phenomena that have a strong impact on humans. TC forecasts are largely based on global numerical models that produce TC-like features. Aspects of Tropical Cyclones such as their formation/genesis, evolution, intensification and dissipation over land are important and challenging problems in climate science. This study investigates the environmental conditions associated with Tropical Cyclone Genesis (TCG) by testing how accurately a statistical model can predict TCG in the CAM5.1 climate model. TCG events are defined using TECA software @inproceedings{Prabhat2015teca, title={TECA: Petascale Pattern Recognition for Climate Science}, author={Prabhat and Byna, Surendra and Vishwanath, Venkatram and Dart, Eli and Wehner, Michael and Collins, William D}, booktitle={Computer Analysis of Images and Patterns}, pages={426-436}, year={2015}, organization={Springer}} to extract TC trajectories from CAM5.1. L1-regularized logistic regression (L1LR) is applied to the CAM5.1 output. The predictions have nearly perfect accuracy for data not associated with TC tracks and high accuracy differentiating between high vorticity and low vorticity systems. The model's active variables largely correspond to current hypotheses about important factors for TCG, such as wind field patterns and local pressure minima, and suggests new routes for investigation. Furthermore, our model's predictions of TC activity are competitive with the output of an instantaneous version of Emanuel and Nolan's Genesis Potential Index (GPI) @inproceedings{eman04, title = "Tropical cyclone activity and the global climate system", author = "Kerry Emanuel and Nolan, {David S.}", year = "2004", pages = "240-241", booktitle = "26th Conference on Hurricanes and Tropical Meteorology"}.

  7. Editorial to: Six papers on Dynamic Statistical Models

    DEFF Research Database (Denmark)

    2014-01-01

    statistical methodology and theory for large and complex data sets that included biostatisticians and mathematical statisticians from three faculties at the University of Copenhagen. The satellite meeting took place August 17–19, 2011. Its purpose was to bring together researchers in statistics and related......The following six papers are based on invited lectures at the satellite meeting held at the University of Copenhagen before the 58th World Statistics Congress of the International Statistical Institute in Dublin in 2011. At the invitation of the Bernoulli Society, the satellite meeting...... was organized around the theme “Dynamic Statistical Models” as a part of the Program of Excellence at the University of Copenhagen on “Statistical methods for complex and high dimensional models” (http://statistics.ku.dk/). The Excellence Program in Statistics was a research project to develop and investigate...

  8. Statistical emulation of a tsunami model for sensitivity analysis and uncertainty quantification

    Directory of Open Access Journals (Sweden)

    A. Sarri

    2012-06-01

    Full Text Available Due to the catastrophic consequences of tsunamis, early warnings need to be issued quickly in order to mitigate the hazard. Additionally, there is a need to represent the uncertainty in the predictions of tsunami characteristics corresponding to the uncertain trigger features (e.g. either position, shape and speed of a landslide, or sea floor deformation associated with an earthquake. Unfortunately, computer models are expensive to run. This leads to significant delays in predictions and makes the uncertainty quantification impractical. Statistical emulators run almost instantaneously and may represent well the outputs of the computer model. In this paper, we use the outer product emulator to build a fast statistical surrogate of a landslide-generated tsunami computer model. This Bayesian framework enables us to build the emulator by combining prior knowledge of the computer model properties with a few carefully chosen model evaluations. The good performance of the emulator is validated using the leave-one-out method.

  9. Statistical and molecular analyses of evolutionary significance of red-green color vision and color blindness in vertebrates.

    Science.gov (United States)

    Yokoyama, Shozo; Takenaka, Naomi

    2005-04-01

    Red-green color vision is strongly suspected to enhance the survival of its possessors. Despite being red-green color blind, however, many species have successfully competed in nature, which brings into question the evolutionary advantage of achieving red-green color vision. Here, we propose a new method of identifying positive selection at individual amino acid sites with the premise that if positive Darwinian selection has driven the evolution of the protein under consideration, then it should be found mostly at the branches in the phylogenetic tree where its function had changed. The statistical and molecular methods have been applied to 29 visual pigments with the wavelengths of maximal absorption at approximately 510-540 nm (green- or middle wavelength-sensitive [MWS] pigments) and at approximately 560 nm (red- or long wavelength-sensitive [LWS] pigments), which are sampled from a diverse range of vertebrate species. The results show that the MWS pigments are positively selected through amino acid replacements S180A, Y277F, and T285A and that the LWS pigments have been subjected to strong evolutionary conservation. The fact that these positively selected M/LWS pigments are found not only in animals with red-green color vision but also in those with red-green color blindness strongly suggests that both red-green color vision and color blindness have undergone adaptive evolution independently in different species.

  10. Patch-based generative shape model and MDL model selection for statistical analysis of archipelagos

    DEFF Research Database (Denmark)

    Ganz, Melanie; Nielsen, Mads; Brandt, Sami

    2010-01-01

    We propose a statistical generative shape model for archipelago-like structures. These kind of structures occur, for instance, in medical images, where our intention is to model the appearance and shapes of calcifications in x-ray radio graphs. The generative model is constructed by (1) learning ...

  11. Efficient pan-European river flood hazard modelling through a combination of statistical and physical models

    NARCIS (Netherlands)

    Paprotny, D.; Morales Napoles, O.; Jonkman, S.N.

    2017-01-01

    Flood hazard is currently being researched on continental and global scales, using models of increasing complexity. In this paper we investigate a different, simplified approach, which combines statistical and physical models in place of conventional rainfall-run-off models to carry out flood

  12. Detailed modeling of the statistical uncertainty of Thomson scattering measurements

    International Nuclear Information System (INIS)

    Morton, L A; Parke, E; Hartog, D J Den

    2013-01-01

    The uncertainty of electron density and temperature fluctuation measurements is determined by statistical uncertainty introduced by multiple noise sources. In order to quantify these uncertainties precisely, a simple but comprehensive model was made of the noise sources in the MST Thomson scattering system and of the resulting variance in the integrated scattered signals. The model agrees well with experimental and simulated results. The signal uncertainties are then used by our existing Bayesian analysis routine to find the most likely electron temperature and density, with confidence intervals. In the model, photonic noise from scattered light and plasma background light is multiplied by the noise enhancement factor (F) of the avalanche photodiode (APD). Electronic noise from the amplifier and digitizer is added. The amplifier response function shapes the signal and induces correlation in the noise. The data analysis routine fits a characteristic pulse to the digitized signals from the amplifier, giving the integrated scattered signals. A finite digitization rate loses information and can cause numerical integration error. We find a formula for the variance of the scattered signals in terms of the background and pulse amplitudes, and three calibration constants. The constants are measured easily under operating conditions, resulting in accurate estimation of the scattered signals' uncertainty. We measure F ≈ 3 for our APDs, in agreement with other measurements for similar APDs. This value is wavelength-independent, simplifying analysis. The correlated noise we observe is reproduced well using a Gaussian response function. Numerical integration error can be made negligible by using an interpolated characteristic pulse, allowing digitization rates as low as the detector bandwidth. The effect of background noise is also determined

  13. Local yield stress statistics in model amorphous solids

    Science.gov (United States)

    Barbot, Armand; Lerbinger, Matthias; Hernandez-Garcia, Anier; García-García, Reinaldo; Falk, Michael L.; Vandembroucq, Damien; Patinet, Sylvain

    2018-03-01

    We develop and extend a method presented by Patinet, Vandembroucq, and Falk [Phys. Rev. Lett. 117, 045501 (2016), 10.1103/PhysRevLett.117.045501] to compute the local yield stresses at the atomic scale in model two-dimensional Lennard-Jones glasses produced via differing quench protocols. This technique allows us to sample the plastic rearrangements in a nonperturbative manner for different loading directions on a well-controlled length scale. Plastic activity upon shearing correlates strongly with the locations of low yield stresses in the quenched states. This correlation is higher in more structurally relaxed systems. The distribution of local yield stresses is also shown to strongly depend on the quench protocol: the more relaxed the glass, the higher the local plastic thresholds. Analysis of the magnitude of local plastic relaxations reveals that stress drops follow exponential distributions, justifying the hypothesis of an average characteristic amplitude often conjectured in mesoscopic or continuum models. The amplitude of the local plastic rearrangements increases on average with the yield stress, regardless of the system preparation. The local yield stress varies with the shear orientation tested and strongly correlates with the plastic rearrangement locations when the system is sheared correspondingly. It is thus argued that plastic rearrangements are the consequence of shear transformation zones encoded in the glass structure that possess weak slip planes along different orientations. Finally, we justify the length scale employed in this work and extract the yield threshold statistics as a function of the size of the probing zones. This method makes it possible to derive physically grounded models of plasticity for amorphous materials by directly revealing the relevant details of the shear transformation zones that mediate this process.

  14. Multivariate statistical models for disruption prediction at ASDEX Upgrade

    International Nuclear Information System (INIS)

    Aledda, R.; Cannas, B.; Fanni, A.; Sias, G.; Pautasso, G.

    2013-01-01

    In this paper, a disruption prediction system for ASDEX Upgrade has been proposed that does not require disruption terminated experiments to be implemented. The system consists of a data-based model, which is built using only few input signals coming from successfully terminated pulses. A fault detection and isolation approach has been used, where the prediction is based on the analysis of the residuals of an auto regressive exogenous input model. The prediction performance of the proposed system is encouraging when it is applied to the same set of campaigns used to implement the model. However, the false alarms significantly increase when we tested the system on discharges coming from experimental campaigns temporally far from those used to train the model. This is due to the well know aging effect inherent in the data-based models. The main advantage of the proposed method, with respect to other data-based approaches in literature, is that it does not need data on experiments terminated with a disruption, as it uses a normal operating conditions model. This is a big advantage in the prospective of a prediction system for ITER, where a limited number of disruptions can be allowed

  15. Statistical power analysis a simple and general model for traditional and modern hypothesis tests

    CERN Document Server

    Murphy, Kevin R; Wolach, Allen

    2014-01-01

    Noted for its accessible approach, this text applies the latest approaches of power analysis to both null hypothesis and minimum-effect testing using the same basic unified model. Through the use of a few simple procedures and examples, the authors show readers with little expertise in statistical analysis how to obtain the values needed to carry out the power analysis for their research. Illustrations of how these analyses work and how they can be used to choose the appropriate criterion for defining statistically significant outcomes are sprinkled throughout. The book presents a simple and g

  16. Statistical modeling for visualization evaluation through data fusion.

    Science.gov (United States)

    Chen, Xiaoyu; Jin, Ran

    2017-11-01

    There is a high demand of data visualization providing insights to users in various applications. However, a consistent, online visualization evaluation method to quantify mental workload or user preference is lacking, which leads to an inefficient visualization and user interface design process. Recently, the advancement of interactive and sensing technologies makes the electroencephalogram (EEG) signals, eye movements as well as visualization logs available in user-centered evaluation. This paper proposes a data fusion model and the application procedure for quantitative and online visualization evaluation. 15 participants joined the study based on three different visualization designs. The results provide a regularized regression model which can accurately predict the user's evaluation of task complexity, and indicate the significance of all three types of sensing data sets for visualization evaluation. This model can be widely applied to data visualization evaluation, and other user-centered designs evaluation and data analysis in human factors and ergonomics. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Comparison of Artificial Neural Networks and ARIMA statistical models in simulations of target wind time series

    Science.gov (United States)

    Kolokythas, Kostantinos; Vasileios, Salamalikis; Athanassios, Argiriou; Kazantzidis, Andreas

    2015-04-01

    The wind is a result of complex interactions of numerous mechanisms taking place in small or large scales, so, the better knowledge of its behavior is essential in a variety of applications, especially in the field of power production coming from wind turbines. In the literature there is a considerable number of models, either physical or statistical ones, dealing with the problem of simulation and prediction of wind speed. Among others, Artificial Neural Networks (ANNs) are widely used for the purpose of wind forecasting and, in the great majority of cases, outperform other conventional statistical models. In this study, a number of ANNs with different architectures, which have been created and applied in a dataset of wind time series, are compared to Auto Regressive Integrated Moving Average (ARIMA) statistical models. The data consist of mean hourly wind speeds coming from a wind farm on a hilly Greek region and cover a period of one year (2013). The main goal is to evaluate the models ability to simulate successfully the wind speed at a significant point (target). Goodness-of-fit statistics are performed for the comparison of the different methods. In general, the ANN showed the best performance in the estimation of wind speed prevailing over the ARIMA models.

  18. Using continuous time stochastic modelling and nonparametric statistics to improve the quality of first principles models

    DEFF Research Database (Denmark)

    A methodology is presented that combines modelling based on first principles and data based modelling into a modelling cycle that facilitates fast decision-making based on statistical methods. A strong feature of this methodology is that given a first principles model along with process data......, the corresponding modelling cycle model of the given system for a given purpose. A computer-aided tool, which integrates the elements of the modelling cycle, is also presented, and an example is given of modelling a fed-batch bioreactor....

  19. Numerical and Qualitative Contrasts of Two Statistical Models for Water Quality Change in Tidal Waters

    Science.gov (United States)

    Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and...

  20. Statistical modelling of monthly mean sea level at coastal tide gauge stations along the Indian subcontinent

    Digital Repository Service at National Institute of Oceanography (India)

    Srinivas, K.; Das, V.K.; DineshKumar, P.K.

    This study investigates the suitability of statistical models for their predictive potential for the monthly mean sea level at different stations along the west and east coasts of the Indian subcontinent. Statistical modelling of the monthly mean...

  1. Statistical model for prediction of hearing loss in patients receiving cisplatin chemotherapy.

    Science.gov (United States)

    Johnson, Andrew; Tarima, Sergey; Wong, Stuart; Friedland, David R; Runge, Christina L

    2013-03-01

    This statistical model might be used to predict cisplatin-induced hearing loss, particularly in patients undergoing concomitant radiotherapy. To create a statistical model based on pretreatment hearing thresholds to provide an individual probability for hearing loss from cisplatin therapy and, secondarily, to investigate the use of hearing classification schemes as predictive tools for hearing loss. Retrospective case-control study. Tertiary care medical center. A total of 112 subjects receiving chemotherapy and audiometric evaluation were evaluated for the study. Of these subjects, 31 met inclusion criteria for analysis. The primary outcome measurement was a statistical model providing the probability of hearing loss following the use of cisplatin chemotherapy. Fifteen of the 31 subjects had significant hearing loss following cisplatin chemotherapy. American Academy of Otolaryngology-Head and Neck Society and Gardner-Robertson hearing classification schemes revealed little change in hearing grades between pretreatment and posttreatment evaluations for subjects with or without hearing loss. The Chang hearing classification scheme could effectively be used as a predictive tool in determining hearing loss with a sensitivity of 73.33%. Pretreatment hearing thresholds were used to generate a statistical model, based on quadratic approximation, to predict hearing loss (C statistic = 0.842, cross-validated = 0.835). The validity of the model improved when only subjects who received concurrent head and neck irradiation were included in the analysis (C statistic = 0.91). A calculated cutoff of 0.45 for predicted probability has a cross-validated sensitivity and specificity of 80%. Pretreatment hearing thresholds can be used as a predictive tool for cisplatin-induced hearing loss, particularly with concomitant radiotherapy.

  2. Statistically accurate low-order models for uncertainty quantification in turbulent dynamical systems.

    Science.gov (United States)

    Sapsis, Themistoklis P; Majda, Andrew J

    2013-08-20

    A framework for low-order predictive statistical modeling and uncertainty quantification in turbulent dynamical systems is developed here. These reduced-order, modified quasilinear Gaussian (ROMQG) algorithms apply to turbulent dynamical systems in which there is significant linear instability or linear nonnormal dynamics in the unperturbed system and energy-conserving nonlinear interactions that transfer energy from the unstable modes to the stable modes where dissipation occurs, resulting in a statistical steady state; such turbulent dynamical systems are ubiquitous in geophysical and engineering turbulence. The ROMQG method involves constructing a low-order, nonlinear, dynamical system for the mean and covariance statistics in the reduced subspace that has the unperturbed statistics as a stable fixed point and optimally incorporates the indirect effect of non-Gaussian third-order statistics for the unperturbed system in a systematic calibration stage. This calibration procedure is achieved through information involving only the mean and covariance statistics for the unperturbed equilibrium. The performance of the ROMQG algorithm is assessed on two stringent test cases: the 40-mode Lorenz 96 model mimicking midlatitude atmospheric turbulence and two-layer baroclinic models for high-latitude ocean turbulence with over 125,000 degrees of freedom. In the Lorenz 96 model, the ROMQG algorithm with just a single mode captures the transient response to random or deterministic forcing. For the baroclinic ocean turbulence models, the inexpensive ROMQG algorithm with 252 modes, less than 0.2% of the total, captures the nonlinear response of the energy, the heat flux, and even the one-dimensional energy and heat flux spectra.

  3. Statistical physics of medical diagnostics: Study of a probabilistic model.

    Science.gov (United States)

    Mashaghi, Alireza; Ramezanpour, Abolfazl

    2018-03-01

    We study a diagnostic strategy which is based on the anticipation of the diagnostic process by simulation of the dynamical process starting from the initial findings. We show that such a strategy could result in more accurate diagnoses compared to a strategy that is solely based on the direct implications of the initial observations. We demonstrate this by employing the mean-field approximation of statistical physics to compute the posterior disease probabilities for a given subset of observed signs (symptoms) in a probabilistic model of signs and diseases. A Monte Carlo optimization algorithm is then used to maximize an objective function of the sequence of observations, which favors the more decisive observations resulting in more polarized disease probabilities. We see how the observed signs change the nature of the macroscopic (Gibbs) states of the sign and disease probability distributions. The structure of these macroscopic states in the configuration space of the variables affects the quality of any approximate inference algorithm (so the diagnostic performance) which tries to estimate the sign-disease marginal probabilities. In particular, we find that the simulation (or extrapolation) of the diagnostic process is helpful when the disease landscape is not trivial and the system undergoes a phase transition to an ordered phase.

  4. Statistical physics of medical diagnostics: Study of a probabilistic model

    Science.gov (United States)

    Mashaghi, Alireza; Ramezanpour, Abolfazl

    2018-03-01

    We study a diagnostic strategy which is based on the anticipation of the diagnostic process by simulation of the dynamical process starting from the initial findings. We show that such a strategy could result in more accurate diagnoses compared to a strategy that is solely based on the direct implications of the initial observations. We demonstrate this by employing the mean-field approximation of statistical physics to compute the posterior disease probabilities for a given subset of observed signs (symptoms) in a probabilistic model of signs and diseases. A Monte Carlo optimization algorithm is then used to maximize an objective function of the sequence of observations, which favors the more decisive observations resulting in more polarized disease probabilities. We see how the observed signs change the nature of the macroscopic (Gibbs) states of the sign and disease probability distributions. The structure of these macroscopic states in the configuration space of the variables affects the quality of any approximate inference algorithm (so the diagnostic performance) which tries to estimate the sign-disease marginal probabilities. In particular, we find that the simulation (or extrapolation) of the diagnostic process is helpful when the disease landscape is not trivial and the system undergoes a phase transition to an ordered phase.

  5. Increased Statistical Efficiency in a Lognormal Mean Model

    Directory of Open Access Journals (Sweden)

    Grant H. Skrepnek

    2014-01-01

    Full Text Available Within the context of clinical and other scientific research, a substantial need exists for an accurate determination of the point estimate in a lognormal mean model, given that highly skewed data are often present. As such, logarithmic transformations are often advocated to achieve the assumptions of parametric statistical inference. Despite this, existing approaches that utilize only a sample’s mean and variance may not necessarily yield the most efficient estimator. The current investigation developed and tested an improved efficient point estimator for a lognormal mean by capturing more complete information via the sample’s coefficient of variation. Results of an empirical simulation study across varying sample sizes and population standard deviations indicated relative improvements in efficiency of up to 129.47 percent compared to the usual maximum likelihood estimator and up to 21.33 absolute percentage points above the efficient estimator presented by Shen and colleagues (2006. The relative efficiency of the proposed estimator increased particularly as a function of decreasing sample size and increasing population standard deviation.

  6. A statistical model of a metallic inclusion in semiconducting media

    International Nuclear Information System (INIS)

    Shikin, V. B.

    2016-01-01

    The properties of an isolated multicharged atom embedded into a semiconducting medium are discussed. The analysis generalizes the results of the known Thomas–Fermi theory for a multicharged (Z ≫ 1) atom in vacuum when it is immersed into an electron–hole gas of finite temperature. The Thomas–Fermi–Debye (TFD) atom problem is directly related to the properties of donors in low-doped semiconductors and is alternative in its conclusions to the ideal scenario of dissociation of donors. In the existing ideal statistics, an individual donor under infinitely low doping is completely ionized (a charged center does not hold its neutralizing counter-ions). A Thomas–Fermi–Debye atom (briefly, a TFD donor) remains a neutral formation that holds its screening “coat” even for infinitely low doping level, i.e., in the region of n_dλ_0"3 ≪ 1, where n_d is the concentration of the doping impurity and λ_0 is the Debye length with the parameters of intrinsic semiconductor. Various observed consequences in the behavior of a TFD donor are discussed that allow one to judge the reality of the implications of the TFD donor model.

  7. A statistical model of a metallic inclusion in semiconducting media

    Energy Technology Data Exchange (ETDEWEB)

    Shikin, V. B., E-mail: shikin@issp.ac.ru [Russian Academy of Sciences, Institute of Solid State Physics (Russian Federation)

    2016-11-15

    The properties of an isolated multicharged atom embedded into a semiconducting medium are discussed. The analysis generalizes the results of the known Thomas–Fermi theory for a multicharged (Z ≫ 1) atom in vacuum when it is immersed into an electron–hole gas of finite temperature. The Thomas–Fermi–Debye (TFD) atom problem is directly related to the properties of donors in low-doped semiconductors and is alternative in its conclusions to the ideal scenario of dissociation of donors. In the existing ideal statistics, an individual donor under infinitely low doping is completely ionized (a charged center does not hold its neutralizing counter-ions). A Thomas–Fermi–Debye atom (briefly, a TFD donor) remains a neutral formation that holds its screening “coat” even for infinitely low doping level, i.e., in the region of n{sub d}λ{sub 0}{sup 3} ≪ 1, where n{sub d} is the concentration of the doping impurity and λ{sub 0} is the Debye length with the parameters of intrinsic semiconductor. Various observed consequences in the behavior of a TFD donor are discussed that allow one to judge the reality of the implications of the TFD donor model.

  8. Dataset of coded handwriting features for use in statistical modelling

    Directory of Open Access Journals (Sweden)

    Anna Agius

    2018-02-01

    Full Text Available The data presented here is related to the article titled, “Using handwriting to infer a writer's country of origin for forensic intelligence purposes” (Agius et al., 2017 [1]. This article reports original writer, spatial and construction characteristic data for thirty-seven English Australian11 In this study, English writers were Australians whom had learnt to write in New South Wales (NSW. writers and thirty-seven Vietnamese writers. All of these characteristics were coded and recorded in Microsoft Excel 2013 (version 15.31. The construction characteristics coded were only extracted from seven characters, which were: ‘g’, ‘h’, ‘th’, ‘M’, ‘0’, ‘7’ and ‘9’. The coded format of the writer, spatial and construction characteristics is made available in this Data in Brief in order to allow others to perform statistical analyses and modelling to investigate whether there is a relationship between the handwriting features and the nationality of the writer, and whether the two nationalities can be differentiated. Furthermore, to employ mathematical techniques that are capable of characterising the extracted features from each participant.

  9. Hierarchical statistical modeling of xylem vulnerability to cavitation.

    Science.gov (United States)

    Ogle, Kiona; Barber, Jarrett J; Willson, Cynthia; Thompson, Brenda

    2009-01-01

    Cavitation of xylem elements diminishes the water transport capacity of plants, and quantifying xylem vulnerability to cavitation is important to understanding plant function. Current approaches to analyzing hydraulic conductivity (K) data to infer vulnerability to cavitation suffer from problems such as the use of potentially unrealistic vulnerability curves, difficulty interpreting parameters in these curves, a statistical framework that ignores sampling design, and an overly simplistic view of uncertainty. This study illustrates how two common curves (exponential-sigmoid and Weibull) can be reparameterized in terms of meaningful parameters: maximum conductivity (k(sat)), water potential (-P) at which percentage loss of conductivity (PLC) =X% (P(X)), and the slope of the PLC curve at P(X) (S(X)), a 'sensitivity' index. We provide a hierarchical Bayesian method for fitting the reparameterized curves to K(H) data. We illustrate the method using data for roots and stems of two populations of Juniperus scopulorum and test for differences in k(sat), P(X), and S(X) between different groups. Two important results emerge from this study. First, the Weibull model is preferred because it produces biologically realistic estimates of PLC near P = 0 MPa. Second, stochastic embolisms contribute an important source of uncertainty that should be included in such analyses.

  10. Statistical models for thermal ageing of steel materials in nuclear power plants

    International Nuclear Information System (INIS)

    Persoz, M.

    1996-01-01

    Some category of steel materials in nuclear power plants may be subjected to thermal ageing, whose extent depends on the steel chemical composition and the ageing parameters, i.e. temperature and duration. This ageing affects the 'impact strength' of the materials, which is a mechanical property. In order to assess the residual lifetime of these components, a probabilistic study has been launched, which takes into account the scatter over the input parameters of the mechanical model. Predictive formulae for estimating the impact strength of aged materials are important input data of the model. A data base has been created with impact strength results obtained from an ageing program in laboratory and statistical treatments have been undertaken. Two kinds of model have been developed, with non linear regression methods (PROC NLIN, available in SAS/STAT). The first one, using a hyperbolic tangent function, is partly based on physical considerations, and the second one, of an exponential type, is purely statistically built. The difficulties consist in selecting the significant parameters and attributing initial values to the coefficients, which is a requirement of the NLIN procedure. This global statistical analysis has led to general models that are unction of the chemical variables and the ageing parameters. These models are as precise (if not more) as local models that had been developed earlier for some specific values of ageing temperature and ageing duration. This paper describes the data and the methodology used to build the models and analyses the results given by the SAS system. (author)

  11. Short-Term Solar Irradiance Forecasting Model Based on Artificial Neural Network Using Statistical Feature Parameters

    Directory of Open Access Journals (Sweden)

    Hongshan Zhao

    2012-05-01

    Full Text Available Short-term solar irradiance forecasting (STSIF is of great significance for the optimal operation and power predication of grid-connected photovoltaic (PV plants. However, STSIF is very complex to handle due to the random and nonlinear characteristics of solar irradiance under changeable weather conditions. Artificial Neural Network (ANN is suitable for STSIF modeling and many research works on this topic are presented, but the conciseness and robustness of the existing models still need to be improved. After discussing the relation between weather variations and irradiance, the characteristics of the statistical feature parameters of irradiance under different weather conditions are figured out. A novel ANN model using statistical feature parameters (ANN-SFP for STSIF is proposed in this paper. The input vector is reconstructed with several statistical feature parameters of irradiance and ambient temperature. Thus sufficient information can be effectively extracted from relatively few inputs and the model complexity is reduced. The model structure is determined by cross-validation (CV, and the Levenberg-Marquardt algorithm (LMA is used for the network training. Simulations are carried out to validate and compare the proposed model with the conventional ANN model using historical data series (ANN-HDS, and the results indicated that the forecast accuracy is obviously improved under variable weather conditions.

  12. Poisson statistics application in modelling of neutron detection

    International Nuclear Information System (INIS)

    Avdic, S.; Marinkovic, P.

    1996-01-01

    The main purpose of this study is taking into account statistical analysis of the experimental data which were measured by 3 He neutron spectrometer. The unfolding method based on principle of maximum likelihood incorporates the Poisson approximation of counting statistics applied (aithor)

  13. Maximum entropy principle and hydrodynamic models in statistical mechanics

    International Nuclear Information System (INIS)

    Trovato, M.; Reggiani, L.

    2012-01-01

    This review presents the state of the art of the maximum entropy principle (MEP) in its classical and quantum (QMEP) formulation. Within the classical MEP we overview a general theory able to provide, in a dynamical context, the macroscopic relevant variables for carrier transport in the presence of electric fields of arbitrary strength. For the macroscopic variables the linearized maximum entropy approach is developed including full-band effects within a total energy scheme. Under spatially homogeneous conditions, we construct a closed set of hydrodynamic equations for the small-signal (dynamic) response of the macroscopic variables. The coupling between the driving field and the energy dissipation is analyzed quantitatively by using an arbitrary number of moments of the distribution function. Analogously, the theoretical approach is applied to many one-dimensional n + nn + submicron Si structures by using different band structure models, different doping profiles, different applied biases and is validated by comparing numerical calculations with ensemble Monte Carlo simulations and with available experimental data. Within the quantum MEP we introduce a quantum entropy functional of the reduced density matrix, the principle of quantum maximum entropy is then asserted as fundamental principle of quantum statistical mechanics. Accordingly, we have developed a comprehensive theoretical formalism to construct rigorously a closed quantum hydrodynamic transport within a Wigner function approach. The theory is formulated both in thermodynamic equilibrium and nonequilibrium conditions, and the quantum contributions are obtained by only assuming that the Lagrange multipliers can be expanded in powers of ħ 2 , being ħ the reduced Planck constant. In particular, by using an arbitrary number of moments, we prove that: i) on a macroscopic scale all nonlocal effects, compatible with the uncertainty principle, are imputable to high-order spatial derivatives both of the

  14. A Statistical Approach For Modeling Tropical Cyclones. Synthetic Hurricanes Generator Model

    Energy Technology Data Exchange (ETDEWEB)

    Pasqualini, Donatella [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-05-11

    This manuscript brie y describes a statistical ap- proach to generate synthetic tropical cyclone tracks to be used in risk evaluations. The Synthetic Hur- ricane Generator (SynHurG) model allows model- ing hurricane risk in the United States supporting decision makers and implementations of adaptation strategies to extreme weather. In the literature there are mainly two approaches to model hurricane hazard for risk prediction: deterministic-statistical approaches, where the storm key physical parameters are calculated using physi- cal complex climate models and the tracks are usually determined statistically from historical data; and sta- tistical approaches, where both variables and tracks are estimated stochastically using historical records. SynHurG falls in the second category adopting a pure stochastic approach.

  15. Quantitative Analysis of Probabilistic Models of Software Product Lines with Statistical Model Checking

    DEFF Research Database (Denmark)

    ter Beek, Maurice H.; Legay, Axel; Lluch Lafuente, Alberto

    2015-01-01

    We investigate the suitability of statistical model checking techniques for analysing quantitative properties of software product line models with probabilistic aspects. For this purpose, we enrich the feature-oriented language FLAN with action rates, which specify the likelihood of exhibiting pa...

  16. Computational algebraic geometry for statistical modeling FY09Q2 progress.

    Energy Technology Data Exchange (ETDEWEB)

    Thompson, David C.; Rojas, Joseph Maurice; Pebay, Philippe Pierre

    2009-03-01

    This is a progress report on polynomial system solving for statistical modeling. This is a progress report on polynomial system solving for statistical modeling. This quarter we have developed our first model of shock response data and an algorithm for identifying the chamber cone containing a polynomial system in n variables with n+k terms within polynomial time - a significant improvement over previous algorithms, all having exponential worst-case complexity. We have implemented and verified the chamber cone algorithm for n+3 and are working to extend the implementation to handle arbitrary k. Later sections of this report explain chamber cones in more detail; the next section provides an overview of the project and how the current progress fits into it.

  17. A statistical model for horizontal mass flux of erodible soil

    International Nuclear Information System (INIS)

    Babiker, A.G.A.G.; Eltayeb, I.A.; Hassan, M.H.A.

    1986-11-01

    It is shown that the mass flux of erodible soil transported horizontally by a statistically distributed wind flow has a statistical distribution. Explicit expression for the probability density function, p.d.f., of the flux is derived for the case in which the wind speed has a Weibull distribution. The statistical distribution for a mass flux characterized by a generalized Bagnold formula is found to be Weibull for the case of zero threshold speed. Analytic and numerical values for the average horizontal mass flux of soil are obtained for various values of wind parameters, by evaluating the first moment of the flux density function. (author)

  18. Statistical shear lag model - unraveling the size effect in hierarchical composites.

    Science.gov (United States)

    Wei, Xiaoding; Filleter, Tobin; Espinosa, Horacio D

    2015-05-01

    Numerous experimental and computational studies have established that the hierarchical structures encountered in natural materials, such as the brick-and-mortar structure observed in sea shells, are essential for achieving defect tolerance. Due to this hierarchy, the mechanical properties of natural materials have a different size dependence compared to that of typical engineered materials. This study aimed to explore size effects on the strength of bio-inspired staggered hierarchical composites and to define the influence of the geometry of constituents in their outstanding defect tolerance capability. A statistical shear lag model is derived by extending the classical shear lag model to account for the statistics of the constituents' strength. A general solution emerges from rigorous mathematical derivations, unifying the various empirical formulations for the fundamental link length used in previous statistical models. The model shows that the staggered arrangement of constituents grants composites a unique size effect on mechanical strength in contrast to homogenous continuous materials. The model is applied to hierarchical yarns consisting of double-walled carbon nanotube bundles to assess its predictive capabilities for novel synthetic materials. Interestingly, the model predicts that yarn gauge length does not significantly influence the yarn strength, in close agreement with experimental observations. Copyright © 2015 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

  19. Modelling malaria treatment practices in Bangladesh using spatial statistics

    Directory of Open Access Journals (Sweden)

    Haque Ubydul

    2012-03-01

    Full Text Available Abstract Background Malaria treatment-seeking practices vary worldwide and Bangladesh is no exception. Individuals from 88 villages in Rajasthali were asked about their treatment-seeking practices. A portion of these households preferred malaria treatment from the National Control Programme, but still a large number of households continued to use drug vendors and approximately one fourth of the individuals surveyed relied exclusively on non-control programme treatments. The risks of low-control programme usage include incomplete malaria treatment, possible misuse of anti-malarial drugs, and an increased potential for drug resistance. Methods The spatial patterns of treatment-seeking practices were first examined using hot-spot analysis (Local Getis-Ord Gi statistic and then modelled using regression. Ordinary least squares (OLS regression identified key factors explaining more than 80% of the variation in control programme and vendor treatment preferences. Geographically weighted regression (GWR was then used to assess where each factor was a strong predictor of treatment-seeking preferences. Results Several factors including tribal affiliation, housing materials, household densities, education levels, and proximity to the regional urban centre, were found to be effective predictors of malaria treatment-seeking preferences. The predictive strength of each of these factors, however, varied across the study area. While education, for example, was a strong predictor in some villages, it was less important for predicting treatment-seeking outcomes in other villages. Conclusion Understanding where each factor is a strong predictor of treatment-seeking outcomes may help in planning targeted interventions aimed at increasing control programme usage. Suggested strategies include providing additional training for the Building Resources across Communities (BRAC health workers, implementing educational programmes, and addressing economic factors.

  20. Statistical behaviour of adaptive multilevel splitting algorithms in simple models

    International Nuclear Information System (INIS)

    Rolland, Joran; Simonnet, Eric

    2015-01-01

    Adaptive multilevel splitting algorithms have been introduced rather recently for estimating tail distributions in a fast and efficient way. In particular, they can be used for computing the so-called reactive trajectories corresponding to direct transitions from one metastable state to another. The algorithm is based on successive selection–mutation steps performed on the system in a controlled way. It has two intrinsic parameters, the number of particles/trajectories and the reaction coordinate used for discriminating good or bad trajectories. We investigate first the convergence in law of the algorithm as a function of the timestep for several simple stochastic models. Second, we consider the average duration of reactive trajectories for which no theoretical predictions exist. The most important aspect of this work concerns some systems with two degrees of freedom. They are studied in detail as a function of the reaction coordinate in the asymptotic regime where the number of trajectories goes to infinity. We show that during phase transitions, the statistics of the algorithm deviate significatively from known theoretical results when using non-optimal reaction coordinates. In this case, the variance of the algorithm is peaking at the transition and the convergence of the algorithm can be much slower than the usual expected central limit behaviour. The duration of trajectories is affected as well. Moreover, reactive trajectories do not correspond to the most probable ones. Such behaviour disappears when using the optimal reaction coordinate called committor as predicted by the theory. We finally investigate a three-state Markov chain which reproduces this phenomenon and show logarithmic convergence of the trajectory durations

  1. Test the Overall Significance of p-values by Using Joint Tail Probability of Ordered p-values as Test Statistic

    NARCIS (Netherlands)

    Fang, Yongxiang; Wit, Ernst

    2008-01-01

    Fisher’s combined probability test is the most commonly used method to test the overall significance of a set independent p-values. However, it is very obviously that Fisher’s statistic is more sensitive to smaller p-values than to larger p-value and a small p-value may overrule the other p-values

  2. SpaSM: A MATLAB Toolbox for Sparse Statistical Modeling

    DEFF Research Database (Denmark)

    Sjöstrand, Karl; Clemmensen, Line Harder; Larsen, Rasmus

    2018-01-01

    Applications in biotechnology such as gene expression analysis and image processing have led to a tremendous development of statistical methods with emphasis on reliable solutions to severely underdetermined systems. Furthermore, interpretations of such solutions are of importance, meaning...

  3. Improving statistical reasoning: theoretical models and practical implications

    National Research Council Canada - National Science Library

    Sedlmeier, Peter

    1999-01-01

    ... in Psychology? 206 References 216 Author Index 230 Subject Index 235 v PrefacePreface Statistical literacy, the art of drawing reasonable inferences from an abundance of numbers provided daily by...

  4. Subset Statistics in the linear IV regression model

    NARCIS (Netherlands)

    Kleibergen, F.R.

    2005-01-01

    We show that the limiting distributions of subset generalizations of the weak instrument robust instrumental variable statistics are boundedly similar when the remaining structural parameters are estimated using maximum likelihood. They are bounded from above by the limiting distributions which

  5. The Importance of Integrating Clinical Relevance and Statistical Significance in the Assessment of Quality of Care--Illustrated Using the Swedish Stroke Register.

    Directory of Open Access Journals (Sweden)

    Anita Lindmark

    Full Text Available When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance.The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008-2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method.Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252 and high specificity (0.991. There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence.The study emphasizes the importance of combining clinical relevance and level of statistical

  6. The Importance of Integrating Clinical Relevance and Statistical Significance in the Assessment of Quality of Care--Illustrated Using the Swedish Stroke Register.

    Science.gov (United States)

    Lindmark, Anita; van Rompaye, Bart; Goetghebeur, Els; Glader, Eva-Lotta; Eriksson, Marie

    2016-01-01

    When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke) to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance. The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008-2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method. Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252) and high specificity (0.991). There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence. The study emphasizes the importance of combining clinical relevance and level of statistical confidence when

  7. Visualization of the variability of 3D statistical shape models by animation.

    Science.gov (United States)

    Lamecker, Hans; Seebass, Martin; Lange, Thomas; Hege, Hans-Christian; Deuflhard, Peter

    2004-01-01

    Models of the 3D shape of anatomical objects and the knowledge about their statistical variability are of great benefit in many computer assisted medical applications like images analysis, therapy or surgery planning. Statistical model of shapes have successfully been applied to automate the task of image segmentation. The generation of 3D statistical shape models requires the identification of corresponding points on two shapes. This remains a difficult problem, especially for shapes of complicated topology. In order to interpret and validate variations encoded in a statistical shape model, visual inspection is of great importance. This work describes the generation and interpretation of statistical shape models of the liver and the pelvic bone.

  8. Computational and Statistical Models: A Comparison for Policy Modeling of Childhood Obesity

    Science.gov (United States)

    Mabry, Patricia L.; Hammond, Ross; Ip, Edward Hak-Sing; Huang, Terry T.-K.

    As systems science methodologies have begun to emerge as a set of innovative approaches to address complex problems in behavioral, social science, and public health research, some apparent conflicts with traditional statistical methodologies for public health have arisen. Computational modeling is an approach set in context that integrates diverse sources of data to test the plausibility of working hypotheses and to elicit novel ones. Statistical models are reductionist approaches geared towards proving the null hypothesis. While these two approaches may seem contrary to each other, we propose that they are in fact complementary and can be used jointly to advance solutions to complex problems. Outputs from statistical models can be fed into computational models, and outputs from computational models can lead to further empirical data collection and statistical models. Together, this presents an iterative process that refines the models and contributes to a greater understanding of the problem and its potential solutions. The purpose of this panel is to foster communication and understanding between statistical and computational modelers. Our goal is to shed light on the differences between the approaches and convey what kinds of research inquiries each one is best for addressing and how they can serve complementary (and synergistic) roles in the research process, to mutual benefit. For each approach the panel will cover the relevant "assumptions" and how the differences in what is assumed can foster misunderstandings. The interpretations of the results from each approach will be compared and contrasted and the limitations for each approach will be delineated. We will use illustrative examples from CompMod, the Comparative Modeling Network for Childhood Obesity Policy. The panel will also incorporate interactive discussions with the audience on the issues raised here.

  9. Field significance of performance measures in the context of regional climate model evaluation. Part 1: temperature

    Science.gov (United States)

    Ivanov, Martin; Warrach-Sagi, Kirsten; Wulfmeyer, Volker

    2018-04-01

    A new approach for rigorous spatial analysis of the downscaling performance of regional climate model (RCM) simulations is introduced. It is based on a multiple comparison of the local tests at the grid cells and is also known as "field" or "global" significance. New performance measures for estimating the added value of downscaled data relative to the large-scale forcing fields are developed. The methodology is exemplarily applied to a standard EURO-CORDEX hindcast simulation with the Weather Research and Forecasting (WRF) model coupled with the land surface model NOAH at 0.11 ∘ grid resolution. Monthly temperature climatology for the 1990-2009 period is analysed for Germany for winter and summer in comparison with high-resolution gridded observations from the German Weather Service. The field significance test controls the proportion of falsely rejected local tests in a meaningful way and is robust to spatial dependence. Hence, the spatial patterns of the statistically significant local tests are also meaningful. We interpret them from a process-oriented perspective. In winter and in most regions in summer, the downscaled distributions are statistically indistinguishable from the observed ones. A systematic cold summer bias occurs in deep river valleys due to overestimated elevations, in coastal areas due probably to enhanced sea breeze circulation, and over large lakes due to the interpolation of water temperatures. Urban areas in concave topography forms have a warm summer bias due to the strong heat islands, not reflected in the observations. WRF-NOAH generates appropriate fine-scale features in the monthly temperature field over regions of complex topography, but over spatially homogeneous areas even small biases can lead to significant deteriorations relative to the driving reanalysis. As the added value of global climate model (GCM)-driven simulations cannot be smaller than this perfect-boundary estimate, this work demonstrates in a rigorous manner the

  10. An ensemble Kalman filter for statistical estimation of physics constrained nonlinear regression models

    International Nuclear Information System (INIS)

    Harlim, John; Mahdi, Adam; Majda, Andrew J.

    2014-01-01

    A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model

  11. Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

    Energy Technology Data Exchange (ETDEWEB)

    Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van' t [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands)

    2012-03-15

    Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.

  12. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models.

    Science.gov (United States)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A

    2012-03-15

    To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright © 2012 Elsevier Inc. All rights reserved.

  13. Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

    International Nuclear Information System (INIS)

    Xu Chengjian; Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van’t

    2012-01-01

    Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.

  14. Statistical modeling of optical attenuation measurements in continental fog conditions

    Science.gov (United States)

    Khan, Muhammad Saeed; Amin, Muhammad; Awan, Muhammad Saleem; Minhas, Abid Ali; Saleem, Jawad; Khan, Rahimdad

    2017-03-01

    Free-space optics is an innovative technology that uses atmosphere as a propagation medium to provide higher data rates. These links are heavily affected by atmospheric channel mainly because of fog and clouds that act to scatter and even block the modulated beam of light from reaching the receiver end, hence imposing severe attenuation. A comprehensive statistical study of the fog effects and deep physical understanding of the fog phenomena are very important for suggesting improvements (reliability and efficiency) in such communication systems. In this regard, 6-months real-time measured fog attenuation data are considered and statistically investigated. A detailed statistical analysis related to each fog event for that period is presented; the best probability density functions are selected on the basis of Akaike information criterion, while the estimates of unknown parameters are computed by maximum likelihood estimation technique. The results show that most fog attenuation events follow normal mixture distribution and some follow the Weibull distribution.

  15. Information Geometric Complexity of a Trivariate Gaussian Statistical Model

    Directory of Open Access Journals (Sweden)

    Domenico Felice

    2014-05-01

    Full Text Available We evaluate the information geometric complexity of entropic motion on low-dimensional Gaussian statistical manifolds in order to quantify how difficult it is to make macroscopic predictions about systems in the presence of limited information. Specifically, we observe that the complexity of such entropic inferences not only depends on the amount of available pieces of information but also on the manner in which such pieces are correlated. Finally, we uncover that, for certain correlational structures, the impossibility of reaching the most favorable configuration from an entropic inference viewpoint seems to lead to an information geometric analog of the well-known frustration effect that occurs in statistical physics.

  16. An R2 statistic for fixed effects in the linear mixed model.

    Science.gov (United States)

    Edwards, Lloyd J; Muller, Keith E; Wolfinger, Russell D; Qaqish, Bahjat F; Schabenberger, Oliver

    2008-12-20

    Statisticians most often use the linear mixed model to analyze Gaussian longitudinal data. The value and familiarity of the R(2) statistic in the linear univariate model naturally creates great interest in extending it to the linear mixed model. We define and describe how to compute a model R(2) statistic for the linear mixed model by using only a single model. The proposed R(2) statistic measures multivariate association between the repeated outcomes and the fixed effects in the linear mixed model. The R(2) statistic arises as a 1-1 function of an appropriate F statistic for testing all fixed effects (except typically the intercept) in a full model. The statistic compares the full model with a null model with all fixed effects deleted (except typically the intercept) while retaining exactly the same covariance structure. Furthermore, the R(2) statistic leads immediately to a natural definition of a partial R(2) statistic. A mixed model in which ethnicity gives a very small p-value as a longitudinal predictor of blood pressure (BP) compellingly illustrates the value of the statistic. In sharp contrast to the extreme p-value, a very small R(2) , a measure of statistical and scientific importance, indicates that ethnicity has an almost negligible association with the repeated BP outcomes for the study.

  17. Statistical model of stress corrosion cracking based on extended ...

    Indian Academy of Sciences (India)

    2016-09-07

    Sep 7, 2016 ... Abstract. In the previous paper (Pramana – J. Phys. 81(6), 1009 (2013)), the mechanism of stress corrosion cracking (SCC) based on non-quadratic form of Dirichlet energy was proposed and its statistical features were discussed. Following those results, we discuss here how SCC propagates on pipe wall ...

  18. Bayesian spatial modelling and the significance of agricultural land use to scrub typhus infection in Taiwan.

    Science.gov (United States)

    Wardrop, Nicola A; Kuo, Chi-Chien; Wang, Hsi-Chieh; Clements, Archie C A; Lee, Pei-Fen; Atkinson, Peter M

    2013-11-01

    Scrub typhus is transmitted by the larval stage of trombiculid mites. Environmental factors, including land cover and land use, are known to influence breeding and survival of trombiculid mites and, thus, also the spatial heterogeneity of scrub typhus risk. Here, a spatially autoregressive modelling framework was applied to scrub typhus incidence data from Taiwan, covering the period 2003 to 2011, to provide increased understanding of the spatial pattern of scrub typhus risk and the environmental and socioeconomic factors contributing to this pattern. A clear spatial pattern in scrub typhus incidence was observed within Taiwan, and incidence was found to be significantly correlated with several land cover classes, temperature, elevation, normalized difference vegetation index, rainfall, population density, average income and the proportion of the population that work in agriculture. The final multivariate regression model included statistically significant correlations between scrub typhus incidence and average income (negatively correlated), the proportion of land that contained mosaics of cropland and vegetation (positively correlated) and elevation (positively correlated). These results highlight the importance of land cover on scrub typhus incidence: mosaics of cropland and vegetation represent a transitional land cover type which can provide favourable habitats for rodents and, therefore, trombiculid mites. In Taiwan, these transitional land cover areas tend to occur in less populated and mountainous areas, following the frontier establishment and subsequent partial abandonment of agricultural cultivation, due to demographic and socioeconomic changes. Future land use policy decision-making should ensure that potential public health outcomes, such as modified risk of scrub typhus, are considered.

  19. Ten Years of Cloud Properties from MODIS: Global Statistics and Use in Climate Model Evaluation

    Science.gov (United States)

    Platnick, Steven E.

    2011-01-01

    The NASA Moderate Resolution Imaging Spectroradiometer (MODIS), launched onboard the Terra and Aqua spacecrafts, began Earth observations on February 24, 2000 and June 24,2002, respectively. Among the algorithms developed and applied to this sensor, a suite of cloud products includes cloud masking/detection, cloud-top properties (temperature, pressure), and optical properties (optical thickness, effective particle radius, water path, and thermodynamic phase). All cloud algorithms underwent numerous changes and enhancements between for the latest Collection 5 production version; this process continues with the current Collection 6 development. We will show example MODIS Collection 5 cloud climatologies derived from global spatial . and temporal aggregations provided in the archived gridded Level-3 MODIS atmosphere team product (product names MOD08 and MYD08 for MODIS Terra and Aqua, respectively). Data sets in this Level-3 product include scalar statistics as well as 1- and 2-D histograms of many cloud properties, allowing for higher order information and correlation studies. In addition to these statistics, we will show trends and statistical significance in annual and seasonal means for a variety of the MODIS cloud properties, as well as the time required for detection given assumed trends. To assist in climate model evaluation, we have developed a MODIS cloud simulator with an accompanying netCDF file containing subsetted monthly Level-3 statistical data sets that correspond to the simulator output. Correlations of cloud properties with ENSO offer the potential to evaluate model cloud sensitivity; initial results will be discussed.

  20. Analytical model of SiPM time resolution and order statistics with crosstalk

    International Nuclear Information System (INIS)

    Vinogradov, S.

    2015-01-01

    Time resolution is the most important parameter of photon detectors in a wide range of time-of-flight and time correlation applications within the areas of high energy physics, medical imaging, and others. Silicon photomultipliers (SiPM) have been initially recognized as perfect photon-number-resolving detectors; now they also provide outstanding results in the scintillator timing resolution. However, crosstalk and afterpulsing introduce false secondary non-Poissonian events, and SiPM time resolution models are experiencing significant difficulties with that. This study presents an attempt to develop an analytical model of the timing resolution of an SiPM taking into account statistics of secondary events resulting from a crosstalk. Two approaches have been utilized to derive an analytical expression for time resolution: the first one based on statistics of independent identically distributed detection event times and the second one based on order statistics of these times. The first approach is found to be more straightforward and “analytical-friendly” to model analog SiPMs. Comparisons of coincidence resolving times predicted by the model with the known experimental results from a LYSO:Ce scintillator and a Hamamatsu MPPC are presented

  1. Analytical model of SiPM time resolution and order statistics with crosstalk

    Energy Technology Data Exchange (ETDEWEB)

    Vinogradov, S., E-mail: Sergey.Vinogradov@liverpool.ac.uk [University of Liverpool and Cockcroft Institute, Sci-Tech Daresbury, Keckwick Lane, Warrington WA4 4AD (United Kingdom); P.N. Lebedev Physical Institute of the Russian Academy of Sciences, 119991 Leninskiy Prospekt 53, Moscow (Russian Federation)

    2015-07-01

    Time resolution is the most important parameter of photon detectors in a wide range of time-of-flight and time correlation applications within the areas of high energy physics, medical imaging, and others. Silicon photomultipliers (SiPM) have been initially recognized as perfect photon-number-resolving detectors; now they also provide outstanding results in the scintillator timing resolution. However, crosstalk and afterpulsing introduce false secondary non-Poissonian events, and SiPM time resolution models are experiencing significant difficulties with that. This study presents an attempt to develop an analytical model of the timing resolution of an SiPM taking into account statistics of secondary events resulting from a crosstalk. Two approaches have been utilized to derive an analytical expression for time resolution: the first one based on statistics of independent identically distributed detection event times and the second one based on order statistics of these times. The first approach is found to be more straightforward and “analytical-friendly” to model analog SiPMs. Comparisons of coincidence resolving times predicted by the model with the known experimental results from a LYSO:Ce scintillator and a Hamamatsu MPPC are presented.

  2. Parameter discovery in stochastic biological models using simulated annealing and statistical model checking.

    Science.gov (United States)

    Hussain, Faraz; Jha, Sumit K; Jha, Susmit; Langmead, Christopher J

    2014-01-01

    Stochastic models are increasingly used to study the behaviour of biochemical systems. While the structure of such models is often readily available from first principles, unknown quantitative features of the model are incorporated into the model as parameters. Algorithmic discovery of parameter values from experimentally observed facts remains a challenge for the computational systems biology community. We present a new parameter discovery algorithm that uses simulated annealing, sequential hypothesis testing, and statistical model checking to learn the parameters in a stochastic model. We apply our technique to a model of glucose and insulin metabolism used for in-silico validation of artificial pancreata and demonstrate its effectiveness by developing parallel CUDA-based implementation for parameter synthesis in this model.

  3. Statistical thermodynamics

    International Nuclear Information System (INIS)

    Lim, Gyeong Hui

    2008-03-01

    This book consists of 15 chapters, which are basic conception and meaning of statistical thermodynamics, Maxwell-Boltzmann's statistics, ensemble, thermodynamics function and fluctuation, statistical dynamics with independent particle system, ideal molecular system, chemical equilibrium and chemical reaction rate in ideal gas mixture, classical statistical thermodynamics, ideal lattice model, lattice statistics and nonideal lattice model, imperfect gas theory on liquid, theory on solution, statistical thermodynamics of interface, statistical thermodynamics of a high molecule system and quantum statistics

  4. Statistical modeling of total crash frequency at highway intersections

    Directory of Open Access Journals (Sweden)

    Arash M. Roshandeh

    2016-04-01

    Full Text Available Intersection-related crashes are associated with high proportion of accidents involving drivers, occupants, pedestrians, and cyclists. In general, the purpose of intersection safety analysis is to determine the impact of safety-related variables on pedestrians, cyclists and vehicles, so as to facilitate the design of effective and efficient countermeasure strategies to improve safety at intersections. This study investigates the effects of traffic, environmental, intersection geometric and pavement-related characteristics on total crash frequencies at intersections. A random-parameter Poisson model was used with crash data from 357 signalized intersections in Chicago from 2004 to 2010. The results indicate that out of the identified factors, evening peak period traffic volume, pavement condition, and unlighted intersections have the greatest effects on crash frequencies. Overall, the results seek to suggest that, in order to improve effective highway-related safety countermeasures at intersections, significant attention must be focused on ensuring that pavements are adequately maintained and intersections should be well lighted. It needs to be mentioned that, projects could be implemented at and around the study intersections during the study period (7 years, which could affect the crash frequency over the time. This is an important variable which could be a part of the future studies to investigate the impacts of safety-related works at intersections and their marginal effects on crash frequency at signalized intersections.

  5. The joint space-time statistics of macroweather precipitation, space-time statistical factorization and macroweather models

    International Nuclear Information System (INIS)

    Lovejoy, S.; Lima, M. I. P. de

    2015-01-01

    Over the range of time scales from about 10 days to 30–100 years, in addition to the familiar weather and climate regimes, there is an intermediate “macroweather” regime characterized by negative temporal fluctuation exponents: implying that fluctuations tend to cancel each other out so that averages tend to converge. We show theoretically and numerically that macroweather precipitation can be modeled by a stochastic weather-climate model (the Climate Extended Fractionally Integrated Flux, model, CEFIF) first proposed for macroweather temperatures and we show numerically that a four parameter space-time CEFIF model can approximately reproduce eight or so empirical space-time exponents. In spite of this success, CEFIF is theoretically and numerically difficult to manage. We therefore propose a simplified stochastic model in which the temporal behavior is modeled as a fractional Gaussian noise but the spatial behaviour as a multifractal (climate) cascade: a spatial extension of the recently introduced ScaLIng Macroweather Model, SLIMM. Both the CEFIF and this spatial SLIMM model have a property often implicitly assumed by climatologists that climate statistics can be “homogenized” by normalizing them with the standard deviation of the anomalies. Physically, it means that the spatial macroweather variability corresponds to different climate zones that multiplicatively modulate the local, temporal statistics. This simplified macroweather model provides a framework for macroweather forecasting that exploits the system's long range memory and spatial correlations; for it, the forecasting problem has been solved. We test this factorization property and the model with the help of three centennial, global scale precipitation products that we analyze jointly in space and in time

  6. Statistical Uncertainty Quantification of Physical Models during Reflood of LBLOCA

    Energy Technology Data Exchange (ETDEWEB)

    Oh, Deog Yeon; Seul, Kwang Won; Woo, Sweng Woong [Korea Institute of Nuclear Safety, Daejeon (Korea, Republic of)

    2015-05-15

    The use of the best-estimate (BE) computer codes in safety analysis for loss-of-coolant accident (LOCA) is the major trend in many countries to reduce the significant conservatism. A key feature of this BE evaluation requires the licensee to quantify the uncertainty of the calculations. So, it is very important how to determine the uncertainty distribution before conducting the uncertainty evaluation. Uncertainty includes those of physical model and correlation, plant operational parameters, and so forth. The quantification process is often performed mainly by subjective expert judgment or obtained from reference documents of computer code. In this respect, more mathematical methods are needed to reasonably determine the uncertainty ranges. The first uncertainty quantification are performed with the various increments for two influential uncertainty parameters to get the calculated responses and their derivatives. The different data set with two influential uncertainty parameters for FEBA tests, are chosen applying more strict criteria for selecting responses and their derivatives, which may be considered as the user’s effect in the CIRCÉ applications. Finally, three influential uncertainty parameters are considered to study the effect on the number of uncertainty parameters due to the limitation of CIRCÉ method. With the determined uncertainty ranges, uncertainty evaluations for FEBA tests are performed to check whether the experimental responses such as the cladding temperature or pressure drop are inside the limits of calculated uncertainty bounds. A confirmation step will be performed to evaluate the quality of the information in the case of the different reflooding PERICLES experiments. The uncertainty ranges of physical model in MARS-KS thermal-hydraulic code during the reflooding were quantified by CIRCÉ method using FEBA experiment tests, instead of expert judgment. Also, through the uncertainty evaluation for FEBA and PERICLES tests, it was confirmed

  7. Statistical Studies of Mesoscale Forecast Models MM5 and WRF

    National Research Council Canada - National Science Library

    Henmi, Teizi

    2004-01-01

    ... models were carried out and the results were compared with surface observation data. Both models tended to overforecast temperature and dew-point temperature, although the correlation coefficients between forecast and observations were fairly high...

  8. Cross-Lingual Lexical Triggers in Statistical Language Modeling

    National Research Council Canada - National Science Library

    Kim, Woosung; Khudanpur, Sanjeev

    2003-01-01

    .... We achieve this through an extension of the method of lexical triggers to the cross-language problem, and by developing a likelihoodbased adaptation scheme for combining a trigger model with an N-gram model...

  9. Central Limit Theorem for Exponentially Quasi-local Statistics of Spin Models on Cayley Graphs

    Science.gov (United States)

    Reddy, Tulasi Ram; Vadlamani, Sreekar; Yogeshwaran, D.

    2018-04-01

    Central limit theorems for linear statistics of lattice random fields (including spin models) are usually proven under suitable mixing conditions or quasi-associativity. Many interesting examples of spin models do not satisfy mixing conditions, and on the other hand, it does not seem easy to show central limit theorem for local statistics via quasi-associativity. In this work, we prove general central limit theorems for local statistics and exponentially quasi-local statistics of spin models on discrete Cayley graphs with polynomial growth. Further, we supplement these results by proving similar central limit theorems for random fields on discrete Cayley graphs taking values in a countable space, but under the stronger assumptions of α -mixing (for local statistics) and exponential α -mixing (for exponentially quasi-local statistics). All our central limit theorems assume a suitable variance lower bound like many others in the literature. We illustrate our general central limit theorem with specific examples of lattice spin models and statistics arising in computational topology, statistical physics and random networks. Examples of clustering spin models include quasi-associated spin models with fast decaying covariances like the off-critical Ising model, level sets of Gaussian random fields with fast decaying covariances like the massive Gaussian free field and determinantal point processes with fast decaying kernels. Examples of local statistics include intrinsic volumes, face counts, component counts of random cubical complexes while exponentially quasi-local statistics include nearest neighbour distances in spin models and Betti numbers of sub-critical random cubical complexes.

  10. A Comparison of Item Fit Statistics for Mixed IRT Models

    Science.gov (United States)

    Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B.

    2010-01-01

    In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…

  11. A statistical skull geometry model for children 0-3 years old.

    Directory of Open Access Journals (Sweden)

    Zhigang Li

    Full Text Available Head injury is the leading cause of fatality and long-term disability for children. Pediatric heads change rapidly in both size and shape during growth, especially for children under 3 years old (YO. To accurately assess the head injury risks for children, it is necessary to understand the geometry of the pediatric head and how morphologic features influence injury causation within the 0-3 YO population. In this study, head CT scans from fifty-six 0-3 YO children were used to develop a statistical model of pediatric skull geometry. Geometric features important for injury prediction, including skull size and shape, skull thickness and suture width, along with their variations among the sample population, were quantified through a series of image and statistical analyses. The size and shape of the pediatric skull change significantly with age and head circumference. The skull thickness and suture width vary with age, head circumference and location, which will have important effects on skull stiffness and injury prediction. The statistical geometry model developed in this study can provide a geometrical basis for future development of child anthropomorphic test devices and pediatric head finite element models.

  12. A statistical skull geometry model for children 0-3 years old.

    Science.gov (United States)

    Li, Zhigang; Park, Byoung-Keon; Liu, Weiguo; Zhang, Jinhuan; Reed, Matthew P; Rupp, Jonathan D; Hoff, Carrie N; Hu, Jingwen

    2015-01-01

    Head injury is the leading cause of fatality and long-term disability for children. Pediatric heads change rapidly in both size and shape during growth, especially for children under 3 years old (YO). To accurately assess the head injury risks for children, it is necessary to understand the geometry of the pediatric head and how morphologic features influence injury causation within the 0-3 YO population. In this study, head CT scans from fifty-six 0-3 YO children were used to develop a statistical model of pediatric skull geometry. Geometric features important for injury prediction, including skull size and shape, skull thickness and suture width, along with their variations among the sample population, were quantified through a series of image and statistical analyses. The size and shape of the pediatric skull change significantly with age and head circumference. The skull thickness and suture width vary with age, head circumference and location, which will have important effects on skull stiffness and injury prediction. The statistical geometry model developed in this study can provide a geometrical basis for future development of child anthropomorphic test devices and pediatric head finite element models.

  13. Addressing economic development goals through innovative teaching of university statistics: a case study of statistical modelling in Nigeria

    Science.gov (United States)

    Oseloka Ezepue, Patrick; Ojo, Adegbola

    2012-12-01

    A challenging problem in some developing countries such as Nigeria is inadequate training of students in effective problem solving using the core concepts of their disciplines. Related to this is a disconnection between their learning and socio-economic development agenda of a country. These problems are more vivid in statistical education which is dominated by textbook examples and unbalanced assessment 'for' and 'of' learning within traditional curricula. The problems impede the achievement of socio-economic development objectives such as those stated in the Nigerian Vision 2020 blueprint and United Nations Millennium Development Goals. They also impoverish the ability of (statistics) graduates to creatively use their knowledge in relevant business and industry sectors, thereby exacerbating mass graduate unemployment in Nigeria and similar developing countries. This article uses a case study in statistical modelling to discuss the nature of innovations in statistics education vital to producing new kinds of graduates who can link their learning to national economic development goals, create wealth and alleviate poverty through (self) employment. Wider implications of the innovations for repositioning mathematical sciences education globally are explored in this article.

  14. Statistical approach to LHCD modeling using the wave kinetic equation

    International Nuclear Information System (INIS)

    Kupfer, K.; Moreau, D.; Litaudon, X.

    1993-04-01

    Recent work has shown that for parameter regimes typical of many present day current drive experiments, the orbits of the launched LH rays are chaotic (in the Hamiltonian sense), so that wave energy diffuses through the stochastic layer and fills the spectral gap. We have analyzed this problem using a statistical approach, by solving the wave kinetic equation for the coarse-grained spectral energy density. An interesting result is that the LH absorption profile is essentially independent of both the total injected power and the level of wave stochastic diffusion

  15. AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models

    DEFF Research Database (Denmark)

    Fournier, David A.; Skaug, Hans J.; Ancheta, Johnoel

    2011-01-01

    Many criteria for statistical parameter estimation, such as maximum likelihood, are formulated as a nonlinear optimization problem.Automatic Differentiation Model Builder (ADMB) is a programming framework based on automatic differentiation, aimed at highly nonlinear models with a large number...... of such a feature is the generic implementation of Laplace approximation of high-dimensional integrals for use in latent variable models. We also review the literature in which ADMB has been used, and discuss future development of ADMB as an open source project. Overall, the main advantages ofADMB are flexibility...

  16. Statistical modeling of competitive threshold collision-induced dissociation

    Science.gov (United States)

    Rodgers, M. T.; Armentrout, P. B.

    1998-08-01

    Collision-induced dissociation of (R1OH)Li+(R2OH) with xenon is studied using guided ion beam mass spectrometry. R1OH and R2OH include the following molecules: water, methanol, ethanol, 1-propanol, 2-propanol, and 1-butanol. In all cases, the primary products formed correspond to endothermic loss of one of the neutral alcohols, with minor products that include those formed by ligand exchange and loss of both ligands. The cross-section thresholds are interpreted to yield 0 and 298 K bond energies for (R1OH)Li+-R2OH and relative Li+ binding affinities of the R1OH and R2OH ligands after accounting for the effects of multiple ion-molecule collisions, internal energy of the reactant ions, and dissociation lifetimes. We introduce a means to simultaneously analyze the cross sections for these competitive dissociations using statistical theories to predict the energy dependent branching ratio. Thermochemistry in good agreement with previous work is obtained in all cases. In essence, this statistical approach provides a detailed means of correcting for the "competitive shift" inherent in multichannel processes.

  17. Bias in iterative reconstruction of low-statistics PET data: benefits of a resolution model

    Energy Technology Data Exchange (ETDEWEB)

    Walker, M D; Asselin, M-C; Julyan, P J; Feldmann, M; Matthews, J C [School of Cancer and Enabling Sciences, Wolfson Molecular Imaging Centre, MAHSC, University of Manchester, Manchester M20 3LJ (United Kingdom); Talbot, P S [Mental Health and Neurodegeneration Research Group, Wolfson Molecular Imaging Centre, MAHSC, University of Manchester, Manchester M20 3LJ (United Kingdom); Jones, T, E-mail: matthew.walker@manchester.ac.uk [Academic Department of Radiation Oncology, Christie Hospital, University of Manchester, Manchester M20 4BX (United Kingdom)

    2011-02-21

    Iterative image reconstruction methods such as ordered-subset expectation maximization (OSEM) are widely used in PET. Reconstructions via OSEM are however reported to be biased for low-count data. We investigated this and considered the impact for dynamic PET. Patient listmode data were acquired in [{sup 11}C]DASB and [{sup 15}O]H{sub 2}O scans on the HRRT brain PET scanner. These data were subsampled to create many independent, low-count replicates. The data were reconstructed and the images from low-count data were compared to the high-count originals (from the same reconstruction method). This comparison enabled low-statistics bias to be calculated for the given reconstruction, as a function of the noise-equivalent counts (NEC). Two iterative reconstruction methods were tested, one with and one without an image-based resolution model (RM). Significant bias was observed when reconstructing data of low statistical quality, for both subsampled human and simulated data. For human data, this bias was substantially reduced by including a RM. For [{sup 11}C]DASB the low-statistics bias in the caudate head at 1.7 M NEC (approx. 30 s) was -5.5% and -13% with and without RM, respectively. We predicted biases in the binding potential of -4% and -10%. For quantification of cerebral blood flow for the whole-brain grey- or white-matter, using [{sup 15}O]H{sub 2}O and the PET autoradiographic method, a low-statistics bias of <2.5% and <4% was predicted for reconstruction with and without the RM. The use of a resolution model reduces low-statistics bias and can hence be beneficial for quantitative dynamic PET.

  18. Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.

    Science.gov (United States)

    Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J

    2016-10-03

    Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.

  19. A Statistical Model for Natural Gas Standardized Load Profiles

    Czech Academy of Sciences Publication Activity Database

    Brabec, Marek; Konár, Ondřej; Malý, Marek; Pelikán, Emil; Vondráček, Jiří

    2009-01-01

    Roč. 58, č. 1 (2009), s. 123-139 ISSN 0035-9254 R&D Projects: GA AV ČR 1ET400300513 Institutional research plan: CEZ:AV0Z10300504 Keywords : disaggregation * generalized additive models * multiplicative model * non-linear effects * segmentation * semiparametric regression model Subject RIV: JE - Non-nuclear Energetics, Energy Consumption ; Use Impact factor: 1.060, year: 2009

  20. Carrier Statistics and Quantum Capacitance Models of Graphene Nanoscroll

    Directory of Open Access Journals (Sweden)

    M. Khaledian

    2014-01-01

    schematic perfect scroll-like Archimedes spiral. The DOS model was derived at first, while it was later applied to compute the carrier concentration and quantum capacitance model. Furthermore, the carrier concentration and quantum capacitance were modeled for both degenerate and nondegenerate regimes, along with examining the effect of structural parameters and chirality number on the density of state and carrier concentration. Latterly, the temperature effect on the quantum capacitance was studied too.