A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data
Directory of Open Access Journals (Sweden)
Maria Vinaixa
2012-10-01
Full Text Available Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
Maric, Marija; de Haan, Else; Hogendoorn, Sanne M; Wolters, Lidewij H; Huizenga, Hilde M
2015-03-01
Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a data-analytic method to analyze univariate (i.e., one symptom) single-case data using the common package SPSS. This method can help the clinical researcher to investigate whether an intervention works as compared with a baseline period or another intervention type, and to determine whether symptom improvement is clinically significant. First, we describe the statistical method in a conceptual way and show how it can be implemented in SPSS. Simulation studies were performed to determine the number of observation points required per intervention phase. Second, to illustrate this method and its implications, we present a case study of an adolescent with anxiety disorders treated with cognitive-behavioral therapy techniques in an outpatient psychotherapy clinic, whose symptoms were regularly assessed before each session. We provide a description of the data analyses and results of this case study. Finally, we discuss the advantages and shortcomings of the proposed method. Copyright © 2014. Published by Elsevier Ltd.
Hohn, M. Ed; Nuhfer, E.B.; Vinopal, R.J.; Klanderman, D.S.
1980-01-01
Classifying very fine-grained rocks through fabric elements provides information about depositional environments, but is subject to the biases of visual taxonomy. To evaluate the statistical significance of an empirical classification of very fine-grained rocks, samples from Devonian shales in four cored wells in West Virginia and Virginia were measured for 15 variables: quartz, illite, pyrite and expandable clays determined by X-ray diffraction; total sulfur, organic content, inorganic carbon, matrix density, bulk density, porosity, silt, as well as density, sonic travel time, resistivity, and ??-ray response measured from well logs. The four lithologic types comprised: (1) sharply banded shale, (2) thinly laminated shale, (3) lenticularly laminated shale, and (4) nonbanded shale. Univariate and multivariate analyses of variance showed that the lithologic classification reflects significant differences for the variables measured, difference that can be detected independently of stratigraphic effects. Little-known statistical methods found useful in this work included: the multivariate analysis of variance with more than one effect, simultaneous plotting of samples and variables on canonical variates, and the use of parametric ANOVA and MANOVA on ranked data. ?? 1980 Plenum Publishing Corporation.
Comparison of multivariate and univariate statistical process control and monitoring methods
International Nuclear Information System (INIS)
Leger, R.P.; Garland, WM.J.; Macgregor, J.F.
1996-01-01
Work in recent years has lead to the development of multivariate process monitoring schemes which use Principal Component Analysis (PCA). This research compares the performance of a univariate scheme and a multivariate PCA scheme used for monitoring a simple process with 11 measured variables. The multivariate PCA scheme was able to adequately represent the process using two principal components. This resulted in a PCA monitoring scheme which used two charts as opposed to 11 charts for the univariate scheme and therefore had distinct advantages in terms of both data representation, presentation, and fault diagnosis capabilities. (author)
International Nuclear Information System (INIS)
Weathers, J.B.; Luck, R.; Weathers, J.W.
2009-01-01
The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.
Energy Technology Data Exchange (ETDEWEB)
Weathers, J.B. [Shock, Noise, and Vibration Group, Northrop Grumman Shipbuilding, P.O. Box 149, Pascagoula, MS 39568 (United States)], E-mail: James.Weathers@ngc.com; Luck, R. [Department of Mechanical Engineering, Mississippi State University, 210 Carpenter Engineering Building, P.O. Box ME, Mississippi State, MS 39762-5925 (United States)], E-mail: Luck@me.msstate.edu; Weathers, J.W. [Structural Analysis Group, Northrop Grumman Shipbuilding, P.O. Box 149, Pascagoula, MS 39568 (United States)], E-mail: Jeffrey.Weathers@ngc.com
2009-11-15
The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.
Univariate normalization of bispectrum using Hölder's inequality.
Shahbazi, Forooz; Ewald, Arne; Nolte, Guido
2014-08-15
Considering that many biological systems including the brain are complex non-linear systems, suitable methods capable of detecting these non-linearities are required to study the dynamical properties of these systems. One of these tools is the third order cummulant or cross-bispectrum, which is a measure of interfrequency interactions between three signals. For convenient interpretation, interaction measures are most commonly normalized to be independent of constant scales of the signals such that its absolute values are bounded by one, with this limit reflecting perfect coupling. Although many different normalization factors for cross-bispectra were suggested in the literature these either do not lead to bounded measures or are themselves dependent on the coupling and not only on the scale of the signals. In this paper we suggest a normalization factor which is univariate, i.e., dependent only on the amplitude of each signal and not on the interactions between signals. Using a generalization of Hölder's inequality it is proven that the absolute value of this univariate bicoherence is bounded by zero and one. We compared three widely used normalizations to the univariate normalization concerning the significance of bicoherence values gained from resampling tests. Bicoherence values are calculated from real EEG data recorded in an eyes closed experiment from 10 subjects. The results show slightly more significant values for the univariate normalization but in general, the differences are very small or even vanishing in some subjects. Therefore, we conclude that the normalization factor does not play an important role in the bicoherence values with regard to statistical power, although a univariate normalization is the only normalization factor which fulfills all the required conditions of a proper normalization. Copyright © 2014 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
John E. Lavery
2012-10-01
Full Text Available We present evidence that one can calculate generically combinatorially expensive Lp and lp averages, 0 < p < 1, in polynomial time by restricting the data to come from a wide class of statistical distributions. Our approach differs from the approaches in the previous literature, which are based on a priori sparsity requirements or on accepting a local minimum as a replacement for a global minimum. The functionals by which Lp averages are calculated are not convex but are radially monotonic and the functionals by which lp averages are calculated are nearly so, which are the keys to solvability in polynomial time. Analytical results for symmetric, radially monotonic univariate distributions are presented. An algorithm for univariate lp averaging is presented. Computational results for a Gaussian distribution, a class of symmetric heavy-tailed distributions and a class of asymmetric heavy-tailed distributions are presented. Many phenomena in human-based areas are increasingly known to be represented by data that have large numbers of outliers and belong to very heavy-tailed distributions. When tails of distributions are so heavy that even medians (L1 and l1 averages do not exist, one needs to consider using lp minimization principles with 0 < p < 1.
Papageorgiou, Spyridon N; Kloukos, Dimitrios; Petridis, Haralampos; Pandis, Nikolaos
2015-10-01
To assess the hypothesis that there is excessive reporting of statistically significant studies published in prosthodontic and implantology journals, which could indicate selective publication. The last 30 issues of 9 journals in prosthodontics and implant dentistry were hand-searched for articles with statistical analyses. The percentages of significant and non-significant results were tabulated by parameter of interest. Univariable/multivariable logistic regression analyses were applied to identify possible predictors of reporting statistically significance findings. The results of this study were compared with similar studies in dentistry with random-effects meta-analyses. From the 2323 included studies 71% of them reported statistically significant results, with the significant results ranging from 47% to 86%. Multivariable modeling identified that geographical area and involvement of statistician were predictors of statistically significant results. Compared to interventional studies, the odds that in vitro and observational studies would report statistically significant results was increased by 1.20 times (OR: 2.20, 95% CI: 1.66-2.92) and 0.35 times (OR: 1.35, 95% CI: 1.05-1.73), respectively. The probability of statistically significant results from randomized controlled trials was significantly lower compared to various study designs (difference: 30%, 95% CI: 11-49%). Likewise the probability of statistically significant results in prosthodontics and implant dentistry was lower compared to other dental specialties, but this result did not reach statistical significant (P>0.05). The majority of studies identified in the fields of prosthodontics and implant dentistry presented statistically significant results. The same trend existed in publications of other specialties in dentistry. Copyright © 2015 Elsevier Ltd. All rights reserved.
Acceleration techniques in the univariate Lipschitz global optimization
Sergeyev, Yaroslav D.; Kvasov, Dmitri E.; Mukhametzhanov, Marat S.; De Franco, Angela
2016-10-01
Univariate box-constrained Lipschitz global optimization problems are considered in this contribution. Geometric and information statistical approaches are presented. The novel powerful local tuning and local improvement techniques are described in the contribution as well as the traditional ways to estimate the Lipschitz constant. The advantages of the presented local tuning and local improvement techniques are demonstrated using the operational characteristics approach for comparing deterministic global optimization algorithms on the class of 100 widely used test functions.
[A SAS marco program for batch processing of univariate Cox regression analysis for great database].
Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin
2015-02-01
To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.
Handbook of univariate and multivariate data analysis with IBM SPSS
Ho, Robert
2013-01-01
Using the same accessible, hands-on approach as its best-selling predecessor, the Handbook of Univariate and Multivariate Data Analysis with IBM SPSS, Second Edition explains how to apply statistical tests to experimental findings, identify the assumptions underlying the tests, and interpret the findings. This second edition now covers more topics and has been updated with the SPSS statistical package for Windows.New to the Second EditionThree new chapters on multiple discriminant analysis, logistic regression, and canonical correlationNew section on how to deal with missing dataCoverage of te
Riad, Safaa M.; Salem, Hesham; Elbalkiny, Heba T.; Khattab, Fatma I.
2015-04-01
Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p = 0.05.
Evaluation of droplet size distributions using univariate and multivariate approaches
DEFF Research Database (Denmark)
Gauno, M.H.; Larsen, C.C.; Vilhelmsen, T.
2013-01-01
of the distribution. The current study was aiming to compare univariate and multivariate approach in evaluating droplet size distributions. As a model system, the atomization of a coating solution from a two-fluid nozzle was investigated. The effect of three process parameters (concentration of ethyl cellulose...... in ethanol, atomizing air pressure, and flow rate of coating solution) on the droplet size and droplet size distribution using a full mixed factorial design was used. The droplet size produced by a two-fluid nozzle was measured by laser diffraction and reported as volume based size distribution....... Investigation of loading and score plots from principal component analysis (PCA) revealed additional information on the droplet size distributions and it was possible to identify univariate statistics (volume median droplet size), which were similar, however, originating from varying droplet size distributions...
Evaluation of droplet size distributions using univariate and multivariate approaches.
Gaunø, Mette Høg; Larsen, Crilles Casper; Vilhelmsen, Thomas; Møller-Sonnergaard, Jørn; Wittendorff, Jørgen; Rantanen, Jukka
2013-01-01
Pharmaceutically relevant material characteristics are often analyzed based on univariate descriptors instead of utilizing the whole information available in the full distribution. One example is droplet size distribution, which is often described by the median droplet size and the width of the distribution. The current study was aiming to compare univariate and multivariate approach in evaluating droplet size distributions. As a model system, the atomization of a coating solution from a two-fluid nozzle was investigated. The effect of three process parameters (concentration of ethyl cellulose in ethanol, atomizing air pressure, and flow rate of coating solution) on the droplet size and droplet size distribution using a full mixed factorial design was used. The droplet size produced by a two-fluid nozzle was measured by laser diffraction and reported as volume based size distribution. Investigation of loading and score plots from principal component analysis (PCA) revealed additional information on the droplet size distributions and it was possible to identify univariate statistics (volume median droplet size), which were similar, however, originating from varying droplet size distributions. The multivariate data analysis was proven to be an efficient tool for evaluating the full information contained in a distribution.
Cain, Meghan K; Zhang, Zhiyong; Yuan, Ke-Hai
2017-10-01
Nonnormality of univariate data has been extensively examined previously (Blanca et al., Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78-84, 2013; Miceeri, Psychological Bulletin, 105(1), 156, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74 % of univariate distributions and 68 % multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17 % in a t-test and 30 % in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances. To facilitate future report of skewness and kurtosis, we provide a tutorial on how to compute univariate and multivariate skewness and kurtosis by SAS, SPSS, R and a newly developed Web application.
Statistical significance of cis-regulatory modules
Directory of Open Access Journals (Sweden)
Smith Andrew D
2007-01-01
Full Text Available Abstract Background It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. Results We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. Conclusion The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM and MODSTORM software.
A comparison of bivariate and univariate QTL mapping in livestock populations
Directory of Open Access Journals (Sweden)
Sorensen Daniel
2003-11-01
Full Text Available Abstract This study presents a multivariate, variance component-based QTL mapping model implemented via restricted maximum likelihood (REML. The method was applied to investigate bivariate and univariate QTL mapping analyses, using simulated data. Specifically, we report results on the statistical power to detect a QTL and on the precision of parameter estimates using univariate and bivariate approaches. The model and methodology were also applied to study the effectiveness of partitioning the overall genetic correlation between two traits into a component due to many genes of small effect, and one due to the QTL. It is shown that when the QTL has a pleiotropic effect on two traits, a bivariate analysis leads to a higher statistical power of detecting the QTL and to a more precise estimate of the QTL's map position, in particular in the case when the QTL has a small effect on the trait. The increase in power is most marked in cases where the contributions of the QTL and of the polygenic components to the genetic correlation have opposite signs. The bivariate REML analysis can successfully partition the two components contributing to the genetic correlation between traits.
Maric, M.; de Haan, M.; Hogendoorn, S.M.; Wolters, L.H.; Huizenga, H.M.
2015-01-01
Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a
Maric, Marija; de Haan, Else; Hogendoorn, Sanne M.; Wolters, Lidewij H.; Huizenga, Hilde M.
2015-01-01
Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a
Vecchiato, G; De Vico Fallani, F; Astolfi, L; Toppi, J; Cincotti, F; Mattia, D; Salinari, S; Babiloni, F
2010-08-30
This paper presents some considerations about the use of adequate statistical techniques in the framework of the neuroelectromagnetic brain mapping. With the use of advanced EEG/MEG recording setup involving hundred of sensors, the issue of the protection against the type I errors that could occur during the execution of hundred of univariate statistical tests, has gained interest. In the present experiment, we investigated the EEG signals from a mannequin acting as an experimental subject. Data have been collected while performing a neuromarketing experiment and analyzed with state of the art computational tools adopted in specialized literature. Results showed that electric data from the mannequin's head presents statistical significant differences in power spectra during the visualization of a commercial advertising when compared to the power spectra gathered during a documentary, when no adjustments were made on the alpha level of the multiple univariate tests performed. The use of the Bonferroni or Bonferroni-Holm adjustments returned correctly no differences between the signals gathered from the mannequin in the two experimental conditions. An partial sample of recently published literature on different neuroscience journals suggested that at least the 30% of the papers do not use statistical protection for the type I errors. While the occurrence of type I errors could be easily managed with appropriate statistical techniques, the use of such techniques is still not so largely adopted in the literature. Copyright (c) 2010 Elsevier B.V. All rights reserved.
VC-dimension of univariate decision trees.
Yildiz, Olcay Taner
2015-02-01
In this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.
The thresholds for statistical and clinical significance
DEFF Research Database (Denmark)
Jakobsen, Janus Christian; Gluud, Christian; Winkel, Per
2014-01-01
BACKGROUND: Thresholds for statistical significance are insufficiently demonstrated by 95% confidence intervals or P-values when assessing results from randomised clinical trials. First, a P-value only shows the probability of getting a result assuming that the null hypothesis is true and does...... not reflect the probability of getting a result assuming an alternative hypothesis to the null hypothesis is true. Second, a confidence interval or a P-value showing significance may be caused by multiplicity. Third, statistical significance does not necessarily result in clinical significance. Therefore...... of the probability that a given trial result is compatible with a 'null' effect (corresponding to the P-value) divided by the probability that the trial result is compatible with the intervention effect hypothesised in the sample size calculation; (3) adjust the confidence intervals and the statistical significance...
Takayama, Motoharu; Terui, Keita; Oiwa, Yoshitsugu
2012-10-01
Chronic subdural hematoma is common in elderly individuals and surgical procedures are simple. The recurrence rate of chronic subdural hematoma, however, varies from 9.2 to 26.5% after surgery. The authors studied factors of the recurrence using univariate and multivariate analyses in patients with chronic subdural hematoma We retrospectively reviewed 239 consecutive cases of chronic subdural hematoma who received burr-hole surgery with irrigation and closed-system drainage. We analyzed the relationships between recurrence of chronic subdural hematoma and factors such as sex, age, laterality, bleeding tendency, other complicated diseases, density on CT, volume of the hematoma, residual air in the hematoma cavity, use of artificial cerebrospinal fluid. Twenty-one patients (8.8%) experienced a recurrence of chronic subdural hematoma. Multiple logistic regression found that the recurrence rate was higher in patients with a large volume of the residual air, and was lower in patients using artificial cerebrospinal fluid. No statistical differences were found in bleeding tendency. Techniques to reduce the air in the hematoma cavity are important for good outcome in surgery of chronic subdural hematoma. Also, the use of artificial cerebrospinal fluid reduces recurrence of chronic subdural hematoma. The surgical procedures can be the same for patients with bleeding tendencies.
FUNSTAT and statistical image representations
Parzen, E.
1983-01-01
General ideas of functional statistical inference analysis of one sample and two samples, univariate and bivariate are outlined. ONESAM program is applied to analyze the univariate probability distributions of multi-spectral image data.
The insignificance of statistical significance testing
Johnson, Douglas H.
1999-01-01
Despite their use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.
New Riemannian Priors on the Univariate Normal Model
Directory of Open Access Journals (Sweden)
Salem Said
2014-07-01
Full Text Available The current paper introduces new prior distributions on the univariate normal model, with the aim of applying them to the classification of univariate normal populations. These new prior distributions are entirely based on the Riemannian geometry of the univariate normal model, so that they can be thought of as “Riemannian priors”. Precisely, if {pθ ; θ ∈ Θ} is any parametrization of the univariate normal model, the paper considers prior distributions G( θ - , γ with hyperparameters θ - ∈ Θ and γ > 0, whose density with respect to Riemannian volume is proportional to exp(−d2(θ, θ - /2γ2, where d2(θ, θ - is the square of Rao’s Riemannian distance. The distributions G( θ - , γ are termed Gaussian distributions on the univariate normal model. The motivation for considering a distribution G( θ - , γ is that this distribution gives a geometric representation of a class or cluster of univariate normal populations. Indeed, G( θ - , γ has a unique mode θ - (precisely, θ - is the unique Riemannian center of mass of G( θ - , γ, as shown in the paper, and its dispersion away from θ - is given by γ. Therefore, one thinks of members of the class represented by G( θ - , γ as being centered around θ - and lying within a typical distance determined by γ. The paper defines rigorously the Gaussian distributions G( θ - , γ and describes an algorithm for computing maximum likelihood estimates of their hyperparameters. Based on this algorithm and on the Laplace approximation, it describes how the distributions G( θ - , γ can be used as prior distributions for Bayesian classification of large univariate normal populations. In a concrete application to texture image classification, it is shown that this leads to an improvement in performance over the use of conjugate priors.
Significance levels for studies with correlated test statistics.
Shi, Jianxin; Levinson, Douglas F; Whittemore, Alice S
2008-07-01
When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry.
Directory of Open Access Journals (Sweden)
Abdelfattah M. Selim
2018-03-01
Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.
Caveats for using statistical significance tests in research assessments
DEFF Research Database (Denmark)
Schneider, Jesper Wiborg
2013-01-01
controversial and numerous criticisms have been leveled against their use. Based on examples from articles by proponents of the use statistical significance tests in research assessments, we address some of the numerous problems with such tests. The issues specifically discussed are the ritual practice......This article raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization procedures for citation indicators by Opthof and Leydesdorff (2010). Statistical significance tests are highly...... argue that applying statistical significance tests and mechanically adhering to their results are highly problematic and detrimental to critical thinking. We claim that the use of such tests do not provide any advantages in relation to deciding whether differences between citation indicators...
Statistically significant relational data mining :
Energy Technology Data Exchange (ETDEWEB)
Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.
2014-02-01
This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
A consistent framework for Horton regression statistics that leads to a modified Hack's law
Furey, P.R.; Troutman, B.M.
2008-01-01
A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of
Basic elements of computational statistics
Härdle, Wolfgang Karl; Okhrin, Yarema
2017-01-01
This textbook on computational statistics presents tools and concepts of univariate and multivariate statistical data analysis with a strong focus on applications and implementations in the statistical software R. It covers mathematical, statistical as well as programming problems in computational statistics and contains a wide variety of practical examples. In addition to the numerous R sniplets presented in the text, all computer programs (quantlets) and data sets to the book are available on GitHub and referred to in the book. This enables the reader to fully reproduce as well as modify and adjust all examples to their needs. The book is intended for advanced undergraduate and first-year graduate students as well as for data analysts new to the job who would like a tour of the various statistical tools in a data analysis workshop. The experienced reader with a good knowledge of statistics and programming might skip some sections on univariate models and enjoy the various mathematical roots of multivariate ...
Statistical Analysis Of Reconnaissance Geochemical Data From ...
African Journals Online (AJOL)
, Co, Mo, Hg, Sb, Tl, Sc, Cr, Ni, La, W, V, U, Th, Bi, Sr and Ga in 56 stream sediment samples collected from Orle drainage system were subjected to univariate and multivariate statistical analyses. The univariate methods used include ...
Health significance and statistical uncertainty. The value of P-value.
Consonni, Dario; Bertazzi, Pier Alberto
2017-10-27
The P-value is widely used as a summary statistics of scientific results. Unfortunately, there is a widespread tendency to dichotomize its value in "P0.05" ("statistically not significant"), with the former implying a "positive" result and the latter a "negative" one. To show the unsuitability of such an approach when evaluating the effects of environmental and occupational risk factors. We provide examples of distorted use of P-value and of the negative consequences for science and public health of such a black-and-white vision. The rigid interpretation of P-value as a dichotomy favors the confusion between health relevance and statistical significance, discourages thoughtful thinking, and distorts attention from what really matters, the health significance. A much better way to express and communicate scientific results involves reporting effect estimates (e.g., risks, risks ratios or risk differences) and their confidence intervals (CI), which summarize and convey both health significance and statistical uncertainty. Unfortunately, many researchers do not usually consider the whole interval of CI but only examine if it includes the null-value, therefore degrading this procedure to the same P-value dichotomy (statistical significance or not). In reporting statistical results of scientific research present effects estimates with their confidence intervals and do not qualify the P-value as "significant" or "not significant".
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958
Directory of Open Access Journals (Sweden)
Lei Zhang
2009-08-01
Full Text Available As genome-wide association studies (GWAS are becoming more popular, two approaches, among others, could be considered in order to improve statistical power for identifying genes contributing subtle to moderate effects to human diseases. The first approach is to increase sample size, which could be achieved by combining both unrelated and familial subjects together. The second approach is to jointly analyze multiple correlated traits. In this study, by extending generalized estimating equations (GEEs, we propose a simple approach for performing univariate or multivariate association tests for the combined data of unrelated subjects and nuclear families. In particular, we correct for population stratification by integrating principal component analysis and transmission disequilibrium test strategies. The proposed method allows for multiple siblings as well as missing parental information. Simulation studies show that the proposed test has improved power compared to two popular methods, EIGENSTRAT and FBAT, by analyzing the combined data, while correcting for population stratification. In addition, joint analysis of bivariate traits has improved power over univariate analysis when pleiotropic effects are present. Application to the Genetic Analysis Workshop 16 (GAW16 data sets attests to the feasibility and applicability of the proposed method.
Comparison of spectrum normalization techniques for univariate ...
Indian Academy of Sciences (India)
Laser-induced breakdown spectroscopy; univariate study; normalization models; stainless steel; standard error of prediction. Abstract. Analytical performance of six different spectrum normalization techniques, namelyinternal normalization, normalization with total light, normalization with background along with their ...
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi
2017-01-01
High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Directory of Open Access Journals (Sweden)
Kеnan Аsani
2013-07-01
Full Text Available The aim is to establish intergroup multivariant and univariant investigated differences in specific motor space between respondents juniors and seniors members of the Macedonian karate team. The sample of 30 male karate respondents covers juniors on 16,17 and seniors over 18 years.In the research were applied 20 specific motor tests. Based on Graph 1 where it is presented multivariant analysis of variance Manova and Anova can be noted that respondents juniors and seniors, although not belonging to the same population are not different in multivariant understudied area.W. lambda of .19, Rao-wool R - Approximation of 1.91 degrees of freedom df 1 = 20 and df 2 = 9 provides the level of significance of p =, 16. Based on univariant analysis for each variable separately can be seen that has been around intergroup statistically significant difference in seven SMAEGERI (kick in the sack with favoritism leg mae geri for 10 sec., SMAVASI (kick in the sack with favoritism foot mavashi geri by 10 sec., SUSIRO (kick in the sack with favoritism leg ushiro geri for 10 sec., SKIZAME (kick in the sack with favoritism hand kizame cuki for 10 sec., STAPNSR (taping with foot in sagital plane for 15 sec. SUDMNR (hitting a moving target with weaker hand and SUDMPN (hitting a moving target with favoritism foot of twenty applied manifest variables. There are no intergroup differences in multivariant investigated specific - motor space among the respondents juniors and seniors members of the Macedonian karate team. Based on univariant analysis for each variable separately can be seen that has been around intergroup statistically significant difference in seven SMAEGERI (kick in the sack with favoritism leg mae geri for 10 sec., SMAVASI (kick in the sack with favoritism foot mavashi geri by 10 sec., SUSIRO (kick in the sack with favoritism leg ushiro geri for 10 sec., SKIZAME (kick in the sack with favoritism hand kizame cuki for 10 sec., STAPNSR (taping with foot in
Swiss solar power statistics 2007 - Significant expansion
International Nuclear Information System (INIS)
Hostettler, T.
2008-01-01
This article presents and discusses the 2007 statistics for solar power in Switzerland. A significant number of new installations is noted as is the high production figures from newer installations. The basics behind the compilation of the Swiss solar power statistics are briefly reviewed and an overview for the period 1989 to 2007 is presented which includes figures on the number of photovoltaic plant in service and installed peak power. Typical production figures in kilowatt-hours (kWh) per installed kilowatt-peak power (kWp) are presented and discussed for installations of various sizes. Increased production after inverter replacement in older installations is noted. Finally, the general political situation in Switzerland as far as solar power is concerned are briefly discussed as are international developments.
Test for the statistical significance of differences between ROC curves
International Nuclear Information System (INIS)
Metz, C.E.; Kronman, H.B.
1979-01-01
A test for the statistical significance of observed differences between two measured Receiver Operating Characteristic (ROC) curves has been designed and evaluated. The set of observer response data for each ROC curve is assumed to be independent and to arise from a ROC curve having a form which, in the absence of statistical fluctuations in the response data, graphs as a straight line on double normal-deviate axes. To test the significance of an apparent difference between two measured ROC curves, maximum likelihood estimates of the two parameters of each curve and the associated parameter variances and covariance are calculated from the corresponding set of observer response data. An approximate Chi-square statistic with two degrees of freedom is then constructed from the differences between the parameters estimated for each ROC curve and from the variances and covariances of these estimates. This statistic is known to be truly Chi-square distributed only in the limit of large numbers of trials in the observer performance experiments. Performance of the statistic for data arising from a limited number of experimental trials was evaluated. Independent sets of rating scale data arising from the same underlying ROC curve were paired, and the fraction of differences found (falsely) significant was compared to the significance level, α, used with the test. Although test performance was found to be somewhat dependent on both the number of trials in the data and the position of the underlying ROC curve in the ROC space, the results for various significance levels showed the test to be reliable under practical experimental conditions
Testing for significance of phase synchronisation dynamics in the EEG.
Daly, Ian; Sweeney-Reed, Catherine M; Nasuto, Slawomir J
2013-06-01
A number of tests exist to check for statistical significance of phase synchronisation within the Electroencephalogram (EEG); however, the majority suffer from a lack of generality and applicability. They may also fail to account for temporal dynamics in the phase synchronisation, regarding synchronisation as a constant state instead of a dynamical process. Therefore, a novel test is developed for identifying the statistical significance of phase synchronisation based upon a combination of work characterising temporal dynamics of multivariate time-series and Markov modelling. We show how this method is better able to assess the significance of phase synchronisation than a range of commonly used significance tests. We also show how the method may be applied to identify and classify significantly different phase synchronisation dynamics in both univariate and multivariate datasets.
Directory of Open Access Journals (Sweden)
J Ghanbari
2017-06-01
Full Text Available Introduction Cumin is one of the most important medicinal plants in Iran and today, it is in the second level of popularity between spices in the world after black pepper. Cumin is an aromatic plant used as flavoring and seasoning agent in foods. Cumin seeds have been found to possess significant biological and have been used for treatment of toothache, dyspepsia, diarrhoea, epilepsy and jaundice. Knowledge of GEI is advantageous to have a cultivar that gives consistently high yield in a broad range of environments and to increase efficiency of breeding program and selection of best genotypes. A genotype that has stable trait expression across environments contributes little to GEI and its performance should be more predictable from the main several statistical methods have been proposed for stability analysis, with the aim of explaining the information contained in the GEI. Regression technique was proposed by Finlay and Wilkinson (1963 and was improved by Eberhart and Russell (1966. Generally, genotype stability was estimated by the slope of and deviation from the regression line for each of the genotypes. This is a popular method in stability analysis and has been applied in many crops. Non-parametric methods (rank mean (R, standard deviation rank (SDR and yield index ratio (YIR, environmental variance (S2i and genotypic variation coefficient (CVi Wricke's ecovalence and Shukla's stability variance (Shukla, 1972 have been used to determine genotype-by-environment interaction in many studies. This study was aimed to evaluate the ecotype × sowing date interaction in cumin and to evaluation of genotypic response of cumin to different sowing dates using univariate stability parameters. Materials and Methods In order to study of ecotype × sowing date interaction, different cumin ecotypes: Semnan, Fars, Yazd, Golestan, Khorasan-Razavi, Khorasan-Shomali, Khorasan-Jonoubi, Isfahan and Kerman in 5 different sowing dates (26th December, 10th January
Automatic Image Segmentation Using Active Contours with Univariate Marginal Distribution
Directory of Open Access Journals (Sweden)
I. Cruz-Aceves
2013-01-01
Full Text Available This paper presents a novel automatic image segmentation method based on the theory of active contour models and estimation of distribution algorithms. The proposed method uses the univariate marginal distribution model to infer statistical dependencies between the control points on different active contours. These contours have been generated through an alignment process of reference shape priors, in order to increase the exploration and exploitation capabilities regarding different interactive segmentation techniques. This proposed method is applied in the segmentation of the hollow core in microscopic images of photonic crystal fibers and it is also used to segment the human heart and ventricular areas from datasets of computed tomography and magnetic resonance images, respectively. Moreover, to evaluate the performance of the medical image segmentations compared to regions outlined by experts, a set of similarity measures has been adopted. The experimental results suggest that the proposed image segmentation method outperforms the traditional active contour model and the interactive Tseng method in terms of segmentation accuracy and stability.
Practical statistics a handbook for business projects
Buglear, John
2013-01-01
Practical Statistics is a hands-on guide to statistics, progressing by complexity of data (univariate, bivariate, multivariate) and analysis (portray, summarise, generalise) in order to give the reader a solid understanding of the fundamentals and how to apply them.
On detection and assessment of statistical significance of Genomic Islands
Directory of Open Access Journals (Sweden)
Chaudhuri Probal
2008-04-01
Full Text Available Abstract Background Many of the available methods for detecting Genomic Islands (GIs in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. Results Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. Conclusion The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods.
Forecasting electricity spot-prices using linear univariate time-series models
International Nuclear Information System (INIS)
Cuaresma, Jesus Crespo; Hlouskova, Jaroslava; Kossmeier, Stephan; Obersteiner, Michael
2004-01-01
This paper studies the forecasting abilities of a battery of univariate models on hourly electricity spot prices, using data from the Leipzig Power Exchange. The specifications studied include autoregressive models, autoregressive-moving average models and unobserved component models. The results show that specifications, where each hour of the day is modelled separately present uniformly better forecasting properties than specifications for the whole time-series, and that the inclusion of simple probabilistic processes for the arrival of extreme price events can lead to improvements in the forecasting abilities of univariate models for electricity spot prices. (Author)
Increasing the statistical significance of entanglement detection in experiments.
Jungnitsch, Bastian; Niekamp, Sönke; Kleinmann, Matthias; Gühne, Otfried; Lu, He; Gao, Wei-Bo; Chen, Yu-Ao; Chen, Zeng-Bing; Pan, Jian-Wei
2010-05-28
Entanglement is often verified by a violation of an inequality like a Bell inequality or an entanglement witness. Considerable effort has been devoted to the optimization of such inequalities in order to obtain a high violation. We demonstrate theoretically and experimentally that such an optimization does not necessarily lead to a better entanglement test, if the statistical error is taken into account. Theoretically, we show for different error models that reducing the violation of an inequality can improve the significance. Experimentally, we observe this phenomenon in a four-photon experiment, testing the Mermin and Ardehali inequality for different levels of noise. Furthermore, we provide a way to develop entanglement tests with high statistical significance.
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Comparison of different Methods for Univariate Time Series Imputation in R
Moritz, Steffen; Sardá, Alexis; Bartz-Beielstein, Thomas; Zaefferer, Martin; Stork, Jörg
2015-01-01
Missing values in datasets are a well-known problem and there are quite a lot of R packages offering imputation functions. But while imputation in general is well covered within R, it is hard to find functions for imputation of univariate time series. The problem is, most standard imputation techniques can not be applied directly. Most algorithms rely on inter-attribute correlations, while univariate time series imputation needs to employ time dependencies. This paper provides an overview of ...
Kryklywy, James H; Macpherson, Ewan A; Mitchell, Derek G V
2018-04-01
Emotion can have diverse effects on behaviour and perception, modulating function in some circumstances, and sometimes having little effect. Recently, it was identified that part of the heterogeneity of emotional effects could be due to a dissociable representation of emotion in dual pathway models of sensory processing. Our previous fMRI experiment using traditional univariate analyses showed that emotion modulated processing in the auditory 'what' but not 'where' processing pathway. The current study aims to further investigate this dissociation using a more recently emerging multi-voxel pattern analysis searchlight approach. While undergoing fMRI, participants localized sounds of varying emotional content. A searchlight multi-voxel pattern analysis was conducted to identify activity patterns predictive of sound location and/or emotion. Relative to the prior univariate analysis, MVPA indicated larger overlapping spatial and emotional representations of sound within early secondary regions associated with auditory localization. However, consistent with the univariate analysis, these two dimensions were increasingly segregated in late secondary and tertiary regions of the auditory processing streams. These results, while complimentary to our original univariate analyses, highlight the utility of multiple analytic approaches for neuroimaging, particularly for neural processes with known representations dependent on population coding.
Statistical Significance for Hierarchical Clustering
Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.
2017-01-01
Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990
Statistical significance of trends in monthly heavy precipitation over the US
Mahajan, Salil
2011-05-11
Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall\\'s τ test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong. © 2011 Springer-Verlag.
ASURV: Astronomical SURVival Statistics
Feigelson, E. D.; Nelson, P. I.; Isobe, T.; LaValley, M.
2014-06-01
ASURV (Astronomical SURVival Statistics) provides astronomy survival analysis for right- and left-censored data including the maximum-likelihood Kaplan-Meier estimator and several univariate two-sample tests, bivariate correlation measures, and linear regressions. ASURV is written in FORTRAN 77, and is stand-alone and does not call any specialized libraries.
Univariate characterization of the German business cycle 1955-1994
Weihs, Claus; Garczarek, Ursula
2002-01-01
We present a descriptive analysis of stylized facts for the German business cycle. We demonstrate that simple ad-hoc instructions for identifying univariate rules characterizing the German business cycle 1955-1994 lead to an error rate comparable to standard multivariate methods.
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance.
Kramer, Karen L; Veile, Amanda; Otárola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children's growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children's monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children's growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children's growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children's growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance.
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance.
Directory of Open Access Journals (Sweden)
Karen L Kramer
Full Text Available Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children's growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children's monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1 as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2 competition from young siblings will negatively impact child growth during the post weaning period; 3 however because of their economic value, older siblings will have a negligible effect on young children's growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children's growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children's growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance.
Increasing the statistical significance of entanglement detection in experiments
Energy Technology Data Exchange (ETDEWEB)
Jungnitsch, Bastian; Niekamp, Soenke; Kleinmann, Matthias; Guehne, Otfried [Institut fuer Quantenoptik und Quanteninformation, Innsbruck (Austria); Lu, He; Gao, Wei-Bo; Chen, Zeng-Bing [Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei (China); Chen, Yu-Ao; Pan, Jian-Wei [Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei (China); Physikalisches Institut, Universitaet Heidelberg (Germany)
2010-07-01
Entanglement is often verified by a violation of an inequality like a Bell inequality or an entanglement witness. Considerable effort has been devoted to the optimization of such inequalities in order to obtain a high violation. We demonstrate theoretically and experimentally that such an optimization does not necessarily lead to a better entanglement test, if the statistical error is taken into account. Theoretically, we show for different error models that reducing the violation of an inequality can improve the significance. We show this to be the case for an error model in which the variance of an observable is interpreted as its error and for the standard error model in photonic experiments. Specifically, we demonstrate that the Mermin inequality yields a Bell test which is statistically more significant than the Ardehali inequality in the case of a photonic four-qubit state that is close to a GHZ state. Experimentally, we observe this phenomenon in a four-photon experiment, testing the above inequalities for different levels of noise.
Reporting effect sizes as a supplement to statistical significance ...
African Journals Online (AJOL)
The purpose of the article is to review the statistical significance reporting practices in reading instruction studies and to provide guidelines for when to calculate and report effect sizes in educational research. A review of six readily accessible (online) and accredited journals publishing research on reading instruction ...
Your Chi-Square Test Is Statistically Significant: Now What?
Sharpe, Donald
2015-01-01
Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…
Directory of Open Access Journals (Sweden)
Melissa Coulson
2010-07-01
Full Text Available A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST, or confidence intervals (CIs. Authors of articles published in psychology, behavioural neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST.
Computational statistics handbook with Matlab
Martinez, Wendy L
2007-01-01
Prefaces Introduction What Is Computational Statistics? An Overview of the Book Probability Concepts Introduction Probability Conditional Probability and Independence Expectation Common Distributions Sampling Concepts Introduction Sampling Terminology and Concepts Sampling Distributions Parameter Estimation Empirical Distribution Function Generating Random Variables Introduction General Techniques for Generating Random Variables Generating Continuous Random Variables Generating Discrete Random Variables Exploratory Data Analysis Introduction Exploring Univariate Data Exploring Bivariate and Trivariate Data Exploring Multidimensional Data Finding Structure Introduction Projecting Data Principal Component Analysis Projection Pursuit EDA Independent Component Analysis Grand Tour Nonlinear Dimensionality Reduction Monte Carlo Methods for Inferential Statistics Introduction Classical Inferential Statistics Monte Carlo Methods for Inferential Statist...
Combinatorial bounds on the α-divergence of univariate mixture models
Nielsen, Frank; Sun, Ke
2017-01-01
We derive lower- and upper-bounds of α-divergence between univariate mixture models with components in the exponential family. Three pairs of bounds are presented in order with increasing quality and increasing computational cost. They are verified
Davis, Tyler; LaRocque, Karen F; Mumford, Jeanette A; Norman, Kenneth A; Wagner, Anthony D; Poldrack, Russell A
2014-08-15
Multi-voxel pattern analysis (MVPA) has led to major changes in how fMRI data are analyzed and interpreted. Many studies now report both MVPA results and results from standard univariate voxel-wise analysis, often with the goal of drawing different conclusions from each. Because MVPA results can be sensitive to latent multidimensional representations and processes whereas univariate voxel-wise analysis cannot, one conclusion that is often drawn when MVPA and univariate results differ is that the activation patterns underlying MVPA results contain a multidimensional code. In the current study, we conducted simulations to formally test this assumption. Our findings reveal that MVPA tests are sensitive to the magnitude of voxel-level variability in the effect of a condition within subjects, even when the same linear relationship is coded in all voxels. We also find that MVPA is insensitive to subject-level variability in mean activation across an ROI, which is the primary variance component of interest in many standard univariate tests. Together, these results illustrate that differences between MVPA and univariate tests do not afford conclusions about the nature or dimensionality of the neural code. Instead, targeted tests of the informational content and/or dimensionality of activation patterns are critical for drawing strong conclusions about the representational codes that are indicated by significant MVPA results. Copyright © 2014 Elsevier Inc. All rights reserved.
Testing statistical significance scores of sequence comparison methods with structure similarity
Directory of Open Access Journals (Sweden)
Leunissen Jack AM
2006-10-01
Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.
Statistical significance versus clinical relevance.
van Rijn, Marieke H C; Bech, Anneke; Bouyer, Jean; van den Brand, Jan A J G
2017-04-01
In March this year, the American Statistical Association (ASA) posted a statement on the correct use of P-values, in response to a growing concern that the P-value is commonly misused and misinterpreted. We aim to translate these warnings given by the ASA into a language more easily understood by clinicians and researchers without a deep background in statistics. Moreover, we intend to illustrate the limitations of P-values, even when used and interpreted correctly, and bring more attention to the clinical relevance of study findings using two recently reported studies as examples. We argue that P-values are often misinterpreted. A common mistake is saying that P < 0.05 means that the null hypothesis is false, and P ≥0.05 means that the null hypothesis is true. The correct interpretation of a P-value of 0.05 is that if the null hypothesis were indeed true, a similar or more extreme result would occur 5% of the times upon repeating the study in a similar sample. In other words, the P-value informs about the likelihood of the data given the null hypothesis and not the other way around. A possible alternative related to the P-value is the confidence interval (CI). It provides more information on the magnitude of an effect and the imprecision with which that effect was estimated. However, there is no magic bullet to replace P-values and stop erroneous interpretation of scientific results. Scientists and readers alike should make themselves familiar with the correct, nuanced interpretation of statistical tests, P-values and CIs. © The Author 2017. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Statistical significance of epidemiological data. Seminar: Evaluation of epidemiological studies
International Nuclear Information System (INIS)
Weber, K.H.
1993-01-01
In stochastic damages, the numbers of events, e.g. the persons who are affected by or have died of cancer, and thus the relative frequencies (incidence or mortality) are binomially distributed random variables. Their statistical fluctuations can be characterized by confidence intervals. For epidemiologic questions, especially for the analysis of stochastic damages in the low dose range, the following issues are interesting: - Is a sample (a group of persons) with a definite observed damage frequency part of the whole population? - Is an observed frequency difference between two groups of persons random or statistically significant? - Is an observed increase or decrease of the frequencies with increasing dose random or statistically significant and how large is the regression coefficient (= risk coefficient) in this case? These problems can be solved by sttistical tests. So-called distribution-free tests and tests which are not bound to the supposition of normal distribution are of particular interest, such as: - χ 2 -independence test (test in contingency tables); - Fisher-Yates-test; - trend test according to Cochran; - rank correlation test given by Spearman. These tests are explained in terms of selected epidemiologic data, e.g. of leukaemia clusters, of the cancer mortality of the Japanese A-bomb survivors especially in the low dose range as well as on the sample of the cancer mortality in the high background area in Yangjiang (China). (orig.) [de
Multivariate Statistical Process Control Charts: An Overview
Bersimis, Sotiris; Psarakis, Stelios; Panaretos, John
2006-01-01
In this paper we discuss the basic procedures for the implementation of multivariate statistical process control via control charting. Furthermore, we review multivariate extensions for all kinds of univariate control charts, such as multivariate Shewhart-type control charts, multivariate CUSUM control charts and multivariate EWMA control charts. In addition, we review unique procedures for the construction of multivariate control charts, based on multivariate statistical techniques such as p...
Univariate decision tree induction using maximum margin classification
Yıldız, Olcay Taner
2012-01-01
In many pattern recognition applications, first decision trees are used due to their simplicity and easily interpretable nature. In this paper, we propose a new decision tree learning algorithm called univariate margin tree where, for each continuous attribute, the best split is found using convex optimization. Our simulation results on 47 data sets show that the novel margin tree classifier performs at least as good as C4.5 and linear discriminant tree (LDT) with a similar time complexity. F...
Statistical Significance and Effect Size: Two Sides of a Coin.
Fan, Xitao
This paper suggests that statistical significance testing and effect size are two sides of the same coin; they complement each other, but do not substitute for one another. Good research practice requires that both should be taken into consideration to make sound quantitative decisions. A Monte Carlo simulation experiment was conducted, and a…
Significant Statistics: Viewed with a Contextual Lens
Tait-McCutcheon, Sandi
2010-01-01
This paper examines the pedagogical and organisational changes three lead teachers made to their statistics teaching and learning programs. The lead teachers posed the research question: What would the effect of contextually integrating statistical investigations and literacies into other curriculum areas be on student achievement? By finding the…
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
Statistical Analysis of Research Data | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data. The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.
A handbook of statistical graphics using SAS ODS
Der, Geoff
2014-01-01
An Introduction to Graphics: Good Graphics, Bad Graphics, Catastrophic Graphics and Statistical GraphicsThe Challenger DisasterGraphical DisplaysA Little History and Some Early Graphical DisplaysGraphical DeceptionAn Introduction to ODS GraphicsGenerating ODS GraphsODS DestinationsStatistical Graphics ProceduresODS Graphs from Statistical ProceduresControlling ODS GraphicsControlling Labelling in GraphsODS Graphics EditorGraphs for Displaying the Characteristics of Univariate Data: Horse Racing, Mortality Rates, Forearm Lengths, Survival Times and Geyser EruptionsIntroductionPie Chart, Bar Cha
DEFF Research Database (Denmark)
Engsted, Tom
I comment on the controversy between McCloskey & Ziliak and Hoover & Siegler on statistical versus economic significance, in the March 2008 issue of the Journal of Economic Methodology. I argue that while McCloskey & Ziliak are right in emphasizing 'real error', i.e. non-sampling error that cannot...... be eliminated through specification testing, they fail to acknowledge those areas in economics, e.g. rational expectations macroeconomics and asset pricing, where researchers clearly distinguish between statistical and economic significance and where statistical testing plays a relatively minor role in model...
Statistical methods in personality assessment research.
Schinka, J A; LaLone, L; Broeckel, J A
1997-06-01
Emerging models of personality structure and advances in the measurement of personality and psychopathology suggest that research in personality and personality assessment has entered a stage of advanced development, in this article we examine whether researchers in these areas have taken advantage of new and evolving statistical procedures. We conducted a review of articles published in the Journal of Personality, Assessment during the past 5 years. Of the 449 articles that included some form of data analysis, 12.7% used only descriptive statistics, most employed only univariate statistics, and fewer than 10% used multivariate methods of data analysis. We discuss the cost of using limited statistical methods, the possible reasons for the apparent reluctance to employ advanced statistical procedures, and potential solutions to this technical shortcoming.
Effect Sizes for Research Univariate and Multivariate Applications
Grissom, Robert J
2011-01-01
Noted for its comprehensive coverage, this greatly expanded new edition now covers the use of univariate and multivariate effect sizes. Many measures and estimators are reviewed along with their application, interpretation, and limitations. Noted for its practical approach, the book features numerous examples using real data for a variety of variables and designs, to help readers apply the material to their own data. Tips on the use of SPSS, SAS, R, and S-Plus are provided. The book's broad disciplinary appeal results from its inclusion of a variety of examples from psychology, medicine, educa
Combinatorial bounds on the α-divergence of univariate mixture models
Nielsen, Frank
2017-06-20
We derive lower- and upper-bounds of α-divergence between univariate mixture models with components in the exponential family. Three pairs of bounds are presented in order with increasing quality and increasing computational cost. They are verified empirically through simulated Gaussian mixture models. The presented methodology generalizes to other divergence families relying on Hellinger-type integrals.
Statistical study on the self-selection bias in FDG-PET cancer screening by a questionnaire survey
International Nuclear Information System (INIS)
Kita, Tamotsu; Yano, Fuzuki; Watanabe, Sadahiro; Soga, Shigeyoshi; Hama, Yukihiro; Shinmoto, Hiroshi; Kosuda, Shigeru
2008-01-01
A questionnaire survey was performed to investigate the possible presence of self-selection bias in 18 F-fluorodeoxyglucose (FDG) positron emission tomography (PET) cancer screening (PET cancer screening). Responders to the questionnaires survey consisted of 80 healthy persons, who answered whether they undergo PET cancer screening, health consciousness, age, sex, and smoking history. The univariate and multivariate analyses on the four parameters were performed between the responders who were to undergo PET cancer screening and the responders who were not. Statistically significant difference was found in health consciousness between the above-mentioned two groups by both univariate and multivariate analysis with the odds ratio of 2.088. The study indicated that self-selection bias should exist in PET cancer screening. (author)
Wilkinson, Michael
2014-03-01
Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.
Directory of Open Access Journals (Sweden)
Zhang Zhang
2012-03-01
Full Text Available Abstract Background Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB. Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis. Results Here we propose a novel measure--Codon Deviation Coefficient (CDC--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance. Conclusions As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions.
Farrell, Mary Beth
2018-06-01
This article is the second part of a continuing education series reviewing basic statistics that nuclear medicine and molecular imaging technologists should understand. In this article, the statistics for evaluating interpretation accuracy, significance, and variance are discussed. Throughout the article, actual statistics are pulled from the published literature. We begin by explaining 2 methods for quantifying interpretive accuracy: interreader and intrareader reliability. Agreement among readers can be expressed simply as a percentage. However, the Cohen κ-statistic is a more robust measure of agreement that accounts for chance. The higher the κ-statistic is, the higher is the agreement between readers. When 3 or more readers are being compared, the Fleiss κ-statistic is used. Significance testing determines whether the difference between 2 conditions or interventions is meaningful. Statistical significance is usually expressed using a number called a probability ( P ) value. Calculation of P value is beyond the scope of this review. However, knowing how to interpret P values is important for understanding the scientific literature. Generally, a P value of less than 0.05 is considered significant and indicates that the results of the experiment are due to more than just chance. Variance, standard deviation (SD), confidence interval, and standard error (SE) explain the dispersion of data around a mean of a sample drawn from a population. SD is commonly reported in the literature. A small SD indicates that there is not much variation in the sample data. Many biologic measurements fall into what is referred to as a normal distribution taking the shape of a bell curve. In a normal distribution, 68% of the data will fall within 1 SD, 95% will fall within 2 SDs, and 99.7% will fall within 3 SDs. Confidence interval defines the range of possible values within which the population parameter is likely to lie and gives an idea of the precision of the statistic being
Systematic reviews of anesthesiologic interventions reported as statistically significant
DEFF Research Database (Denmark)
Imberger, Georgina; Gluud, Christian; Boylan, John
2015-01-01
statistically significant meta-analyses of anesthesiologic interventions, we used TSA to estimate power and imprecision in the context of sparse data and repeated updates. METHODS: We conducted a search to identify all systematic reviews with meta-analyses that investigated an intervention that may......: From 11,870 titles, we found 682 systematic reviews that investigated anesthesiologic interventions. In the 50 sampled meta-analyses, the median number of trials included was 8 (interquartile range [IQR], 5-14), the median number of participants was 964 (IQR, 523-1736), and the median number...
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
Trottini, Mario; Vigo, Isabel; Belda, Santiago
2015-01-01
Given a time series, running trends analysis (RTA) involves evaluating least squares trends over overlapping time windows of L consecutive time points, with overlap by all but one observation. This produces a new series called the “running trends series,” which is used as summary statistics of the original series for further analysis. In recent years, RTA has been widely used in climate applied research as summary statistics for time series and time series association. There is no doubt that ...
P-Value, a true test of statistical significance? a cautionary note ...
African Journals Online (AJOL)
While it's not the intention of the founders of significance testing and hypothesis testing to have the two ideas intertwined as if they are complementary, the inconvenient marriage of the two practices into one coherent, convenient, incontrovertible and misinterpreted practice has dotted our standard statistics textbooks and ...
Zhang, Zhang
2012-03-22
Background: Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB). Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis.Results: Here we propose a novel measure--Codon Deviation Coefficient (CDC)--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance.Conclusions: As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions. 2012 Zhang et al; licensee BioMed Central Ltd.
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate…
Brouwer, D.; Meijer, R.R.; Zevalkink, D.J.
2013-01-01
Several researchers have emphasized that item response theory (IRT)-based methods should be preferred over classical approaches in measuring change for individual patients. In the present study we discuss and evaluate the use of IRT-based statistics to measure statistical significant individual
New Graphical Methods and Test Statistics for Testing Composite Normality
Directory of Open Access Journals (Sweden)
Marc S. Paolella
2015-07-01
Full Text Available Several graphical methods for testing univariate composite normality from an i.i.d. sample are presented. They are endowed with correct simultaneous error bounds and yield size-correct tests. As all are based on the empirical CDF, they are also consistent for all alternatives. For one test, called the modified stabilized probability test, or MSP, a highly simplified computational method is derived, which delivers the test statistic and also a highly accurate p-value approximation, essentially instantaneously. The MSP test is demonstrated to have higher power against asymmetric alternatives than the well-known and powerful Jarque-Bera test. A further size-correct test, based on combining two test statistics, is shown to have yet higher power. The methodology employed is fully general and can be applied to any i.i.d. univariate continuous distribution setting.
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
DEFF Research Database (Denmark)
Jakobsen, Janus Christian; Wetterslev, Jorn; Winkel, Per
2014-01-01
BACKGROUND: Thresholds for statistical significance when assessing meta-analysis results are being insufficiently demonstrated by traditional 95% confidence intervals and P-values. Assessment of intervention effects in systematic reviews with meta-analysis deserves greater rigour. METHODS......: Methodologies for assessing statistical and clinical significance of intervention effects in systematic reviews were considered. Balancing simplicity and comprehensiveness, an operational procedure was developed, based mainly on The Cochrane Collaboration methodology and the Grading of Recommendations...... Assessment, Development, and Evaluation (GRADE) guidelines. RESULTS: We propose an eight-step procedure for better validation of meta-analytic results in systematic reviews (1) Obtain the 95% confidence intervals and the P-values from both fixed-effect and random-effects meta-analyses and report the most...
Soliman, Essam S; Moawed, Sherif A; Hassan, Rania A
2017-08-01
Birds litter contains unutilized nitrogen in the form of uric acid that is converted into ammonia; a fact that does not only affect poultry performance but also has a negative effect on people's health around the farm and contributes in the environmental degradation. The influence of microclimatic ammonia emissions on Ross and Hubbard broilers reared in different housing systems at two consecutive seasons (fall and winter) was evaluated using a discriminant function analysis to differentiate between Ross and Hubbard breeds. A total number of 400 air samples were collected and analyzed for ammonia levels during the experimental period. Data were analyzed using univariate and multivariate statistical methods. Ammonia levels were significantly higher (p0.05) were found between the two farms in body weight, body weight gain, feed intake, feed conversion ratio, and performance index (PI) of broilers. Body weight; weight gain and PI had increased values (pbroiler breed. Ammonia emissions were positively (although weekly) correlated with the ambient relative humidity (r=0.383; p0.05). Test of significance of discriminant function analysis did not show a classification based on the studied traits suggesting that they cannot been used as predictor variables. The percentage of correct classification was 52% and it was improved after deletion of highly correlated traits to 57%. The study revealed that broiler's growth was negatively affected by increased microclimatic ammonia concentrations and recommended the analysis of broilers' growth performance parameters data using multivariate discriminant function analysis.
Jeffrey P. Prestemon
2009-01-01
Timber product markets are subject to large shocks deriving from natural disturbances and policy shifts. Statistical modeling of shocks is often done to assess their economic importance. In this article, I simulate the statistical power of univariate and bivariate methods of shock detection using time series intervention models. Simulations show that bivariate methods...
Stress assessment based on EEG univariate features and functional connectivity measures.
Alonso, J F; Romero, S; Ballester, M R; Antonijoan, R M; Mañanas, M A
2015-07-01
The biological response to stress originates in the brain but involves different biochemical and physiological effects. Many common clinical methods to assess stress are based on the presence of specific hormones and on features extracted from different signals, including electrocardiogram, blood pressure, skin temperature, or galvanic skin response. The aim of this paper was to assess stress using EEG-based variables obtained from univariate analysis and functional connectivity evaluation. Two different stressors, the Stroop test and sleep deprivation, were applied to 30 volunteers to find common EEG patterns related to stress effects. Results showed a decrease of the high alpha power (11 to 12 Hz), an increase in the high beta band (23 to 36 Hz, considered a busy brain indicator), and a decrease in the approximate entropy. Moreover, connectivity showed that the high beta coherence and the interhemispheric nonlinear couplings, measured by the cross mutual information function, increased significantly for both stressors, suggesting that useful stress indexes may be obtained from EEG-based features.
Which DTW Method Applied to Marine Univariate Time Series Imputation
Phan , Thi-Thu-Hong; Caillault , Émilie; Lefebvre , Alain; Bigand , André
2017-01-01
International audience; Missing data are ubiquitous in any domains of applied sciences. Processing datasets containing missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). Therefore, the aim of this paper is to build a framework for filling missing values in univariate time series and to perform a comparison of different similarity metrics used for the imputation task. This allows to suggest the most suitable methods for the imp...
The pathways for intelligible speech: multivariate and univariate perspectives.
Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K
2014-09-01
An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.
Applied multivariate statistics with R
Zelterman, Daniel
2015-01-01
This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...
Yuan, Ke-Hai
2008-01-01
In the literature of mean and covariance structure analysis, noncentral chi-square distribution is commonly used to describe the behavior of the likelihood ratio (LR) statistic under alternative hypothesis. Due to the inaccessibility of the rather technical literature for the distribution of the LR statistic, it is widely believed that the…
Log-concave Probability Distributions: Theory and Statistical Testing
DEFF Research Database (Denmark)
An, Mark Yuing
1996-01-01
This paper studies the broad class of log-concave probability distributions that arise in economics of uncertainty and information. For univariate, continuous, and log-concave random variables we prove useful properties without imposing the differentiability of density functions. Discrete...... and multivariate distributions are also discussed. We propose simple non-parametric testing procedures for log-concavity. The test statistics are constructed to test one of the two implicati ons of log-concavity: increasing hazard rates and new-is-better-than-used (NBU) property. The test for increasing hazard...... rates are based on normalized spacing of the sample order statistics. The tests for NBU property fall into the category of Hoeffding's U-statistics...
Rossi, M.; Apuani, T.; Felletti, F.
2009-04-01
The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9
Soliman, Essam S.; Moawed, Sherif A.; Hassan, Rania A.
2017-01-01
Background and Aim: Birds litter contains unutilized nitrogen in the form of uric acid that is converted into ammonia; a fact that does not only affect poultry performance but also has a negative effect on people’s health around the farm and contributes in the environmental degradation. The influence of microclimatic ammonia emissions on Ross and Hubbard broilers reared in different housing systems at two consecutive seasons (fall and winter) was evaluated using a discriminant function analysis to differentiate between Ross and Hubbard breeds. Materials and Methods: A total number of 400 air samples were collected and analyzed for ammonia levels during the experimental period. Data were analyzed using univariate and multivariate statistical methods. Results: Ammonia levels were significantly higher (p0.05) were found between the two farms in body weight, body weight gain, feed intake, feed conversion ratio, and performance index (PI) of broilers. Body weight; weight gain and PI had increased values (pbroiler breed. Ammonia emissions were positively (although weekly) correlated with the ambient relative humidity (r=0.383; p0.05). Test of significance of discriminant function analysis did not show a classification based on the studied traits suggesting that they cannot been used as predictor variables. The percentage of correct classification was 52% and it was improved after deletion of highly correlated traits to 57%. Conclusion: The study revealed that broiler’s growth was negatively affected by increased microclimatic ammonia concentrations and recommended the analysis of broilers’ growth performance parameters data using multivariate discriminant function analysis. PMID:28919677
Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph
2010-01-01
Two new methods based on FTâRaman spectroscopy, one simple, based on band intensity ratio, and the other using a partial least squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in cellulose I samples was determined based on univariate regression that was first developed using the Raman band...
Univaried models in the series of temperature of the air
International Nuclear Information System (INIS)
Leon Aristizabal Gloria esperanza
2000-01-01
The theoretical framework for the study of the air's temperature time series is the theory of stochastic processes, particularly those known as ARIMA, that make it possible to carry out a univaried analysis. ARIMA models are built in order to explain the structure of the monthly temperatures corresponding to the mean, the absolute maximum, absolute minimum, maximum mean and minimum mean temperatures, for four stations in Colombia. By means of those models, the possible evolution of the latter variables is estimated with predictive aims in mind. The application and utility of the models is discussed
Kleijnen, J.P.C.
2006-01-01
Classic linear regression models and their concomitant statistical designs assume a univariate response and white noise.By definition, white noise is normally, independently, and identically distributed with zero mean.This survey tries to answer the following questions: (i) How realistic are these
Diedrich, Alice; Schlegl, Sandra; Greetfeld, Martin; Fumi, Markus; Voderholzer, Ulrich
2018-03-01
This study examines the statistical and clinical significance of symptom changes during an intensive inpatient treatment program with a strong psychotherapeutic focus for individuals with severe bulimia nervosa. 295 consecutively admitted bulimic patients were administered the Structured Interview for Anorexic and Bulimic Syndromes-Self-Rating (SIAB-S), the Eating Disorder Inventory-2 (EDI-2), the Brief Symptom Inventory (BSI), and the Beck Depression Inventory-II (BDI-II) at treatment intake and discharge. Results indicated statistically significant symptom reductions with large effect sizes regarding severity of binge eating and compensatory behavior (SIAB-S), overall eating disorder symptom severity (EDI-2), overall psychopathology (BSI), and depressive symptom severity (BDI-II) even when controlling for antidepressant medication. The majority of patients showed either reliable (EDI-2: 33.7%, BSI: 34.8%, BDI-II: 18.1%) or even clinically significant symptom changes (EDI-2: 43.2%, BSI: 33.9%, BDI-II: 56.9%). Patients with clinically significant improvement were less distressed at intake and less likely to suffer from a comorbid borderline personality disorder when compared with those who did not improve to a clinically significant extent. Findings indicate that intensive psychotherapeutic inpatient treatment may be effective in about 75% of severely affected bulimic patients. For the remaining non-responding patients, inpatient treatment might be improved through an even stronger focus on the reduction of comorbid borderline personality traits.
Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.
Deegear, James
This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…
STATCAT, Statistical Analysis of Parametric and Non-Parametric Data
International Nuclear Information System (INIS)
David, Hugh
1990-01-01
1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required
Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B
2013-03-23
Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
The prognostic significance of parapharyngeal tumour involvement in nasopharyngeal carcinoma
International Nuclear Information System (INIS)
Teo, P.Y.; Lee, W.; Yu, P.
1996-01-01
From 1984 to 1989, 903 treatment-naive non-disseminated nasopharyngeal carcinomas (NPCs) were given primary radical radiotherapy. All patients had computed tomographic and endoscopic evaluation of the primary tumour. Potentially significant parameters were analysed by both univariate and multivariate methods for independent significance. In the whole group of patients, the male sex, skull base and cranial nerve(s) involvement, advanced Ho N-level, presence of fixed or partially fixed nodes and nodes contralateral to the side of the bulk of the nasopharyngeal primary, significantly determined survival and distant metastasis rates, whereas skull base and cranial nerve involvement, advanced age and male sex significantly worsened local control. However in the Ho T2No subgroup, parapharyngeal tumour involvement was the most significant prognosticator that determined distant metastasis and survival rates in the absence of the overriding prognosticators of skull base infiltration, cranial nerve(s) palsy, and cervical nodal metastasis. The local tumour control of the Ho T2No was adversely affected by the presence of oropharyngeal tumour extension. The administration of booster radiotherapy (20 Gy) after conventional radiotherapy (60-62.5 Gy) in tumours with parapharyngeal involvement has led to an improvement in local control, short of statistical significance
Gaskin, Cadeyrn J; Happell, Brenda
2014-05-01
improvement. Most importantly, researchers should abandon the misleading practice of interpreting the results from inferential tests based solely on whether they are statistically significant (or not) and, instead, focus on reporting and interpreting effect sizes, confidence intervals, and significance levels. Nursing researchers also need to conduct and report a priori power analyses, and to address the issue of Type I experiment-wise error inflation in their studies. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
International Nuclear Information System (INIS)
Dordevic, N.; Wehrens, R.; Postma, G.J.; Buydens, L.M.C.; Camin, F.
2012-01-01
Highlights: ► The assessment of claims of origin is of enormous economic importance for DOC and DOCG wines. ► The official method is based on univariate statistical tests of H, C and O isotopic ratios. ► We consider 5220 Italian wine samples collected in the period 2000–2010. ► Multivariate statistical analysis leads to much better specificity and easier detection of false claims of origin. ► In the case of multi-modal data, mixture modelling provides additional improvements. - Abstract: Wine derives its economic value to a large extent from geographical origin, which has a significant impact on the quality of the wine. According to the food legislation, wines can be without geographical origin (table wine) and wines with origin. Wines with origin must have characteristics which are essential due to its region of production and must be produced, processed and prepared, exclusively within that region. The development of fast and reliable analytical methods for the assessment of claims of origin is very important. The current official method is based on the measurement of stable isotope ratios of water and alcohol in wine, which are influenced by climatic factors. The results in this paper are based on 5220 Italian wine samples collected in the period 2000–2010. We evaluate the univariate approach underlying the official method to assess claims of origin and propose several new methods to get better geographical discrimination between samples. It is shown that multivariate methods are superior to univariate approaches in that they show increased sensitivity and specificity. In cases where data are non-normally distributed, an approach based on mixture modelling provides additional improvements.
Van Aert, R.C.M.; Van Assen, M.A.L.M.
2018-01-01
The unrealistically high rate of positive results within psychology has increased the attention to replication research. However, researchers who conduct a replication and want to statistically combine the results of their replication with a statistically significant original study encounter
A tutorial on hunting statistical significance by chasing N
Directory of Open Access Journals (Sweden)
Denes Szucs
2016-09-01
Full Text Available There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Hidden data dredging (also called p-hacking is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. In order to build better intuition to avoid, detect and criticise some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings. I illustrate several forms of two special cases of data dredging. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. Second, researchers may group participants post-hoc along potential but unplanned independent grouping variables. The first approach 'hacks' the number of participants in studies, the second approach ‘hacks’ the number of variables in the analysis. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. Similar, usually undocumented data dredging steps can easily lead to having 20-50%, or more false positives.
Are studies reporting significant results more likely to be published?
Koletsi, Despina; Karagianni, Anthi; Pandis, Nikolaos; Makou, Margarita; Polychronopoulou, Argy; Eliades, Theodore
2009-11-01
Our objective was to assess the hypothesis that there are variations of the proportion of articles reporting a significant effect, with a higher percentage of those articles published in journals with impact factors. The contents of 5 orthodontic journals (American Journal of Orthodontics and Dentofacial Orthopedics, Angle Orthodontist, European Journal of Orthodontics, Journal of Orthodontics, and Orthodontics and Craniofacial Research), published between 2004 and 2008, were hand-searched. Articles with statistical analysis of data were included in the study and classified into 4 categories: behavior and psychology, biomaterials and biomechanics, diagnostic procedures and treatment, and craniofacial growth, morphology, and genetics. In total, 2622 articles were examined, with 1785 included in the analysis. Univariate and multivariate logistic regression analyses were applied with statistical significance as the dependent variable, and whether the journal had an impact factor, the subject, and the year were the independent predictors. A higher percentage of articles showed significant results relative to those without significant associations (on average, 88% vs 12%) for those journals. Overall, these journals published significantly more studies with significant results, ranging from 75% to 90% (P = 0.02). Multivariate modeling showed that journals with impact factors had a 100% increased probability of publishing a statistically significant result compared with journals with no impact factor (odds ratio [OR], 1.99; 95% CI, 1.19-3.31). Compared with articles on biomaterials and biomechanics, all other subject categories showed lower probabilities of significant results. Nonsignificant findings in behavior and psychology and diagnosis and treatment were 1.8 (OR, 1.75; 95% CI, 1.51-2.67) and 3.5 (OR, 3.50; 95% CI, 2.27-5.37) times more likely to be published, respectively. Journals seem to prefer reporting significant results; this might be because of authors
Wind Speed Prediction Using a Univariate ARIMA Model and a Multivariate NARX Model
Directory of Open Access Journals (Sweden)
Erasmo Cadenas
2016-02-01
Full Text Available Two on step ahead wind speed forecasting models were compared. A univariate model was developed using a linear autoregressive integrated moving average (ARIMA. This method’s performance is well studied for a large number of prediction problems. The other is a multivariate model developed using a nonlinear autoregressive exogenous artificial neural network (NARX. This uses the variables: barometric pressure, air temperature, wind direction and solar radiation or relative humidity, as well as delayed wind speed. Both models were developed from two databases from two sites: an hourly average measurements database from La Mata, Oaxaca, Mexico, and a ten minute average measurements database from Metepec, Hidalgo, Mexico. The main objective was to compare the impact of the various meteorological variables on the performance of the multivariate model of wind speed prediction with respect to the high performance univariate linear model. The NARX model gave better results with improvements on the ARIMA model of between 5.5% and 10. 6% for the hourly database and of between 2.3% and 12.8% for the ten minute database for mean absolute error and mean squared error, respectively.
R package imputeTestbench to compare imputations methods for univariate time series
Bokde, Neeraj; Kulat, Kishore; Beck, Marcus W; Asencio-Cortés, Gualberto
2016-01-01
This paper describes the R package imputeTestbench that provides a testbench for comparing imputation methods for missing data in univariate time series. The imputeTestbench package can be used to simulate the amount and type of missing data in a complete dataset and compare filled data using different imputation methods. The user has the option to simulate missing data by removing observations completely at random or in blocks of different sizes. Several default imputation methods are includ...
DEFF Research Database (Denmark)
Jones, Allan; Sommerlund, Bo
2007-01-01
The uses of null hypothesis significance testing (NHST) and statistical power analysis within psychological research are critically discussed. The article looks at the problems of relying solely on NHST when dealing with small and large sample sizes. The use of power-analysis in estimating...... the potential error introduced by small and large samples is advocated. Power analysis is not recommended as a replacement to NHST but as an additional source of information about the phenomena under investigation. Moreover, the importance of conceptual analysis in relation to statistical analysis of hypothesis...
Energy Technology Data Exchange (ETDEWEB)
Dordevic, N.; Wehrens, R. [IASMA Research and Innovation Centre, Fondazione Edmund Mach, via Mach 1, 38010 San Michele all' Adige (Italy); Postma, G.J.; Buydens, L.M.C. [Radboud University Nijmegen, Institute for Molecules and Materials, Analytical Chemistry, P.O. Box 9010, 6500 GL Nijmegen (Netherlands); Camin, F., E-mail: federica.camin@fmach.it [IASMA Research and Innovation Centre, Fondazione Edmund Mach, via Mach 1, 38010 San Michele all' Adige (Italy)
2012-12-13
Highlights: Black-Right-Pointing-Pointer The assessment of claims of origin is of enormous economic importance for DOC and DOCG wines. Black-Right-Pointing-Pointer The official method is based on univariate statistical tests of H, C and O isotopic ratios. Black-Right-Pointing-Pointer We consider 5220 Italian wine samples collected in the period 2000-2010. Black-Right-Pointing-Pointer Multivariate statistical analysis leads to much better specificity and easier detection of false claims of origin. Black-Right-Pointing-Pointer In the case of multi-modal data, mixture modelling provides additional improvements. - Abstract: Wine derives its economic value to a large extent from geographical origin, which has a significant impact on the quality of the wine. According to the food legislation, wines can be without geographical origin (table wine) and wines with origin. Wines with origin must have characteristics which are essential due to its region of production and must be produced, processed and prepared, exclusively within that region. The development of fast and reliable analytical methods for the assessment of claims of origin is very important. The current official method is based on the measurement of stable isotope ratios of water and alcohol in wine, which are influenced by climatic factors. The results in this paper are based on 5220 Italian wine samples collected in the period 2000-2010. We evaluate the univariate approach underlying the official method to assess claims of origin and propose several new methods to get better geographical discrimination between samples. It is shown that multivariate methods are superior to univariate approaches in that they show increased sensitivity and specificity. In cases where data are non-normally distributed, an approach based on mixture modelling provides additional improvements.
Regression Is a Univariate General Linear Model Subsuming Other Parametric Methods as Special Cases.
Vidal, Sherry
Although the concept of the general linear model (GLM) has existed since the 1960s, other univariate analyses such as the t-test and the analysis of variance models have remained popular. The GLM produces an equation that minimizes the mean differences of independent variables as they are related to a dependent variable. From a computer printout…
Expression and prognostic significance of lysozyme in male breast cancer
International Nuclear Information System (INIS)
Serra, Carlos; Baltasar, Aniceto; Medrano, Justo; Vizoso, Francisco; Alonso, Lorena; Rodríguez, Juan C; González, Luis O; Fernández, María; Lamelas, María L; Sánchez, Luis M; García-Muñiz, José L
2002-01-01
Lysozyme, one of the major protein components of human milk that is also synthesized by a significant percentage of breast carcinomas, is associated with lesions that have a favorable outcome in female breast cancer. Here we evaluate the expression and prognostic value of lysozyme in male breast cancer (MBC). Lysozyme expression was examined by immunohistochemical methods in a series of 60 MBC tissue sections and in 15 patients with gynecomastia. Staining was quantified using the HSCORE (histological score) system, which considers both the intensity and the percentage of cells staining at each intensity. Prognostic value of lysozyme was retrospectively evaluated by multivariate analysis taking into account conventional prognostic factors. Lysozyme immunostaining was negative in all cases of gynecomastia. A total of 27 of 60 MBC sections (45%) stained positively for this protein, but there were clear differences among them with regard to the intensity and percentage of stained cells. Statistical analysis showed that lysozyme HSCORE values in relation to age, tumor size, nodal status, histological grade, estrogen receptor status, metastasis and histological type did not increase the statistical significance. Univariate analysis confirmed that both nodal involvement and lysozyme values were significant predictors of short-term relapse-free survival. Multivariate analysis, according to Cox's regression model, also showed that nodal status and lysozyme levels were significant independent indicators of short-term relapse-free survival. Tumor expression of lysozyme is associated with lesions that have an unfavorable outcome in male breast cancer. This milk protein may be a new prognostic factor in patients with breast cancer
QRS complex detection based on continuous density hidden Markov models using univariate observations
Sotelo, S.; Arenas, W.; Altuve, M.
2018-04-01
In the electrocardiogram (ECG), the detection of QRS complexes is a fundamental step in the ECG signal processing chain since it allows the determination of other characteristics waves of the ECG and provides information about heart rate variability. In this work, an automatic QRS complex detector based on continuous density hidden Markov models (HMM) is proposed. HMM were trained using univariate observation sequences taken either from QRS complexes or their derivatives. The detection approach is based on the log-likelihood comparison of the observation sequence with a fixed threshold. A sliding window was used to obtain the observation sequence to be evaluated by the model. The threshold was optimized by receiver operating characteristic curves. Sensitivity (Sen), specificity (Spc) and F1 score were used to evaluate the detection performance. The approach was validated using ECG recordings from the MIT-BIH Arrhythmia database. A 6-fold cross-validation shows that the best detection performance was achieved with 2 states HMM trained with QRS complexes sequences (Sen = 0.668, Spc = 0.360 and F1 = 0.309). We concluded that these univariate sequences provide enough information to characterize the QRS complex dynamics from HMM. Future works are directed to the use of multivariate observations to increase the detection performance.
Statistical significance estimation of a signal within the GooFit framework on GPUs
Directory of Open Access Journals (Sweden)
Cristella Leonardo
2017-01-01
Full Text Available In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B+ → J/ψϕK+. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.
Sierevelt, Inger N.; van Oldenrijk, Jakob; Poolman, Rudolf W.
2007-01-01
In this paper we describe several issues that influence the reporting of statistical significance in relation to clinical importance, since misinterpretation of p values is a common issue in orthopaedic literature. Orthopaedic research is tormented by the risks of false-positive (type I error) and
Wingate, Peter H; Thornton, George C; McIntyre, Kelly S; Frame, Jennifer H
2003-02-01
The present study examined relationships between reduction-in-force (RIF) personnel practices, presentation of statistical evidence, and litigation outcomes. Policy capturing methods were utilized to analyze the components of 115 federal district court opinions involving age discrimination disparate treatment allegations and organizational downsizing. Univariate analyses revealed meaningful links between RIF personnel practices, use of statistical evidence, and judicial verdict. The defendant organization was awarded summary judgment in 73% of the claims included in the study. Judicial decisions in favor of the defendant organization were found to be significantly related to such variables as formal performance appraisal systems, termination decision review within the organization, methods of employee assessment and selection for termination, and the presence of a concrete layoff policy. The use of statistical evidence in ADEA disparate treatment litigation was investigated and found to be a potentially persuasive type of indirect evidence. Legal, personnel, and evidentiary ramifications are reviewed, and a framework of downsizing mechanics emphasizing legal defensibility is presented.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149
International Nuclear Information System (INIS)
DUDEK, J; SZPAK, B; FORNAL, B; PORQUET, M-G
2011-01-01
In this and the follow-up article we briefly discuss what we believe represents one of the most serious problems in contemporary nuclear structure: the question of statistical significance of parametrizations of nuclear microscopic Hamiltonians and the implied predictive power of the underlying theories. In the present Part I, we introduce the main lines of reasoning of the so-called Inverse Problem Theory, an important sub-field in the contemporary Applied Mathematics, here illustrated on the example of the Nuclear Mean-Field Approach.
Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.
2011-01-01
In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…
International Nuclear Information System (INIS)
Kang, Susan H; Haydu, Lauren E; Goh, Robin Yeong Hong; Fogarty, Gerald B
2012-01-01
Merkel cell carcinoma (MCC) is a rare tumour of skin. This study is a retrospective audit of patients with MCC from St Vincent’s and Mater Hospital, Sydney, Australia. The aim of this study was to investigate the influence of radiotherapy (RT) on the local and regional control of MCC lesions and survival of patients with MCC. The data bases in anatomical pathology, RT and surgery. We searched for patients having a diagnosis of MCC between 1996 and 2007. Patient, tumour and treatment characteristics were collected and analysed. Univariate survival analysis of categorical variables was conducted with the Kaplan-Meier method together with the Log-Rank test for statistical significance. Continuous variables were assessed using the Cox regression method. Multivariate analysis was performed for significant univariate results. Sixty seven patients were found. Sixty two who were stage I-III and were treated with radical intent were analysed. 68% were male. The median age was 74 years. Forty-two cases (68%) were stage I or II, and 20 cases (32%) were stage III. For the subset of 42 stage I and II patients, those that had RT to their primary site had a 2-year local recurrence free survival of 89% compared with 36% for patients not receiving RT (p<0.001). The cumulative 2-year regional recurrence free survival for patients having adjuvant regional RT was 84% compared with 43% for patients not receiving this treatment (p<0.001). Immune status at initial surgery was a significant predictor for OS and MCCSS. In a multivariate analysis combining macroscopic size (mm) and immune status at initial surgery, only immune status remained a significant predictor of overall survival (HR=2.096, 95% CI: 1.002-4.385, p=0.049). RT is associated with significant improvement in local and regional control in Merkel cell carcinoma. Immunosuppression is an important factor in overall survival
A Note on Comparing the Power of Test Statistics at Low Significance Levels.
Morris, Nathan; Elston, Robert
2011-01-01
It is an obvious fact that the power of a test statistic is dependent upon the significance (alpha) level at which the test is performed. It is perhaps a less obvious fact that the relative performance of two statistics in terms of power is also a function of the alpha level. Through numerous personal discussions, we have noted that even some competent statisticians have the mistaken intuition that relative power comparisons at traditional levels such as α = 0.05 will be roughly similar to relative power comparisons at very low levels, such as the level α = 5 × 10 -8 , which is commonly used in genome-wide association studies. In this brief note, we demonstrate that this notion is in fact quite wrong, especially with respect to comparing tests with differing degrees of freedom. In fact, at very low alpha levels the cost of additional degrees of freedom is often comparatively low. Thus we recommend that statisticians exercise caution when interpreting the results of power comparison studies which use alpha levels that will not be used in practice.
DEFF Research Database (Denmark)
Serviss, Jason T.; Gådin, Jesper R.; Eriksson, Per
2017-01-01
, e.g. genes in a specific pathway, alone can separate samples into these established classes. Despite this, the evaluation of class separations is often subjective and performed via visualization. Here we present the ClusterSignificance package; a set of tools designed to assess the statistical...... significance of class separations downstream of dimensionality reduction algorithms. In addition, we demonstrate the design and utility of the ClusterSignificance package and utilize it to determine the importance of long non-coding RNA expression in the identity of multiple hematological malignancies....
Arnrich, B; Albert, A; Walter, J
2006-01-01
Among the coronary bypass patients from our Datamart database, we found a prevalence of 29.6% of diagnosed diabetics. 5.2% of the patients without a diagnosis of diabetes mellitus and a fasting plasma glucose level > 125 mg/dl were defined as undiagnosed diabetics. The objective of this paper was to compare univariate methods and techniques for risk stratification to determine, whether undiagnosed diabetes is per se a risk factor for increased ventilation time and length of ICU stay, and for increased prevalence of resuscitation, reintubation and 30-d mortality for diabetics in heart surgery. Univariate comparisons reveals that undiagnosed diabetics needed resuscitation significantly more often and had an increased ventilation time, while the length of ICU stay was significantly reduced. The significantly different distribution between the diabetics groups of 11 from 32 attributes examined, demands the use of methods for risk stratification. Both risk adjusted methods regression and matching confirm that undiagnosed diabetics had an increased ventilation time and an increased prevalence of resuscitation, while the length of ICU stay was not significantly reduced. A homogeneous distribution of the patient characteristics in the two diabetics groups could be achieved through a statistical matching method using the propensity score. In contrast to the regression analysis, a significantly increased prevalence of reintubation in undiagnosed diabetics was found. Based on an example of undiagnosed diabetics in heart surgery, the presented study reveals the necessity and the possibilities of techniques for risk stratification in retrospective analysis and shows how the potential of data collection from daily clinical practice can be used in an effective way.
Baradez, Marc-Olivier; Biziato, Daniela; Hassan, Enas; Marshall, Damian
2018-01-01
Cell therapies offer unquestionable promises for the treatment, and in some cases even the cure, of complex diseases. As we start to see more of these therapies gaining market authorization, attention is turning to the bioprocesses used for their manufacture, in particular the challenge of gaining higher levels of process control to help regulate cell behavior, manage process variability, and deliver product of a consistent quality. Many processes already incorporate the measurement of key markers such as nutrient consumption, metabolite production, and cell concentration, but these are often performed off-line and only at set time points in the process. Having the ability to monitor these markers in real-time using in-line sensors would offer significant advantages, allowing faster decision-making and a finer level of process control. In this study, we use Raman spectroscopy as an in-line optical sensor for bioprocess monitoring of an autologous T-cell immunotherapy model produced in a stirred tank bioreactor system. Using reference datasets generated on a standard bioanalyzer, we develop chemometric models from the Raman spectra for glucose, glutamine, lactate, and ammonia. These chemometric models can accurately monitor donor-specific increases in nutrient consumption and metabolite production as the primary T-cell transition from a recovery phase and begin proliferating. Using a univariate modeling approach, we then show how changes in peak intensity within the Raman spectra can be correlated with cell concentration and viability. These models, which act as surrogate markers, can be used to monitor cell behavior including cell proliferation rates, proliferative capacity, and transition of the cells to a quiescent phenotype. Finally, using the univariate models, we also demonstrate how Raman spectroscopy can be applied for real-time monitoring. The ability to measure these key parameters using an in-line Raman optical sensor makes it possible to have immediate
Baradez, Marc-Olivier; Biziato, Daniela; Hassan, Enas; Marshall, Damian
2018-01-01
Cell therapies offer unquestionable promises for the treatment, and in some cases even the cure, of complex diseases. As we start to see more of these therapies gaining market authorization, attention is turning to the bioprocesses used for their manufacture, in particular the challenge of gaining higher levels of process control to help regulate cell behavior, manage process variability, and deliver product of a consistent quality. Many processes already incorporate the measurement of key markers such as nutrient consumption, metabolite production, and cell concentration, but these are often performed off-line and only at set time points in the process. Having the ability to monitor these markers in real-time using in-line sensors would offer significant advantages, allowing faster decision-making and a finer level of process control. In this study, we use Raman spectroscopy as an in-line optical sensor for bioprocess monitoring of an autologous T-cell immunotherapy model produced in a stirred tank bioreactor system. Using reference datasets generated on a standard bioanalyzer, we develop chemometric models from the Raman spectra for glucose, glutamine, lactate, and ammonia. These chemometric models can accurately monitor donor-specific increases in nutrient consumption and metabolite production as the primary T-cell transition from a recovery phase and begin proliferating. Using a univariate modeling approach, we then show how changes in peak intensity within the Raman spectra can be correlated with cell concentration and viability. These models, which act as surrogate markers, can be used to monitor cell behavior including cell proliferation rates, proliferative capacity, and transition of the cells to a quiescent phenotype. Finally, using the univariate models, we also demonstrate how Raman spectroscopy can be applied for real-time monitoring. The ability to measure these key parameters using an in-line Raman optical sensor makes it possible to have immediate
Directory of Open Access Journals (Sweden)
Marc-Olivier Baradez
2018-03-01
Full Text Available Cell therapies offer unquestionable promises for the treatment, and in some cases even the cure, of complex diseases. As we start to see more of these therapies gaining market authorization, attention is turning to the bioprocesses used for their manufacture, in particular the challenge of gaining higher levels of process control to help regulate cell behavior, manage process variability, and deliver product of a consistent quality. Many processes already incorporate the measurement of key markers such as nutrient consumption, metabolite production, and cell concentration, but these are often performed off-line and only at set time points in the process. Having the ability to monitor these markers in real-time using in-line sensors would offer significant advantages, allowing faster decision-making and a finer level of process control. In this study, we use Raman spectroscopy as an in-line optical sensor for bioprocess monitoring of an autologous T-cell immunotherapy model produced in a stirred tank bioreactor system. Using reference datasets generated on a standard bioanalyzer, we develop chemometric models from the Raman spectra for glucose, glutamine, lactate, and ammonia. These chemometric models can accurately monitor donor-specific increases in nutrient consumption and metabolite production as the primary T-cell transition from a recovery phase and begin proliferating. Using a univariate modeling approach, we then show how changes in peak intensity within the Raman spectra can be correlated with cell concentration and viability. These models, which act as surrogate markers, can be used to monitor cell behavior including cell proliferation rates, proliferative capacity, and transition of the cells to a quiescent phenotype. Finally, using the univariate models, we also demonstrate how Raman spectroscopy can be applied for real-time monitoring. The ability to measure these key parameters using an in-line Raman optical sensor makes it possible
van Tulder, M.W.; Malmivaara, A.; Hayden, J.; Koes, B.
2007-01-01
STUDY DESIGN. Critical appraisal of the literature. OBJECIVES. The objective of this study was to assess if results of back pain trials are statistically significant and clinically important. SUMMARY OF BACKGROUND DATA. There seems to be a discrepancy between conclusions reported by authors and
Ismail, A.; Hassan, Noor I.
2013-09-01
Cancer is one of the principal causes of death in Malaysia. This study was performed to determine the pattern of rate of cancer deaths at a public hospital in Malaysia over an 11 year period from year 2001 to 2011, to determine the best fitted model of forecasting the rate of cancer deaths using Univariate Modeling and to forecast the rates for the next two years (2012 to 2013). The medical records of the death of patients with cancer admitted at this Hospital over 11 year's period were reviewed, with a total of 663 cases. The cancers were classified according to 10th Revision International Classification of Diseases (ICD-10). Data collected include socio-demographic background of patients such as registration number, age, gender, ethnicity, ward and diagnosis. Data entry and analysis was accomplished using SPSS 19.0 and Minitab 16.0. The five Univariate Models used were Naïve with Trend Model, Average Percent Change Model (ACPM), Single Exponential Smoothing, Double Exponential Smoothing and Holt's Method. The overall 11 years rate of cancer deaths showed that at this hospital, Malay patients have the highest percentage (88.10%) compared to other ethnic groups with males (51.30%) higher than females. Lung and breast cancer have the most number of cancer deaths among gender. About 29.60% of the patients who died due to cancer were aged 61 years old and above. The best Univariate Model used for forecasting the rate of cancer deaths is Single Exponential Smoothing Technique with alpha of 0.10. The forecast for the rate of cancer deaths shows a horizontally or flat value. The forecasted mortality trend remains at 6.84% from January 2012 to December 2013. All the government and private sectors and non-governmental organizations need to highlight issues on cancer especially lung and breast cancers to the public through campaigns using mass media, media electronics, posters and pamphlets in the attempt to decrease the rate of cancer deaths in Malaysia.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Indirectional statistics and the significance of an asymmetry discovered by Birch
International Nuclear Information System (INIS)
Kendall, D.G.; Young, G.A.
1984-01-01
Birch (1982, Nature, 298, 451) reported an apparent 'statistical asymmetry of the Universe'. The authors here develop 'indirectional analysis' as a technique for investigating statistical effects of this kind and conclude that the reported effect (whatever may be its origin) is strongly supported by the observations. The estimated pole of the asymmetry is at RA 13h 30m, Dec. -37deg. The angular error in its estimation is unlikely to exceed 20-30deg. (author)
Directory of Open Access Journals (Sweden)
Carlos Lago-Peñas
2010-06-01
Full Text Available The aim of the present study was to analyze men's football competitions, trying to identify which game-related statistics allow to discriminate winning, drawing and losing teams. The sample used corresponded to 380 games from the 2008-2009 season of the Spanish Men's Professional League. The game-related statistics gathered were: total shots, shots on goal, effectiveness, assists, crosses, offsides commited and received, corners, ball possession, crosses against, fouls committed and received, corners against, yellow and red cards, and venue. An univariate (t-test and multivariate (discriminant analysis of data was done. The results showed that winning teams had averages that were significantly higher for the following game statistics: total shots (p < 0.001, shots on goal (p < 0.01, effectiveness (p < 0.01, assists (p < 0.01, offsides committed (p < 0.01 and crosses against (p < 0.01. Losing teams had significantly higher averages in the variable crosses (p < 0.01, offsides received (p < 0. 01 and red cards (p < 0.01. Discriminant analysis allowed to conclude the following: the variables that discriminate between winning, drawing and losing teams were the total shots, shots on goal, crosses, crosses against, ball possession and venue. Coaches and players should be aware for these different profiles in order to increase knowledge about game cognitive and motor solicitation and, therefore, to evaluate specificity at the time of practice and game planning
Ytsma, Cai R.; Dyar, M. Darby
2018-01-01
Hydrogen (H) is a critical element to measure on the surface of Mars because its presence in mineral structures is indicative of past hydrous conditions. The Curiosity rover uses the laser-induced breakdown spectrometer (LIBS) on the ChemCam instrument to analyze rocks for their H emission signal at 656.6 nm, from which H can be quantified. Previous LIBS calibrations for H used small data sets measured on standards and/or manufactured mixtures of hydrous minerals and rocks and applied univariate regression to spectra normalized in a variety of ways. However, matrix effects common to LIBS make these calibrations of limited usefulness when applied to the broad range of compositions on the Martian surface. In this study, 198 naturally-occurring hydrous geological samples covering a broad range of bulk compositions with directly-measured H content are used to create more robust prediction models for measuring H in LIBS data acquired under Mars conditions. Both univariate and multivariate prediction models, including partial least square (PLS) and the least absolute shrinkage and selection operator (Lasso), are compared using several different methods for normalization of H peak intensities. Data from the ChemLIBS Mars-analog spectrometer at Mount Holyoke College are compared against spectra from the same samples acquired using a ChemCam-like instrument at Los Alamos National Laboratory and the ChemCam instrument on Mars. Results show that all current normalization and data preprocessing variations for quantifying H result in models with statistically indistinguishable prediction errors (accuracies) ca. ± 1.5 weight percent (wt%) H2O, limiting the applications of LIBS in these implementations for geological studies. This error is too large to allow distinctions among the most common hydrous phases (basalts, amphiboles, micas) to be made, though some clays (e.g., chlorites with ≈ 12 wt% H2O, smectites with 15-20 wt% H2O) and hydrated phases (e.g., gypsum with ≈ 20
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo
2018-06-05
Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.
Energy Technology Data Exchange (ETDEWEB)
Sfetsos, A. [7 Pirsou Str., Athens (Greece); Coonick, A.H. [Imperial Coll. of Science Technology and Medicine, Dept. of Electrical and Electronic Engineering, London (United Kingdom)
2000-07-01
This paper introduces a new approach for the forecasting of mean hourly global solar radiation received by a horizontal surface. In addition to the traditional linear methods, several artificial-intelligence-based techniques are studied. These include linear, feed-forward, recurrent Elman and Radial Basis neural networks alongside the adaptive neuro-fuzzy inference scheme. The problem is examined initially for the univariate case, and is extended to include additional meteorological parameters in the process of estimating the optimum model. The results indicate that the developed artificial intelligence models predict the solar radiation time series more effectively compared to the conventional procedures based on the clearness index. The forecasting ability of some models can be further enhanced with the use of additional meteorological parameters. (Author)
Kim, Sung-Min; Choi, Yosoon
2017-06-18
To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z -score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z -scores: high content with a high z -score (HH), high content with a low z -score (HL), low content with a high z -score (LH), and low content with a low z -score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1-4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.
Directory of Open Access Journals (Sweden)
Sung-Min Kim
2017-06-01
Full Text Available To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z-score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z-scores: high content with a high z-score (HH, high content with a low z-score (HL, low content with a high z-score (LH, and low content with a low z-score (LL. The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1–4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.
Karian, Zaven A
2000-01-01
Throughout the physical and social sciences, researchers face the challenge of fitting statistical distributions to their data. Although the study of statistical modelling has made great strides in recent years, the number and variety of distributions to choose from-all with their own formulas, tables, diagrams, and general properties-continue to create problems. For a specific application, which of the dozens of distributions should one use? What if none of them fit well?Fitting Statistical Distributions helps answer those questions. Focusing on techniques used successfully across many fields, the authors present all of the relevant results related to the Generalized Lambda Distribution (GLD), the Generalized Bootstrap (GB), and Monte Carlo simulation (MC). They provide the tables, algorithms, and computer programs needed for fitting continuous probability distributions to data in a wide variety of circumstances-covering bivariate as well as univariate distributions, and including situations where moments do...
Calkins, D. S.
1998-01-01
When the dependent (or response) variable response variable in an experiment has direction and magnitude, one approach that has been used for statistical analysis involves splitting magnitude and direction and applying univariate statistical techniques to the components. However, such treatment of quantities with direction and magnitude is not justifiable mathematically and can lead to incorrect conclusions about relationships among variables and, as a result, to flawed interpretations. This note discusses a problem with that practice and recommends mathematically correct procedures to be used with dependent variables that have direction and magnitude for 1) computation of mean values, 2) statistical contrasts of and confidence intervals for means, and 3) correlation methods.
Energy Technology Data Exchange (ETDEWEB)
Crow, C.J.
1985-01-01
Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.
Detecting Statistically Significant Communities of Triangle Motifs in Undirected Networks
2016-04-26
Systems, Statistics & Management Science, University of Alabama, USA. 1 DISTRIBUTION A: Distribution approved for public release. Contents 1 Summary 5...13 5 Application to Real Networks 18 5.1 2012 FBS Football Schedule Network... football schedule network. . . . . . . . . . . . . . . . . . . . . . 21 14 Stem plot of degree-ordered vertices versus the degree for college football
Kellerer-Pirklbauer, Andreas
2016-04-01
Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter
Parametric statistical change point analysis
Chen, Jie
2000-01-01
This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study
Compounding approach for univariate time series with nonstationary variances
Schäfer, Rudi; Barkhofen, Sonja; Guhr, Thomas; Stöckmann, Hans-Jürgen; Kuhl, Ulrich
2015-12-01
A defining feature of nonstationary systems is the time dependence of their statistical parameters. Measured time series may exhibit Gaussian statistics on short time horizons, due to the central limit theorem. The sample statistics for long time horizons, however, averages over the time-dependent variances. To model the long-term statistical behavior, we compound the local distribution with the distribution of its parameters. Here, we consider two concrete, but diverse, examples of such nonstationary systems: the turbulent air flow of a fan and a time series of foreign exchange rates. Our main focus is to empirically determine the appropriate parameter distribution for the compounding approach. To this end, we extract the relevant time scales by decomposing the time signals into windows and determine the distribution function of the thus obtained local variances.
Conducting tests for statistically significant differences using forest inventory data
James A. Westfall; Scott A. Pugh; John W. Coulston
2013-01-01
Many forest inventory and monitoring programs are based on a sample of ground plots from which estimates of forest resources are derived. In addition to evaluating metrics such as number of trees or amount of cubic wood volume, it is often desirable to make comparisons between resource attributes. To properly conduct statistical tests for differences, it is imperative...
International Nuclear Information System (INIS)
Lauss, Martin; Frigyesi, Attila; Ryden, Tobias; Höglund, Mattias
2010-01-01
Genome wide gene expression data is a rich source for the identification of gene signatures suitable for clinical purposes and a number of statistical algorithms have been described for both identification and evaluation of such signatures. Some employed algorithms are fairly complex and hence sensitive to over-fitting whereas others are more simple and straight forward. Here we present a new type of simple algorithm based on ROC analysis and the use of metagenes that we believe will be a good complement to existing algorithms. The basis for the proposed approach is the use of metagenes, instead of collections of individual genes, and a feature selection using AUC values obtained by ROC analysis. Each gene in a data set is assigned an AUC value relative to the tumor class under investigation and the genes are ranked according to these values. Metagenes are then formed by calculating the mean expression level for an increasing number of ranked genes, and the metagene expression value that optimally discriminates tumor classes in the training set is used for classification of new samples. The performance of the metagene is then evaluated using LOOCV and balanced accuracies. We show that the simple uni-variate gene expression average algorithm performs as well as several alternative algorithms such as discriminant analysis and the more complex approaches such as SVM and neural networks. The R package rocc is freely available at http://cran.r-project.org/web/packages/rocc/index.html
Perneger, Thomas V; Combescure, Christophe
2017-07-01
Published P-values provide a window into the global enterprise of medical research. The aim of this study was to use the distribution of published P-values to estimate the relative frequencies of null and alternative hypotheses and to seek irregularities suggestive of publication bias. This cross-sectional study included P-values published in 120 medical research articles in 2016 (30 each from the BMJ, JAMA, Lancet, and New England Journal of Medicine). The observed distribution of P-values was compared with expected distributions under the null hypothesis (i.e., uniform between 0 and 1) and the alternative hypothesis (strictly decreasing from 0 to 1). P-values were categorized according to conventional levels of statistical significance and in one-percent intervals. Among 4,158 recorded P-values, 26.1% were highly significant (P values values equal to 1, and (3) about twice as many P-values less than 0.05 compared with those more than 0.05. The latter finding was seen in both randomized trials and observational studies, and in most types of analyses, excepting heterogeneity tests and interaction tests. Under plausible assumptions, we estimate that about half of the tested hypotheses were null and the other half were alternative. This analysis suggests that statistical tests published in medical journals are not a random sample of null and alternative hypotheses but that selective reporting is prevalent. In particular, significant results are about twice as likely to be reported as nonsignificant results. Copyright © 2017 Elsevier Inc. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Schlink, U.
1996-12-31
The work evaluates specifically the nuisance data provided by the measuring station in the centre of Leipig during the period from 1980 to 1993, with the aim to develop an algorithm for making very short-term forecasts of excessive nuisances. Forecasting was to be univariate, i.e., based exclusively on the half-hourly readings of SO{sub 2} concentrations taken in the past. As shown by Fourier analysis, there exist three main and mutually independent spectral regions: the high-frequency sector (period < 12 hours) of unstable irregularities, the seasonal sector with the periods of 24 and 12 hours, and the low-frequency sector (period > 24 hours). After breaking the measuring series up into components, the low-frequency sector is termed trend component, or trend for short. For obtaining the components, a Kalman filter is used. It was found that smog episodes are most adequately described by the trend component. This is therefore more closely investigated. The phase representation then shows characteristic trajectories of the trends. (orig./KW) [Deutsch] In der vorliegende Arbeit wurden speziell die Immissionsdaten der Messstation Leipzig-Mitte des Zeitraumes 1980-1993 mit dem Ziel der Erstellung eines Algorithmus fuer die Kuerzestfristprognose von Ueberschreitungssituationen untersucht. Die Prognosestellung sollte allein anhand der in der Vergangenheit registrierten Halbstundenwerte der SO{sub 2}-Konzentration, also univariat erfolgen. Wie die Fourieranalyse zeigt, gibt es drei wesentliche und voneinander unabhaengige Spektralbereiche: Den hochfrequenten Bereich (Periode <12 Stunden) der instabilen Irregularitaeten, den saisonalen Anteil mit den Perioden von 24 und 12 Stunden und den niedrigfrequenten Bereich (Periode >24 Stunden). Letzterer wird nach einer Zerlegung der Messreihe in Komponenten als Trendkomponente (oder kurz Trend) bezeichnet. Fuer die Komponentenzerlegung wird ein Kalman-Filter verwendet. Es stellt sich heraus, dass Smogepisoden am deutlichsten
An R2 statistic for fixed effects in the linear mixed model.
Edwards, Lloyd J; Muller, Keith E; Wolfinger, Russell D; Qaqish, Bahjat F; Schabenberger, Oliver
2008-12-20
Statisticians most often use the linear mixed model to analyze Gaussian longitudinal data. The value and familiarity of the R(2) statistic in the linear univariate model naturally creates great interest in extending it to the linear mixed model. We define and describe how to compute a model R(2) statistic for the linear mixed model by using only a single model. The proposed R(2) statistic measures multivariate association between the repeated outcomes and the fixed effects in the linear mixed model. The R(2) statistic arises as a 1-1 function of an appropriate F statistic for testing all fixed effects (except typically the intercept) in a full model. The statistic compares the full model with a null model with all fixed effects deleted (except typically the intercept) while retaining exactly the same covariance structure. Furthermore, the R(2) statistic leads immediately to a natural definition of a partial R(2) statistic. A mixed model in which ethnicity gives a very small p-value as a longitudinal predictor of blood pressure (BP) compellingly illustrates the value of the statistic. In sharp contrast to the extreme p-value, a very small R(2) , a measure of statistical and scientific importance, indicates that ethnicity has an almost negligible association with the repeated BP outcomes for the study.
Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.
2010-01-01
This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…
Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.
1984-01-01
An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.
Chek, Mohd Zaki Awang; Ahmad, Abu Bakar; Ridzwan, Ahmad Nur Azam Ahmad; Jelas, Imran Md.; Jamal, Nur Faezah; Ismail, Isma Liana; Zulkifli, Faiz; Noor, Syamsul Ikram Mohd
2012-09-01
The main objective of this study is to forecast the future claims amount of Invalidity Pension Scheme (IPS). All data were derived from SOCSO annual reports from year 1972 - 2010. These claims consist of all claims amount from 7 benefits offered by SOCSO such as Invalidity Pension, Invalidity Grant, Survivors Pension, Constant Attendance Allowance, Rehabilitation, Funeral and Education. Prediction of future claims of Invalidity Pension Scheme will be made using Univariate Forecasting Models to predict the future claims among workforce in Malaysia.
Directory of Open Access Journals (Sweden)
Yukinori Sakao
Full Text Available BACKGROUND: We aimed to clarify that the size of the lung adenocarcinoma evaluated using mediastinal window on computed tomography is an important and useful modality for predicting invasiveness, lymph node metastasis and prognosis in small adenocarcinoma. METHODS: We evaluated 176 patients with small lung adenocarcinomas (diameter, 1-3 cm who underwent standard surgical resection. Tumours were examined using computed tomography with thin section conditions (1.25 mm thick on high-resolution computed tomography with tumour dimensions evaluated under two settings: lung window and mediastinal window. We also determined the patient age, gender, preoperative nodal status, tumour size, tumour disappearance ratio, preoperative serum carcinoembryonic antigen levels and pathological status (lymphatic vessel, vascular vessel or pleural invasion. Recurrence-free survival was used for prognosis. RESULTS: Lung window, mediastinal window, tumour disappearance ratio and preoperative nodal status were significant predictive factors for recurrence-free survival in univariate analyses. Areas under the receiver operator curves for recurrence were 0.76, 0.73 and 0.65 for mediastinal window, tumour disappearance ratio and lung window, respectively. Lung window, mediastinal window, tumour disappearance ratio, preoperative serum carcinoembryonic antigen levels and preoperative nodal status were significant predictive factors for lymph node metastasis in univariate analyses; areas under the receiver operator curves were 0.61, 0.76, 0.72 and 0.66, for lung window, mediastinal window, tumour disappearance ratio and preoperative serum carcinoembryonic antigen levels, respectively. Lung window, mediastinal window, tumour disappearance ratio, preoperative serum carcinoembryonic antigen levels and preoperative nodal status were significant factors for lymphatic vessel, vascular vessel or pleural invasion in univariate analyses; areas under the receiver operator curves were 0
Sakao, Yukinori; Kuroda, Hiroaki; Mun, Mingyon; Uehara, Hirofumi; Motoi, Noriko; Ishikawa, Yuichi; Nakagawa, Ken; Okumura, Sakae
2014-01-01
Background We aimed to clarify that the size of the lung adenocarcinoma evaluated using mediastinal window on computed tomography is an important and useful modality for predicting invasiveness, lymph node metastasis and prognosis in small adenocarcinoma. Methods We evaluated 176 patients with small lung adenocarcinomas (diameter, 1–3 cm) who underwent standard surgical resection. Tumours were examined using computed tomography with thin section conditions (1.25 mm thick on high-resolution computed tomography) with tumour dimensions evaluated under two settings: lung window and mediastinal window. We also determined the patient age, gender, preoperative nodal status, tumour size, tumour disappearance ratio, preoperative serum carcinoembryonic antigen levels and pathological status (lymphatic vessel, vascular vessel or pleural invasion). Recurrence-free survival was used for prognosis. Results Lung window, mediastinal window, tumour disappearance ratio and preoperative nodal status were significant predictive factors for recurrence-free survival in univariate analyses. Areas under the receiver operator curves for recurrence were 0.76, 0.73 and 0.65 for mediastinal window, tumour disappearance ratio and lung window, respectively. Lung window, mediastinal window, tumour disappearance ratio, preoperative serum carcinoembryonic antigen levels and preoperative nodal status were significant predictive factors for lymph node metastasis in univariate analyses; areas under the receiver operator curves were 0.61, 0.76, 0.72 and 0.66, for lung window, mediastinal window, tumour disappearance ratio and preoperative serum carcinoembryonic antigen levels, respectively. Lung window, mediastinal window, tumour disappearance ratio, preoperative serum carcinoembryonic antigen levels and preoperative nodal status were significant factors for lymphatic vessel, vascular vessel or pleural invasion in univariate analyses; areas under the receiver operator curves were 0.60, 0.81, 0
Prognostic significance of blood coagulation tests in carcinoma of the lung and colon.
Wojtukiewicz, M Z; Zacharski, L R; Moritz, T E; Hur, K; Edwards, R L; Rickles, F R
1992-08-01
Blood coagulation test results were collected prospectively in patients with previously untreated, advanced lung or colon cancer who entered into a clinical trial. In patients with colon cancer, reduced survival was associated (in univariate analysis) with higher values obtained at entry to the study for fibrinogen, fibrin(ogen) split products, antiplasmin, and fibrinopeptide A and accelerated euglobulin lysis times. In patients with non-small cell lung cancer, reduced survival was associated (in univariate analysis) with higher fibrinogen and fibrin(ogen) split products, platelet counts and activated partial thromboplastin times. In patients with small cell carcinoma of the lung, only higher activated partial thromboplastin times were associated (in univariate analysis) with reduced survival in patients with disseminated disease. In multivariate analysis, higher activated partial thromboplastin times were a significant independent predictor of survival for patients with non-small cell lung cancer limited to one hemithorax and with disseminated small cell carcinoma of the lung. Fibrin(ogen) split product levels were an independent predictor of survival for patients with disseminated non-small cell lung cancer as were both the fibrinogen and fibrinopeptide A levels for patients with disseminated colon cancer. These results suggest that certain tests of blood coagulation may be indicative of prognosis in lung and colon cancer. The heterogeneity of these results suggests that the mechanism(s), intensity, and pathophysiological significance of coagulation activation in cancer may differ between tumour types.
MIDAS: Regionally linear multivariate discriminative statistical mapping.
Varol, Erdem; Sotiras, Aristeidis; Davatzikos, Christos
2018-07-01
Statistical parametric maps formed via voxel-wise mass-univariate tests, such as the general linear model, are commonly used to test hypotheses about regionally specific effects in neuroimaging cross-sectional studies where each subject is represented by a single image. Despite being informative, these techniques remain limited as they ignore multivariate relationships in the data. Most importantly, the commonly employed local Gaussian smoothing, which is important for accounting for registration errors and making the data follow Gaussian distributions, is usually chosen in an ad hoc fashion. Thus, it is often suboptimal for the task of detecting group differences and correlations with non-imaging variables. Information mapping techniques, such as searchlight, which use pattern classifiers to exploit multivariate information and obtain more powerful statistical maps, have become increasingly popular in recent years. However, existing methods may lead to important interpretation errors in practice (i.e., misidentifying a cluster as informative, or failing to detect truly informative voxels), while often being computationally expensive. To address these issues, we introduce a novel efficient multivariate statistical framework for cross-sectional studies, termed MIDAS, seeking highly sensitive and specific voxel-wise brain maps, while leveraging the power of regional discriminant analysis. In MIDAS, locally linear discriminative learning is applied to estimate the pattern that best discriminates between two groups, or predicts a variable of interest. This pattern is equivalent to local filtering by an optimal kernel whose coefficients are the weights of the linear discriminant. By composing information from all neighborhoods that contain a given voxel, MIDAS produces a statistic that collectively reflects the contribution of the voxel to the regional classifiers as well as the discriminative power of the classifiers. Critically, MIDAS efficiently assesses the
Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph
2013-11-07
The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the
Directory of Open Access Journals (Sweden)
Abdel Samee Nagwan M
2012-08-01
Full Text Available Abstract Background Discovering new biomarkers has a great role in improving early diagnosis of Hepatocellular carcinoma (HCC. The experimental determination of biomarkers needs a lot of time and money. This motivates this work to use in-silico prediction of biomarkers to reduce the number of experiments required for detecting new ones. This is achieved by extracting the most representative genes in microarrays of HCC. Results In this work, we provide a method for extracting the differential expressed genes, up regulated ones, that can be considered candidate biomarkers in high throughput microarrays of HCC. We examine the power of several gene selection methods (such as Pearson’s correlation coefficient, Cosine coefficient, Euclidean distance, Mutual information and Entropy with different estimators in selecting informative genes. A biological interpretation of the highly ranked genes is done using KEGG (Kyoto Encyclopedia of Genes and Genomes pathways, ENTREZ and DAVID (Database for Annotation, Visualization, and Integrated Discovery databases. The top ten genes selected using Pearson’s correlation coefficient and Cosine coefficient contained six genes that have been implicated in cancer (often multiple cancers genesis in previous studies. A fewer number of genes were obtained by the other methods (4 genes using Mutual information, 3genes using Euclidean distance and only one gene using Entropy. A better result was obtained by the utilization of a hybrid approach based on intersecting the highly ranked genes in the output of all investigated methods. This hybrid combination yielded seven genes (2 genes for HCC and 5 genes in different types of cancer in the top ten genes of the list of intersected genes. Conclusions To strengthen the effectiveness of the univariate selection methods, we propose a hybrid approach by intersecting several of these methods in a cascaded manner. This approach surpasses all of univariate selection methods when
Methods for meta-analysis of multiple traits using GWAS summary statistics.
Ray, Debashree; Boehnke, Michael
2018-03-01
Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses
Age most significant predictor of requiring enteral feeding in head-and-neck cancer patients
International Nuclear Information System (INIS)
Sachdev, Sean; Refaat, Tamer; Bacchus, Ian D; Sathiaseelan, Vythialinga; Mittal, Bharat B
2015-01-01
A significant number of patients treated for head and neck squamous cell cancer (HNSCC) undergo enteral tube feeding. Data suggest that avoiding enteral feeding can prevent long-term tube dependence and disuse of the swallowing mechanism which has been linked to complications such as prolonged dysphagia and esophageal constriction. We examined detailed dosimetric and clinical parameters to better identify those at risk of requiring enteral feeding. One hundred patients with advanced stage HNSCC were retrospectively analyzed after intensity-modulated radiation therapy (IMRT) to a median dose of 70 Gy (range: 60-75 Gy) with concurrent chemotherapy in nearly all cases (97%). Patients with significant weight loss (>10%) in the setting of severely reduced oral intake were referred for placement of a percutaneous endoscopic gastrostomy (PEG) tube. Detailed DVH parameters were collected for several structures. Univariate and multivariate analyses using logistic regression were used to determine clinical and dosimetric factors associated with needing enteral feeding. Dichotomous outcomes were tested using Fisher’s exact test and continuous variables between groups using the Wilcoxon rank-sum test. Thirty-three percent of patients required placement of an enteral feeding tube. The median time to tube placement was 25 days from start of treatment, after a median dose of 38 Gy. On univariate analysis, age (p = 0.0008), the DFH (Docetaxel/5-FU/Hydroxyurea) chemotherapy regimen (p = .042) and b.i.d treatment (P = 0.040) (used in limited cases on protocol) predicted need for enteral feeding. On multivariate analysis, age remained the single statistically significant factor (p = 0.003) regardless of other clinical features (e.g. BMI) and all radiation planning parameters. For patients 60 or older compared to younger adults, the odds ratio for needing enteral feeding was 4.188 (p = 0.0019). Older age was found to be the most significant risk factor for needing enteral feeding in
Hashim, Muhammad Jawad
2010-09-01
Post-hoc secondary data analysis with no prespecified hypotheses has been discouraged by textbook authors and journal editors alike. Unfortunately no single term describes this phenomenon succinctly. I would like to coin the term "sigsearch" to define this practice and bring it within the teaching lexicon of statistics courses. Sigsearch would include any unplanned, post-hoc search for statistical significance using multiple comparisons of subgroups. It would also include data analysis with outcomes other than the prespecified primary outcome measure of a study as well as secondary data analyses of earlier research.
International Nuclear Information System (INIS)
Batista Braga, Jez Willian; Trevizan, Lilian Cristina; Nunes, Lidiane Cristina; Aparecida Rufini, Iolanda; Santos, Dario; Krug, Francisco Jose
2010-01-01
The application of laser induced breakdown spectrometry (LIBS) aiming the direct analysis of plant materials is a great challenge that still needs efforts for its development and validation. In this way, a series of experimental approaches has been carried out in order to show that LIBS can be used as an alternative method to wet acid digestions based methods for analysis of agricultural and environmental samples. The large amount of information provided by LIBS spectra for these complex samples increases the difficulties for selecting the most appropriated wavelengths for each analyte. Some applications have suggested that improvements in both accuracy and precision can be achieved by the application of multivariate calibration in LIBS data when compared to the univariate regression developed with line emission intensities. In the present work, the performance of univariate and multivariate calibration, based on partial least squares regression (PLSR), was compared for analysis of pellets of plant materials made from an appropriate mixture of cryogenically ground samples with cellulose as the binding agent. The development of a specific PLSR model for each analyte and the selection of spectral regions containing only lines of the analyte of interest were the best conditions for the analysis. In this particular application, these models showed a similar performance, but PLSR seemed to be more robust due to a lower occurrence of outliers in comparison to the univariate method. Data suggests that efforts dealing with sample presentation and fitness of standards for LIBS analysis must be done in order to fulfill the boundary conditions for matrix independent development and validation.
Yepes-Calderon, Fernando; Brun, Caroline; Sant, Nishita; Thompson, Paul; Lepore, Natasha
2015-01-01
Tensor-Based Morphometry (TBM) is an increasingly popular method for group analysis of brain MRI data. The main steps in the analysis consist of a nonlinear registration to align each individual scan to a common space, and a subsequent statistical analysis to determine morphometric differences, or difference in fiber structure between groups. Recently, we implemented the Statistically-Assisted Fluid Registration Algorithm or SAFIRA,1 which is designed for tracking morphometric differences among populations. To this end, SAFIRA allows the inclusion of statistical priors extracted from the populations being studied as regularizers in the registration. This flexibility and degree of sophistication limit the tool to expert use, even more so considering that SAFIRA was initially implemented in command line mode. Here, we introduce a new, intuitive, easy to use, Matlab-based graphical user interface for SAFIRA's multivariate TBM. The interface also generates different choices for the TBM statistics, including both the traditional univariate statistics on the Jacobian matrix, and comparison of the full deformation tensors.2 This software will be freely disseminated to the neuroimaging research community.
Fayez, Yasmin Mohammed; Tawakkol, Shereen Mostafa; Fahmy, Nesma Mahmoud; Lotfy, Hayam Mahmoud; Shehata, Mostafa Abdel-Aty
2018-04-01
Three methods of analysis are conducted that need computational procedures by the Matlab® software. The first is the univariate mean centering method which eliminates the interfering signal of the one component at a selected wave length leaving the amplitude measured to represent the component of interest only. The other two multivariate methods named PLS and PCR depend on a large number of variables that lead to extraction of the maximum amount of information required to determine the component of interest in the presence of the other. Good accurate and precise results are obtained from the three methods for determining clotrimazole in the linearity range 1-12 μg/mL and 75-550 μg/mL with dexamethasone acetate 2-20 μg/mL in synthetic mixtures and pharmaceutical formulation using two different spectral regions 205-240 nm and 233-278 nm. The results obtained are compared statistically to each other and to the official methods.
Directory of Open Access Journals (Sweden)
E. A. Tatokchin
2017-01-01
Full Text Available Development of the modern educational technologies caused by broad introduction of comput-er testing and development of distant forms of education does necessary revision of methods of an examination of pupils. In work it was shown, need transition to mathematical criteria, exami-nations of knowledge which are deprived of subjectivity. In article the review of the problems arising at realization of this task and are offered approaches for its decision. The greatest atten-tion is paid to discussion of a problem of objective transformation of rated estimates of the ex-pert on to the scale estimates of the student. In general, the discussion this question is was con-cluded that the solution to this problem lies in the creation of specialized intellectual systems. The basis for constructing intelligent system laid the mathematical model of self-organizing nonequilibrium dissipative system, which is a group of students. This article assumes that the dissipative system is provided by the constant influx of new test items of the expert and non-equilibrium – individual psychological characteristics of students in the group. As a result, the system must self-organize themselves into stable patterns. This patern will allow for, relying on large amounts of data, get a statistically significant assessment of student. To justify the pro-posed approach in the work presents the data of the statistical analysis of the results of testing a large sample of students (> 90. Conclusions from this statistical analysis allowed to develop intelligent system statistically significant examination of student performance. It is based on data clustering algorithm (k-mean for the three key parameters. It is shown that this approach allows you to create of the dynamics and objective expertise evaluation.
Handbook of univariate and multivariate data analysis and interpretation with SPSS
Ho, Robert
2006-01-01
Many statistics texts tend to focus more on the theory and mathematics underlying statistical tests than on their applications and interpretation. This can leave readers with little understanding of how to apply statistical tests or how to interpret their findings. While the SPSS statistical software has done much to alleviate the frustrations of social science professionals and students who must analyze data, they still face daunting challenges in selecting the proper tests, executing the tests, and interpreting the test results.With emphasis firmly on such practical matters, this handbook se
Directory of Open Access Journals (Sweden)
Anita Lindmark
Full Text Available When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance.The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008-2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method.Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252 and high specificity (0.991. There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence.The study emphasizes the importance of combining clinical relevance and level of statistical
Lindmark, Anita; van Rompaye, Bart; Goetghebeur, Els; Glader, Eva-Lotta; Eriksson, Marie
2016-01-01
When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke) to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance. The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008-2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method. Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252) and high specificity (0.991). There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence. The study emphasizes the importance of combining clinical relevance and level of statistical confidence when
Aboagye-Sarfo, Patrick; Mai, Qun; Sanfilippo, Frank M; Preen, David B; Stewart, Louise M; Fatovich, Daniel M
2015-10-01
To develop multivariate vector-ARMA (VARMA) forecast models for predicting emergency department (ED) demand in Western Australia (WA) and compare them to the benchmark univariate autoregressive moving average (ARMA) and Winters' models. Seven-year monthly WA state-wide public hospital ED presentation data from 2006/07 to 2012/13 were modelled. Graphical and VARMA modelling methods were used for descriptive analysis and model fitting. The VARMA models were compared to the benchmark univariate ARMA and Winters' models to determine their accuracy to predict ED demand. The best models were evaluated by using error correction methods for accuracy. Descriptive analysis of all the dependent variables showed an increasing pattern of ED use with seasonal trends over time. The VARMA models provided a more precise and accurate forecast with smaller confidence intervals and better measures of accuracy in predicting ED demand in WA than the ARMA and Winters' method. VARMA models are a reliable forecasting method to predict ED demand for strategic planning and resource allocation. While the ARMA models are a closely competing alternative, they under-estimated future ED demand. Copyright © 2015 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Sadreyev Ruslan I
2004-08-01
Full Text Available Abstract Background Profile-based analysis of multiple sequence alignments (MSA allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1 MSA position and a set of predicted residue frequencies, and (2 between two MSA positions. These problems are important for (i evaluation and optimization of methods predicting residue occurrence at protein positions; (ii detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii detection of sites that determine functional or structural specificity in two related families. Results For problems (1 and (2, we propose analytical estimates of P-value and apply them to the detection of significant positional dissimilarities in various experimental situations. (a We compare structure-based predictions of residue propensities at a protein position to the actual residue frequencies in the MSA of homologs. (b We evaluate our method by the ability to detect erroneous position matches produced by an automatic sequence aligner. (c We compare MSA positions that correspond to residues aligned by automatic structure aligners. (d We compare MSA positions that are aligned by high-quality manual superposition of structures. Detected dissimilarities reveal shortcomings of the automatic methods for residue frequency prediction and alignment construction. For the high-quality structural alignments, the dissimilarities suggest sites of potential functional or structural importance. Conclusion The proposed computational method is of significant potential value for the analysis of protein families.
Fang, Yongxiang; Wit, Ernst
2008-01-01
Fisher’s combined probability test is the most commonly used method to test the overall significance of a set independent p-values. However, it is very obviously that Fisher’s statistic is more sensitive to smaller p-values than to larger p-value and a small p-value may overrule the other p-values
Nielsen, Frank
2016-12-09
Information-theoreticmeasures, such as the entropy, the cross-entropy and the Kullback-Leibler divergence between two mixture models, are core primitives in many signal processing tasks. Since the Kullback-Leibler divergence of mixtures provably does not admit a closed-form formula, it is in practice either estimated using costly Monte Carlo stochastic integration, approximated or bounded using various techniques. We present a fast and generic method that builds algorithmically closed-form lower and upper bounds on the entropy, the cross-entropy, the Kullback-Leibler and the α-divergences of mixtures. We illustrate the versatile method by reporting our experiments for approximating the Kullback-Leibler and the α-divergences between univariate exponential mixtures, Gaussian mixtures, Rayleigh mixtures and Gamma mixtures.
Fan, Liping; Fu, Danhui; Zhang, Jinping; Wang, Qingqing; Ye, Yamei; Xie, Qianling
2017-01-01
The aim of this study was to evaluate whether blood transfusions affect overall survival (OS) and progression-free survival (PFS) in newly diagnosed multiple myeloma (MM) patients without hematopoietic stem cell transplantation. A total of 181 patients were enrolled and divided into two groups: 68 patients in the transfused group and 113 patients in the nontransfused group. Statistical analyses showed that there were significant differences in ECOG scoring, Ig isotype, platelet (Plt) counts, hemoglobin (Hb) level, serum creatinine (Scr) level, and β2-microglobulin (β2-MG) level between the two groups. Univariate analyses showed that higher International Staging System staging, Plt counts blood transfusion was associated with PFS but not OS in MM patients. Multivariate analyses showed that blood transfusion was not an independent factor for PFS in MM patients. Our preliminary results suggested that newly diagnosed MM patients may benefit from a liberal blood transfusion strategy, since blood transfusion is not an independent impact factor for survival. PMID:28567420
Statistical determination of significant curved I-girder bridge seismic response parameters
Seo, Junwon
2013-06-01
Curved steel bridges are commonly used at interchanges in transportation networks and more of these structures continue to be designed and built in the United States. Though the use of these bridges continues to increase in locations that experience high seismicity, the effects of curvature and other parameters on their seismic behaviors have been neglected in current risk assessment tools. These tools can evaluate the seismic vulnerability of a transportation network using fragility curves. One critical component of fragility curve development for curved steel bridges is the completion of sensitivity analyses that help identify influential parameters related to their seismic response. In this study, an accessible inventory of existing curved steel girder bridges located primarily in the Mid-Atlantic United States (MAUS) was used to establish statistical characteristics used as inputs for a seismic sensitivity study. Critical seismic response quantities were captured using 3D nonlinear finite element models. Influential parameters from these quantities were identified using statistical tools that incorporate experimental Plackett-Burman Design (PBD), which included Pareto optimal plots and prediction profiler techniques. The findings revealed that the potential variation in the influential parameters included number of spans, radius of curvature, maximum span length, girder spacing, and cross-frame spacing. These parameters showed varying levels of influence on the critical bridge response.
Oliveira Mendes, Thiago de; Pinto, Liliane Pereira; Santos, Laurita dos; Tippavajhala, Vamshi Krishna; Téllez Soto, Claudio Alberto; Martin, Airton Abrahão
2016-07-01
The analysis of biological systems by spectroscopic techniques involves the evaluation of hundreds to thousands of variables. Hence, different statistical approaches are used to elucidate regions that discriminate classes of samples and to propose new vibrational markers for explaining various phenomena like disease monitoring, mechanisms of action of drugs, food, and so on. However, the technical statistics are not always widely discussed in applied sciences. In this context, this work presents a detailed discussion including the various steps necessary for proper statistical analysis. It includes univariate parametric and nonparametric tests, as well as multivariate unsupervised and supervised approaches. The main objective of this study is to promote proper understanding of the application of various statistical tools in these spectroscopic methods used for the analysis of biological samples. The discussion of these methods is performed on a set of in vivo confocal Raman spectra of human skin analysis that aims to identify skin aging markers. In the Appendix, a complete routine of data analysis is executed in a free software that can be used by the scientific community involved in these studies.
Combination of statistical approaches for analysis of 2-DE data gives complementary results
DEFF Research Database (Denmark)
Grove, Harald; Jørgensen, Bo; Jessen, Flemming
2008-01-01
Five methods for finding significant changes in proteome data have been used to analyze a two-dimensional gel electrophoresis data set. We used both univariate (ANOVA) and multivariate (Partial Least Squares with jackknife, Cross Model Validation, Power-PLS and CovProc) methods. The gels were taken...
Conceptual and statistical problems associated with the use of diversity indices in ecology.
Barrantes, Gilbert; Sandoval, Luis
2009-09-01
Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
Directory of Open Access Journals (Sweden)
Leitner Dietmar
2005-04-01
Full Text Available Abstract Background A reliable prediction of the Xaa-Pro peptide bond conformation would be a useful tool for many protein structure calculation methods. We have analyzed the Protein Data Bank and show that the combined use of sequential and structural information has a predictive value for the assessment of the cis versus trans peptide bond conformation of Xaa-Pro within proteins. For the analysis of the data sets different statistical methods such as the calculation of the Chou-Fasman parameters and occurrence matrices were used. Furthermore we analyzed the relationship between the relative solvent accessibility and the relative occurrence of prolines in the cis and in the trans conformation. Results One of the main results of the statistical investigations is the ranking of the secondary structure and sequence information with respect to the prediction of the Xaa-Pro peptide bond conformation. We observed a significant impact of secondary structure information on the occurrence of the Xaa-Pro peptide bond conformation, while the sequence information of amino acids neighboring proline is of little predictive value for the conformation of this bond. Conclusion In this work, we present an extensive analysis of the occurrence of the cis and trans proline conformation in proteins. Based on the data set, we derived patterns and rules for a possible prediction of the proline conformation. Upon adoption of the Chou-Fasman parameters, we are able to derive statistically relevant correlations between the secondary structure of amino acid fragments and the Xaa-Pro peptide bond conformation.
Most, Sebastian; Nowak, Wolfgang; Bijeljic, Branko
2015-04-01
Fickian transport in groundwater flow is the exception rather than the rule. Transport in porous media is frequently simulated via particle methods (i.e. particle tracking random walk (PTRW) or continuous time random walk (CTRW)). These methods formulate transport as a stochastic process of particle position increments. At the pore scale, geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Hence, it is important to get a better understanding of the processes at pore scale. For our analysis we track the positions of 10.000 particles migrating through the pore space over time. The data we use come from micro CT scans of a homogeneous sandstone and encompass about 10 grain sizes. Based on those images we discretize the pore structure and simulate flow at the pore scale based on the Navier-Stokes equation. This flow field realistically describes flow inside the pore space and we do not need to add artificial dispersion during the transport simulation. Next, we use particle tracking random walk and simulate pore-scale transport. Finally, we use the obtained particle trajectories to do a multivariate statistical analysis of the particle motion at the pore scale. Our analysis is based on copulas. Every multivariate joint distribution is a combination of its univariate marginal distributions. The copula represents the dependence structure of those univariate marginals and is therefore useful to observe correlation and non-Gaussian interactions (i.e. non-Fickian transport). The first goal of this analysis is to better understand the validity regions of commonly made assumptions. We are investigating three different transport distances: 1) The distance where the statistical dependence between particle increments can be modelled as an order-one Markov process. This would be the Markovian distance for the process, where
Prognostic significance of radionuclide-assessed diastolic function in hypertrophic cardiomyopathy
International Nuclear Information System (INIS)
Chikamori, T.; Dickie, S.; Poloniecki, J.D.; Myers, M.J.; Lavender, J.P.; McKenna, W.J.
1990-01-01
To evaluate the prognostic significance of diastolic function in hypertrophic cardiomyopathy (HC), technetium-99m gated equilibrium radionuclide angiography, acquired in list mode, was performed in 161 patients. Five diastolic indexes were calculated. During 3.0 +/- 1.9 years, 13 patients had disease-related deaths. With univariate analysis, these patients were younger (29 +/- 20 vs 42 +/- 16 years; p less than 0.05), had a higher incidence of syncope (p less than 0.025), dyspnea (p less than 0.001), reduced peak filling rate (2.9 +/- 0.9 vs 3.4 +/- 1.0 end-diastolic volume/s; p = 0.09) with increased relative filling volume during the rapid filling period (80 +/- 7 vs 75 +/- 12%; p = 0.06) and decreased atrial contribution (17 +/- 7 vs 22 +/- 11%; p = 0.07). Stepwise discriminant analysis revealed that young age at diagnosis, syncope at diagnosis, reduced peak ejection rate, positive family history, reduced peak filling rate, increased relative filling volume by peak filling rate and concentric left ventricular hypertrophy were the most statistically significant (p = 0.0001) predictors of disease-related death (sensitivity 92%, specificity 76%, accuracy 77%, positive predictive value 25%). Discriminant analysis excluding the diastolic indexes, however, showed similar predictability (sensitivity 92%, specificity 76%, accuracy 78%, positive predictive value 26%). To obtain more homogeneous groups for analysis, patients were classified as survivors or electrically unstable, including sudden death, out-of-hospital ventricular fibrillation and nonsustained ventricular tachycardia during 48-hour ambulatory electrocardiography, and heart failure death or cardiac transplant
CONFIDENCE LEVELS AND/VS. STATISTICAL HYPOTHESIS TESTING IN STATISTICAL ANALYSIS. CASE STUDY
Directory of Open Access Journals (Sweden)
ILEANA BRUDIU
2009-05-01
Full Text Available Estimated parameters with confidence intervals and testing statistical assumptions used in statistical analysis to obtain conclusions on research from a sample extracted from the population. Paper to the case study presented aims to highlight the importance of volume of sample taken in the study and how this reflects on the results obtained when using confidence intervals and testing for pregnant. If statistical testing hypotheses not only give an answer "yes" or "no" to some questions of statistical estimation using statistical confidence intervals provides more information than a test statistic, show high degree of uncertainty arising from small samples and findings build in the "marginally significant" or "almost significant (p very close to 0.05.
Prognostic significance of lymphovascular invasion in radical prostatectomy specimens.
Yee, David S; Shariat, Shahrokh F; Lowrance, William T; Maschino, Alexandra C; Savage, Caroline J; Cronin, Angel M; Scardino, Peter T; Eastham, James A
2011-08-01
Study Type - Prognosis (case series). 4. What's known on the subject? and What does the study add? The reported incidence of lymphovascular invasion (LVI) in radical prostatectomy specimens ranges from 5% to 53%. Although LVI has a strong and significant association with adverse clinicopathologic features, it has almost uniformly not been found to be a predictor of biochemical recurrence (BR) on multivariate analysis. This study confirms that LVI is associated with features of aggressive disease and is an independent predictor of BCR. Given that LVI may play a role in the metastatic process, it may be useful in clinical decision-making regarding adjuvant therapy for patients treated with RP. To determine whether lymphovascular invasion (LVI) in radical prostatectomy (RP) specimens has prognostic significance. The study examined whether LVI is associated with clinicopathological characteristics and biochemical recurrence (BCR). LVI was evaluated based on routine pathology reports on 1298 patients treated with RP for clinically localized prostate cancer between 2004 and 2007. LVI was defined as the unequivocal presence of tumour cells within an endothelium-lined space. The association between LVI and clinicopathological features was assessed with univariate logistic regression. Cox regression was used to test the association between LVI and BCR. LVI was identified in 10% (129/1298) of patients. The presence of LVI increased with advancing pathological stage: 2% (20/820) in pT2N0 patients, 16% (58/363) in pT3N0 patients and 17% (2/12) in pT4N0 patients; and was highest in patients with pN1 disease (52%; 49/94). Univariate analysis showed an association between LVI and higher preoperative prostate-specific antigen levels and Gleason scores, and a greater likelihood of extraprostatic extension, seminal vesicle invasion, lymph node metastasis and positive surgical margins (all P < 0.001). With a median follow-up of 27 months, LVI was significantly associated with an
Statistical Network Analysis for Functional MRI: Mean Networks and Group Comparisons.
Directory of Open Access Journals (Sweden)
Cedric E Ginestet
2014-05-01
Full Text Available Comparing networks in neuroscience is hard, because the topological properties of a given network are necessarily dependent on the number of edges of that network. This problem arises in the analysis of both weighted and unweighted networks. The term density is often used in this context, in order to refer to the mean edge weight of a weighted network, or to the number of edges in an unweighted one. Comparing families of networks is therefore statistically difficult because differences in topology are necessarily associated with differences in density. In this review paper, we consider this problem from two different perspectives, which include (i the construction of summary networks, such as how to compute and visualize the mean network from a sample of network-valued data points; and (ii how to test for topological differences, when two families of networks also exhibit significant differences in density. In the first instance, we show that the issue of summarizing a family of networks can be conducted by either adopting a mass-univariate approach, which produces a statistical parametric network (SPN, or by directly computing the mean network, provided that a metric has been specified on the space of all networks with a given number of nodes. In the second part of this review, we then highlight the inherent problems associated with the comparison of topological functions of families of networks that differ in density. In particular, we show that a wide range of topological summaries, such as global efficiency and network modularity are highly sensitive to differences in density. Moreover, these problems are not restricted to unweighted metrics, as we demonstrate that the same issues remain present when considering the weighted versions of these metrics. We conclude by encouraging caution, when reporting such statistical comparisons, and by emphasizing the importance of constructing summary networks.
Statistical analysis of the potassium concentration obtained through
International Nuclear Information System (INIS)
Pereira, Joao Eduardo da Silva; Silva, Jose Luiz Silverio da; Pires, Carlos Alberto da Fonseca; Strieder, Adelir Jose
2007-01-01
The present work was developed in outcrops of Santa Maria region, southern Brazil, Rio Grande do Sul State. Statistic evaluations were applied in different rock types. The possibility to distinguish different geologic units, sedimentary and volcanic (acid and basic types) by means of the statistic analyses from the use of airborne gamma-ray spectrometry integrating potash radiation emissions data with geological and geochemistry data is discussed. This Project was carried out at 1973 by Geological Survey of Brazil/Companhia de Pesquisas de Recursos Minerais. The Camaqua Project evaluated the behavior of potash concentrations generating XYZ Geosof 1997 format, one grid, thematic map and digital thematic map files from this total area. Using these data base, the integration of statistics analyses in sedimentary formations which belong to the Depressao Central do Rio Grande do Sul and/or to volcanic rocks from Planalto da Serra Geral at the border of Parana Basin was tested. Univariate statistics model was used: the media, the standard media error, and the trust limits were estimated. The Tukey's Test was used in order to compare mean values. The results allowed to create criteria to distinguish geological formations based on their potash content. The back-calibration technique was employed to transform K radiation to percentage. Inside this context it was possible to define characteristic values from radioactive potash emissions and their trust ranges in relation to geologic formations. The potash variable when evaluated in relation to geographic Universal Transverse Mercator coordinates system showed a spatial relation following one polynomial model of second order, with one determination coefficient. The statistica 7.1 software Generalist Linear Models produced by Statistics Department of Federal University of Santa Maria/Brazil was used. (author)
Lower bounds on the run time of the univariate marginal distribution algorithm on OneMax
DEFF Research Database (Denmark)
Krejca, Martin S.; Witt, Carsten
2017-01-01
The Univariate Marginal Distribution Algorithm (UMDA), a popular estimation of distribution algorithm, is studied from a run time perspective. On the classical OneMax benchmark function, a lower bound of Ω(μ√n + n log n), where μ is the population size, on its expected run time is proved...... values maintained by the algorithm, including carefully designed potential functions. These techniques may prove useful in advancing the field of run time analysis for estimation of distribution algorithms in general........ This is the first direct lower bound on the run time of the UMDA. It is stronger than the bounds that follow from general black-box complexity theory and is matched by the run time of many evolutionary algorithms. The results are obtained through advanced analyses of the stochastic change of the frequencies of bit...
Clinical significance of pretreatment FDG PET/CT IN MOBG-avid pediatric neuroblastoma
Energy Technology Data Exchange (ETDEWEB)
Kang, Seo Young; Kim, Yong Il; Cheon, Gi Jeong; Kang, Keon Wook; Chung, June Key; Lee, Dong Soo; Kang, Hyoung Jin; Shin, Hee Young [Seoul National University Hospital, Seoul (Korea, Republic of); Kim, E. Edmund [Seoul National University, Seoul (Korea, Republic of); Rahim, Muhammad Kashif [Nishtar Medical College and Hospital, Multan (Pakistan)
2017-06-15
{sup 18}F-fluorodeoxyglucose-positron emission tomography (FDG-PET) imaging is well known to have clinical significance in the initial staging and response evaluation of the many kinds of neoplasms. However, its role in the pediatric neuroblastoma is not clearly defined. In the present study, the clinical significance of FDG-PET/computed tomography (CT) in 123I- or 131I-metaiodobenzylguanidine (MIBG)-avid pediatric neuroblastoma was investigated. Twenty patients with neuroblastoma who undertook pretreatment FDG PET/CT at our institute between 2008 and 2015 and showed MIBG avidity were retrospectively enrolled in the present study. Clinical information—including histopathology, and serum markers—and several PET parameters—including SUVmax of the primary lesion (Psuv), target-to-background ratio (TBR), metabolic tumor volume (MTV), and coefficient of variation (CV)—were analyzed. The prognostic effect of PET parameters was evaluated in terms of progression-free survival (PFS). Total 20 patients (4.5 ± 3.5 years) were divided as two groups by disease progression. Six patients (30.0 %) experienced disease progression and one patient (5.0 %) died during follow-up period. There were not statistically significant in age, stage, MYCN status, primary tumor size, serum lactate dehydrogenase (LDH), neuron-specific enolase (NSE), and ferritin level between two groups with progression or no progression. However, Psuv (p = 0.017), TBR (p = 0.09), MTV (p = 0.02), and CV (p = 0.036) showed significant differences between two groups. In univariate analysis, PFS was significantly associated with Psuv (p = 0.021) and TBR (p = 0.023). FDG-PET parameters were significantly related with progression of neuroblastoma. FDG-PET/CT may have the potential as a valuable modality for evaluating prognosis in the patients with MIBG-avid pediatric neuroblastoma.
Directory of Open Access Journals (Sweden)
Fernando Cervantes-Sanchez
2016-01-01
Full Text Available This paper presents a novel method for improving the training step of the single-scale Gabor filters by using the Boltzmann univariate marginal distribution algorithm (BUMDA in X-ray angiograms. Since the single-scale Gabor filters (SSG are governed by three parameters, the optimal selection of the SSG parameters is highly desirable in order to maximize the detection performance of coronary arteries while reducing the computational time. To obtain the best set of parameters for the SSG, the area (Az under the receiver operating characteristic curve is used as fitness function. Moreover, to classify vessel and nonvessel pixels from the Gabor filter response, the interclass variance thresholding method has been adopted. The experimental results using the proposed method obtained the highest detection rate with Az=0.9502 over a training set of 40 images and Az=0.9583 with a test set of 40 images. In addition, the experimental results of vessel segmentation provided an accuracy of 0.944 with the test set of angiograms.
Directory of Open Access Journals (Sweden)
James R. Moeller
2006-01-01
Full Text Available In brain mapping studies of sensory, cognitive, and motor operations, specific waveforms of dynamic neural activity are predicted based on theoretical models of human information processing. For example in event-related functional MRI (fMRI, the general linear model (GLM is employed in mass-univariate analyses to identify the regions whose dynamic activity closely matches the expected waveforms. By comparison multivariate analyses based on PCA or ICA provide greater flexibility in detecting spatiotemporal properties of experimental data that may strongly support alternative neuroscientific explanations. We investigated conjoint multivariate and mass-univariate analyses that combine the capabilities to (1 verify activation of neural machinery we already understand and (2 discover reliable signatures of new neural machinery. We examined combinations of GLM and PCA that recover latent neural signals (waveforms and footprints with greater accuracy than either method alone. Comparative results are illustrated with analyses of real fMRI data, adding to Monte Carlo simulation support.
Harrou, Fouzi
2017-09-18
This study reports the development of an innovative fault detection and diagnosis scheme to monitor the direct current (DC) side of photovoltaic (PV) systems. Towards this end, we propose a statistical approach that exploits the advantages of one-diode model and those of the univariate and multivariate exponentially weighted moving average (EWMA) charts to better detect faults. Specifically, we generate array\\'s residuals of current, voltage and power using measured temperature and irradiance. These residuals capture the difference between the measurements and the predictions MPP for the current, voltage and power from the one-diode model, and use them as fault indicators. Then, we apply the multivariate EWMA (MEWMA) monitoring chart to the residuals to detect faults. However, a MEWMA scheme cannot identify the type of fault. Once a fault is detected in MEWMA chart, the univariate EWMA chart based on current and voltage indicators is used to identify the type of fault (e.g., short-circuit, open-circuit and shading faults). We applied this strategy to real data from the grid-connected PV system installed at the Renewable Energy Development Center, Algeria. Results show the capacity of the proposed strategy to monitors the DC side of PV systems and detects partial shading.
International Nuclear Information System (INIS)
Fabre, C.; Cousin, A.; Wiens, R.C.; Ollila, A.; Gasnault, O.; Maurice, S.; Sautter, V.; Forni, O.; Lasue, J.; Tokar, R.; Vaniman, D.; Melikechi, N.
2014-01-01
Curiosity rover landed on August 6th, 2012 in Gale Crater, Mars and it possesses unique analytical capabilities to investigate the chemistry and mineralogy of the Martian soil. In particular, the LIBS technique is being used for the first time on another planet with the ChemCam instrument, and more than 75,000 spectra have been returned in the first year on Mars. Curiosity carries body-mounted calibration targets specially designed for the ChemCam instrument, some of which are homgeneous glasses and others that are fine-grained glass-ceramics. We present direct calibrations, using these onboard standards to infer elements and element ratios by ratioing relative peak areas. As the laser spot size is around 300 μm, the LIBS technique provides measurements of the silicate glass compositions representing homogeneous material and measurements of the ceramic targets that are comparable to fine-grained rock or soil. The laser energy and the auto-focus are controlled for all sequences used for calibration. The univariate calibration curves present relatively to very good correlation coefficients with low RSDs for major and ratio calibrations. Trace element calibration curves (Li, Sr, and Mn), down to several ppm, can be used as a rapid tool to draw attention to remarkable rocks and soils along the traverse. First comparisons to alpha-particle X-ray spectroscopy (APXS) data, on selected targets, show good agreement for most elements and for Mg# and Al/Si estimates. SiO 2 estimates using univariate cannot be yet used. Na 2 O and K 2 O estimates are relevant for high alkali contents, but probably under estimated due to the CCCT initial compositions. Very good results for CaO and Al 2 O 3 estimates and satisfactory results for FeO are obtained. - Highlights: • In situ LIBS univariate calibrations are done using the Curiosity onboard standards. • Major and minor element contents can be rapidly obtained. • Trace element contents can be used as a rapid tool along the
Ozer, Erdener; Sarialioglu, Faik; Cetingoz, Riza; Yüceer, Nurullah; Cakmakci, Handan; Ozkal, Sermin; Olgun, Nur; Uysal, Kamer; Corapcioglu, Funda; Canda, Serefettin
2004-01-01
The purpose of this study was to investigate whether quantitative assessment of cytologic anaplasia and angiogenesis may predict the clinical prognosis in medulloblastoma and stratify the patients to avoid both undertreatment and overtreatment. Medulloblastomas from 23 patients belonging to the Pediatric Oncology Group were evaluated with respect to some prognostic variables, including histologic assessment of nodularity and desmoplasia, grading of anaplasia, measurement of nuclear size, mitotic cell count, quantification of angiogenesis, including vascular surface density (VSD) and microvessel number (NVES), and immunohistochemical scoring of vascular endothelial growth factor (VEGF) expression. Univariate and multivariate analyses for prognostic indicators for survival were performed. Univariate analysis revealed that extensive nodularity was a significant favorable prognostic factor, whereas the presence of anaplasia, increased nuclear size, mitotic rate, VSD, and NVES were significant unfavorable prognostic factors. Using multivariate analysis, increased nuclear size was found to be an independent unfavorable prognostic factor for survival. Neither the presence of desmoplasia nor VEGF expression was significantly related to patient survival. Although care must be taken not to overstate the importance of the results of this single-institution preliminary report, pathologic grading of medulloblastomas with respect to grading of anaplasia and quantification of nodularity, nuclear size, and microvessel profiles may be clinically useful for the treatment of medulloblastomas. Further validation of the independent prognostic significance of nuclear size in stratifying patients is required.
Directory of Open Access Journals (Sweden)
Minna M Boström
Full Text Available Inflammation is an important feature of carcinogenesis. Tumor-associated macrophages (TAMs can be associated with either poor or improved prognosis, depending on their properties and polarization. Current knowledge of the prognostic significance of TAMs in bladder cancer is limited and was investigated in this study. We analyzed 184 urothelial bladder cancer patients undergoing transurethral resection of a bladder tumor or radical cystectomy. CD68 (pan-macrophage marker, MAC387 (polarized towards type 1 macrophages, and CLEVER-1/Stabilin-1 (type 2 macrophages and lymphatic/blood vessels were detected immunohistochemically. The median follow-up time was 6.0 years. High macrophage counts associated with a higher pT category and grade. Among patients undergoing transurethral resection, all studied markers apart from CLEVER-1/Stabilin-1 were associated with increased risk of progression and poorer disease-specific and overall survival in univariate analyses. High levels of two macrophage markers (CD68/MAC387+/+ or CD68/CLEVER-1+/+ groups had an independent prognostic role after transurethral resection in multivariate analyses. In the cystectomy cohort, MAC387, alone and in combination with CD68, was associated with poorer survival in univariate analyses, but none of the markers were independent predictors of outcome in multivariate analyses. In conclusion, this study demonstrates that macrophage phenotypes provide significant independent prognostic information, particularly in bladder cancers undergoing transurethral resection.
Singer, Meromit; Engström, Alexander; Schönhuth, Alexander; Pachter, Lior
2011-09-23
Recent experimental and computational work confirms that CpGs can be unmethylated inside coding exons, thereby showing that codons may be subjected to both genomic and epigenomic constraint. It is therefore of interest to identify coding CpG islands (CCGIs) that are regions inside exons enriched for CpGs. The difficulty in identifying such islands is that coding exons exhibit sequence biases determined by codon usage and constraints that must be taken into account. We present a method for finding CCGIs that showcases a novel approach we have developed for identifying regions of interest that are significant (with respect to a Markov chain) for the counts of any pattern. Our method begins with the exact computation of tail probabilities for the number of CpGs in all regions contained in coding exons, and then applies a greedy algorithm for selecting islands from among the regions. We show that the greedy algorithm provably optimizes a biologically motivated criterion for selecting islands while controlling the false discovery rate. We applied this approach to the human genome (hg18) and annotated CpG islands in coding exons. The statistical criterion we apply to evaluating islands reduces the number of false positives in existing annotations, while our approach to defining islands reveals significant numbers of undiscovered CCGIs in coding exons. Many of these appear to be examples of functional epigenetic specialization in coding exons.
International Nuclear Information System (INIS)
Sharma, P.; Khare, M.
2000-01-01
Historical data of the time-series of carbon monoxide (CO) concentration was analysed using Box-Jenkins modelling approach. Univariate Linear Stochastic Models (ULSMs) were developed to examine the degree of prediction possible for situations where only a limited data set, restricted only to the past record of pollutant data are available. The developed models can be used to provide short-term, real-time forecast of extreme CO concentrations for an Air Quality Control Region (AQCR), comprising a major traffic intersection in a Central Business District of Delhi City, India. (author)
Multivariate statistical modelling based on generalized linear models
Fahrmeir, Ludwig
1994-01-01
This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...
Univariate and multivariate analysis on processing tomato quality under different mulches
Directory of Open Access Journals (Sweden)
Carmen Moreno
2014-04-01
Full Text Available The use of eco-friendly mulch materials as alternatives to the standard polyethylene (PE has become increasingly prevalent worldwide. Consequently, a comparison of mulch materials from different origins is necessary to evaluate their feasibility. Several researchers have compared the effects of mulch materials on each crop variable through univariate analysis (ANOVA. However, it is important to focus on the effect of these materials on fruit quality, because this factor decisively influences the acceptance of the final product by consumers and the industrial sector. This study aimed to analyze the information supplied by a randomized complete block experiment combined over two seasons, a principal component analysis (PCA and a cluster analysis (CA when studying the effects of mulch materials on the quality of processing tomato (Lycopersicon esculentum Mill.. The study focused on the variability in the quality measurements and on the determination of mulch materials with a similar response to them. A comparison of the results from both types of analysis yielded complementary information. ANOVA showed the similarity of certain materials. However, considering the totality of the variables analyzed, the final interpretation was slightly complicated. PCA indicated that the juice color, the fruit firmness and the soluble solid content were the most influential factors in the total variability of a set of 12 juice and fruit variables, and CA allowed us to establish four categories of treatment: plastics (polyethylene - PE, oxo- and biodegradable materials, papers, manual weeding and barley (Hordeum vulgare L. straw. Oxobiodegradable and PE were most closely related based on CA.
Directory of Open Access Journals (Sweden)
Mostafa Nejadhadad
2017-11-01
Full Text Available A geochemical exploration program was applied to recognize the anomalous geochemical haloes at the Ravanj lead mine, Delijan, Iran. Sampling of unweathered rocks were undertaken across rock exposures on a 10 × 10 meter grid (n = 302 as well as the accessible parts of underground mine A (n = 42. First, the threshold values of all elements were determined using the cut-off values used in the exploratory data analysis (EDA method. Then, for further studies, elements with lognormal distributions (Pb, Zn, Ag, As, Cd, Co, Cu, Sb, S, Sr, Th, Ba, Bi, Fe, Ni and Mn were selected. Robustness against outliers is achieved by application of central log ratio transformation to address the closure problems with compositional data prior to principle components analysis (PCA. Results of these analyses show that, in the Ravanj deposit, Pb mineralization is characterized by a Pb-Ba-Ag-Sb ± Zn ± Cd association. The supra-mineralization haloes are characterized by barite and tetrahedrite in a Ba- Th- Ag- Cu- Sb- As- Sr association and sub-mineralization haloes are comprised of pyrite and tetrahedrite, probably reflecting a Fe-Cu-As-Bi-Ni-Co-Mo-Mn association. Using univariate and multivariate geostatistical analyses (e.g., EDA and robust PCA, four anomalies were detected and mapped in Block A of the Ravanj deposit. Anomalies 1 and 2 are around the ancient orebodies. Anomaly 3 is located in a thin bedded limestone-shale intercalation unit that does not show significant mineralization. Drilling of the fourth anomaly suggested a low grade, non-economic Pb mineralization.
Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu
2015-09-21
Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.
Fang, Yongxiang; Wit, Ernst
2008-01-01
Fisher’s combined probability test is the most commonly used method to test the overall significance of a set independent p-values. However, it is very obviously that Fisher’s statistic is more sensitive to smaller p-values than to larger p-value and a small p-value may overrule the other p-values and decide the test result. This is, in some cases, viewed as a flaw. In order to overcome this flaw and improve the power of the test, the joint tail probability of a set p-values is proposed as a ...
Sleep Duration and Breast Cancer Phenotype
International Nuclear Information System (INIS)
Khawaja, A.; Rao, S.
2013-01-01
Emerging evidence suggests that short sleep is associated with an increased risk of cancer; however, little has been done to study the role of sleep on tumor characteristics. In this study, we evaluated the relationship between sleep duration and tumor phenotype in 972 breast cancer patients. Sleep duration was inversely associated with tumor grade (univariate P= 0.032), particularly in postmenopausal women (univariate P= 0.018). This association did not reach statistical significance after adjustments for age, race, body mass index, hormone replacement therapy use, alcohol consumption, smoking, and physical activity in the entire study sample (P= 0.052), but it remained statistically significant (P= 0.049) among post-menopausal patients. We did not observe a statistically significant association between sleep duration and stage at diagnosis, ER, or HER2 receptor status. These results present a modest association between short duration of sleep and higher grade breast cancer in post-menopausal women. Further work needs to be done to validate these findings.
Energy Technology Data Exchange (ETDEWEB)
Fabre, C. [GeoRessources lab, Université de Lorraine, Nancy (France); Cousin, A.; Wiens, R.C. [Los Alamos National Laboratory, Los Alamos, NM (United States); Ollila, A. [University of NM, Albuquerque (United States); Gasnault, O.; Maurice, S. [IRAP, Toulouse (France); Sautter, V. [Museum National d' Histoire Naturelle, Paris (France); Forni, O.; Lasue, J. [IRAP, Toulouse (France); Tokar, R.; Vaniman, D. [Planetary Science Institute, Tucson, AZ (United States); Melikechi, N. [Delaware State University (United States)
2014-09-01
Curiosity rover landed on August 6th, 2012 in Gale Crater, Mars and it possesses unique analytical capabilities to investigate the chemistry and mineralogy of the Martian soil. In particular, the LIBS technique is being used for the first time on another planet with the ChemCam instrument, and more than 75,000 spectra have been returned in the first year on Mars. Curiosity carries body-mounted calibration targets specially designed for the ChemCam instrument, some of which are homgeneous glasses and others that are fine-grained glass-ceramics. We present direct calibrations, using these onboard standards to infer elements and element ratios by ratioing relative peak areas. As the laser spot size is around 300 μm, the LIBS technique provides measurements of the silicate glass compositions representing homogeneous material and measurements of the ceramic targets that are comparable to fine-grained rock or soil. The laser energy and the auto-focus are controlled for all sequences used for calibration. The univariate calibration curves present relatively to very good correlation coefficients with low RSDs for major and ratio calibrations. Trace element calibration curves (Li, Sr, and Mn), down to several ppm, can be used as a rapid tool to draw attention to remarkable rocks and soils along the traverse. First comparisons to alpha-particle X-ray spectroscopy (APXS) data, on selected targets, show good agreement for most elements and for Mg# and Al/Si estimates. SiO{sub 2} estimates using univariate cannot be yet used. Na{sub 2}O and K{sub 2}O estimates are relevant for high alkali contents, but probably under estimated due to the CCCT initial compositions. Very good results for CaO and Al{sub 2}O{sub 3} estimates and satisfactory results for FeO are obtained. - Highlights: • In situ LIBS univariate calibrations are done using the Curiosity onboard standards. • Major and minor element contents can be rapidly obtained. • Trace element contents can be used as a
Lopes Antunes, Ana Carolina; Dórea, Fernanda; Halasa, Tariq; Toft, Nils
2016-05-01
Surveillance systems are critical for accurate, timely monitoring and effective disease control. In this study, we investigated the performance of univariate process monitoring control algorithms in detecting changes in seroprevalence for endemic diseases. We also assessed the effect of sample size (number of sentinel herds tested in the surveillance system) on the performance of the algorithms. Three univariate process monitoring control algorithms were compared: Shewart p Chart(1) (PSHEW), Cumulative Sum(2) (CUSUM) and Exponentially Weighted Moving Average(3) (EWMA). Increases in seroprevalence were simulated from 0.10 to 0.15 and 0.20 over 4, 8, 24, 52 and 104 weeks. Each epidemic scenario was run with 2000 iterations. The cumulative sensitivity(4) (CumSe) and timeliness were used to evaluate the algorithms' performance with a 1% false alarm rate. Using these performance evaluation criteria, it was possible to assess the accuracy and timeliness of the surveillance system working in real-time. The results showed that EWMA and PSHEW had higher CumSe (when compared with the CUSUM) from week 1 until the end of the period for all simulated scenarios. Changes in seroprevalence from 0.10 to 0.20 were more easily detected (higher CumSe) than changes from 0.10 to 0.15 for all three algorithms. Similar results were found with EWMA and PSHEW, based on the median time to detection. Changes in the seroprevalence were detected later with CUSUM, compared to EWMA and PSHEW for the different scenarios. Increasing the sample size 10 fold halved the time to detection (CumSe=1), whereas increasing the sample size 100 fold reduced the time to detection by a factor of 6. This study investigated the performance of three univariate process monitoring control algorithms in monitoring endemic diseases. It was shown that automated systems based on these detection methods identified changes in seroprevalence at different times. Increasing the number of tested herds would lead to faster
Directory of Open Access Journals (Sweden)
Jung Kwon Kim
Full Text Available There have been conflicting reports regarding the association of perioperative blood transfusion (PBT with oncologic outcomes including recurrence rates and survival outcomes in prostate cancer. We aimed to evaluate whether perioperative blood transfusion (PBT affects biochemical recurrence-free survival (BRFS, cancer-specific survival (CSS, and overall survival (OS following radical prostatectomy (RP for patients with prostate cancer.A total of 2,713 patients who underwent RP for clinically localized prostate cancer between 1993 and 2014 were retrospectively analyzed. We performed a comparative analysis based on receipt of transfusion (PBT group vs. no-PBT group and transfusion type (autologous PBT vs. allogeneic PBT. Univariate and multivariate Cox-proportional hazard regression analysis were performed to evaluate variables associated with BRFS, CSS, and OS. The Kaplan-Meier method was used to calculate survival estimates for BRFS, CSS, and OS, and log-rank test was used to conduct comparisons between the groups.The number of patients who received PBT was 440 (16.5%. Among these patients, 350 (79.5% received allogeneic transfusion and the other 90 (20.5% received autologous transfusion. In a multivariate analysis, allogeneic PBT was found to be statistically significant predictors of BRFS, CSS, and OS; conversely, autologous PBT was not. The Kaplan-Meier survival analysis showed significantly decreased 5-year BRFS (79.2% vs. 70.1%, log-rank, p = 0.001, CSS (98.5% vs. 96.7%, log-rank, p = 0.012, and OS (95.5% vs. 90.6%, log-rank, p < 0.001 in the allogeneic PBT group compared to the no-allogeneic PBT group. In the autologous PBT group, however, none of these were statistically significant compared to the no-autologous PBT group.We found that allogeneic PBT was significantly associated with decreased BRFS, CSS, and OS. This provides further support for the immunomodulation hypothesis for allogeneic PBT.
Son, Wonsoo; Park, Jaechan
2017-09-01
Frameless stereotactic aspiration of a hematoma can be the one of the treatment options for spontaneous intracerebral hemorrhage in the basal ganglia. Postoperative hematoma enlargement, however, can be a serious complication of intracranial surgery that frequently results in severe neurological deficit and even death. Therefore, it is important to identify the risk factors of postoperative hematoma growth. During a 13-year period, 101 patients underwent minimally invasive frameless stereotactic aspiration for basal ganglia hematoma. Patients were classified into two groups according to whether or not they had postoperative hematoma enlargement in a computed tomography scan. Baseline demographic data and several risk factors, such as hypertension, preoperative hematoma growth, antiplatelet medication, presence of concomitant intraventricular hemorrhage (IVH), were analysed via a univariate statistical study. Nine of 101 patients (8.9%) showed hematoma enlargement after frameless stereotactic aspiration. Among the various risk factors, concomitant IVH and antiplatelet medication were found to be significantly associated with postoperative enlargement of hematomas. In conclusion, our study revealed that aspirin use and concomitant IVH are factors associated with hematoma enlargement subsequent to frameless stereotactic aspiration for basal ganglia hematoma.
Meng, Wei; Jiang, Yangyang; Ma, Jie
2017-01-01
O6-methylguanine-DNA methyltransferase (MGMT) is an independent predictor of therapeutic response and potential prognosis in patients with glioblastoma multiforme (GBM). However, its significance of clinical prognosis in different continents still needs to be explored. To explore the effects of MGMT promoter methylation on both progression-free survival (PFS) and overall survival (OS) among GBM patients from different continents, a systematic review of published studies was conducted. A total of 5103 patients from 53 studies were involved in the systematic review and the total percentage of MGMT promoter methylation was 45.53%. Of these studies, 16 studies performed univariate analyses and 17 performed multivariate analyses of MGMT promoter methylation on PFS. The pooled hazard ratio (HR) estimated for PFS was 0.55 (95% CI 0.50, 0.60) by univariate analysis and 0.43 (95% CI 0.38, 0.48) by multivariate analysis. The effect of MGMT promoter methylation on OS was explored in 30 studies by univariate analysis and in 30 studies by multivariate analysis. The combined HR was 0.48 (95% CI 0.44, 0.52) and 0.42 (95% CI 0.38, 0.45), respectively. In each subgroup divided by areas, the prognostic significance still remained highly significant. The proportion of methylation in each group was in inverse proportion to the corresponding HR in the univariate and multivariate analyses of PFS. However, from the perspective of OS, compared with data from Europe and the US, higher methylation rates in Asia did not bring better returns.
Ren, Zhiyong; Li, Yufeng; Shen, Tiansheng; Hameed, Omar; Siegal, Gene P; Wei, Shi
2016-01-01
Prognostic factors are well established in early-stage breast cancer (BC), but less well-defined in advanced disease. We analyzed 323 BC patients who had distant relapse during follow-up from 1997 to 2010 to determine the significant clinicopathologic factors predicting survival outcomes. By univariate analysis, race, tumor grade, estrogen and progesterone receptors (ER/PR) and HER2 status were significantly associated with overall survival (OS) and post-metastasis survival (PMS). Applying a Cox regression model revealed that all these factors remained significant for PMS, while race, tumor grade and HER2 were independent factors for OS. Tumor grade was the only significant factor for metastasis-free survival by univariate and multivariate analyses. Our findings demonstrated that being Caucasian, hormonal receptor positive (HR+) and HER2 positive (HER2+) were all associated with a decreased hazard of death and that patients with HR+/HER2+ tumors had superior outcomes to those with HR+/HER2- disease. Further, PR status held a prognostic value over ER, thus reflecting the biologic mechanism of the importance of the functional ER pathway and the heterogeneity in the response to endocrine therapy. These observations indicate that the patients' genetic makeup and the intrinsic nature of the tumor principally govern BC progression and prognosticate the long-term outcomes in advanced disease. Copyright © 2015 Elsevier GmbH. All rights reserved.
After statistics reform : Should we still teach significance testing?
A. Hak (Tony)
2014-01-01
textabstractIn the longer term null hypothesis significance testing (NHST) will disappear because p- values are not informative and not replicable. Should we continue to teach in the future the procedures of then abolished routines (i.e., NHST)? Three arguments are discussed for not teaching NHST in
Directory of Open Access Journals (Sweden)
Sunando Roy
2009-10-01
Full Text Available Feline immunodeficiency virus (FIV and human immunodeficiency virus (HIV are recently identified lentiviruses that cause progressive immune decline and ultimately death in infected cats and humans. It is of great interest to understand how to prevent immune system collapse caused by these lentiviruses. We recently described that disease caused by a virulent FIV strain in cats can be attenuated if animals are first infected with a feline immunodeficiency virus derived from a wild cougar. The detailed temporal tracking of cat immunological parameters in response to two viral infections resulted in high-dimensional datasets containing variables that exhibit strong co-variation. Initial analyses of these complex data using univariate statistical techniques did not account for interactions among immunological response variables and therefore potentially obscured significant effects between infection state and immunological parameters.Here, we apply a suite of multivariate statistical tools, including Principal Component Analysis, MANOVA and Linear Discriminant Analysis, to temporal immunological data resulting from FIV superinfection in domestic cats. We investigated the co-variation among immunological responses, the differences in immune parameters among four groups of five cats each (uninfected, single and dual infected animals, and the "immune profiles" that discriminate among them over the first four weeks following superinfection. Dual infected cats mount an immune response by 24 days post superinfection that is characterized by elevated levels of CD8 and CD25 cells and increased expression of IL4 and IFNgamma, and FAS. This profile discriminates dual infected cats from cats infected with FIV alone, which show high IL-10 and lower numbers of CD8 and CD25 cells.Multivariate statistical analyses demonstrate both the dynamic nature of the immune response to FIV single and dual infection and the development of a unique immunological profile in dual
Worry, Intolerance of Uncertainty, and Statistics Anxiety
Williams, Amanda S.
2013-01-01
Statistics anxiety is a problem for most graduate students. This study investigates the relationship between intolerance of uncertainty, worry, and statistics anxiety. Intolerance of uncertainty was significantly related to worry, and worry was significantly related to three types of statistics anxiety. Six types of statistics anxiety were…
Chiou, Chei-Chang; Wang, Yu-Min; Lee, Li-Tze
2014-08-01
Statistical knowledge is widely used in academia; however, statistics teachers struggle with the issue of how to reduce students' statistics anxiety and enhance students' statistics learning. This study assesses the effectiveness of a "one-minute paper strategy" in reducing students' statistics-related anxiety and in improving students' statistics-related achievement. Participants were 77 undergraduates from two classes enrolled in applied statistics courses. An experiment was implemented according to a pretest/posttest comparison group design. The quasi-experimental design showed that the one-minute paper strategy significantly reduced students' statistics anxiety and improved students' statistics learning achievement. The strategy was a better instructional tool than the textbook exercise for reducing students' statistics anxiety and improving students' statistics achievement.
Directory of Open Access Journals (Sweden)
Vujović Svetlana R.
2013-01-01
Full Text Available This paper illustrates the utility of multivariate statistical techniques for analysis and interpretation of water quality data sets and identification of pollution sources/factors with a view to get better information about the water quality and design of monitoring network for effective management of water resources. Multivariate statistical techniques, such as factor analysis (FA/principal component analysis (PCA and cluster analysis (CA, were applied for the evaluation of variations and for the interpretation of a water quality data set of the natural water bodies obtained during 2010 year of monitoring of 13 parameters at 33 different sites. FA/PCA attempts to explain the correlations between the observations in terms of the underlying factors, which are not directly observable. Factor analysis is applied to physico-chemical parameters of natural water bodies with the aim classification and data summation as well as segmentation of heterogeneous data sets into smaller homogeneous subsets. Factor loadings were categorized as strong and moderate corresponding to the absolute loading values of >0.75, 0.75-0.50, respectively. Four principal factors were obtained with Eigenvalues >1 summing more than 78 % of the total variance in the water data sets, which is adequate to give good prior information regarding data structure. Each factor that is significantly related to specific variables represents a different dimension of water quality. The first factor F1 accounting for 28 % of the total variance and represents the hydrochemical dimension of water quality. The second factor F2 accounting for 18% of the total variance and may be taken factor of water eutrophication. The third factor F3 accounting 17 % of the total variance and represents the influence of point sources of pollution on water quality. The fourth factor F4 accounting 13 % of the total variance and may be taken as an ecological dimension of water quality. Cluster analysis (CA is an
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
Liu, Wei; Ding, Jinhui
2018-04-01
The application of the principle of the intention-to-treat (ITT) to the analysis of clinical trials is challenged in the presence of missing outcome data. The consequences of stopping an assigned treatment in a withdrawn subject are unknown. It is difficult to make a single assumption about missing mechanisms for all clinical trials because there are complicated reactions in the human body to drugs due to the presence of complex biological networks, leading to data missing randomly or non-randomly. Currently there is no statistical method that can tell whether a difference between two treatments in the ITT population of a randomized clinical trial with missing data is significant at a pre-specified level. Making no assumptions about the missing mechanisms, we propose a generalized complete-case (GCC) analysis based on the data of completers. An evaluation of the impact of missing data on the ITT analysis reveals that a statistically significant GCC result implies a significant treatment effect in the ITT population at a pre-specified significance level unless, relative to the comparator, the test drug is poisonous to the non-completers as documented in their medical records. Applications of the GCC analysis are illustrated using literature data, and its properties and limits are discussed.
de Brito, Aila Riany; Santos Reis, Nadabe Dos; Silva, Tatielle Pereira; Ferreira Bonomo, Renata Cristina; Trovatti Uetanabaro, Ana Paula; de Assis, Sandra Aparecida; da Silva, Erik Galvão Paranhos; Aguiar-Oliveira, Elizama; Oliveira, Julieta Rangel; Franco, Marcelo
2017-11-26
Endoglucanase production by Aspergillus oryzae ATCC 10124 cultivated in rice husks or peanut shells was optimized by experimental design as a function of humidity, time, and temperature. The optimum temperature for the endoglucanase activity was estimated by a univariate analysis (one factor at the time) as 50°C (rice husks) and 60°C (peanut shells), however, by a multivariate analysis (synergism of factors), it was determined a different temperature (56°C) for endoglucanase from peanut shells. For the optimum pH, values determined by univariate and multivariate analysis were 5 and 5.2 (rice husk) and 5 and 7.6 (peanut shells). In addition, the best half-lives were observed at 50°C as 22.8 hr (rice husks) and 7.3 hr (peanut shells), also, 80% of residual activities was obtained between 30 and 50°C for both substrates, and the pH stability was improved at 5-7 (rice hulls) and 6-9 (peanut shells). Both endoglucanases obtained presented different characteristics as a result of the versatility of fungi in different substrates.
Energy Technology Data Exchange (ETDEWEB)
Karolewski, K.; Korzeniowski, S.; Urbanski, K.; Kojs, Z. [Centre of Oncology, Maia Sklodowska-Curie Memorial Inst., Krakow (Poland); Sokolowski, A. [Dept. of Statistics, Cracow Univ. of Economics (Poland)
1999-11-01
The prognostic importance of various pretherapeutic and therapeutic factors was analysed in a group of 413 cervical cancer patients with stage IIB (183 pts) and IIIB (230 pts) treated with radical radiotherapy, which consisted of external irradiation and intracavitary brachytherapy. Univariate analysis of pretherapeutic factors revealed the prognostic significance of patient age, history of abortion, stage, haemoglobin and hematocrit levels. Five-year overall survival rate in stage IIB patients was 51% in stage IIIB 40% and the respective rates for local control at each stage were 61%, and 46%. Univariate analysis of therapeutic factors showed that survival and local control rates increased with the dose, but a significant difference was found only in the case of a paracentral (point A) dose. In a multivariate analysis only patient age, abortions, and clinical stage appeared to have a significant and independent impact on survival. Linear regression analysis results indicated that prolongation of treatment time between 33 and 108 days caused a loss of local control of 0.36% per day. (orig.)
Improving the performance of univariate control charts for abnormal detection and classification
Yiakopoulos, Christos; Koutsoudaki, Maria; Gryllias, Konstantinos; Antoniadis, Ioannis
2017-03-01
Bearing failures in rotating machinery can cause machine breakdown and economical loss, if no effective actions are taken on time. Therefore, it is of prime importance to detect accurately the presence of faults, especially at their early stage, to prevent sequent damage and reduce costly downtime. The machinery fault diagnosis follows a roadmap of data acquisition, feature extraction and diagnostic decision making, in which mechanical vibration fault feature extraction is the foundation and the key to obtain an accurate diagnostic result. A challenge in this area is the selection of the most sensitive features for various types of fault, especially when the characteristics of failures are difficult to be extracted. Thus, a plethora of complex data-driven fault diagnosis methods are fed by prominent features, which are extracted and reduced through traditional or modern algorithms. Since most of the available datasets are captured during normal operating conditions, the last decade a number of novelty detection methods, able to work when only normal data are available, have been developed. In this study, a hybrid method combining univariate control charts and a feature extraction scheme is introduced focusing towards an abnormal change detection and classification, under the assumption that measurements under normal operating conditions of the machinery are available. The feature extraction method integrates the morphological operators and the Morlet wavelets. The effectiveness of the proposed methodology is validated on two different experimental cases with bearing faults, demonstrating that the proposed approach can improve the fault detection and classification performance of conventional control charts.
International Nuclear Information System (INIS)
Fouque, A.L.; Ciuciu, Ph.; Risser, L.; Fouque, A.L.; Ciuciu, Ph.; Risser, L.
2009-01-01
In this paper, a novel statistical parcellation of intra-subject functional MRI (fMRI) data is proposed. The key idea is to identify functionally homogenous regions of interest from their hemodynamic parameters. To this end, a non-parametric voxel-based estimation of hemodynamic response function is performed as a prerequisite. Then, the extracted hemodynamic features are entered as the input data of a Multivariate Spatial Gaussian Mixture Model (MSGMM) to be fitted. The goal of the spatial aspect is to favor the recovery of connected components in the mixture. Our statistical clustering approach is original in the sense that it extends existing works done on univariate spatially regularized Gaussian mixtures. A specific Gibbs sampler is derived to account for different covariance structures in the feature space. On realistic artificial fMRI datasets, it is shown that our algorithm is helpful for identifying a parsimonious functional parcellation required in the context of joint detection estimation of brain activity. This allows us to overcome the classical assumption of spatial stationarity of the BOLD signal model. (authors)
Directory of Open Access Journals (Sweden)
Jibril Oyekunle Bello
2016-01-01
Full Text Available Introduction: Urethral strictures are common in urologic practice of Sub-Saharan Africa including Nigeria. We determine the rate of stricture recurrence following urethroplasty for anterior urethral strictures and evaluate preoperative variables that predict of stricture recurrence in our practice. Subjects and Methods: Thirty-six men who had urethroplasty for proven anterior urethral stricture disease between February 2012 and January 2015 were retrospectively analyzed. Preoperative factors including age, socioeconomic factors, comorbidities, etiology of strictures, stricture location, stricture length, periurethral spongiofibrosis, and prior stricture treatments were assessed for independent predictors of stricture recurrence. Results: The median age was 49.5 years (range 21-90, median stricture length was 4 cm (range 1-18 cm and the overall recurrence rate was 27.8%. Postinfectious strictures, pan urethral strictures or multiple strictures involving the penile and bulbar urethra were more common. Most patients had penile circular fasciocutaneous flap urethroplasty. Following univariate analysis of potential preoperative predictors of stricture recurrence, stricture length, and prior treatments with dilations or urethrotomies were found to be significantly associated with stricture recurrence. On multivariate analysis, they both remained statistically significant. Patients who had prior treatments had greater odds of having a recurrent stricture (odds ratio 18, 95% confidence interval [CI] 1.4-224.3. Stricture length was dichotomized based on receiver operating characteristic (ROC analysis, and strictures of length ≥5 cm had significantly greater recurrence (area under ROC curve of 0.825, 95% CI 0.690-0.960, P = 0.032. Conclusion: Patients who had prior dilatations or urethrotomies and those with long strictures particularly strictures ≥5 cm have significantly greater odds of developing a recurrence following urethroplasty in Nigerian
International Nuclear Information System (INIS)
MacKenzie, Robert; Franssen, Edmee; Balogh, Judith; Birt, Derek; Gilbert, Ralph
1998-01-01
Purpose: To determine retrospectively the prognostic significance of airway compromise necessitating tracheostomy in carcinoma of the larynx managed with radical radiotherapy and surgery for salvage (RRSS). Methods and Materials: The charts of 270 patients managed with RRSS at the Toronto-Sunnybrook Regional Cancer Centre between June 1980 and December 1990 were reviewed. Airway compromise necessitating tracheostomy was documented in 26 patients prior to radiotherapy and 3 patients during radiotherapy. Of 29, 27 had T3T4 primaries. Patients have been followed for a median of 5 years. Results: Patients managed without tracheostomy had a 2-year disease-free survival of 74% compared to 41% for those managed with tracheostomy. The adverse impact of airway compromise was more marked in patients with glottic primaries (78% vs. 32%, p = 0.0001) than those with supraglottic primaries (64% vs. 47%, p = 0.18). Tracheostomy was identified in univariate analysis, but not in multivariate analysis, as having a statistically significant impact on local control and local-regional control. Radiotherapy controlled disease above the clavicles in 185 of 267 (69%) evaluable patients. 83% of isolated local-regional failures underwent salvage surgery. Among those managed without tracheostomy, ultimate local-regional control (LRC) was achieved in 161 (94%) of 172 glottic primaries and 54 (81%) of 67 supraglottic primaries. Among those managed with tracheostomy, ultimate LRC was achieved in 9 (69%) of 13 glottic primaries and 12 (80%) of 15 supraglottic primaries. In a subset analysis of 76 patients with T3T4 primaries, there was no statistically significant difference in larynx preservation, disease-free survival, or cause-specific survival between those managed with and without tracheostomy. Conclusion: Airway compromise necessitating tracheostomy is an adverse prognostic factor in patients with carcinoma of the larynx. However, larynx preservation is possible in over 40% of those
Wang, X; Jiao, Y; Tang, T; Wang, H; Lu, Z
2013-12-19
Intrinsic connectivity networks (ICNs) are composed of spatial components and time courses. The spatial components of ICNs were discovered with moderate-to-high reliability. So far as we know, few studies focused on the reliability of the temporal patterns for ICNs based their individual time courses. The goals of this study were twofold: to investigate the test-retest reliability of temporal patterns for ICNs, and to analyze these informative univariate metrics. Additionally, a correlation analysis was performed to enhance interpretability. Our study included three datasets: (a) short- and long-term scans, (b) multi-band echo-planar imaging (mEPI), and (c) eyes open or closed. Using dual regression, we obtained the time courses of ICNs for each subject. To produce temporal patterns for ICNs, we applied two categories of univariate metrics: network-wise complexity and network-wise low-frequency oscillation. Furthermore, we validated the test-retest reliability for each metric. The network-wise temporal patterns for most ICNs (especially for default mode network, DMN) exhibited moderate-to-high reliability and reproducibility under different scan conditions. Network-wise complexity for DMN exhibited fair reliability (ICC<0.5) based on eyes-closed sessions. Specially, our results supported that mEPI could be a useful method with high reliability and reproducibility. In addition, these temporal patterns were with physiological meanings, and certain temporal patterns were correlated to the node strength of the corresponding ICN. Overall, network-wise temporal patterns of ICNs were reliable and informative and could be complementary to spatial patterns of ICNs for further study. Copyright © 2013 IBRO. Published by Elsevier Ltd. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Lee, Eun Sun [Chung-Ang University Hospital, Department of Radiology, Seoul (Korea, Republic of); Chung-Ang University, College of Medicine and Graduate School of Medicine, Seoul (Korea, Republic of); National Cancer Centre, Department of Radiology, Goyang-si, Gyeonggi-do (Korea, Republic of); Kim, Min Ju; Hur, Bo Yun [National Cancer Centre, Department of Radiology, Goyang-si, Gyeonggi-do (Korea, Republic of); Park, Sung Chan; Hyun, Jong Hee; Chang, Hee Jin; Baek, Ji Yeon; Kim, Dae Yong; Oh, Jae Hwan [National Cancer Centre, Centre for Colorectal Cancer, Goyang, Gyeonggi-do (Korea, Republic of); Kim, Sun Young [National Cancer Centre, Centre for Colorectal Cancer, Goyang, Gyeonggi-do (Korea, Republic of); University of Ulsan College of Medicine, Department of Oncology, Asan Medical Centre, Seoul (Korea, Republic of)
2018-02-15
We evaluated the diagnostic performance of magnetic resonance imaging (MRI) in terms of identifying extramural venous invasion (EMVI) in rectal cancer patients with preoperative chemoradiotherapy (CRT) and its prognostic significance. During 2008-2010, 200 patients underwent surgery following preoperative CRT for rectal cancer. Two radiologists independently reviewed all pre- and post-CRT MRI retrospectively. We investigated diagnostic performance of pre-CRT MR-EMVI (MR-EMVI) and post-CRT MR-EMVI (yMR-EMVI), based on pathological EMVI as the standard of reference. We assessed correlation between MRI findings and patients' prognosis, such as disease-free survival (DFS) and overall survival (OS). Additionally, subgroup analysis in MR- or yMR-EMVI-positive patients was performed to confirm the significance of the severity of EMVI in MRI on patient's prognosis. The sensitivity and specificity of yMR-EMVI were 76.19% and 79.75% (area under the curve: 0.830), respectively. In univariate analysis, yMR-EMVI was the only significant MRI factor in DFS (P = 0.027). The mean DFS for yMR-EMVI (+) patients was significantly less than for yMR-EMVI (-) patients: 57.56 months versus 72.46 months. yMR-EMVI demonstrated good diagnostic performance. yMR-EMVI was the only significant EMVI-related MRI factor that correlated with patients' DFS in univariate analysis; however, it was not significant in multivariate analysis. (orig.)
Di Florio, Adriano
2017-10-01
In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B + → J/ψϕK +. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.
DEFF Research Database (Denmark)
Lopes Antunes, Ana Carolina; Jensen, Dan; Hisham Beshara Halasa, Tariq
2017-01-01
Disease monitoring and surveillance play a crucial role in control and eradication programs, as it is important to track implemented strategies in order to reduce and/or eliminate a specific disease. The objectives of this study were to assess the performance of different statistical monitoring......, decreases and constant sero-prevalence levels (referred as events). Two space-state models were used to model the time series, and different statistical monitoring methods (such as univariate process control algorithms–Shewart Control Chart, Tabular Cumulative Sums, and the V-mask- and monitoring...... of noise in the baseline was greater for the Shewhart Control Chart and Tabular Cumulative Sums than for the V-Mask and trend-based methods. The performance of the different statistical monitoring methods varied when monitoring increases and decreases in disease sero-prevalence. Combining two of more...
DEFF Research Database (Denmark)
Lopes Antunes, Ana Carolina; Jensen, Dan; Hisham Beshara Halasa, Tariq
2017-01-01
, decreases and constant sero-prevalence levels (referred as events). Two space-state models were used to model the time series, and different statistical monitoring methods (such as univariate process control algorithms–Shewart Control Chart, Tabular Cumulative Sums, and the V-mask- and monitoring......Disease monitoring and surveillance play a crucial role in control and eradication programs, as it is important to track implemented strategies in order to reduce and/or eliminate a specific disease. The objectives of this study were to assess the performance of different statistical monitoring...... of noise in the baseline was greater for the Shewhart Control Chart and Tabular Cumulative Sums than for the V-Mask and trend-based methods. The performance of the different statistical monitoring methods varied when monitoring increases and decreases in disease sero-prevalence. Combining two of more...
DEFF Research Database (Denmark)
Madsen, Tobias
2017-01-01
In the present thesis I develop, implement and apply statistical methods for detecting genomic elements implicated in cancer development and progression. This is done in two separate bodies of work. The first uses the somatic mutation burden to distinguish cancer driver mutations from passenger m...
Significance evaluation in factor graphs
DEFF Research Database (Denmark)
Madsen, Tobias; Hobolth, Asger; Jensen, Jens Ledet
2017-01-01
in genomics and the multiple-testing issues accompanying them, accurate significance evaluation is of great importance. We here address the problem of evaluating statistical significance of observations from factor graph models. Results Two novel numerical approximations for evaluation of statistical...... significance are presented. First a method using importance sampling. Second a saddlepoint approximation based method. We develop algorithms to efficiently compute the approximations and compare them to naive sampling and the normal approximation. The individual merits of the methods are analysed both from....... Conclusions The applicability of saddlepoint approximation and importance sampling is demonstrated on known models in the factor graph framework. Using the two methods we can substantially improve computational cost without compromising accuracy. This contribution allows analyses of large datasets...
Wang, Hui; Sui, Weiguo; Xue, Wen; Wu, Junyong; Chen, Jiejing; Dai, Yong
2014-09-01
Immunoglobulin A nephropathy (IgAN) is a complex trait regulated by the interaction among multiple physiologic regulatory systems and probably involving numerous genes, which leads to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of IgAN nephropathy is based on multiple genes with minor effects. To learn the association between 23 single nucleotide polymorphisms (SNPs) in 14 genes predisposing to chronic glomerular diseases and IgAN in Han males, the 23 SNPs genotypes of 21 Han males were detected and analyzed with a BaiO gene chip, and their associations were analyzed with univariate analysis and multiple linear regression analysis. Analysis showed that CTLA4 rs231726 and CR2 rs1048971 revealed a significant association with IgAN. These findings support the multi-gene nature of the etiology of IgAN and propose a potential gene-gene interactive model for future studies.
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Statistical lamb wave localization based on extreme value theory
Harley, Joel B.
2018-04-01
Guided wave localization methods based on delay-and-sum imaging, matched field processing, and other techniques have been designed and researched to create images that locate and describe structural damage. The maximum value of these images typically represent an estimated damage location. Yet, it is often unclear if this maximum value, or any other value in the image, is a statistically significant indicator of damage. Furthermore, there are currently few, if any, approaches to assess the statistical significance of guided wave localization images. As a result, we present statistical delay-and-sum and statistical matched field processing localization methods to create statistically significant images of damage. Our framework uses constant rate of false alarm statistics and extreme value theory to detect damage with little prior information. We demonstrate our methods with in situ guided wave data from an aluminum plate to detect two 0.75 cm diameter holes. Our results show an expected improvement in statistical significance as the number of sensors increase. With seventeen sensors, both methods successfully detect damage with statistical significance.
Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C
2008-01-01
As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.
Significant Overexpression of DVL1 in Taiwanese Colorectal Cancer Patients with Liver Metastasis
Directory of Open Access Journals (Sweden)
Shiu-Ru Lin
2013-10-01
Full Text Available Undetected micrometastasis plays a key role in the metastasis of cancer in colorectal cancer (CRC patients. The aim of this study is to identify a biomarker of CRC patients with liver metastasis through the detection of circulating tumor cells (CTCs. Microarray and bioinformatics analysis of 10 CRC cancer tissue specimens compared with normal adjacent tissues revealed that 31 genes were up-regulated (gene expression ratio of cancer tissue to paired normal tissue > 2 in the cancer patients. We used a weighted enzymatic chip array (WEnCA including 31 prognosis-related genes to investigate CTCs in 214 postoperative stage I–III CRC patients and to analyze the correlation between gene expression and clinico-pathological parameters. We employed the immunohistochemistry (IHC method with polyclonal mouse antibody against DVL1 to detect DVL1 expression in 60 CRC patients. CRC liver metastasis occurred in 19.16% (41/214 of the patients. Using univariate analysis and multivariate proportional hazards regression analysis, we found that DVL1 mRNA overexpression had a significant, independent predictive value for liver metastasis in CRC patients (OR: 5.764; 95% CI: 2.588–12.837; p < 0.0001 on univariate analysis; OR: 3.768; 95% CI: 1.469–9.665; p = 0.006 on multivariate analysis. IHC staining of the immunoreactivity of DVL1 showed that DVL1 was localized in the cytoplasm of CRC cells. High expression of DVL1 was observed in 55% (33/60 of CRC tumor specimens and was associated significantly with tumor depth, perineural invasion and liver metastasis status (all p < 0.05. Our experimental results demonstrated that DVL1 is significantly overexpressed in CRC patients with liver metastasis, leading us to conclude that DVL1 could be a potential prognostic and predictive marker for CRC patients.
Mirosław Mrozkowiak; Hanna Żukowska
2015-01-01
Mrozkowiak Mirosław, Żukowska Hanna. Znaczenie Dobrego Krzesła, jako elementu szkolnego i domowego środowiska ucznia, w profilaktyce zaburzeń statyki postawy ciała = The significance of Good Chair as part of children’s school and home environment in the preventive treatment of body statistics distortions. Journal of Education, Health and Sport. 2015;5(7):179-215. ISSN 2391-8306. DOI 10.5281/zenodo.19832 http://ojs.ukw.edu.pl/index.php/johs/article/view/2015%3B5%287%29%3A179-215 https:...
Can a significance test be genuinely Bayesian?
Pereira, Carlos A. de B.; Stern, Julio Michael; Wechsler, Sergio
2008-01-01
The Full Bayesian Significance Test, FBST, is extensively reviewed. Its test statistic, a genuine Bayesian measure of evidence, is discussed in detail. Its behavior in some problems of statistical inference like testing for independence in contingency tables is discussed.
Energy Technology Data Exchange (ETDEWEB)
Arigo, R.; Howe, S.E.; Webb, T. III
1984-06-01
Radiocarbon-dated pollen records are a source of quantitative estimates for climatic variables for the past 9000 years. Multiple regression is the main method for calculation of these estimates and requires a series of steps to gain equations that meet the statistical assumptions of the analysis. This manual describes these steps which include (1) selection of the region for analysis, (2) selection of the pollen types for statiscal analysis, (3) deletion of univariate outliers, (4) transformation to produce linear relationships, (5) selection of the regression equation, and (6) tests of the regression residuals. The input commands and the output from a series of SPSS (Statistical Package for Social Scientists) programs are illustrated and described, and, as an example, modern pollen and climatic data from lower Michigan are used to calculate a regression equation for July mean temperature. 19 references, 1 table.
International Nuclear Information System (INIS)
Martin, Robert P.; Nutt, William T.
2011-01-01
Research highlights: → Historical recitation on application of order-statistics models to nuclear power plant thermal-hydraulics safety analysis. → Interpretation of regulatory language regarding 10 CFR 50.46 reference to a 'high level of probability'. → Derivation and explanation of order-statistics-based evaluation methodologies considering multi-variate acceptance criteria. → Summary of order-statistics models and recommendations to the nuclear power plant thermal-hydraulics safety analysis community. - Abstract: The application of order-statistics in best-estimate plus uncertainty nuclear safety analysis has received a considerable amount of attention from methodology practitioners, regulators, and academia. At the root of the debate are two questions: (1) what is an appropriate quantitative interpretation of 'high level of probability' in regulatory language appearing in the LOCA rule, 10 CFR 50.46 and (2) how best to mathematically characterize the multi-variate case. An original derivation is offered to provide a quantitative basis for 'high level of probability.' At root of the second question is whether one should recognize a probability statement based on the tolerance region method of Wald and Guba, et al., for multi-variate problems, one explicitly based on the regulatory limits, best articulated in the Wallis-Nutt 'Testing Method', or something else entirely. This paper reviews the origins of the different positions, key assumptions, limitations, and relationship to addressing acceptance criteria. It presents a mathematical interpretation of the regulatory language, including a complete derivation of uni-variate order-statistics (as credited in AREVA's Realistic Large Break LOCA methodology) and extension to multi-variate situations. Lastly, it provides recommendations for LOCA applications, endorsing the 'Testing Method' and addressing acceptance methods allowing for limited sample failures.
Lepore, N; Brun, C; Chou, Y Y; Chiang, M C; Dutton, R A; Hayashi, K M; Luders, E; Lopez, O L; Aizenstein, H J; Toga, A W; Becker, J T; Thompson, P M
2008-01-01
This paper investigates the performance of a new multivariate method for tensor-based morphometry (TBM). Statistics on Riemannian manifolds are developed that exploit the full information in deformation tensor fields. In TBM, multiple brain images are warped to a common neuroanatomical template via 3-D nonlinear registration; the resulting deformation fields are analyzed statistically to identify group differences in anatomy. Rather than study the Jacobian determinant (volume expansion factor) of these deformations, as is common, we retain the full deformation tensors and apply a manifold version of Hotelling's $T(2) test to them, in a Log-Euclidean domain. In 2-D and 3-D magnetic resonance imaging (MRI) data from 26 HIV/AIDS patients and 14 matched healthy subjects, we compared multivariate tensor analysis versus univariate tests of simpler tensor-derived indices: the Jacobian determinant, the trace, geodesic anisotropy, and eigenvalues of the deformation tensor, and the angle of rotation of its eigenvectors. We detected consistent, but more extensive patterns of structural abnormalities, with multivariate tests on the full tensor manifold. Their improved power was established by analyzing cumulative p-value plots using false discovery rate (FDR) methods, appropriately controlling for false positives. This increased detection sensitivity may empower drug trials and large-scale studies of disease that use tensor-based morphometry.
Statistics for experimentalists
Cooper, B E
2014-01-01
Statistics for Experimentalists aims to provide experimental scientists with a working knowledge of statistical methods and search approaches to the analysis of data. The book first elaborates on probability and continuous probability distributions. Discussions focus on properties of continuous random variables and normal variables, independence of two random variables, central moments of a continuous distribution, prediction from a normal distribution, binomial probabilities, and multiplication of probabilities and independence. The text then examines estimation and tests of significance. Topics include estimators and estimates, expected values, minimum variance linear unbiased estimators, sufficient estimators, methods of maximum likelihood and least squares, and the test of significance method. The manuscript ponders on distribution-free tests, Poisson process and counting problems, correlation and function fitting, balanced incomplete randomized block designs and the analysis of covariance, and experiment...
Abe, Hideyuki; Takei, Kohei; Uematsu, Toshitaka; Tokura, Yuumi; Suzuki, Issei; Sakamoto, Kazumasa; Nishihara, Daisaku; Yamaguchi, Yoshiyuki; Mizuno, Tomoya; Nukui, Akinori; Kobayashi, Minoru; Kamai, Takao
2018-04-01
Recently, numerous studies have reported an association between sarcopenia and poor outcomes in various kinds of malignancies. We investigated whether sarcopenia predicts the survival of patients with metastatic urothelial carcinoma who underwent systemic chemotherapy. We reviewed 87 metastatic urothelial carcinoma patients who underwent chemotherapy (gemcitabine plus cisplatin or gemcitabine plus carboplatin for cisplatin-unfit patients) between 2007 and 2015. A computed tomography scan prior to chemotherapy was used for evaluating sarcopenia, and we measured three cross-sectional areas of skeletal muscle at the third lumbar vertebra and calculated the skeletal muscle index (SMI), the paraspinal muscle index (PSMI), and the total psoas area (TPA) of each patient. Predictive values of survival were assessed using Cox regression analysis. The median overall survival (OS) was 16 months (95% CI 13.5-18). Although SMI alone was not a significant predictor of shorter OS (P = 0.117) in univariate analysis, SMI stratified by the value of the body mass index (BMI) was a significant predictor of shorter OS in univariate analysis (P = 0.037) and was also an independent predictor of shorter OS in multivariate analysis (P = 0.026). PSMI and TPA were not significant prognostic factors even when stratified by BMI (P = 0.294 and 0.448), respectively. Neither PSMI nor TPA could substitute SMI as a predictor for poor outcomes in metastatic urothelial carcinoma patients treated with systemic chemotherapy in our study. SMI stratified by BMI is a useful predictor of prognosis in these patients.
The distribution of two medically and agriculturally important cryptic ...
African Journals Online (AJOL)
Univariate and multivariate analyses showed statistically significant differences between eco geographic characteristics of collecting localities associated with each of the two species in South Africa. The derived distributions indicate previously reported cases of plague in South Africa, to some extent, coincide with the ...
On two methods of statistical image analysis
Missimer, J; Knorr, U; Maguire, RP; Herzog, H; Seitz, RJ; Tellman, L; Leenders, K.L.
1999-01-01
The computerized brain atlas (CBA) and statistical parametric mapping (SPM) are two procedures for voxel-based statistical evaluation of PET activation studies. Each includes spatial standardization of image volumes, computation of a statistic, and evaluation of its significance. In addition,
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Statistics Using Just One Formula
Rosenthal, Jeffrey S.
2018-01-01
This article advocates that introductory statistics be taught by basing all calculations on a single simple margin-of-error formula and deriving all of the standard introductory statistical concepts (confidence intervals, significance tests, comparisons of means and proportions, etc) from that one formula. It is argued that this approach will…
Statistics Anxiety among Postgraduate Students
Koh, Denise; Zawi, Mohd Khairi
2014-01-01
Most postgraduate programmes, that have research components, require students to take at least one course of research statistics. Not all postgraduate programmes are science based, there are a significant number of postgraduate students who are from the social sciences that will be taking statistics courses, as they try to complete their…
Regionalization of Drought across South Korea Using Multivariate Methods
Directory of Open Access Journals (Sweden)
Muhammad Azam
2017-12-01
Full Text Available Topographic and hydro-climatic features of South Korea are highly heterogeneous and able to influence the drought phenomena in the region. The complex topographical and hydro-climatic features of South Korea need a statistically accurate method to find homogeneous regions. Regionalization of drought in a bivariate framework has scarcely been applied in South Korea before. Hierarchical Classification on Principal Components (HCPC algorithm together with Principal Component Analysis (PCA method and cluster validation indices were investigated and used for the regionalization of drought across the South Korean region. Statistical homogeneity and discordancy of the region was tested on univariate and bivariate frameworks. HCPC indicate that South Korea should be divided into four regions which are closer to being homogeneous. Univariate and bivariate homogeneity and discordancy tests showed the significant difference in their results due to the inability of univariate homogeneity and discordancy measures to consider the joint behavior of duration and severity. Regionalization of drought for SPI time scale of 1, 3, 6, 12, and 24 months showed significant variation in discordancy and homogeneity of the region with the change in SPI time scale. The results of this study can be used as basic data required to establish a drought mitigation plan on regional scales.
DIFFERENCES IN GAME STATISTICS BETWEEN WINNING AND LOSING RUGBY TEAMS IN THE SIX NATIONS TOURNAMENT
Directory of Open Access Journals (Sweden)
José M. Palao
2009-12-01
Full Text Available The objective of the present study was to analyze the differences in rugby game statistics between winning and losing teams. The data from 58 games of round robin play from the Six Nations tournament from the 2003-2006 seasons were analyzed. The groups of variables studied were: number of points scored, way in which the points were scored; way teams obtained the ball and how the team used it; and technical and tactical aspects of the game. A univariate (t-test and multivariate (discriminant analysis of data was done. Winning teams had average values that were significantly higher in points scored, conversions, successful drops, mauls won, line breaks, possessions kicked, tackles completed, and turnovers won. Losing teams had significantly higher averages for the variables scrums lost and line-outs lost. The results showed that: a in the phases of obtaining the ball and more specifically in scrummage and line-out, winning teams lose fewer balls than losing teams (winning teams have an efficacy of 90% in both actions; b the winning team tends to play more with their feet when they obtain the ball, to utilize the maul as a way of attacking, and to break the defensive line more often than the losing team does; and c On defence, winning teams recovered more balls and completed more tackles than losing teams, and the percentage of tackles completed by winning teams was 94%. The value presented could be used as a reference for practice and competition in peak performance teams
Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science.
Veldkamp, Coosje L S; Nuijten, Michèle B; Dominguez-Alvarez, Linda; van Assen, Marcel A L M; Wicherts, Jelte M
2014-01-01
Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this 'co-piloting' currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.
Directory of Open Access Journals (Sweden)
Dominic Beaulieu-Prévost
2006-03-01
Full Text Available For the last 50 years of research in quantitative social sciences, the empirical evaluation of scientific hypotheses has been based on the rejection or not of the null hypothesis. However, more than 300 articles demonstrated that this method was problematic. In summary, null hypothesis testing (NHT is unfalsifiable, its results depend directly on sample size and the null hypothesis is both improbable and not plausible. Consequently, alternatives to NHT such as confidence intervals (CI and measures of effect size are starting to be used in scientific publications. The purpose of this article is, first, to provide the conceptual tools necessary to implement an approach based on confidence intervals, and second, to briefly demonstrate why such an approach is an interesting alternative to an approach based on NHT. As demonstrated in the article, the proposed CI approach avoids most problems related to a NHT approach and can often improve the scientific and contextual relevance of the statistical interpretations by testing range hypotheses instead of a point hypothesis and by defining the minimal value of a substantial effect. The main advantage of such a CI approach is that it replaces the notion of statistical power by an easily interpretable three-value logic (probable presence of a substantial effect, probable absence of a substantial effect and probabilistic undetermination. The demonstration includes a complete example.
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Preventing statistical errors in scientific journals.
Nuijten, M.B.
2016-01-01
There is evidence for a high prevalence of statistical reporting errors in psychology and other scientific fields. These errors display a systematic preference for statistically significant results, distorting the scientific literature. There are several possible causes for this systematic error
Müller-Kirsten, Harald J W
2013-01-01
Statistics links microscopic and macroscopic phenomena, and requires for this reason a large number of microscopic elements like atoms. The results are values of maximum probability or of averaging. This introduction to statistical physics concentrates on the basic principles, and attempts to explain these in simple terms supplemented by numerous examples. These basic principles include the difference between classical and quantum statistics, a priori probabilities as related to degeneracies, the vital aspect of indistinguishability as compared with distinguishability in classical physics, the differences between conserved and non-conserved elements, the different ways of counting arrangements in the three statistics (Maxwell-Boltzmann, Fermi-Dirac, Bose-Einstein), the difference between maximization of the number of arrangements of elements, and averaging in the Darwin-Fowler method. Significant applications to solids, radiation and electrons in metals are treated in separate chapters, as well as Bose-Eins...
Renyi statistics in equilibrium statistical mechanics
International Nuclear Information System (INIS)
Parvan, A.S.; Biro, T.S.
2010-01-01
The Renyi statistics in the canonical and microcanonical ensembles is examined both in general and in particular for the ideal gas. In the microcanonical ensemble the Renyi statistics is equivalent to the Boltzmann-Gibbs statistics. By the exact analytical results for the ideal gas, it is shown that in the canonical ensemble, taking the thermodynamic limit, the Renyi statistics is also equivalent to the Boltzmann-Gibbs statistics. Furthermore it satisfies the requirements of the equilibrium thermodynamics, i.e. the thermodynamical potential of the statistical ensemble is a homogeneous function of first degree of its extensive variables of state. We conclude that the Renyi statistics arrives at the same thermodynamical relations, as those stemming from the Boltzmann-Gibbs statistics in this limit.
Statistical Software for State Space Methods
Directory of Open Access Journals (Sweden)
Jacques J. F. Commandeur
2011-05-01
Full Text Available In this paper we review the state space approach to time series analysis and establish the notation that is adopted in this special volume of the Journal of Statistical Software. We first provide some background on the history of state space methods for the analysis of time series. This is followed by a concise overview of linear Gaussian state space analysis including the modelling framework and appropriate estimation methods. We discuss the important class of unobserved component models which incorporate a trend, a seasonal, a cycle, and fixed explanatory and intervention variables for the univariate and multivariate analysis of time series. We continue the discussion by presenting methods for the computation of different estimates for the unobserved state vector: filtering, prediction, and smoothing. Estimation approaches for the other parameters in the model are also considered. Next, we discuss how the estimation procedures can be used for constructing confidence intervals, detecting outlier observations and structural breaks, and testing model assumptions of residual independence, homoscedasticity, and normality. We then show how ARIMA and ARIMA components models fit in the state space framework to time series analysis. We also provide a basic introduction for non-Gaussian state space models. Finally, we present an overview of the software tools currently available for the analysis of time series with state space methods as they are discussed in the other contributions to this special volume.
African Journals Online (AJOL)
Nubidga
Data on cocoa yield, flowering, and pod infestation were obtained through semi- ... and analyzed using univariate, bivariate and graphical techniques. .... Univariate and multivariate statistical techniques ..... Journal of Applied Meteorology.
Statistics: a Bayesian perspective
National Research Council Canada - National Science Library
Berry, Donald A
1996-01-01
...: it is the only introductory textbook based on Bayesian ideas, it combines concepts and methods, it presents statistics as a means of integrating data into the significant process, it develops ideas...
Statistical Analysis and Evaluation of the Depth of the Ruts on Lithuanian State Significance Roads
Directory of Open Access Journals (Sweden)
Erinijus Getautis
2011-04-01
Full Text Available The aim of this work is to gather information about the national flexible pavement roads ruts depth, to determine its statistical dispersijon index and to determine their validity for needed requirements. Analysis of scientific works of ruts apearance in the asphalt and their influence for driving is presented in this work. Dynamical models of ruts in asphalt are presented in the work as well. Experimental outcome data of rut depth dispersijon in the national highway of Lithuania Vilnius – Kaunas is prepared. Conclusions are formulated and presented. Article in Lithuanian
Clinicopathological significance of fascin-1 expression in patients with non-small cell lung cancer
Directory of Open Access Journals (Sweden)
Ling XL
2015-06-01
Full Text Available Xiao-Ling Ling,* Tao Zhang,* Xiao-Ming Hou, Da Zhao Department of Oncology, The First Hospital of Lanzhou University (The Branch Hospital of Donggang, Lanzhou, Gansu Province, People’s Republic of China *These authors contributed equally to this work Purpose: Fascin-1 promotes the formation of filopodia, lamellipodia, and microspikes of cell membrane after its cross-linking with F-actin, thereby enhancing the cell movement and metastasis and invasion of tumor cells. This study explored the fascin-1 protein’s expression in non-small cell lung cancer (NSCLC tissues and its relationship with clinical pathology and prognostic indicators.Methods: Immunohistochemical analysis was used to determine the expression of fascin-1 in NSCLC tissues. We used quantitative real-time polymerase chain reaction and western blot analysis to further verify the results. The fascin-1 expression and statistical method for clinical pathological parameters are examined by χ2. Kaplan–Meier method is used for survival analysis. Cox’s Proportional Hazard Model was used to conduct a combined-effect analysis for each covariate.Results: In 73 of the 128 cases, NSCLC cancer tissues (57.0% were found with high expression of fascin-1, which was significantly higher than the adjacent tissues (35/128, 27.3%. The results suggested that the high expression of fascin-1 was significantly correlated with lymph node metastasis (P=0.022 and TNM stage (P=0.042. The high fascin-1 expression patients survived shorter than those NSCLC patients with low fascin-1 expression (P<0.05. Univariate analysis revealed that lymph node metastasis, TNM stage, and fascin-1 expression status were correlated with the overall survival. Similarly, lymph node metastasis, TNM stage, and fascin-1 expression status were significantly associated with the overall survival in multivariate analyses by using the Cox regression model.Conclusion: The fascin-1 protein may be a useful prognostic indicator and
Directory of Open Access Journals (Sweden)
Laura Badenes-Ribera
2018-06-01
Full Text Available Introduction: Publications arguing against the null hypothesis significance testing (NHST procedure and in favor of good statistical practices have increased. The most frequently mentioned alternatives to NHST are effect size statistics (ES, confidence intervals (CIs, and meta-analyses. A recent survey conducted in Spain found that academic psychologists have poor knowledge about effect size statistics, confidence intervals, and graphic displays for meta-analyses, which might lead to a misinterpretation of the results. In addition, it also found that, although the use of ES is becoming generalized, the same thing is not true for CIs. Finally, academics with greater knowledge about ES statistics presented a profile closer to good statistical practice and research design. Our main purpose was to analyze the extension of these results to a different geographical area through a replication study.Methods: For this purpose, we elaborated an on-line survey that included the same items as the original research, and we asked academic psychologists to indicate their level of knowledge about ES, their CIs, and meta-analyses, and how they use them. The sample consisted of 159 Italian academic psychologists (54.09% women, mean age of 47.65 years. The mean number of years in the position of professor was 12.90 (SD = 10.21.Results: As in the original research, the results showed that, although the use of effect size estimates is becoming generalized, an under-reporting of CIs for ES persists. The most frequent ES statistics mentioned were Cohen's d and R2/η2, which can have outliers or show non-normality or violate statistical assumptions. In addition, academics showed poor knowledge about meta-analytic displays (e.g., forest plot and funnel plot and quality checklists for studies. Finally, academics with higher-level knowledge about ES statistics seem to have a profile closer to good statistical practices.Conclusions: Changing statistical practice is not
Understanding Statistics - Cancer Statistics
Annual reports of U.S. cancer statistics including new cases, deaths, trends, survival, prevalence, lifetime risk, and progress toward Healthy People targets, plus statistical summaries for a number of common cancer types.
Directory of Open Access Journals (Sweden)
Sreeram V Ramagopalan
2015-04-01
Full Text Available Background: We and others have shown a significant proportion of interventional trials registered on ClinicalTrials.gov have their primary outcomes altered after the listed study start and completion dates. The objectives of this study were to investigate whether changes made to primary outcomes are associated with the likelihood of reporting a statistically significant primary outcome on ClinicalTrials.gov. Methods: A cross-sectional analysis of all interventional clinical trials registered on ClinicalTrials.gov as of 20 November 2014 was performed. The main outcome was any change made to the initially listed primary outcome and the time of the change in relation to the trial start and end date. Findings: 13,238 completed interventional trials were registered with ClinicalTrials.gov that also had study results posted on the website. 2555 (19.3% had one or more statistically significant primary outcomes. Statistical analysis showed that registration year, funding source and primary outcome change after trial completion were associated with reporting a statistically significant primary outcome. Conclusions: Funding source and primary outcome change after trial completion are associated with a statistically significant primary outcome report on clinicaltrials.gov.
Promotional Strategy Impacts on Organizational Market Share and Profitability
Adesoga Dada Adefulu
2015-01-01
The paper examined promotional strategy impacts on market share and profitability in Coca-Cola and 7up companies in Lagos State, Nigeria. Survey research method was adopted. The study population was the staff in marketing positions in the selected companies. Questionnaire was administered on the samples from Coca-Cola and 7UP companies. The statistical tool employed was the univariate analysis of variance (ANOVA) to determine the statistical significance and the extent to which...
The use of principal components and univariate charts to control multivariate processes
Directory of Open Access Journals (Sweden)
Marcela A. G. Machado
2008-04-01
Full Text Available In this article, we evaluate the performance of the T² chart based on the principal components (PC X chart and the simultaneous univariate control charts based on the original variables (SU charts or based on the principal components (SUPC charts. The main reason to consider the PC chart lies on the dimensionality reduction. However, depending on the disturbance and on the way the original variables are related, the chart is very slow in signaling, except when all variables are negatively correlated and the principal component is wisely selected. Comparing the SU , the SUPC and the T² charts we conclude that the SU X charts (SUPC charts have a better overall performance when the variables are positively (negatively correlated. We also develop the expression to obtain the power of two S² charts designed for monitoring the covariance matrix. These joint S² charts are, in the majority of the cases, more efficient than the generalized variance chart.Neste artigo, avaliamos o desempenho do gráfico de T² baseado em componentes principais (gráfico PC e dos gráficos de controle simultâneos univariados baseados nas variáveis originais (gráfico SU X ou baseados em componentes principais (gráfico SUPC. A principal razão para o uso do gráfico PC é a redução de dimensionalidade. Entretanto, dependendo da perturbação e da correlação entre as variáveis originais, o gráfico é lento em sinalizar, exceto quando todas as variáveis são negativamente correlacionadas e a componente principal é adequadamente escolhida. Comparando os gráficos SU X, SUPC e T² concluímos que o gráfico SU X (gráfico SUPC tem um melhor desempenho global quando as variáveis são positivamente (negativamente correlacionadas. Desenvolvemos também uma expressão para obter o poder de detecção de dois gráficos de S² projetados para controlar a matriz de covariâncias. Os gráficos conjuntos de S² são, na maioria dos casos, mais eficientes que o gr
Statistical analysis and data management
International Nuclear Information System (INIS)
Anon.
1981-01-01
This report provides an overview of the history of the WIPP Biology Program. The recommendations of the American Institute of Biological Sciences (AIBS) for the WIPP biology program are summarized. The data sets available for statistical analyses and problems associated with these data sets are also summarized. Biological studies base maps are presented. A statistical model is presented to evaluate any correlation between climatological data and small mammal captures. No statistically significant relationship between variance in small mammal captures on Dr. Gennaro's 90m x 90m grid and precipitation records from the Duval Potash Mine were found
Directory of Open Access Journals (Sweden)
Mashhood Ahmed Sheikh
2017-08-01
mediate the association between childhood adversity and ADS in adulthood. However, when education was excluded as a mediator-response confounding variable, the indirect effect of childhood adversity on ADS in adulthood was statistically significant (p < 0.05. This study shows that a careful inclusion of potential confounding variables is important when assessing mediation.
International Nuclear Information System (INIS)
Lean, Hooi Hooi; Smyth, Russell
2014-01-01
This paper examines whether initiatives to promote hydroelectricity consumption are likely to be effective by applying univariate and panel Lagrange Multiplier (LM) unit root tests to hydroelectricity consumption in 55 countries over the period 1965–2011. We find that for the panel, as well as about four-fifths of individual countries, that hydroelectricity consumption is stationary. This result implies that shocks to hydroelectricity consumption in most countries will only result in temporary deviations from the long-run growth path. An important consequence of this finding is that initiatives designed to have permanent positive effects on hydroelectricity consumption, such as large-scale dam construction, are unlikely to be effective in increasing the share of hydroelectricity, relative to consumption of fossil fuels. - Highlights: • Applies unit root tests to hydroelectricity consumption. • Hydroelectricity consumption is stationary. • Shocks to hydroelectricity consumption result in temporary deviations from the long-run growth path
Das Bhowmik, R.; Arumugam, S.
2015-12-01
Multivariate downscaling techniques exhibited superiority over univariate regression schemes in terms of preserving cross-correlations between multiple variables- precipitation and temperature - from GCMs. This study focuses on two aspects: (a) develop an analytical solutions on estimating biases in cross-correlations from univariate downscaling approaches and (b) quantify the uncertainty in land-surface states and fluxes due to biases in cross-correlations in downscaled climate forcings. Both these aspects are evaluated using climate forcings available from both historical climate simulations and CMIP5 hindcasts over the entire US. The analytical solution basically relates the univariate regression parameters, co-efficient of determination of regression and the co-variance ratio between GCM and downscaled values. The analytical solutions are compared with the downscaled univariate forcings by choosing the desired p-value (Type-1 error) in preserving the observed cross-correlation. . For quantifying the impacts of biases on cross-correlation on estimating streamflow and groundwater, we corrupt the downscaled climate forcings with different cross-correlation structure.
[Big data in official statistics].
Zwick, Markus
2015-08-01
The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany.
Univariate and Cross Tabulation Analysis of Construction Accidents in the Aegean Region
BARADAN, Selim; AKBOĞA, Özge; ÇETİNKAYA, Ufuk; USMEN, Mümtaz A.
2016-01-01
It is crucial toinvestigate case studies and analyze accident statistics to establish safetyand health culture in the construction industry, which exhibits high fatalityrates. However, it is difficult to find reliable and accurate constructionaccidents data in Turkeydue to inadequate accident reporting and recordkeeping system, which hindersstatistical safety research. Therefore, an independent database was generatedby using inspection reports in this research study. Data mining was performed...
National Statistical Commission and Indian Official Statistics*
Indian Academy of Sciences (India)
IAS Admin
a good collection of official statistics of that time. With more .... statistical agencies and institutions to provide details of statistical activities .... ing several training programmes. .... ful completion of Indian Statistical Service examinations, the.
Prognostic indices in stereotactic radiotherapy of brain metastases of non-small cell lung cancer.
Kaul, David; Angelidis, Alexander; Budach, Volker; Ghadjar, Pirus; Kufeld, Markus; Badakhshi, Harun
2015-11-26
Our purpose was to analyze the long-term clinical outcome and to identify prognostic factors after Linac-based stereotactic radiosurgery (SRS) or fractionated stereotactic radiotherapy (FSRT) on patients with brain metastases (BM) from non-small cell lung cancer (NSCLC). We performed a retrospective analysis of survival on 90 patients who underwent SRS or FSRT of intracranial NSCLC metastases between 04/2004 and 05/2014 that had not undergone prior surgery or whole brain radiotherapy (WBRT) for BM. Follow-up data was analyzed until May 2015. Potential prognostic factors were examined in univariable and multivariable analyses. The Golden Grading System (GGS), the disease-specific graded prognostic assessment (DS-GPA), the RADES II prognostic index as well as the NSCLC-specific index proposed by Rades et al. in 2013 (NSCLC-RADES) were calculated and their predictive values were tested in univariable analysis. The median follow-up time of the surviving patients was 14 months. The overall survival (OS) rate was 51 % after 6 months and 29.9 % after 12 months. Statistically significant factors of better OS after univariable analysis were lower International Union Against Cancer (UICC) stage at first diagnosis, histology of adenocarcinoma, prior surgery of the primary tumor and lower total BM volume. After multivariable analysis adenocarcinoma histology remained a significant factor; higher Karnofsky Performance Score (KPS) and the presence of extracranial metastases (ECM) were also significant. The RADES II and the NSCLC-RADES indices were significant predictors of OS. However, the NSCLC-RADES failed to differentiate between intermediate- and low-risk patients. The DS-GPA and GGS were not statistically significant predictors of survival in univariable analysis. The ideal prognostic index has not been defined yet. We believe that more specific indices will be developed in the future. Our results indicate that the histologic subtype of NSCLC could add to the prognostic
Official Statistics and Statistics Education: Bridging the Gap
Directory of Open Access Journals (Sweden)
Gal Iddo
2017-03-01
Full Text Available This article aims to challenge official statistics providers and statistics educators to ponder on how to help non-specialist adult users of statistics develop those aspects of statistical literacy that pertain to official statistics. We first document the gap in the literature in terms of the conceptual basis and educational materials needed for such an undertaking. We then review skills and competencies that may help adults to make sense of statistical information in areas of importance to society. Based on this review, we identify six elements related to official statistics about which non-specialist adult users should possess knowledge in order to be considered literate in official statistics: (1 the system of official statistics and its work principles; (2 the nature of statistics about society; (3 indicators; (4 statistical techniques and big ideas; (5 research methods and data sources; and (6 awareness and skills for citizens’ access to statistical reports. Based on this ad hoc typology, we discuss directions that official statistics providers, in cooperation with statistics educators, could take in order to (1 advance the conceptualization of skills needed to understand official statistics, and (2 expand educational activities and services, specifically by developing a collaborative digital textbook and a modular online course, to improve public capacity for understanding of official statistics.
The significance of the Van Nuys prognostic index in the management of ductal carcinoma in situ
Directory of Open Access Journals (Sweden)
Davies Mary
2008-06-01
Full Text Available Abstract Background Debate regarding the benefit of radiotherapy after local excision of ductal carcinoma in situ (DCIS continues. The Van Nuys Prognostic Index (VNPI is thought to be a useful aid in deciding which patients are at increased risk of local recurrence and who may benefit from adjuvant radiotherapy (RT. Recently published interim data from the Sloane project has showed that the VNPI score did significantly affect the chances of getting planned radiotherapy in the UK, suggesting that British clinicians may already be using this scoring system to assist in decision making. This paper independently assesses the prognostic validity of the VNPI in a British population. Patients and methods A retrospective review was conducted of all patients (n = 215 who underwent breast conserving surgery for DCIS at a single institution between 1997 – 2006. No patients included in the study received additional radiotherapy or hormonal treatment. Kaplan Meier survival curves were calculated, to determine disease free survival, for the total sample and a series of univariate analyses were performed to examine the value of various prognostic factors including the VNPI. The log-rank test was used to determine statistical significance of differential survival rates. Multivariate Cox regression analysis was performed to analyze the significance of the individual components of the VNPI. All analyses were conducted using SPSS software, version 14.5. Results The mean follow-up period was 53 months (range 12–97, SD19.9. Ninety five tumours were high grade (44% and 84 tumours exhibited comedo necrosis (39%. The closest mean initial excision margin was 2.4 mm (range 0–22 mm, standard deviation 2.8 and a total of 72 tumours (33% underwent further re-excision. The observed and the actuarial 8 year disease-free survival rates in this study were 91% and 83% respectively. The VNPI score and the presence of comedo necrosis were the only statistically significant
Energy Technology Data Exchange (ETDEWEB)
Pistelli, Mirco, E-mail: mirco.pistelli@alice.it; Caramanti, Miriam [Clinica di Oncologia Medica, AO Ospedali Riuniti-Ancona, Università Politecnica delle Marche, Ancona 60020 (Italy); Biscotti, Tommasina; Santinelli, Alfredo [Anatomia Patologica, AO Ospedali Riuniti-Ancona, Università Politecnica delle Marche, Ancona 60020 (Italy); Pagliacci, Alessandra; De Lisa, Mariagrazia; Ballatore, Zelmira; Ridolfi, Francesca; Maccaroni, Elena; Bracci, Raffaella; Berardi, Rossana; Battelli, Nicola; Cascinu, Stefano [Clinica di Oncologia Medica, AO Ospedali Riuniti-Ancona, Università Politecnica delle Marche, Ancona 60020 (Italy)
2014-06-27
Background: Triple-negative breast cancers (TNBC) are characterized by aggressive tumour biology resulting in a poor prognosis. Androgen receptor (AR) is one of newly emerging biomarker in TNBC. In recent years, ARs have been demonstrated to play an important role in the genesis and in the development of breast cancer, although their prognostic role is still debated. In the present study, we explored the correlation of AR expression with clinical, pathological and molecular features and its impact on prognosis in early TNBC. Patients and Methods: ARs were considered positive in case of tumors with >10% nuclear-stained. Survival distribution was estimated by the Kaplan Meier method. The univariate and multivariate analyses were performed. The difference among variables were calculated by chi-square test. Results: 81 TNBC patients diagnosed between January 2006 and December 2011 were included in the analysis. Slides were stained immunohistochemically for estrogen and progesterone receptors, HER-2, Ki-67, ALDH1, e-cadherin and AR. Of the 81 TNBC samples, 18.8% showed positive immunostaining for AR, 23.5% and 44.4% of patients were negative for e-cadherin and ALDH1, respectively. Positive AR immunostaining was inversely correlated with a higher Ki-67 (p < 0.0001) and a lympho-vascular invasion (p = 0.01), but no other variables. Univariate survival analysis revealed that AR expression was not associated with disease-free survival (p = 0.72) or overall survival (p = 0.93). Conclusions: The expression of AR is associated with some biological features of TNBC, such as Ki-67 and lympho-vascular invasion; nevertheless the prognostic significance of AR was not documented in our analysis. However, since ARs are expressed in a significant number of TNBC, prospective studies in order to determine the biological mechanisms and their potential role as novel treatment target.
Directory of Open Access Journals (Sweden)
Mirco Pistelli
2014-06-01
Full Text Available Background: Triple-negative breast cancers (TNBC are characterized by aggressive tumour biology resulting in a poor prognosis. Androgen receptor (AR is one of newly emerging biomarker in TNBC. In recent years, ARs have been demonstrated to play an important role in the genesis and in the development of breast cancer, although their prognostic role is still debated. In the present study, we explored the correlation of AR expression with clinical, pathological and molecular features and its impact on prognosis in early TNBC. Patients and Methods: ARs were considered positive in case of tumors with >10% nuclear-stained. Survival distribution was estimated by the Kaplan Meier method. The univariate and multivariate analyses were performed. The difference among variables were calculated by chi-square test. Results: 81 TNBC patients diagnosed between January 2006 and December 2011 were included in the analysis. Slides were stained immunohistochemically for estrogen and progesterone receptors, HER-2, Ki-67, ALDH1, e-cadherin and AR. Of the 81 TNBC samples, 18.8% showed positive immunostaining for AR, 23.5% and 44.4% of patients were negative for e-cadherin and ALDH1, respectively. Positive AR immunostaining was inversely correlated with a higher Ki-67 (p < 0.0001 and a lympho-vascular invasion (p = 0.01, but no other variables. Univariate survival analysis revealed that AR expression was not associated with disease-free survival (p = 0.72 or overall survival (p = 0.93. Conclusions: The expression of AR is associated with some biological features of TNBC, such as Ki-67 and lympho-vascular invasion; nevertheless the prognostic significance of AR was not documented in our analysis. However, since ARs are expressed in a significant number of TNBC, prospective studies in order to determine the biological mechanisms and their potential role as novel treatment target.
International Nuclear Information System (INIS)
Pistelli, Mirco; Caramanti, Miriam; Biscotti, Tommasina; Santinelli, Alfredo; Pagliacci, Alessandra; De Lisa, Mariagrazia; Ballatore, Zelmira; Ridolfi, Francesca; Maccaroni, Elena; Bracci, Raffaella; Berardi, Rossana; Battelli, Nicola; Cascinu, Stefano
2014-01-01
Background: Triple-negative breast cancers (TNBC) are characterized by aggressive tumour biology resulting in a poor prognosis. Androgen receptor (AR) is one of newly emerging biomarker in TNBC. In recent years, ARs have been demonstrated to play an important role in the genesis and in the development of breast cancer, although their prognostic role is still debated. In the present study, we explored the correlation of AR expression with clinical, pathological and molecular features and its impact on prognosis in early TNBC. Patients and Methods: ARs were considered positive in case of tumors with >10% nuclear-stained. Survival distribution was estimated by the Kaplan Meier method. The univariate and multivariate analyses were performed. The difference among variables were calculated by chi-square test. Results: 81 TNBC patients diagnosed between January 2006 and December 2011 were included in the analysis. Slides were stained immunohistochemically for estrogen and progesterone receptors, HER-2, Ki-67, ALDH1, e-cadherin and AR. Of the 81 TNBC samples, 18.8% showed positive immunostaining for AR, 23.5% and 44.4% of patients were negative for e-cadherin and ALDH1, respectively. Positive AR immunostaining was inversely correlated with a higher Ki-67 (p < 0.0001) and a lympho-vascular invasion (p = 0.01), but no other variables. Univariate survival analysis revealed that AR expression was not associated with disease-free survival (p = 0.72) or overall survival (p = 0.93). Conclusions: The expression of AR is associated with some biological features of TNBC, such as Ki-67 and lympho-vascular invasion; nevertheless the prognostic significance of AR was not documented in our analysis. However, since ARs are expressed in a significant number of TNBC, prospective studies in order to determine the biological mechanisms and their potential role as novel treatment target
Uncertainty Visualization Using Copula-Based Analysis in Mixed Distribution Models.
Hazarika, Subhashis; Biswas, Ayan; Shen, Han-Wei
2018-01-01
Distributions are often used to model uncertainty in many scientific datasets. To preserve the correlation among the spatially sampled grid locations in the dataset, various standard multivariate distribution models have been proposed in visualization literature. These models treat each grid location as a univariate random variable which models the uncertainty at that location. Standard multivariate distributions (both parametric and nonparametric) assume that all the univariate marginals are of the same type/family of distribution. But in reality, different grid locations show different statistical behavior which may not be modeled best by the same type of distribution. In this paper, we propose a new multivariate uncertainty modeling strategy to address the needs of uncertainty modeling in scientific datasets. Our proposed method is based on a statistically sound multivariate technique called Copula, which makes it possible to separate the process of estimating the univariate marginals and the process of modeling dependency, unlike the standard multivariate distributions. The modeling flexibility offered by our proposed method makes it possible to design distribution fields which can have different types of distribution (Gaussian, Histogram, KDE etc.) at the grid locations, while maintaining the correlation structure at the same time. Depending on the results of various standard statistical tests, we can choose an optimal distribution representation at each location, resulting in a more cost efficient modeling without significantly sacrificing on the analysis quality. To demonstrate the efficacy of our proposed modeling strategy, we extract and visualize uncertain features like isocontours and vortices in various real world datasets. We also study various modeling criterion to help users in the task of univariate model selection.
International Nuclear Information System (INIS)
Blanchard, Pierre; Quero, Laurent; Pacault, Vincent; Schlageter, Marie-Helene; Baruch-Hennequin, Valerie; Hennequin, Christophe
2012-01-01
P53 mutations are an adverse prognostic factor in esophageal cancer. P53 and KRas mutations are involved in chemo-radioresistance. Circulating anti-p53 or anti-KRas antibodies are associated with gene mutations. We studied whether anti-p53 or anti-KRas auto-antibodies were prognostic factors for response to chemoradiotherapy (CRT) or survival in esophageal carcinoma. Serum p53 and KRas antibodies (abs) were measured using an ELISA method in 97 consecutive patients treated at Saint Louis University Hospital between 1999 and 2002 with CRT for esophageal carcinoma (squamous cell carcinoma (SCCE) 57 patients, adenocarcinoma (ACE) 27 patients). Patient and tumor characteristics, response to treatment and the follow-up status of 84 patients were retrospectively collected. The association between antibodies and patient characteristics was studied. Univariate and multivariate survival analyses were conducted. Twenty-four patients (28%) had anti-p53 abs. Abs were found predominantly in SCCE (p = 0.003). Anti-p53 abs were associated with a shorter overall survival in the univariate analysis (HR 1.8 [1.03-2.9], p = 0.04). In the multivariate analysis, independent prognostic factors for overall and progression-free survival were an objective response to CRT, the CRT strategy (alone or combined with surgery [preoperative]) and anti-p53 abs. None of the long-term survivors had p53 abs. KRas abs were found in 19 patients (23%, no difference according to the histological type). There was no significant association between anti-KRas abs and survival neither in the univariate nor in the multivariate analysis. Neither anti-p53 nor anti-KRas abs were associated with response to CRT. Anti-p53 abs are an independent prognostic factor for esophageal cancer patients treated with CRT. Individualized therapeutic approaches should be evaluated in this population
Blanchard, Pierre; Quero, Laurent; Pacault, Vincent; Schlageter, Marie-Helene; Baruch-Hennequin, Valerie; Hennequin, Christophe
2012-03-26
P53 mutations are an adverse prognostic factor in esophageal cancer. P53 and KRas mutations are involved in chemo-radioresistance. Circulating anti-p53 or anti-KRas antibodies are associated with gene mutations. We studied whether anti-p53 or anti-KRas auto-antibodies were prognostic factors for response to chemoradiotherapy (CRT) or survival in esophageal carcinoma. Serum p53 and KRas antibodies (abs) were measured using an ELISA method in 97 consecutive patients treated at Saint Louis University Hospital between 1999 and 2002 with CRT for esophageal carcinoma (squamous cell carcinoma (SCCE) 57 patients, adenocarcinoma (ACE) 27 patients). Patient and tumor characteristics, response to treatment and the follow-up status of 84 patients were retrospectively collected. The association between antibodies and patient characteristics was studied. Univariate and multivariate survival analyses were conducted. Twenty-four patients (28%) had anti-p53 abs. Abs were found predominantly in SCCE (p = 0.003). Anti-p53 abs were associated with a shorter overall survival in the univariate analysis (HR 1.8 [1.03-2.9], p = 0.04). In the multivariate analysis, independent prognostic factors for overall and progression-free survival were an objective response to CRT, the CRT strategy (alone or combined with surgery [preoperative]) and anti-p53 abs. None of the long-term survivors had p53 abs. KRas abs were found in 19 patients (23%, no difference according to the histological type). There was no significant association between anti-KRas abs and survival neither in the univariate nor in the multivariate analysis. Neither anti-p53 nor anti-KRas abs were associated with response to CRT. Anti-p53 abs are an independent prognostic factor for esophageal cancer patients treated with CRT. Individualized therapeutic approaches should be evaluated in this population.
Directory of Open Access Journals (Sweden)
Blanchard Pierre
2012-03-01
Full Text Available Abstract Background P53 mutations are an adverse prognostic factor in esophageal cancer. P53 and KRas mutations are involved in chemo-radioresistance. Circulating anti-p53 or anti-KRas antibodies are associated with gene mutations. We studied whether anti-p53 or anti-KRas auto-antibodies were prognostic factors for response to chemoradiotherapy (CRT or survival in esophageal carcinoma. Methods Serum p53 and KRas antibodies (abs were measured using an ELISA method in 97 consecutive patients treated at Saint Louis University Hospital between 1999 and 2002 with CRT for esophageal carcinoma (squamous cell carcinoma (SCCE 57 patients, adenocarcinoma (ACE 27 patients. Patient and tumor characteristics, response to treatment and the follow-up status of 84 patients were retrospectively collected. The association between antibodies and patient characteristics was studied. Univariate and multivariate survival analyses were conducted. Results Twenty-four patients (28% had anti-p53 abs. Abs were found predominantly in SCCE (p = 0.003. Anti-p53 abs were associated with a shorter overall survival in the univariate analysis (HR 1.8 [1.03-2.9], p = 0.04. In the multivariate analysis, independent prognostic factors for overall and progression-free survival were an objective response to CRT, the CRT strategy (alone or combined with surgery [preoperative] and anti-p53 abs. None of the long-term survivors had p53 abs. KRas abs were found in 19 patients (23%, no difference according to the histological type. There was no significant association between anti-KRas abs and survival neither in the univariate nor in the multivariate analysis. Neither anti-p53 nor anti-KRas abs were associated with response to CRT. Conclusions Anti-p53 abs are an independent prognostic factor for esophageal cancer patients treated with CRT. Individualized therapeutic approaches should be evaluated in this population.
Degree-based statistic and center persistency for brain connectivity analysis.
Yoo, Kwangsun; Lee, Peter; Chung, Moo K; Sohn, William S; Chung, Sun Ju; Na, Duk L; Ju, Daheen; Jeong, Yong
2017-01-01
Brain connectivity analyses have been widely performed to investigate the organization and functioning of the brain, or to observe changes in neurological or psychiatric conditions. However, connectivity analysis inevitably introduces the problem of mass-univariate hypothesis testing. Although, several cluster-wise correction methods have been suggested to address this problem and shown to provide high sensitivity, these approaches fundamentally have two drawbacks: the lack of spatial specificity (localization power) and the arbitrariness of an initial cluster-forming threshold. In this study, we propose a novel method, degree-based statistic (DBS), performing cluster-wise inference. DBS is designed to overcome the above-mentioned two shortcomings. From a network perspective, a few brain regions are of critical importance and considered to play pivotal roles in network integration. Regarding this notion, DBS defines a cluster as a set of edges of which one ending node is shared. This definition enables the efficient detection of clusters and their center nodes. Furthermore, a new measure of a cluster, center persistency (CP) was introduced. The efficiency of DBS with a known "ground truth" simulation was demonstrated. Then they applied DBS to two experimental datasets and showed that DBS successfully detects the persistent clusters. In conclusion, by adopting a graph theoretical concept of degrees and borrowing the concept of persistence from algebraic topology, DBS could sensitively identify clusters with centric nodes that would play pivotal roles in an effect of interest. DBS is potentially widely applicable to variable cognitive or clinical situations and allows us to obtain statistically reliable and easily interpretable results. Hum Brain Mapp 38:165-181, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
International Nuclear Information System (INIS)
Shakespeare, T.P.; Mukherjee, R.K.; Gebski, V.J.
2003-01-01
Confidence levels, clinical significance curves, and risk-benefit contours are tools improving analysis of clinical studies and minimizing misinterpretation of published results, however no software has been available for their calculation. The objective was to develop software to help clinicians utilize these tools. Excel 2000 spreadsheets were designed using only built-in functions, without macros. The workbook was protected and encrypted so that users can modify only input cells. The workbook has 4 spreadsheets for use in studies comparing two patient groups. Sheet 1 comprises instructions and graphic examples for use. Sheet 2 allows the user to input the main study results (e.g. survival rates) into a 2-by-2 table. Confidence intervals (95%), p-value and the confidence level for Treatment A being better than Treatment B are automatically generated. An additional input cell allows the user to determine the confidence associated with a specified level of benefit. For example if the user wishes to know the confidence that Treatment A is at least 10% better than B, 10% is entered. Sheet 2 automatically displays clinical significance curves, graphically illustrating confidence levels for all possible benefits of one treatment over the other. Sheet 3 allows input of toxicity data, and calculates the confidence that one treatment is more toxic than the other. It also determines the confidence that the relative toxicity of the most effective arm does not exceed user-defined tolerability. Sheet 4 automatically calculates risk-benefit contours, displaying the confidence associated with a specified scenario of minimum benefit and maximum risk of one treatment arm over the other. The spreadsheet is freely downloadable at www.ontumor.com/professional/statistics.htm A simple, self-explanatory, freely available spreadsheet calculator was developed using Excel 2000. The incorporated decision-making tools can be used for data analysis and improve the reporting of results of any
International Nuclear Information System (INIS)
Tariq, Saadia R.; Shah, Munir H.; Shaheen, Nazia
2009-01-01
Two tanning units of Pakistan, namely, Kasur and Mian Channun were investigated with respect to the tanning processes (chrome and vegetable, respectively) and the effects of the tanning agents on the quality of soil in vicinity of tanneries were evaluated. The effluent and soil samples from 16 tanneries each of Kasur and Mian Channun were collected. The levels of selected metals (Na, K, Ca, Mg, Fe, Cr, Mn, Co, Cd, Ni, Pb and Zn) were determined by using flame atomic absorption spectrophotometer under optimum analytical conditions. The data thus obtained were subjected to univariate and multivariate statistical analyses. Most of the metals exhibited considerably higher concentrations in the effluents and soils of Kasur compared with those of Mian Channun. It was observed that the soil of Kasur was highly contaminated by Na, K, Ca and Mg emanating from various processes of leather manufacture. Furthermore, the levels of Cr were also present at much enhanced levels than its background concentration due to the adoption of chrome tanning. The levels of Cr determined in soil samples collected from the vicinity of Mian Channun tanneries were almost comparable to the background levels. The soil of this city was found to have contaminated only by the metals originating from pre-tanning processes. The apportionment of selected metals in the effluent and soil samples was determined by a multivariate cluster analysis, which revealed significant differences in chrome and vegetable tanning processes.
Tariq, Saadia R; Shah, Munir H; Shaheen, Nazia
2009-09-30
Two tanning units of Pakistan, namely, Kasur and Mian Channun were investigated with respect to the tanning processes (chrome and vegetable, respectively) and the effects of the tanning agents on the quality of soil in vicinity of tanneries were evaluated. The effluent and soil samples from 16 tanneries each of Kasur and Mian Channun were collected. The levels of selected metals (Na, K, Ca, Mg, Fe, Cr, Mn, Co, Cd, Ni, Pb and Zn) were determined by using flame atomic absorption spectrophotometer under optimum analytical conditions. The data thus obtained were subjected to univariate and multivariate statistical analyses. Most of the metals exhibited considerably higher concentrations in the effluents and soils of Kasur compared with those of Mian Channun. It was observed that the soil of Kasur was highly contaminated by Na, K, Ca and Mg emanating from various processes of leather manufacture. Furthermore, the levels of Cr were also present at much enhanced levels than its background concentration due to the adoption of chrome tanning. The levels of Cr determined in soil samples collected from the vicinity of Mian Channun tanneries were almost comparable to the background levels. The soil of this city was found to have contaminated only by the metals originating from pre-tanning processes. The apportionment of selected metals in the effluent and soil samples was determined by a multivariate cluster analysis, which revealed significant differences in chrome and vegetable tanning processes.
Energy Technology Data Exchange (ETDEWEB)
Tariq, Saadia R. [Department of Chemistry, Lahore College for Women University, Lahore (Pakistan); Shah, Munir H., E-mail: munir_qau@yahoo.com [Department of Chemistry, Quaid-i-Azam University, Islamabad 45320 (Pakistan); Shaheen, Nazia [Department of Chemistry, Quaid-i-Azam University, Islamabad 45320 (Pakistan)
2009-09-30
Two tanning units of Pakistan, namely, Kasur and Mian Channun were investigated with respect to the tanning processes (chrome and vegetable, respectively) and the effects of the tanning agents on the quality of soil in vicinity of tanneries were evaluated. The effluent and soil samples from 16 tanneries each of Kasur and Mian Channun were collected. The levels of selected metals (Na, K, Ca, Mg, Fe, Cr, Mn, Co, Cd, Ni, Pb and Zn) were determined by using flame atomic absorption spectrophotometer under optimum analytical conditions. The data thus obtained were subjected to univariate and multivariate statistical analyses. Most of the metals exhibited considerably higher concentrations in the effluents and soils of Kasur compared with those of Mian Channun. It was observed that the soil of Kasur was highly contaminated by Na, K, Ca and Mg emanating from various processes of leather manufacture. Furthermore, the levels of Cr were also present at much enhanced levels than its background concentration due to the adoption of chrome tanning. The levels of Cr determined in soil samples collected from the vicinity of Mian Channun tanneries were almost comparable to the background levels. The soil of this city was found to have contaminated only by the metals originating from pre-tanning processes. The apportionment of selected metals in the effluent and soil samples was determined by a multivariate cluster analysis, which revealed significant differences in chrome and vegetable tanning processes.
Does BMI influence hospital stay and morbidity after fast-track hip and knee arthroplasty?
DEFF Research Database (Denmark)
Husted, Henrik; Jørgensen, Christoffer C.; Gromov, Kirill
2016-01-01
-day re-admission rates were around 6% for both THA (6.1%) and TKA (5.9%), without any statistically significant differences between BMI groups in univariate analysis (p > 0.4), but there was a trend of a protective effect of overweight for both THA (p = 0.1) and TKA (p = 0.06). 90-day re...
Review of the Statistical Techniques in Medical Sciences | Okeh ...
African Journals Online (AJOL)
... medical researcher in selecting the appropriate statistical techniques. Of course, all statistical techniques have certain underlying assumptions, which must be checked before the technique is applied. Keywords: Variable, Prospective Studies, Retrospective Studies, Statistical significance. Bio-Research Vol. 6 (1) 2008: pp.
Directory of Open Access Journals (Sweden)
Eun-Joo Jung
2011-01-01
Full Text Available Background: The over expression of fascin, extracellular matrix metalloproteinase inducer (EMMPRIN, and ezrin proteins has been associated with poor prognosis in various carcinomas and sarcomas. However, very few studies have reported the relationship between the expression of fascin, EMMPRIN, and ezrin proteins and the clinico-pathologic parameters of colorectal carcinomas. Aims: The aim was to investigate the relationship between fascin, EMMPRIN, and ezrin proteins in colorectal adenocarcinomas and their correlation with clinico-pathologic parameters. Settings and Design: The expression of fascin, EMMPRIN, and ezrin proteins was studied in 210 colorectal adenocarcinoma patients through immunohistochemical staining. Materials and Methods: Immunohistochemical staining by the avidin-biotin peroxidase method was done. The scoring of each protein expression was done and divided into three groups (negative, low-, and high-expression groups. Statistical Analysis: A chi-square test, and Kendall′s tau-b correlation test were used for comparing. Survival analysis was performed using the Kaplan-Meier method with log-rank tests and the Cox proportional hazard model. Results: The percentages of the high-expression group of fascin, EMMPRIN, and ezrin proteins in colorectal adenocarcinomas were 24%, 73%, and 62%, respectively. Weak positive correlations were observed among these protein expressions. An increased expression of the fascin protein was significantly associated with advanced tumor depth and shorter survival times, and a high expression of fascin protein was an independent prognostic factor in univariate and multivariate survival analyses. EMMPRIN and ezrin protein expressions were not associated with the clinico-pathologic parameters. Conclusions: The high expression of fascin protein may be an unfavorable prognostic marker for individual colorectal cancer patients.
Clinical significance of serum anti-human papillomavirus 16 and 18 antibodies in cervical neoplasia.
Chay, Doo Byung; Cho, Hanbyoul; Kim, Bo Wook; Kang, Eun Suk; Song, Eunseop; Kim, Jae-Hoon
2013-02-01
To estimate the clinical significance of serum anti-human papillomavirus (HPV) antibodies and high-risk cervical HPV DNA in cervical neoplasia. The study population comprised patients who were histopathologically diagnosed with cervical intraepithelial neoplasia (CIN) 1 (n=64), CIN 2 and 3 (n=241), cervical cancer (n=170), and normal control participants (n=975). Cervical HPV DNA tests were performed through nucleic acid hybridization assay tests, and serum anti-HPV 16 and 18 antibodies were measured by competitive immunoassay. The associations of HPV DNA and anti-HPV antibodies were evaluated with demographic characteristics and compared according to the levels of disease severity. Anti-HPV antibodies were also investigated with clinicopathologic parameters, including survival data. Among various demographic characteristics, factors involving sexual behavior had a higher tendency of HPV DNA positivity and HPV seropositivity. Human papillomavirus DNA mean titer and positivity were both increased in patients with cervical neoplasia compared with those with normal control participants, but there was no statistical difference among types of cervical neoplasia. Serum anti-HPV 16 antibodies were also able to differentiate cervical neoplasia from a normal control participant and furthermore distinguished CIN 1 from CIN 2 and 3 (odd ratio 2.87 [1.43-5.78], P=.002). In cervical cancer, HPV 16 seropositivity was associated with prolonged disease-free survival according to the univariable analysis (hazard ratio=0.12 [0.01-0.94], P=.044). Serum anti-HPV 16 antibodies can distinguish cervical neoplasia from a normal control and has the advantage of identifying high-grade CIN. Moreover, in cervical cancer, HPV 16 seropositivity may be associated with a more favorable prognosis. II.
Pardo-Igúzquiza, Eulogio; Rodríguez-Tovar, Francisco J.
2012-12-01
Many spectral analysis techniques have been designed assuming sequences taken with a constant sampling interval. However, there are empirical time series in the geosciences (sediment cores, fossil abundance data, isotope analysis, …) that do not follow regular sampling because of missing data, gapped data, random sampling or incomplete sequences, among other reasons. In general, interpolating an uneven series in order to obtain a succession with a constant sampling interval alters the spectral content of the series. In such cases it is preferable to follow an approach that works with the uneven data directly, avoiding the need for an explicit interpolation step. The Lomb-Scargle periodogram is a popular choice in such circumstances, as there are programs available in the public domain for its computation. One new computer program for spectral analysis improves the standard Lomb-Scargle periodogram approach in two ways: (1) It explicitly adjusts the statistical significance to any bias introduced by variance reduction smoothing, and (2) it uses a permutation test to evaluate confidence levels, which is better suited than parametric methods when neighbouring frequencies are highly correlated. Another novel program for cross-spectral analysis offers the advantage of estimating the Lomb-Scargle cross-periodogram of two uneven time series defined on the same interval, and it evaluates the confidence levels of the estimated cross-spectra by a non-parametric computer intensive permutation test. Thus, the cross-spectrum, the squared coherence spectrum, the phase spectrum, and the Monte Carlo statistical significance of the cross-spectrum and the squared-coherence spectrum can be obtained. Both of the programs are written in ANSI Fortran 77, in view of its simplicity and compatibility. The program code is of public domain, provided on the website of the journal (http://www.iamg.org/index.php/publisher/articleview/frmArticleID/112/). Different examples (with simulated and
Directory of Open Access Journals (Sweden)
Sudeepa Bhattacharyya
2006-01-01
Full Text Available Multiple Myeloma (MM is a severely debilitating neoplastic disease of B cell origin, with the primary source of morbidity and mortality associated with unrestrained bone destruction. Surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS was used to screen for potential biomarkers indicative of skeletal involvement in patients with MM. Serum samples from 48 MM patients, 24 with more than three bone lesions and 24 with no evidence of bone lesions were fractionated and analyzed in duplicate using copper ion loaded immobilized metal affinity SELDI chip arrays. The spectra obtained were compiled, normalized, and mass peaks with mass-to-charge ratios (m/z between 2000 and 20,000 Da identified. Peak information from all fractions was combined together and analyzed using univariate statistics, as well as a linear, partial least squares discriminant analysis (PLS-DA, and a non-linear, random forest (RF, classification algorithm. The PLS-DA model resulted in prediction accuracy between 96–100%, while the RF model was able to achieve a specificity and sensitivity of 87.5% each. Both models as well as multiple comparison adjusted univariate analysis identified a set of four peaks that were the most discriminating between the two groups of patients and hold promise as potential biomarkers for future diagnostic and/or therapeutic purposes.
International Nuclear Information System (INIS)
Lim, Gyeong Hui
2008-03-01
This book consists of 15 chapters, which are basic conception and meaning of statistical thermodynamics, Maxwell-Boltzmann's statistics, ensemble, thermodynamics function and fluctuation, statistical dynamics with independent particle system, ideal molecular system, chemical equilibrium and chemical reaction rate in ideal gas mixture, classical statistical thermodynamics, ideal lattice model, lattice statistics and nonideal lattice model, imperfect gas theory on liquid, theory on solution, statistical thermodynamics of interface, statistical thermodynamics of a high molecule system and quantum statistics
Directory of Open Access Journals (Sweden)
Isabel eValli
2016-04-01
Full Text Available The identification of individuals at high risk of developing psychosis is entirely based on clinical assessment, associated with limited predictive potential. There is therefore increasing interest in the development of biological markers that could be used in clinical practice for this purpose. We studied 25 individuals with an At Risk Mental State for psychosis and 25 healthy controls using structural MRI, and functional MRI in conjunction with a verbal memory task. Data were analysed using a standard univariate analysis, and with Support Vector Machine (SVM, a multivariate pattern recognition technique that enables statistical inferences to be made at the level of the individual, yielding results with high translational potential. The application of SVM to structural MRI data permitted the identification of individuals at high risk of psychosis with a sensitivity of 68% and a specificity of 76%, resulting in an accuracy of 72% (p<0.001. Univariate volumetric between-group differences did not reach statistical significance. In contrast, the univariate fMRI analysis identified between-group differences (p<0.05 corrected while the application of SVM to the same data did not. Since SVM is well suited at identifying the pattern of abnormality that distinguishes two groups, whereas univariate methods are more likely to identify regions that individually are most different between two groups, our results suggest the presence of focal functional abnormalities in the context of a diffuse pattern of structural abnormalities in individuals at high clinical risk of psychosis.
[Statistics for statistics?--Thoughts about psychological tools].
Berger, Uwe; Stöbel-Richter, Yve
2007-12-01
Statistical methods take a prominent place among psychologists' educational programs. Being known as difficult to understand and heavy to learn, students fear of these contents. Those, who do not aspire after a research carrier at the university, will forget the drilled contents fast. Furthermore, because it does not apply for the work with patients and other target groups at a first glance, the methodological education as a whole was often questioned. For many psychological practitioners the statistical education makes only sense by enforcing respect against other professions, namely physicians. For the own business, statistics is rarely taken seriously as a professional tool. The reason seems to be clear: Statistics treats numbers, while psychotherapy treats subjects. So, does statistics ends in itself? With this article, we try to answer the question, if and how statistical methods were represented within the psychotherapeutical and psychological research. Therefore, we analyzed 46 Originals of a complete volume of the journal Psychotherapy, Psychosomatics, Psychological Medicine (PPmP). Within the volume, 28 different analyse methods were applied, from which 89 per cent were directly based upon statistics. To be able to write and critically read Originals as a backbone of research, presumes a high degree of statistical education. To ignore statistics means to ignore research and at least to reveal the own professional work to arbitrariness.
On a curvature-statistics theorem
International Nuclear Information System (INIS)
Calixto, M; Aldaya, V
2008-01-01
The spin-statistics theorem in quantum field theory relates the spin of a particle to the statistics obeyed by that particle. Here we investigate an interesting correspondence or connection between curvature (κ = ±1) and quantum statistics (Fermi-Dirac and Bose-Einstein, respectively). The interrelation between both concepts is established through vacuum coherent configurations of zero modes in quantum field theory on the compact O(3) and noncompact O(2; 1) (spatial) isometry subgroups of de Sitter and Anti de Sitter spaces, respectively. The high frequency limit, is retrieved as a (zero curvature) group contraction to the Newton-Hooke (harmonic oscillator) group. We also make some comments on the physical significance of the vacuum energy density and the cosmological constant problem.
On a curvature-statistics theorem
Energy Technology Data Exchange (ETDEWEB)
Calixto, M [Departamento de Matematica Aplicada y Estadistica, Universidad Politecnica de Cartagena, Paseo Alfonso XIII 56, 30203 Cartagena (Spain); Aldaya, V [Instituto de Astrofisica de Andalucia, Apartado Postal 3004, 18080 Granada (Spain)], E-mail: Manuel.Calixto@upct.es
2008-08-15
The spin-statistics theorem in quantum field theory relates the spin of a particle to the statistics obeyed by that particle. Here we investigate an interesting correspondence or connection between curvature ({kappa} = {+-}1) and quantum statistics (Fermi-Dirac and Bose-Einstein, respectively). The interrelation between both concepts is established through vacuum coherent configurations of zero modes in quantum field theory on the compact O(3) and noncompact O(2; 1) (spatial) isometry subgroups of de Sitter and Anti de Sitter spaces, respectively. The high frequency limit, is retrieved as a (zero curvature) group contraction to the Newton-Hooke (harmonic oscillator) group. We also make some comments on the physical significance of the vacuum energy density and the cosmological constant problem.
Testing statistical hypotheses of equivalence
Wellek, Stefan
2010-01-01
Equivalence testing has grown significantly in importance over the last two decades, especially as its relevance to a variety of applications has become understood. Yet published work on the general methodology remains scattered in specialists' journals, and for the most part, it focuses on the relatively narrow topic of bioequivalence assessment.With a far broader perspective, Testing Statistical Hypotheses of Equivalence provides the first comprehensive treatment of statistical equivalence testing. The author addresses a spectrum of specific, two-sided equivalence testing problems, from the
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Prognostic and survival analysis of 837 Chinese colorectal cancer patients.
Yuan, Ying; Li, Mo-Dan; Hu, Han-Guang; Dong, Cai-Xia; Chen, Jia-Qi; Li, Xiao-Fen; Li, Jing-Jing; Shen, Hong
2013-05-07
To develop a prognostic model to predict survival of patients with colorectal cancer (CRC). Survival data of 837 CRC patients undergoing surgery between 1996 and 2006 were collected and analyzed by univariate analysis and Cox proportional hazard regression model to reveal the prognostic factors for CRC. All data were recorded using a standard data form and analyzed using SPSS version 18.0 (SPSS, Chicago, IL, United States). Survival curves were calculated by the Kaplan-Meier method. The log rank test was used to assess differences in survival. Univariate hazard ratios and significant and independent predictors of disease-specific survival and were identified by Cox proportional hazard analysis. The stepwise procedure was set to a threshold of 0.05. Statistical significance was defined as P analysis suggested age, preoperative obstruction, serum carcinoembryonic antigen level at diagnosis, status of resection, tumor size, histological grade, pathological type, lymphovascular invasion, invasion of adjacent organs, and tumor node metastasis (TNM) staging were positive prognostic factors (P analysis showed a significant statistical difference in 3-year survival among these groups: LNR1, 73%; LNR2, 55%; and LNR3, 42% (P analysis results showed that histological grade, depth of bowel wall invasion, and number of metastatic lymph nodes were the most important prognostic factors for CRC if we did not consider the interaction of the TNM staging system (P < 0.05). When the TNM staging was taken into account, histological grade lost its statistical significance, while the specific TNM staging system showed a statistically significant difference (P < 0.0001). The overall survival of CRC patients has improved between 1996 and 2006. LNR is a powerful factor for estimating the survival of stage III CRC patients.
International Nuclear Information System (INIS)
Dai, Wu-Sheng; Xie, Mi
2013-01-01
In this paper, we give a general discussion on the calculation of the statistical distribution from a given operator relation of creation, annihilation, and number operators. Our result shows that as long as the relation between the number operator and the creation and annihilation operators can be expressed as a † b=Λ(N) or N=Λ −1 (a † b), where N, a † , and b denote the number, creation, and annihilation operators, i.e., N is a function of quadratic product of the creation and annihilation operators, the corresponding statistical distribution is the Gentile distribution, a statistical distribution in which the maximum occupation number is an arbitrary integer. As examples, we discuss the statistical distributions corresponding to various operator relations. In particular, besides the Bose–Einstein and Fermi–Dirac cases, we discuss the statistical distributions for various schemes of intermediate statistics, especially various q-deformation schemes. Our result shows that the statistical distributions corresponding to various q-deformation schemes are various Gentile distributions with different maximum occupation numbers which are determined by the deformation parameter q. This result shows that the results given in much literature on the q-deformation distribution are inaccurate or incomplete. -- Highlights: ► A general discussion on calculating statistical distribution from relations of creation, annihilation, and number operators. ► A systemic study on the statistical distributions corresponding to various q-deformation schemes. ► Arguing that many results of q-deformation distributions in literature are inaccurate or incomplete
Statistical and theoretical research
International Nuclear Information System (INIS)
Anon.
1983-01-01
Significant accomplishments include the creation of field designs to detect population impacts, new census procedures for small mammals, and methods for designing studies to determine where and how much of a contaminant is extent over certain landscapes. A book describing these statistical methods is currently being written and will apply to a variety of environmental contaminants, including radionuclides. PNL scientists also have devised an analytical method for predicting the success of field eexperiments on wild populations. Two highlights of current research are the discoveries that population of free-roaming horse herds can double in four years and that grizzly bear populations may be substantially smaller than once thought. As stray horses become a public nuisance at DOE and other large Federal sites, it is important to determine their number. Similar statistical theory can be readily applied to other situations where wild animals are a problem of concern to other government agencies. Another book, on statistical aspects of radionuclide studies, is written specifically for researchers in radioecology
Significance analysis of lexical bias in microarray data
Directory of Open Access Journals (Sweden)
Falkow Stanley
2003-04-01
Full Text Available Abstract Background Genes that are determined to be significantly differentially regulated in microarray analyses often appear to have functional commonalities, such as being components of the same biochemical pathway. This results in certain words being under- or overrepresented in the list of genes. Distinguishing between biologically meaningful trends and artifacts of annotation and analysis procedures is of the utmost importance, as only true biological trends are of interest for further experimentation. A number of sophisticated methods for identification of significant lexical trends are currently available, but these methods are generally too cumbersome for practical use by most microarray users. Results We have developed a tool, LACK, for calculating the statistical significance of apparent lexical bias in microarray datasets. The frequency of a user-specified list of search terms in a list of genes which are differentially regulated is assessed for statistical significance by comparison to randomly generated datasets. The simplicity of the input files and user interface targets the average microarray user who wishes to have a statistical measure of apparent lexical trends in analyzed datasets without the need for bioinformatics skills. The software is available as Perl source or a Windows executable. Conclusion We have used LACK in our laboratory to generate biological hypotheses based on our microarray data. We demonstrate the program's utility using an example in which we confirm significant upregulation of SPI-2 pathogenicity island of Salmonella enterica serovar Typhimurium by the cation chelator dipyridyl.
The measure and significance of Bateman's principles.
Collet, Julie M; Dean, Rebecca F; Worley, Kirsty; Richardson, David S; Pizzari, Tommaso
2014-05-07
Bateman's principles explain sex roles and sexual dimorphism through sex-specific variance in mating success, reproductive success and their relationships within sexes (Bateman gradients). Empirical tests of these principles, however, have come under intense scrutiny. Here, we experimentally show that in replicate groups of red junglefowl, Gallus gallus, mating and reproductive successes were more variable in males than in females, resulting in a steeper male Bateman gradient, consistent with Bateman's principles. However, we use novel quantitative techniques to reveal that current methods typically overestimate Bateman's principles because they (i) infer mating success indirectly from offspring parentage, and thus miss matings that fail to result in fertilization, and (ii) measure Bateman gradients through the univariate regression of reproductive over mating success, without considering the substantial influence of other components of male reproductive success, namely female fecundity and paternity share. We also find a significant female Bateman gradient but show that this likely emerges as spurious consequences of male preference for fecund females, emphasizing the need for experimental approaches to establish the causal relationship between reproductive and mating success. While providing qualitative support for Bateman's principles, our study demonstrates how current approaches can generate a misleading view of sex differences and roles.
Whither Statistics Education Research?
Watson, Jane
2016-01-01
This year marks the 25th anniversary of the publication of a "National Statement on Mathematics for Australian Schools", which was the first curriculum statement this country had including "Chance and Data" as a significant component. It is hence an opportune time to survey the history of the related statistics education…
Energy Technology Data Exchange (ETDEWEB)
Pitkaenen, M.T.; Manninen, H.I.; Lindgren, K.-A.J.; Sihvonen, T.A.; Airaksinen, O.; Soimakallio, S
2002-07-01
AIM: To identify plain radiographic findings that predict segmental lumbar spine instability as shown by functional flexion-extension radiography. MATERIALS AND METHODS: Plain radiographs and flexion-extension radiographs of 215 patients with clinically suspected lumbar spine instability were analysed. Instability was classified into anterior or posterior sliding instability. The registered plain radiographic findings were traction spur, spondylarthrosis, arthrosis of facet joints, disc degeneration, retrolisthesis, degenerative spondylolisthesis, spondylolytic spondylolisthesis and vacuum phenomena. Factors reaching statistical significance in univariate analyses (P < 0.05) were included in stepwise multiple logistic regression analysis. RESULTS: Degenerative spondylolisthesis (P = 0.004 at L3-4 level and P = 0.017 at L4-5 level in univariate analysis and odds ratio 16.92 at L4-5 level in multiple logistic regression analyses) and spondylolytic spondylolisthesis (P = 0.003 at L5-S1 level in univariate analyses) were the strongest independent determinants of anterior sliding instability. Retrolisthesis (odds ratio 10.97), traction spur (odds ratio 4.45) and spondylarthrosis (odds ratio 3.20) at L3-4 level were statistically significant determinants of posterior sliding instability in multivariate analysis. CONCLUSION: Sliding instability is strongly associated with various plain radiographic findings. In mechanical back pain, functional flexion-extension radiographs should be limited to situations when symptoms are not explained by findings of plain radiographs and/or when they are likely to alter therapy. Pitkaenen, M.T. et al. (2002)
International Nuclear Information System (INIS)
Pitkaenen, M.T.; Manninen, H.I.; Lindgren, K.-A.J.; Sihvonen, T.A.; Airaksinen, O.; Soimakallio, S.
2002-01-01
AIM: To identify plain radiographic findings that predict segmental lumbar spine instability as shown by functional flexion-extension radiography. MATERIALS AND METHODS: Plain radiographs and flexion-extension radiographs of 215 patients with clinically suspected lumbar spine instability were analysed. Instability was classified into anterior or posterior sliding instability. The registered plain radiographic findings were traction spur, spondylarthrosis, arthrosis of facet joints, disc degeneration, retrolisthesis, degenerative spondylolisthesis, spondylolytic spondylolisthesis and vacuum phenomena. Factors reaching statistical significance in univariate analyses (P < 0.05) were included in stepwise multiple logistic regression analysis. RESULTS: Degenerative spondylolisthesis (P = 0.004 at L3-4 level and P = 0.017 at L4-5 level in univariate analysis and odds ratio 16.92 at L4-5 level in multiple logistic regression analyses) and spondylolytic spondylolisthesis (P = 0.003 at L5-S1 level in univariate analyses) were the strongest independent determinants of anterior sliding instability. Retrolisthesis (odds ratio 10.97), traction spur (odds ratio 4.45) and spondylarthrosis (odds ratio 3.20) at L3-4 level were statistically significant determinants of posterior sliding instability in multivariate analysis. CONCLUSION: Sliding instability is strongly associated with various plain radiographic findings. In mechanical back pain, functional flexion-extension radiographs should be limited to situations when symptoms are not explained by findings of plain radiographs and/or when they are likely to alter therapy. Pitkaenen, M.T. et al. (2002)
International Nuclear Information System (INIS)
Choi, Hye Jin; Kang, Chang Moo; Lee, Woo Jung; Jo, Kwanhyeong; Lee, Jong Doo; Lee, Jae-Hoon; Ryu, Young Hoon
2015-01-01
The purpose of this study was to investigate the prognostic value of 18 F-fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) in patients with ampullary adenocarcinoma (AAC) after curative surgical resection. Fifty-two patients with AAC who had undergone 18 F-FDG PET/CT and subsequent curative resections were retrospectively enrolled. The maximum standardized uptake value (SUV max ) and tumor to background ratio (TBR) were measured on 18 F-FDG PET/CT in all patients. The prognostic significances of PET/CT parameters and clinicopathologic factors for recurrence-free survival (RFS) and overall survival (OS) were evaluated by univariate and multivariate analyses. Of the 52 patients, 19 (36.5 %) experienced tumor recurrence during the follow-up period and 18 (35.8 %) died. The 3-year RFS and OS were 62.3 and 61.5 %, respectively. Preoperative CA19-9 level, tumor differentiation, presence of lymph node metastasis, SUV max , and TBR were significant prognostic factors for both RFS and OS (p < 0.05) on univariate analyses, and patient age showed significance only for predicting RFS (p < 0.05). On multivariate analyses, SUV max and TBR were independent prognostic factors for RFS, and tumor differentiation, SUV max , and TBR were independent prognostic factors for OS. SUV max and TBR on preoperative 18 F-FDG PET/CT are independent prognostic factors for predicting RFS and OS in patients with AAC; patients with high SUV max (>4.80) or TBR (>1.75) had poor survival outcomes. The role of and indications for adjuvant therapy after curative resection of AAC are still unclear. 18 F-FDG uptake in the primary tumor could provide additive prognostic information for the decision-making process regarding adjuvant therapy. (orig.)
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Tumur, Odgerel; Soon, Kean; Brown, Fraser; Mykytowycz, Marcus
2013-06-01
The aims of our study were to evaluate the effect of application of Adaptive Statistical Iterative Reconstruction (ASIR) algorithm on the radiation dose of coronary computed tomography angiography (CCTA) and its effects on image quality of CCTA and to evaluate the effects of various patient and CT scanning factors on the radiation dose of CCTA. This was a retrospective study that included 347 consecutive patients who underwent CCTA at a tertiary university teaching hospital between 1 July 2009 and 20 September 2011. Analysis was performed comparing patient demographics, scan characteristics, radiation dose and image quality in two groups of patients in whom conventional Filtered Back Projection (FBP) or ASIR was used for image reconstruction. There were 238 patients in the FBP group and 109 patients in the ASIR group. There was no difference between the groups in the use of prospective gating, scan length or tube voltage. In ASIR group, significantly lower tube current was used compared with FBP group, 550 mA (450-600) vs. 650 mA (500-711.25) (median (interquartile range)), respectively, P ASIR group compared with FBP group, 4.29 mSv (2.84-6.02) vs. 5.84 mSv (3.88-8.39) (median (interquartile range)), respectively, P ASIR was associated with increased image noise compared with FBP (39.93 ± 10.22 vs. 37.63 ± 18.79 (mean ± standard deviation), respectively, P ASIR reduces the radiation dose of CCTA without affecting the image quality. © 2013 The Authors. Journal of Medical Imaging and Radiation Oncology © 2013 The Royal Australian and New Zealand College of Radiologists.
de la Motte Rouge, Thibault; Pautier, Patricia; Genestie, Catherine; Rey, Annie; Gouy, Sébastien; Leary, Alexandra; Haie-Meder, Christine; Kerbrat, Pierre; Culine, Stéphane; Fizazi, Karim; Lhommé, Catherine
2016-09-01
The ovarian yolk sac tumor (OYST) is a very rare malignancy arising in young women. Our objective was to determine whether an early decline in serum alpha-fetoprotein (AFP) during chemotherapy has a prognostic impact. This retrospective study is based on prospectively recorded OYST cases at Gustave Roussy (Cancer Treatment Center). Survival curves were estimated using the Kaplan-Meier method. The serum AFP decline was calculated with the formula previously developed and validated in male patients with poor prognosis non-seminomatous germ cell tumors. Univariate and multivariate analyses were performed using the log-rank test and logistic regression, respectively. Data on AFP were available to calculate an early AFP decline in 57 patients. All patients had undergone surgery followed by chemotherapy. The 5-year overall survival (OS) and event-free survival (EFS) rates were 86% (95% CI: 74%-93%) and 84% (95% CI: 73%-91%), respectively. The disease stage, presence of ascites at presentation, use of the BEP regimen, serum AFP half-life and an early AFP decline were significantly predictive factors for OS and EFS in the univariate analysis. The OS rate was 100% and 49% (95% CI: 26%-72%) in patients with a favorable AFP decline and in those with an unfavorable decline, respectively (p<0.001). In the multivariate analysis, only the presence of ascites at diagnosis (RR=7.3, p=0.03) and an unfavorable early AFP decline (RR=16.9, p<0.01) were significant negative predictive factors for OS. An early AFP decline during chemotherapy is an independent prognostic factor in patients with OYSTs. No conflict of interest. Copyright © 2016. Published by Elsevier Inc.
Industrial commodity statistics yearbook 2001. Production statistics (1992-2001)
International Nuclear Information System (INIS)
2003-01-01
This is the thirty-fifth in a series of annual compilations of statistics on world industry designed to meet both the general demand for information of this kind and the special requirements of the United Nations and related international bodies. Beginning with the 1992 edition, the title of the publication was changed to industrial Commodity Statistics Yearbook as the result of a decision made by the United Nations Statistical Commission at its twenty-seventh session to discontinue, effective 1994, publication of the Industrial Statistics Yearbook, volume I, General Industrial Statistics by the Statistics Division of the United Nations. The United Nations Industrial Development Organization (UNIDO) has become responsible for the collection and dissemination of general industrial statistics while the Statistics Division of the United Nations continues to be responsible for industrial commodity production statistics. The previous title, Industrial Statistics Yearbook, volume II, Commodity Production Statistics, was introduced in the 1982 edition. The first seven editions in this series were published under the title The Growth of World industry and the next eight editions under the title Yearbook of Industrial Statistics. This edition of the Yearbook contains annual quantity data on production of industrial commodities by country, geographical region, economic grouping and for the world. A standard list of about 530 commodities (about 590 statistical series) has been adopted for the publication. The statistics refer to the ten-year period 1992-2001 for about 200 countries and areas
Industrial commodity statistics yearbook 2002. Production statistics (1993-2002)
International Nuclear Information System (INIS)
2004-01-01
This is the thirty-sixth in a series of annual compilations of statistics on world industry designed to meet both the general demand for information of this kind and the special requirements of the United Nations and related international bodies. Beginning with the 1992 edition, the title of the publication was changed to industrial Commodity Statistics Yearbook as the result of a decision made by the United Nations Statistical Commission at its twenty-seventh session to discontinue, effective 1994, publication of the Industrial Statistics Yearbook, volume I, General Industrial Statistics by the Statistics Division of the United Nations. The United Nations Industrial Development Organization (UNIDO) has become responsible for the collection and dissemination of general industrial statistics while the Statistics Division of the United Nations continues to be responsible for industrial commodity production statistics. The previous title, Industrial Statistics Yearbook, volume II, Commodity Production Statistics, was introduced in the 1982 edition. The first seven editions in this series were published under the title 'The Growth of World industry' and the next eight editions under the title 'Yearbook of Industrial Statistics'. This edition of the Yearbook contains annual quantity data on production of industrial commodities by country, geographical region, economic grouping and for the world. A standard list of about 530 commodities (about 590 statistical series) has been adopted for the publication. The statistics refer to the ten-year period 1993-2002 for about 200 countries and areas
Industrial commodity statistics yearbook 2000. Production statistics (1991-2000)
International Nuclear Information System (INIS)
2002-01-01
This is the thirty-third in a series of annual compilations of statistics on world industry designed to meet both the general demand for information of this kind and the special requirements of the United Nations and related international bodies. Beginning with the 1992 edition, the title of the publication was changed to industrial Commodity Statistics Yearbook as the result of a decision made by the United Nations Statistical Commission at its twenty-seventh session to discontinue, effective 1994, publication of the Industrial Statistics Yearbook, volume I, General Industrial Statistics by the Statistics Division of the United Nations. The United Nations Industrial Development Organization (UNIDO) has become responsible for the collection and dissemination of general industrial statistics while the Statistics Division of the United Nations continues to be responsible for industrial commodity production statistics. The previous title, Industrial Statistics Yearbook, volume II, Commodity Production Statistics, was introduced in the 1982 edition. The first seven editions in this series were published under the title The Growth of World industry and the next eight editions under the title Yearbook of Industrial Statistics. This edition of the Yearbook contains annual quantity data on production of industrial commodities by country, geographical region, economic grouping and for the world. A standard list of about 530 commodities (about 590 statistical series) has been adopted for the publication. Most of the statistics refer to the ten-year period 1991-2000 for about 200 countries and areas
Directory of Open Access Journals (Sweden)
Lutz Bornmann
Full Text Available Using the InCites tool of Thomson Reuters, this study compares normalized citation impact values calculated for China, Japan, France, Germany, United States, and the UK throughout the time period from 1981 to 2010. InCites offers a unique opportunity to study the normalized citation impacts of countries using (i a long publication window (1981 to 2010, (ii a differentiation in (broad or more narrow subject areas, and (iii allowing for the use of statistical procedures in order to obtain an insightful investigation of national citation trends across the years. Using four broad categories, our results show significantly increasing trends in citation impact values for France, the UK, and especially Germany across the last thirty years in all areas. The citation impact of papers from China is still at a relatively low level (mostly below the world average, but the country follows an increasing trend line. The USA exhibits a stable pattern of high citation impact values across the years. With small impact differences between the publication years, the US trend is increasing in engineering and technology but decreasing in medical and health sciences as well as in agricultural sciences. Similar to the USA, Japan follows increasing as well as decreasing trends in different subject areas, but the variability across the years is small. In most of the years, papers from Japan perform below or approximately at the world average in each subject area.
Holtzman, Jessica N; Miller, Shefali; Hooshmand, Farnaz; Wang, Po W; Chang, Kiki D; Hill, Shelley J; Rasgon, Natalie L; Ketter, Terence A
2015-07-01
The strengths and limitations of considering childhood-and adolescent-onset bipolar disorder (BD) separately versus together remain to be established. We assessed this issue. BD patients referred to the Stanford Bipolar Disorder Clinic during 2000-2011 were assessed with the Systematic Treatment Enhancement Program for BD Affective Disorders Evaluation. Patients with childhood- and adolescent-onset were compared to those with adult-onset for 7 unfavorable bipolar illness characteristics with replicated associations with early-onset patients. Among 502 BD outpatients, those with childhood- (adolescent- (13-18 years, N=218) onset had significantly higher rates for 4/7 unfavorable illness characteristics, including lifetime comorbid anxiety disorder, at least ten lifetime mood episodes, lifetime alcohol use disorder, and prior suicide attempt, than those with adult-onset (>18 years, N=174). Childhood- but not adolescent-onset BD patients also had significantly higher rates of first-degree relative with mood disorder, lifetime substance use disorder, and rapid cycling in the prior year. Patients with pooled childhood/adolescent - compared to adult-onset had significantly higher rates for 5/7 of these unfavorable illness characteristics, while patients with childhood- compared to adolescent-onset had significantly higher rates for 4/7 of these unfavorable illness characteristics. Caucasian, insured, suburban, low substance abuse, American specialty clinic-referred sample limits generalizability. Onset age is based on retrospective recall. Childhood- compared to adolescent-onset BD was more robustly related to unfavorable bipolar illness characteristics, so pooling these groups attenuated such relationships. Further study is warranted to determine the extent to which adolescent-onset BD represents an intermediate phenotype between childhood- and adult-onset BD. Copyright © 2015 Elsevier B.V. All rights reserved.
Encounter Probability of Significant Wave Height
DEFF Research Database (Denmark)
Liu, Z.; Burcharth, H. F.
The determination of the design wave height (often given as the significant wave height) is usually based on statistical analysis of long-term extreme wave height measurement or hindcast. The result of such extreme wave height analysis is often given as the design wave height corresponding to a c...
International Nuclear Information System (INIS)
Schoenwiese, C.D.
1990-01-01
Based on univariate correction and coherence analyses, including techniques moving in time, and taking account of the physical basis of the relationships, a simple multivariate concept is presented which correlates observational climatic time series simultaneously with solar, volcanic, ENSO (El Nino/Souther Oscillation) and anthropogenic greenhouse-gas forcing. The climatic elements considered are air temperature (near the ground and stratosphere), sea surface temperature, sea level and precipitation, and cover at least the period 1881-1980 (stratospheric temperature only since 1960). The climate signal assessments which may be hypothetically attributed to the observed CO 2 or equivalent CO 2 (implying additional greenhouse gases) increase are compared with those resulting from GCM experiments. In case of the Northern hemisphere air temperature these comparisons are performed not only in respect to hemispheric and global means, but also in respect to the regional and seasonal patterns. Autocorrelations and phase shifts of the climate response to natural and anthropogenic forcing complicate the statistical assessments
Directory of Open Access Journals (Sweden)
Alessandro Chiaudani
2017-11-01
Full Text Available In this research, univariate and bivariate statistical methods were applied to rainfall, river and piezometric level datasets belonging to 24-year time series (1986–2009. These methods, which often are used to understand the effects of precipitation on rivers and karstic springs discharge, have been used to assess piezometric level response to rainfall and river level fluctuations in a porous aquifer. A rain gauge, a river level gauge and three wells, located in Central Italy along the lower Pescara River valley in correspondence of its important alluvial aquifer, provided the data. Statistical analysis has been used within a known hydrogeological framework, which has been refined by mean of a photo-interpretation and a GPS survey. Water–groundwater relationships were identified following the autocorrelation and cross-correlation analyses. Spectral analysis and mono-fractal features of time series were assessed to provide information on multi-year variability, data distributions, their fractal dimension and the distribution return time within the historical time series. The statistical–mathematical results were interpreted through fieldwork that identified distinct groundwater flowpaths within the aquifer and enabled the implementation of a conceptual model, improving the knowledge on water resources management tools.
[Comment on] Statistical discrimination
Chinn, Douglas
In the December 8, 1981, issue of Eos, a news item reported the conclusion of a National Research Council study that sexual discrimination against women with Ph.D.'s exists in the field of geophysics. Basically, the item reported that even when allowances are made for motherhood the percentage of female Ph.D.'s holding high university and corporate positions is significantly lower than the percentage of male Ph.D.'s holding the same types of positions. The sexual discrimination conclusion, based only on these statistics, assumes that there are no basic psychological differences between men and women that might cause different populations in the employment group studied. Therefore, the reasoning goes, after taking into account possible effects from differences related to anatomy, such as women stopping their careers in order to bear and raise children, the statistical distributions of positions held by male and female Ph.D.'s ought to be very similar to one another. Any significant differences between the distributions must be caused primarily by sexual discrimination.
International Nuclear Information System (INIS)
Eliazar, Iddo
2017-01-01
The exponential, the normal, and the Poisson statistical laws are of major importance due to their universality. Harmonic statistics are as universal as the three aforementioned laws, but yet they fall short in their ‘public relations’ for the following reason: the full scope of harmonic statistics cannot be described in terms of a statistical law. In this paper we describe harmonic statistics, in their full scope, via an object termed harmonic Poisson process: a Poisson process, over the positive half-line, with a harmonic intensity. The paper reviews the harmonic Poisson process, investigates its properties, and presents the connections of this object to an assortment of topics: uniform statistics, scale invariance, random multiplicative perturbations, Pareto and inverse-Pareto statistics, exponential growth and exponential decay, power-law renormalization, convergence and domains of attraction, the Langevin equation, diffusions, Benford’s law, and 1/f noise. - Highlights: • Harmonic statistics are described and reviewed in detail. • Connections to various statistical laws are established. • Connections to perturbation, renormalization and dynamics are established.
Energy Technology Data Exchange (ETDEWEB)
Eliazar, Iddo, E-mail: eliazar@post.tau.ac.il
2017-05-15
The exponential, the normal, and the Poisson statistical laws are of major importance due to their universality. Harmonic statistics are as universal as the three aforementioned laws, but yet they fall short in their ‘public relations’ for the following reason: the full scope of harmonic statistics cannot be described in terms of a statistical law. In this paper we describe harmonic statistics, in their full scope, via an object termed harmonic Poisson process: a Poisson process, over the positive half-line, with a harmonic intensity. The paper reviews the harmonic Poisson process, investigates its properties, and presents the connections of this object to an assortment of topics: uniform statistics, scale invariance, random multiplicative perturbations, Pareto and inverse-Pareto statistics, exponential growth and exponential decay, power-law renormalization, convergence and domains of attraction, the Langevin equation, diffusions, Benford’s law, and 1/f noise. - Highlights: • Harmonic statistics are described and reviewed in detail. • Connections to various statistical laws are established. • Connections to perturbation, renormalization and dynamics are established.
Statistical mechanics for a class of quantum statistics
International Nuclear Information System (INIS)
Isakov, S.B.
1994-01-01
Generalized statistical distributions for identical particles are introduced for the case where filling a single-particle quantum state by particles depends on filling states of different momenta. The system of one-dimensional bosons with a two-body potential that can be solved by means of the thermodynamic Bethe ansatz is shown to be equivalent thermodynamically to a system of free particles obeying statistical distributions of the above class. The quantum statistics arising in this way are completely determined by the two-particle scattering phases of the corresponding interacting systems. An equation determining the statistical distributions for these statistics is derived
DEFF Research Database (Denmark)
Gulati, Sakshi; Martinez, Pierre; Joshi, Tejal
2014-01-01
and statistical analysisBiomarker association with CSS was analysed by univariate and multivariate analyses. Results and limitationsA total of 17 of 28 biomarkers (TP53 mutations; amplifications of chromosomes 8q, 12, 20q11.21q13.32, and 20 and deletions of 4p, 9p, 9p21.3p24.1, and 22q; low EDNRB and TSPAN7...... expression and six gene expression signatures) were validated as predictors of poor CSS in univariate analysis. Tumour stage and the ccB expression signature were the only independent predictors in multivariate analysis. ITH of the ccB signature was identified in 8 of 10 tumours. Several genetic alterations...... that were significant in univariate analysis were enriched, and chromosomal instability indices were increased in samples expressing the ccB signature. The study may be underpowered to validate low-prevalence biomarkers. ConclusionsThe ccB signature was the only independent prognostic biomarker. Enrichment...
Data-driven inference for the spatial scan statistic.
Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C
2011-08-02
Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Medical Statistics – Mathematics or Oracle? Farewell Lecture
Directory of Open Access Journals (Sweden)
Gaus, Wilhelm
2005-06-01
Full Text Available Certainty is rare in medicine. This is a direct consequence of the individuality of each and every human being and the reason why we need medical statistics. However, statistics have their pitfalls, too. Fig. 1 shows that the suicide rate peaks in youth, while in Fig. 2 the rate is highest in midlife and Fig. 3 in old age. Which of these contradictory messages is right? After an introduction to the principles of statistical testing, this lecture examines the probability with which statistical test results are correct. For this purpose the level of significance and the power of the test are compared with the sensitivity and specificity of a diagnostic procedure. The probability of obtaining correct statistical test results is the same as that for the positive and negative correctness of a diagnostic procedure and therefore depends on prevalence. The focus then shifts to the problem of multiple statistical testing. The lecture demonstrates that for each data set of reasonable size at least one test result proves to be significant - even if the data set is produced by a random number generator. It is extremely important that a hypothesis is generated independently from the data used for its testing. These considerations enable us to understand the gradation of "lame excuses, lies and statistics" and the difference between pure truth and the full truth. Finally, two historical oracles are cited.
International Nuclear Information System (INIS)
Tumur, Odgerel; Soon, Kean; Brown, Fraser; Mykytowycz, Marcus
2013-01-01
The aims of our study were to evaluate the effect of application of Adaptive Statistical Iterative Reconstruction (ASIR) algorithm on the radiation dose of coronary computed tomography angiography (CCTA) and its effects on image quality of CCTA and to evaluate the effects of various patient and CT scanning factors on the radiation dose of CCTA. This was a retrospective study that included 347 consecutive patients who underwent CCTA at a tertiary university teaching hospital between 1 July 2009 and 20 September 2011. Analysis was performed comparing patient demographics, scan characteristics, radiation dose and image quality in two groups of patients in whom conventional Filtered Back Projection (FBP) or ASIR was used for image reconstruction. There were 238 patients in the FBP group and 109 patients in the ASIR group. There was no difference between the groups in the use of prospective gating, scan length or tube voltage. In ASIR group, significantly lower tube current was used compared with FBP group, 550mA (450–600) vs. 650mA (500–711.25) (median (interquartile range)), respectively, P<0.001. There was 27% effective radiation dose reduction in the ASIR group compared with FBP group, 4.29mSv (2.84–6.02) vs. 5.84mSv (3.88–8.39) (median (interquartile range)), respectively, P<0.001. Although ASIR was associated with increased image noise compared with FBP (39.93±10.22 vs. 37.63±18.79 (mean ±standard deviation), respectively, P<001), it did not affect the signal intensity, signal-to-noise ratio, contrast-to-noise ratio or the diagnostic quality of CCTA. Application of ASIR reduces the radiation dose of CCTA without affecting the image quality.
Conversion factors and oil statistics
International Nuclear Information System (INIS)
Karbuz, Sohbet
2004-01-01
World oil statistics, in scope and accuracy, are often far from perfect. They can easily lead to misguided conclusions regarding the state of market fundamentals. Without proper attention directed at statistic caveats, the ensuing interpretation of oil market data opens the door to unnecessary volatility, and can distort perception of market fundamentals. Among the numerous caveats associated with the compilation of oil statistics, conversion factors, used to produce aggregated data, play a significant role. Interestingly enough, little attention is paid to conversion factors, i.e. to the relation between different units of measurement for oil. Additionally, the underlying information regarding the choice of a specific factor when trying to produce measurements of aggregated data remains scant. The aim of this paper is to shed some light on the impact of conversion factors for two commonly encountered issues, mass to volume equivalencies (barrels to tonnes) and for broad energy measures encountered in world oil statistics. This paper will seek to demonstrate how inappropriate and misused conversion factors can yield wildly varying results and ultimately distort oil statistics. Examples will show that while discrepancies in commonly used conversion factors may seem trivial, their impact on the assessment of a world oil balance is far from negligible. A unified and harmonised convention for conversion factors is necessary to achieve accurate comparisons and aggregate oil statistics for the benefit of both end-users and policy makers
Statistics with JMP graphs, descriptive statistics and probability
Goos, Peter
2015-01-01
Peter Goos, Department of Statistics, University ofLeuven, Faculty of Bio-Science Engineering and University ofAntwerp, Faculty of Applied Economics, BelgiumDavid Meintrup, Department of Mathematics and Statistics,University of Applied Sciences Ingolstadt, Faculty of MechanicalEngineering, GermanyThorough presentation of introductory statistics and probabilitytheory, with numerous examples and applications using JMPDescriptive Statistics and Probability provides anaccessible and thorough overview of the most important descriptivestatistics for nominal, ordinal and quantitative data withpartic
Expression and clinical significance of PIWIL2 in hilar cholangiocarcinoma tissues and cell lines.
Chen, Y J; Xiong, X F; Wen, S Q; Tian, L; Cheng, W L; Qi, Y Q
2015-06-26
The objective of this study was to explore the relationship between PIWI-like protein 2 (PIWIL2) and clinicopathological charac-teristics and prognosis after radical resection. To accomplish this, we analyzed PIWIL2 expression in hilar cholangiocarcinoma tissues and cell lines. PIWIL2 expression was detected by immunohistochemistry in 41 hilar cholangiocarcinoma samples and 10 control tissues. Western blotting and immunocytofluorescence were used to investigate PIWIL2 expression in the cholangiocarcinoma cell line QBC939 and the bile duct epithelial cell line HIBEpic. Univariate and multivariate surviv-al analyses were performed using the Kaplan-Meier method for hilar cholangiocarcinoma patients who underwent radical resection. PIWIL2 expression was significantly higher in the hilar cholangiocarcinoma tissues and QBC939 cells than in control tissues and HIBEpic cells, respectively (P hilar cholangiocarcinoma (P hilar cholangiocarcinoma.
DEFF Research Database (Denmark)
Schneider, Jesper Wiborg
2012-01-01
In this paper we discuss and question the use of statistical significance tests in relation to university rankings as recently suggested. We outline the assumptions behind and interpretations of statistical significance tests and relate this to examples from the recent SCImago Institutions Rankin...
Uncertainty the soul of modeling, probability & statistics
Briggs, William
2016-01-01
This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science. The ultimate goal is to call into question many standard tenets and lay the philosophical and probabilistic groundwork and infrastructure for statistical modeling. It is the first book devoted to the philosophy of data aimed at working scientists and calls for a new consideration in the practice of probability and statistics to eliminate what has been referred to as the "Cult of Statistical Significance". The book explains the philosophy of these ideas and not the mathematics, though there are a handful of mathematical examples. The topics are logically laid out, starting with basic philosophy as related to probability, statistics, and science, and stepping through the key probabilistic ideas and concepts, and ending with statistical models. Its jargon-free approach asserts that standard methods, suc...
Statistical process control in nursing research.
Polit, Denise F; Chaboyer, Wendy
2012-02-01
In intervention studies in which randomization to groups is not possible, researchers typically use quasi-experimental designs. Time series designs are strong quasi-experimental designs but are seldom used, perhaps because of technical and analytic hurdles. Statistical process control (SPC) is an alternative analytic approach to testing hypotheses about intervention effects using data collected over time. SPC, like traditional statistical methods, is a tool for understanding variation and involves the construction of control charts that distinguish between normal, random fluctuations (common cause variation), and statistically significant special cause variation that can result from an innovation. The purpose of this article is to provide an overview of SPC and to illustrate its use in a study of a nursing practice improvement intervention. Copyright © 2011 Wiley Periodicals, Inc.
Statistics Anxiety and Business Statistics: The International Student
Bell, James A.
2008-01-01
Does the international student suffer from statistics anxiety? To investigate this, the Statistics Anxiety Rating Scale (STARS) was administered to sixty-six beginning statistics students, including twelve international students and fifty-four domestic students. Due to the small number of international students, nonparametric methods were used to…
Changing world extreme temperature statistics
Finkel, J. M.; Katz, J. I.
2018-04-01
We use the Global Historical Climatology Network--daily database to calculate a nonparametric statistic that describes the rate at which all-time daily high and low temperature records have been set in nine geographic regions (continents or major portions of continents) during periods mostly from the mid-20th Century to the present. This statistic was defined in our earlier work on temperature records in the 48 contiguous United States. In contrast to this earlier work, we find that in every region except North America all-time high records were set at a rate significantly (at least $3\\sigma$) higher than in the null hypothesis of a stationary climate. Except in Antarctica, all-time low records were set at a rate significantly lower than in the null hypothesis. In Europe, North Africa and North Asia the rate of setting new all-time highs increased suddenly in the 1990's, suggesting a change in regional climate regime; in most other regions there was a steadier increase.
Directory of Open Access Journals (Sweden)
Jacob Elebro
Full Text Available Periampullary adenocarcinoma, including pancreatic cancer, is a heterogeneous group of tumours with dismal prognosis, for which there is an urgent need to identify novel treatment strategies. The human epithelial growth factor receptors EGFR, HER2 and HER3 have been studied in several tumour types, and HER-targeting drugs have a beneficial effect on survival in selected types of cancer. However, these effects have not been evident in pancreatic cancer, and remain unexplored in other types of periampullary cancer. The prognostic impact of HER-expression in these cancers also remains unclear. The aim of this study was therefore to examine the expression and prognostic value of EGFR, HER2 and HER3 in periampullary cancer, with particular reference to histological subtype. To this end, protein expression of EGFR, HER2 and HER3, and HER2 gene amplification was assessed by immunohistochemistry and silver in situ hybridization, respectively, on tissue microarrays with tumours from 175 periampullary adenocarcinomas, with follow-up data on recurrence-free survival (RFS and overall survival (OS for up to 5 years. EGFR expression was similar in pancreatobiliary (PB and intestinal (I type tumours, but high HER2 and HER3 expression was significantly more common in I-type tumours. In PB-type cases receiving adjuvant gemcitabine, but not in untreated cases, high EGFR expression was significantly associated with a shorter OS and RFS, with a significant treatment interaction in relation to OS (pinteraction = 0.042. In I-type cases, high EGFR expression was associated with a shorter OS and RFS in univariable, but not in multivariable, analysis. High HER3 expression was associated with a prolonged RFS in univariable, but not in multivariable, analysis. Neither HER2 protein expression nor gene amplification was prognostic. The finding of a potential interaction between the expression of EGFR and response to adjuvant chemotherapy in PB-type tumours needs validation
Applying Statistical Mechanics to pixel detectors
International Nuclear Information System (INIS)
Pindo, Massimiliano
2002-01-01
Pixel detectors, being made of a large number of active cells of the same kind, can be considered as significant sets to which Statistical Mechanics variables and methods can be applied. By properly redefining well known statistical parameters in order to let them match the ones that actually characterize pixel detectors, an analysis of the way they work can be performed in a totally new perspective. A deeper understanding of pixel detectors is attained, helping in the evaluation and comparison of their intrinsic characteristics and performance
Introductory statistics for the behavioral sciences
Welkowitz, Joan; Cohen, Jacob
1971-01-01
Introductory Statistics for the Behavioral Sciences provides an introduction to statistical concepts and principles. This book emphasizes the robustness of parametric procedures wherein such significant tests as t and F yield accurate results even if such assumptions as equal population variances and normal population distributions are not well met.Organized into three parts encompassing 16 chapters, this book begins with an overview of the rationale upon which much of behavioral science research is based, namely, drawing inferences about a population based on data obtained from a samp
International Nuclear Information System (INIS)
Golfieri, R.; Giampalma, E.; D'Arienzo, P.; Maffei, M.; Muzzi, C.; Tancioni, S.; Gavelli, G.; Morselli Labate, A.M.; Sama, C.; Jovine, E.; Grazi, G.L.; Mazziotti, A.; Cavallari, A.
2000-01-01
The aim of this study was to evaluate the incidence, radiographic appearance, time of onset, outcome and risk factors of non-infectious and infectious pulmonary complications following liver transplantation. Chest X-ray features of 300 consecutive patients who had undergone 333 liver transplants over an 11-year period were analysed: the type of pulmonary complication, the infecting pathogens and the mean time of their occurrence are described. The main risk factors for lung infections were quantified through univariate and multivariate statistical analysis. Non-infectious pulmonary abnormalities (atelectasis and/or pleural effusion: 86.7%) and pulmonary oedema (44.7%) appeared during the first postoperative week. Infectious pneumonia was observed in 13.7%, with a mortality of 36.6%. Bacterial and viral pneumonia made up the bulk of infections (63.4 and 29.3%, respectively) followed by fungal infiltrates (24.4%). A fairly good correlation between radiological chest X-ray pattern, time of onset and the cultured microorganisms has been observed in all cases. In multivariate analysis, persistent non-infectious abnormalities and pulmonary oedema were identified as the major independent predictors of posttransplant pneumonia, followed by prolonged assisted mechanical ventilation and traditional caval anastomosis. A ''pneumonia-risk score'' was calculated: low-risk score ( 3.30) population. The ''pneumonia-risk score'' identifies a specific group of patients in whom closer radiographic monitoring is recommended. In addition, a highly significant correlation (p<0.001) was observed between pneumonia-risk score and the expected survival, thus confirming pulmonary infections as a major cause of death in OLT recipients. (orig.)
Spreadsheets as tools for statistical computing and statistics education
Neuwirth, Erich
2000-01-01
Spreadsheets are an ubiquitous program category, and we will discuss their use in statistics and statistics education on various levels, ranging from very basic examples to extremely powerful methods. Since the spreadsheet paradigm is very familiar to many potential users, using it as the interface to statistical methods can make statistics more easily accessible.
Directory of Open Access Journals (Sweden)
Rossi Hassad
2018-01-01
Full Text Available Students� attitude, including perceived usefulness, is generally associated with academic success. The related research in statistics education has focused almost exclusively on the role of attitude in explaining and predicting academic learning outcomes, hence there is a paucity of research evidence on how attitude (particularly perceived usefulness impacts students� intentions to use and stay engaged in statistics beyond the introductory course. This study explored the relationship between college students� perception of the usefulness of an introductory statistics course, their beliefs about where statistics will be most useful, and their intentions to take another statistics course. A cross-sectional study of 106 students was conducted. The mean rating for usefulness was 4.7 (out of 7, with no statistically significant differences based on gender and age. Sixty-four percent reported that they would consider taking another statistics course, and this subgroup rated the course as more useful (p = .01. The majority (67% reported that statistics would be most useful for either graduate school or research, whereas 14% indicated their job, and 19% were undecided. The �undecided� students had the lowest mean rating for usefulness of the course (p = .001. Addressing data, in the context of real-world problem-solving and decision-making, could facilitate students to better appreciate the usefulness and practicality of statistics. Qualitative research methods could help to elucidate these findings.
Data-driven inference for the spatial scan statistic
Directory of Open Access Journals (Sweden)
Duczmal Luiz H
2011-08-01
Full Text Available Abstract Background Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. Results A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. Conclusions A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Register-based statistics statistical methods for administrative data
Wallgren, Anders
2014-01-01
This book provides a comprehensive and up to date treatment of theory and practical implementation in Register-based statistics. It begins by defining the area, before explaining how to structure such systems, as well as detailing alternative approaches. It explains how to create statistical registers, how to implement quality assurance, and the use of IT systems for register-based statistics. Further to this, clear details are given about the practicalities of implementing such statistical methods, such as protection of privacy and the coordination and coherence of such an undertaking. Thi
... What Is Cancer? Cancer Statistics Cancer Disparities Cancer Statistics Cancer has a major impact on society in ... success of efforts to control and manage cancer. Statistics at a Glance: The Burden of Cancer in ...
Energy Technology Data Exchange (ETDEWEB)
Fhager, V
2000-01-01
In order to make correct predictions of the second moment of statistical nuclear variables, such as the number of fissions and the number of thermalized neutrons, the dependence of the energy distribution of the source particles on their number should be considered. It has been pointed out recently that neglecting this number dependence in accelerator driven systems might result in bad estimates of the second moment, and this paper contains qualitative and quantitative estimates of the size of these efforts. We walk towards the requested results in two steps. First, models of the number dependent energy distributions of the neutrons that are ejected in the spallation reactions are constructed, both by simple assumptions and by extracting energy distributions of spallation neutrons from a high-energy particle transport code. Then, the second moment of nuclear variables in a sub-critical reactor, into which spallation neutrons are injected, is calculated. The results from second moment calculations using number dependent energy distributions for the source neutrons are compared to those where only the average energy distribution is used. Two physical models are employed to simulate the neutron transport in the reactor. One is analytical, treating only slowing down of neutrons by elastic scattering in the core material. For this model, equations are written down and solved for the second moment of thermalized neutrons that include the distribution of energy of the spallation neutrons. The other model utilizes Monte Carlo methods for tracking the source neutrons as they travel inside the reactor material. Fast and thermal fission reactions are considered, as well as neutron capture and elastic scattering, and the second moment of the number of fissions, the number of neutrons that leaked out of the system, etc. are calculated. Both models use a cylindrical core with a homogenous mixture of core material. Our results indicate that the number dependence of the energy
International Nuclear Information System (INIS)
Fhager, V.
2000-01-01
In order to make correct predictions of the second moment of statistical nuclear variables, such as the number of fissions and the number of thermalized neutrons, the dependence of the energy distribution of the source particles on their number should be considered. It has been pointed out recently that neglecting this number dependence in accelerator driven systems might result in bad estimates of the second moment, and this paper contains qualitative and quantitative estimates of the size of these efforts. We walk towards the requested results in two steps. First, models of the number dependent energy distributions of the neutrons that are ejected in the spallation reactions are constructed, both by simple assumptions and by extracting energy distributions of spallation neutrons from a high-energy particle transport code. Then, the second moment of nuclear variables in a sub-critical reactor, into which spallation neutrons are injected, is calculated. The results from second moment calculations using number dependent energy distributions for the source neutrons are compared to those where only the average energy distribution is used. Two physical models are employed to simulate the neutron transport in the reactor. One is analytical, treating only slowing down of neutrons by elastic scattering in the core material. For this model, equations are written down and solved for the second moment of thermalized neutrons that include the distribution of energy of the spallation neutrons. The other model utilizes Monte Carlo methods for tracking the source neutrons as they travel inside the reactor material. Fast and thermal fission reactions are considered, as well as neutron capture and elastic scattering, and the second moment of the number of fissions, the number of neutrons that leaked out of the system, etc. are calculated. Both models use a cylindrical core with a homogenous mixture of core material. Our results indicate that the number dependence of the energy
Analysis of Medication Errors in Simulated Pediatric Resuscitation by Residents
Directory of Open Access Journals (Sweden)
Evelyn Porter
2014-07-01
Full Text Available Introduction: The objective of our study was to estimate the incidence of prescribing medication errors specifically made by a trainee and identify factors associated with these errors during the simulated resuscitation of a critically ill child. Methods: The results of the simulated resuscitation are described. We analyzed data from the simulated resuscitation for the occurrence of a prescribing medication error. We compared univariate analysis of each variable to medication error rate and performed a separate multiple logistic regression analysis on the significant univariate variables to assess the association between the selected variables. Results: We reviewed 49 simulated resuscitations . The final medication error rate for the simulation was 26.5% (95% CI 13.7% - 39.3%. On univariate analysis, statistically significant findings for decreased prescribing medication error rates included senior residents in charge, presence of a pharmacist, sleeping greater than 8 hours prior to the simulation, and a visual analog scale score showing more confidence in caring for critically ill children. Multiple logistic regression analysis using the above significant variables showed only the presence of a pharmacist to remain significantly associated with decreased medication error, odds ratio of 0.09 (95% CI 0.01 - 0.64. Conclusion: Our results indicate that the presence of a clinical pharmacist during the resuscitation of a critically ill child reduces the medication errors made by resident physician trainees.
Directory of Open Access Journals (Sweden)
Ozgur Yazici
2015-08-01
Full Text Available ABSTRACTPurpose:To evaluate the patient and stone related factors which may influence the final outcome of SWL in the management of ureteral stones.Materials and Methods:Between October 2011 and October 2013, a total of 204 adult patients undergoing SWL for single ureteral stone sizing 5 to 15 mm were included into the study program. The impact of both patient (age, sex, BMI, and stone related factors (laterality, location, longest diameter and density as CT HU along with BUN and lastly SSD (skin to stone distance on fragmentation were analysed by univariate and multivariate analyses. Results: Stone free rates for proximal and distal ureteral stones were 68.8% and 72.7%, respectively with no statistically significant difference between two groups (p=0.7. According to univariate and multivariate analyses, while higher BMI (mean: 26.8 and 28.1, p=0.048 and stone density values (mean: 702 HU and 930 HU, p<0.0001 were detected as statistically significant independent predictors of treatment failure for proximal ureteral stones, the only statistically significant predicting parameter for the success rates of SWL in distal ureteral stones was the higher SSD value (median: 114 and 90, p=0.012.Conclusions:Our findings have clearly shown that while higher BMI and increased stone attenuation values detected by NCCT were significant factors influencing the final outcome of SWL treatment in proximal ureteral stones; opposite to the literature, high SSD was the only independent predictor of success for the SWL treatment of distal ureteral stones.
Natural radionuclides in effluents release by a deactivated uranium mine
Energy Technology Data Exchange (ETDEWEB)
Pereira, Wagner S.; Kelecom, Alphonse; Silva, Ademir X.; Lopes, José M.; Pinto, Carlos E.C.; Py Júnior, Delcy A.; Antunes, Marcos M., E-mail: pereiraws@gmail.com, E-mail: caerjbr@gmail.com, E-mail: wspereira@inb.gov.br, E-mail: delcy@inb.gov.br, E-mail: Antunes@inb.gov.br, E-mail: lararapls@hotmail.com, E-mail: Ademir@nuclear.ufrj.br, E-mail: marqueslopes@yahoo.com.br [Universidade Veiga de Almeida (UVA), Rio de Janeiro, RJ (Brazil); Indústrias Nucleares do Brasil (COMAP.N/FCN/INB), Resende RJ (Brazil). Fábrica de Combustível Nuclear. Coordenação de Meio Ambiente e Proteção Radiológica Ambiental; Universidade Federal Fluminense (LARARA-PLS/UFF), Niterói, RJ (Brazil). Laboratório de Radiobiologia e Radiometria; Coordenacao de Pos-Graduacao e Pesquisa de Engenharia (COPPE/UFRJ), Rio de Janeiro, RJ (Brazil). Programa de Engenharia Nuclear
2017-07-01
The Ore Treatment Unit (OTU) is a mine and deactivated uranium plant in the city of Caldas, Minas Gerais, Brazil. This facility possesses three points of release of liquid effluents containing radionuclides: point 014, 025 and 076. At these points, the values of activity concentrations (AC) of the radionuclides U{sub nat}, {sup 226}Ra, {sup 210}Pb, {sup 232}Th and {sup 228}Ra were analyzed in 2012. The evaluation of point 014 by univariate statistics pointed four groups. [U{sub nat} > {sup 228}Ra > ({sup 226}Ra = {sup 210}Pb) >{sup 232}Th]. The multivariate statistics separated the radionuclides into two groups: [(U{sub nat} and {sup 232}Th) and ({sup 226}Ra, {sup 228}Ra and {sup 210}Pb)]. At point 025, the univariate statistics described three groups: [Un{sub at} > ({sup 228}Ra = {sup 210}Pb) > ({sup 226}Ra = {sup 232}Th)] and the multivariate analysis also described three but different groups: [(U{sub nat} and {sup 228}Ra), ({sup 226}Ra and {sup 210}Pb) and {sup 232}Th]. In turn, point 076 showed another behavior. The univariate analysis showed only two groups: [(U{sub nat}) > ({sup 226}Ra, {sup 228}Ra, {sup 210}Pb, {sup 232}Th)]. Differently, the multivariate statistics defined three groups: [(U{sub nat} and {sup 232}Th), ({sup 226}Ra and {sup 228}Ra) and {sup 210}Pb].Thus, statistical analysis showed that each point has releases of effluents with different characteristics. Both the behaviors of releases, based on multivariate statistics, and of the AC magnitudes, based on the univariate statistics, are different between the points. The only common features were the greater magnitude of uranium and the smaller magnitude of thorium. (author)
Software Used to Generate Cancer Statistics - SEER Cancer Statistics
Videos that highlight topics and trends in cancer statistics and definitions of statistical terms. Also software tools for analyzing and reporting cancer statistics, which are used to compile SEER's annual reports.
Understanding Statistics and Statistics Education: A Chinese Perspective
Shi, Ning-Zhong; He, Xuming; Tao, Jian
2009-01-01
In recent years, statistics education in China has made great strides. However, there still exists a fairly large gap with the advanced levels of statistics education in more developed countries. In this paper, we identify some existing problems in statistics education in Chinese schools and make some proposals as to how they may be overcome. We…
Pestman, Wiebe R
2009-01-01
This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.
Statistical inference and visualization in scale-space for spatially dependent images
Vaughan, Amy
2012-03-01
SiZer (SIgnificant ZERo crossing of the derivatives) is a graphical scale-space visualization tool that allows for statistical inferences. In this paper we develop a spatial SiZer for finding significant features and conducting goodness-of-fit tests for spatially dependent images. The spatial SiZer utilizes a family of kernel estimates of the image and provides not only exploratory data analysis but also statistical inference with spatial correlation taken into account. It is also capable of comparing the observed image with a specific null model being tested by adjusting the statistical inference using an assumed covariance structure. Pixel locations having statistically significant differences between the image and a given null model are highlighted by arrows. The spatial SiZer is compared with the existing independent SiZer via the analysis of simulated data with and without signal on both planar and spherical domains. We apply the spatial SiZer method to the decadal temperature change over some regions of the Earth. © 2011 The Korean Statistical Society.
The Role of Statistics in Business and Industry
Hahn, Gerald J
2011-01-01
An insightful guide to the use of statistics for solving key problems in modern-day business and industry This book has been awarded the Technometrics Ziegel Prize for the best book reviewed by the journal in 2010. Technometrics is a journal of statistics for the physical, chemical and engineering sciences, published jointly by the American Society for Quality and the American Statistical Association. Criteria for the award include that the book brings together in one volume a body of material previously only available in scattered research articles and having the potential to significantly im
Sonpavde, Guru; Pond, Gregory R; Fougeray, Ronan; Choueiri, Toni K; Qu, Angela Q; Vaughn, David J; Niegisch, Guenter; Albers, Peter; James, Nicholas D; Wong, Yu-Ning; Ko, Yoo-Joung; Sridhar, Srikala S; Galsky, Matthew D; Petrylak, Daniel P; Vaishampayan, Ulka N; Khan, Awais; Vogelzang, Nicholas J; Beer, Tomasz M; Stadler, Walter M; O'Donnell, Peter H; Sternberg, Cora N; Rosenberg, Jonathan E; Bellmunt, Joaquim
2013-04-01
Outcomes for patients in the second-line setting of advanced urothelial carcinoma (UC) are dismal. The recognized prognostic factors in this context are Eastern Cooperative Oncology Group (ECOG) performance status (PS) >0, hemoglobin level (Hb) 0, LM, Hb statistic=0.638). Setting of prior chemotherapy (metastatic disease vs perioperative) and prior platinum agent (cisplatin or carboplatin) were not prognostic factors. External validation demonstrated a significant association of TFPC with PFS on univariable and most multivariable analyses, and with OS on univariable analyses. Limitations of retrospective analyses are applicable. Shorter TFPC enhances prognostic classification independent of ECOG-PS >0, Hb advanced UC. These data may facilitate drug development and interpretation of trials. Copyright © 2012 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Sampling, Probability Models and Statistical Reasoning Statistical
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Statistical Symbolic Execution with Informed Sampling
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
International Nuclear Information System (INIS)
2005-01-01
For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees
Whole Frog Project and Virtual Frog Dissection Statistics wwwstats output for January 1 through duplicate or extraneous accesses. For example, in these statistics, while a POST requesting an image is as well. Note that this under-represents the bytes requested. Starting date for following statistics
Directory of Open Access Journals (Sweden)
Mabaso Musawenkosi LH
2007-09-01
Full Text Available Abstract Background Several malaria risk maps have been developed in recent years, many from the prevalence of infection data collated by the MARA (Mapping Malaria Risk in Africa project, and using various environmental data sets as predictors. Variable selection is a major obstacle due to analytical problems caused by over-fitting, confounding and non-independence in the data. Testing and comparing every combination of explanatory variables in a Bayesian spatial framework remains unfeasible for most researchers. The aim of this study was to develop a malaria risk map using a systematic and practicable variable selection process for spatial analysis and mapping of historical malaria risk in Botswana. Results Of 50 potential explanatory variables from eight environmental data themes, 42 were significantly associated with malaria prevalence in univariate logistic regression and were ranked by the Akaike Information Criterion. Those correlated with higher-ranking relatives of the same environmental theme, were temporarily excluded. The remaining 14 candidates were ranked by selection frequency after running automated step-wise selection procedures on 1000 bootstrap samples drawn from the data. A non-spatial multiple-variable model was developed through step-wise inclusion in order of selection frequency. Previously excluded variables were then re-evaluated for inclusion, using further step-wise bootstrap procedures, resulting in the exclusion of another variable. Finally a Bayesian geo-statistical model using Markov Chain Monte Carlo simulation was fitted to the data, resulting in a final model of three predictor variables, namely summer rainfall, mean annual temperature and altitude. Each was independently and significantly associated with malaria prevalence after allowing for spatial correlation. This model was used to predict malaria prevalence at unobserved locations, producing a smooth risk map for the whole country. Conclusion We have
Colon-Berlingeri, Migdalisel; Burrowes, Patricia A
2011-01-01
Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math-biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology.
Effects of quantum coherence on work statistics
Xu, Bao-Ming; Zou, Jian; Guo, Li-Sha; Kong, Xiang-Mu
2018-05-01
In the conventional two-point measurement scheme of quantum thermodynamics, quantum coherence is destroyed by the first measurement. But as we know the coherence really plays an important role in the quantum thermodynamics process, and how to describe the work statistics for a quantum coherent process is still an open question. In this paper, we use the full counting statistics method to investigate the effects of quantum coherence on work statistics. First, we give a general discussion and show that for a quantum coherent process, work statistics is very different from that of the two-point measurement scheme, specifically the average work is increased or decreased and the work fluctuation can be decreased by quantum coherence, which strongly depends on the relative phase, the energy level structure, and the external protocol. Then, we concretely consider a quenched one-dimensional transverse Ising model and show that quantum coherence has a more significant influence on work statistics in the ferromagnetism regime compared with that in the paramagnetism regime, so that due to the presence of quantum coherence the work statistics can exhibit the critical phenomenon even at high temperature.
Statistical learning and selective inference.
Taylor, Jonathan; Tibshirani, Robert J
2015-06-23
We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
Statistical reporting inconsistencies in experimental philosophy.
Colombo, Matteo; Duev, Georgi; Nuijten, Michèle B; Sprenger, Jan
2018-01-01
Experimental philosophy (x-phi) is a young field of research in the intersection of philosophy and psychology. It aims to make progress on philosophical questions by using experimental methods traditionally associated with the psychological and behavioral sciences, such as null hypothesis significance testing (NHST). Motivated by recent discussions about a methodological crisis in the behavioral sciences, questions have been raised about the methodological standards of x-phi. Here, we focus on one aspect of this question, namely the rate of inconsistencies in statistical reporting. Previous research has examined the extent to which published articles in psychology and other behavioral sciences present statistical inconsistencies in reporting the results of NHST. In this study, we used the R package statcheck to detect statistical inconsistencies in x-phi, and compared rates of inconsistencies in psychology and philosophy. We found that rates of inconsistencies in x-phi are lower than in the psychological and behavioral sciences. From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science.
Statistical reporting inconsistencies in experimental philosophy
Colombo, Matteo; Duev, Georgi; Nuijten, Michèle B.; Sprenger, Jan
2018-01-01
Experimental philosophy (x-phi) is a young field of research in the intersection of philosophy and psychology. It aims to make progress on philosophical questions by using experimental methods traditionally associated with the psychological and behavioral sciences, such as null hypothesis significance testing (NHST). Motivated by recent discussions about a methodological crisis in the behavioral sciences, questions have been raised about the methodological standards of x-phi. Here, we focus on one aspect of this question, namely the rate of inconsistencies in statistical reporting. Previous research has examined the extent to which published articles in psychology and other behavioral sciences present statistical inconsistencies in reporting the results of NHST. In this study, we used the R package statcheck to detect statistical inconsistencies in x-phi, and compared rates of inconsistencies in psychology and philosophy. We found that rates of inconsistencies in x-phi are lower than in the psychological and behavioral sciences. From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science. PMID:29649220
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Energy Technology Data Exchange (ETDEWEB)
Alwan, Aravind; Aluru, N.R.
2013-12-15
This paper presents a data-driven framework for performing uncertainty quantification (UQ) by choosing a stochastic model that accurately describes the sources of uncertainty in a system. This model is propagated through an appropriate response surface function that approximates the behavior of this system using stochastic collocation. Given a sample of data describing the uncertainty in the inputs, our goal is to estimate a probability density function (PDF) using the kernel moment matching (KMM) method so that this PDF can be used to accurately reproduce statistics like mean and variance of the response surface function. Instead of constraining the PDF to be optimal for a particular response function, we show that we can use the properties of stochastic collocation to make the estimated PDF optimal for a wide variety of response functions. We contrast this method with other traditional procedures that rely on the Maximum Likelihood approach, like kernel density estimation (KDE) and its adaptive modification (AKDE). We argue that this modified KMM method tries to preserve what is known from the given data and is the better approach when the available data is limited in quantity. We test the performance of these methods for both univariate and multivariate density estimation by sampling random datasets from known PDFs and then measuring the accuracy of the estimated PDFs, using the known PDF as a reference. Comparing the output mean and variance estimated with the empirical moments using the raw data sample as well as the actual moments using the known PDF, we show that the KMM method performs better than KDE and AKDE in predicting these moments with greater accuracy. This improvement in accuracy is also demonstrated for the case of UQ in electrostatic and electrothermomechanical microactuators. We show how our framework results in the accurate computation of statistics in micromechanical systems.
International Nuclear Information System (INIS)
Alwan, Aravind; Aluru, N.R.
2013-01-01
This paper presents a data-driven framework for performing uncertainty quantification (UQ) by choosing a stochastic model that accurately describes the sources of uncertainty in a system. This model is propagated through an appropriate response surface function that approximates the behavior of this system using stochastic collocation. Given a sample of data describing the uncertainty in the inputs, our goal is to estimate a probability density function (PDF) using the kernel moment matching (KMM) method so that this PDF can be used to accurately reproduce statistics like mean and variance of the response surface function. Instead of constraining the PDF to be optimal for a particular response function, we show that we can use the properties of stochastic collocation to make the estimated PDF optimal for a wide variety of response functions. We contrast this method with other traditional procedures that rely on the Maximum Likelihood approach, like kernel density estimation (KDE) and its adaptive modification (AKDE). We argue that this modified KMM method tries to preserve what is known from the given data and is the better approach when the available data is limited in quantity. We test the performance of these methods for both univariate and multivariate density estimation by sampling random datasets from known PDFs and then measuring the accuracy of the estimated PDFs, using the known PDF as a reference. Comparing the output mean and variance estimated with the empirical moments using the raw data sample as well as the actual moments using the known PDF, we show that the KMM method performs better than KDE and AKDE in predicting these moments with greater accuracy. This improvement in accuracy is also demonstrated for the case of UQ in electrostatic and electrothermomechanical microactuators. We show how our framework results in the accurate computation of statistics in micromechanical systems
Analysis of statistical misconception in terms of statistical reasoning
Maryati, I.; Priatna, N.
2018-05-01
Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
Statistical Inference at Work: Statistical Process Control as an Example
Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia
2008-01-01
To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…
What can we learn from noise? - Mesoscopic nonequilibrium statistical physics.
Kobayashi, Kensuke
2016-01-01
Mesoscopic systems - small electric circuits working in quantum regime - offer us a unique experimental stage to explorer quantum transport in a tunable and precise way. The purpose of this Review is to show how they can contribute to statistical physics. We introduce the significance of fluctuation, or equivalently noise, as noise measurement enables us to address the fundamental aspects of a physical system. The significance of the fluctuation theorem (FT) in statistical physics is noted. We explain what information can be deduced from the current noise measurement in mesoscopic systems. As an important application of the noise measurement to statistical physics, we describe our experimental work on the current and current noise in an electron interferometer, which is the first experimental test of FT in quantum regime. Our attempt will shed new light in the research field of mesoscopic quantum statistical physics.
A Statistical Primer: Understanding Descriptive and Inferential Statistics
Gillian Byrne
2007-01-01
As libraries and librarians move more towards evidence‐based decision making, the data being generated in libraries is growing. Understanding the basics of statistical analysis is crucial for evidence‐based practice (EBP), in order to correctly design and analyze researchas well as to evaluate the research of others. This article covers the fundamentals of descriptive and inferential statistics, from hypothesis construction to sampling to common statistical techniques including chi‐square, co...
Solution of the statistical bootstrap with Bose statistics
International Nuclear Information System (INIS)
Engels, J.; Fabricius, K.; Schilling, K.
1977-01-01
A brief and transparent way to introduce Bose statistics into the statistical bootstrap of Hagedorn and Frautschi is presented. The resulting bootstrap equation is solved by a cluster expansion for the grand canonical partition function. The shift of the ultimate temperature due to Bose statistics is determined through an iteration process. We discuss two-particle spectra of the decaying fireball (with given mass) as obtained from its grand microcanonical level density
Mainigi, Sumeet K; Chebrolu, Lakshmi Hima Bindu; Romero-Corral, Abel; Mehta, Vinay; Machado, Rodolfo Rozindo; Konecny, Tomas; Pressman, Gregg S
2012-10-01
Cardiac calcification is associated with coronary artery disease, arrhythmias, conduction disease, and adverse cardiac events. Recently, we have described an echocardiographic-based global cardiac calcification scoring system. The objective of this study was to evaluate the severity of cardiac calcification in patients with permanent pacemakers as based on this scoring system. Patients with a pacemaker implanted within the 2-year study period with a previous echocardiogram were identified and underwent blinded global cardiac calcium scoring. These patients were compared to matched control patients without a pacemaker who also underwent calcium scoring. The study group consisted of 49 patients with pacemaker implantation who were compared to 100 matched control patients. The mean calcium score in the pacemaker group was 3.3 ± 2.9 versus 1.8 ± 2.0 (P = 0.006) in the control group. Univariate and multivariate analysis revealed glomerular filtration rate and calcium scoring to be significant predictors of the presence of a pacemaker. Echocardiographic-based calcium scoring correlates with the presence of severe conduction disease requiring a pacemaker. © 2012, Wiley Periodicals, Inc.
Alternating event processes during lifetimes: population dynamics and statistical inference.
Shinohara, Russell T; Sun, Yifei; Wang, Mei-Cheng
2018-01-01
In the literature studying recurrent event data, a large amount of work has been focused on univariate recurrent event processes where the occurrence of each event is treated as a single point in time. There are many applications, however, in which univariate recurrent events are insufficient to characterize the feature of the process because patients experience nontrivial durations associated with each event. This results in an alternating event process where the disease status of a patient alternates between exacerbations and remissions. In this paper, we consider the dynamics of a chronic disease and its associated exacerbation-remission process over two time scales: calendar time and time-since-onset. In particular, over calendar time, we explore population dynamics and the relationship between incidence, prevalence and duration for such alternating event processes. We provide nonparametric estimation techniques for characteristic quantities of the process. In some settings, exacerbation processes are observed from an onset time until death; to account for the relationship between the survival and alternating event processes, nonparametric approaches are developed for estimating exacerbation process over lifetime. By understanding the population dynamics and within-process structure, the paper provide a new and general way to study alternating event processes.
Descriptive and inferential statistical methods used in burns research.
Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars
2010-05-01
Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals
Szulc, Stefan
1965-01-01
Statistical Methods provides a discussion of the principles of the organization and technique of research, with emphasis on its application to the problems in social statistics. This book discusses branch statistics, which aims to develop practical ways of collecting and processing numerical data and to adapt general statistical methods to the objectives in a given field.Organized into five parts encompassing 22 chapters, this book begins with an overview of how to organize the collection of such information on individual units, primarily as accomplished by government agencies. This text then
Goodman, Joseph W
2015-01-01
This book discusses statistical methods that are useful for treating problems in modern optics, and the application of these methods to solving a variety of such problems This book covers a variety of statistical problems in optics, including both theory and applications. The text covers the necessary background in statistics, statistical properties of light waves of various types, the theory of partial coherence and its applications, imaging with partially coherent light, atmospheric degradations of images, and noise limitations in the detection of light. New topics have been introduced i
All of statistics a concise course in statistical inference
Wasserman, Larry
2004-01-01
This book is for people who want to learn probability and statistics quickly It brings together many of the main ideas in modern statistics in one place The book is suitable for students and researchers in statistics, computer science, data mining and machine learning This book covers a much wider range of topics than a typical introductory text on mathematical statistics It includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses The reader is assumed to know calculus and a little linear algebra No previous knowledge of probability and statistics is required The text can be used at the advanced undergraduate and graduate level Larry Wasserman is Professor of Statistics at Carnegie Mellon University He is also a member of the Center for Automated Learning and Discovery in the School of Computer Science His research areas include nonparametric inference, asymptotic theory, causality, and applications to astrophysics, bi...
Southard, Rodney E.
2013-01-01
located in Region 1, 120 were located in Region 2, and 10 were located in Region 3. Streamgages located outside of Missouri were selected to extend the range of data used for the independent variables in the regression analyses. Streamgages included in the regression analyses had 10 or more years of record and were considered to be affected minimally by anthropogenic activities or trends. Regional regression analyses identified three characteristics as statistically significant for the development of regional equations. For Region 1, drainage area, longest flow path, and streamflow-variability index were statistically significant. The range in the standard error of estimate for Region 1 is 79.6 to 94.2 percent. For Region 2, drainage area and streamflow variability index were statistically significant, and the range in the standard error of estimate is 48.2 to 72.1 percent. For Region 3, drainage area and streamflow-variability index also were statistically significant with a range in the standard error of estimate of 48.1 to 96.2 percent. Limitations on the use of estimating low-flow frequency statistics at ungaged locations are dependent on the method used. The first method outlined for use in Missouri, power curve equations, were developed to estimate the selected statistics for ungaged locations on 28 selected streams with multiple streamgages located on the same stream. A second method uses a drainage-area ratio to compute statistics at an ungaged location using data from a single streamgage on the same stream with 10 or more years of record. Ungaged locations on these streams may use the ratio of the drainage area at an ungaged location to the drainage area at a streamgage location to scale the selected statistic value from the streamgage location to the ungaged location. This method can be used if the drainage area of the ungaged location is within 40 to 150 percent of the streamgage drainage area. The third method is the use of the regional regression equations
... Standards Act and Program MQSA Insights MQSA National Statistics Share Tweet Linkedin Pin it More sharing options ... but should level off with time. Archived Scorecard Statistics 2018 Scorecard Statistics 2017 Scorecard Statistics 2016 Scorecard ...
Inverse Statistics in the Foreign Exchange Market
Jensen, M. H.; Johansen, A.; Petroni, F.; Simonsen, I.
2004-01-01
We investigate intra-day foreign exchange (FX) time series using the inverse statistic analysis developed in [1,2]. Specifically, we study the time-averaged distributions of waiting times needed to obtain a certain increase (decrease) $\\rho$ in the price of an investment. The analysis is performed for the Deutsch mark (DM) against the $US for the full year of 1998, but similar results are obtained for the Japanese Yen against the $US. With high statistical significance, the presence of "reson...
Generalized quantum statistics
International Nuclear Information System (INIS)
Chou, C.
1992-01-01
In the paper, a non-anyonic generalization of quantum statistics is presented, in which Fermi-Dirac statistics (FDS) and Bose-Einstein statistics (BES) appear as two special cases. The new quantum statistics, which is characterized by the dimension of its single particle Fock space, contains three consistent parts, namely the generalized bilinear quantization, the generalized quantum mechanical description and the corresponding statistical mechanics
Rumsey, Deborah
2011-01-01
The fun and easy way to get down to business with statistics Stymied by statistics? No fear ? this friendly guide offers clear, practical explanations of statistical ideas, techniques, formulas, and calculations, with lots of examples that show you how these concepts apply to your everyday life. Statistics For Dummies shows you how to interpret and critique graphs and charts, determine the odds with probability, guesstimate with confidence using confidence intervals, set up and carry out a hypothesis test, compute statistical formulas, and more.Tracks to a typical first semester statistics cou
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Nick, Todd G
2007-01-01
Statistics is defined by the Medical Subject Headings (MeSH) thesaurus as the science and art of collecting, summarizing, and analyzing data that are subject to random variation. The two broad categories of summarizing and analyzing data are referred to as descriptive and inferential statistics. This chapter considers the science and art of summarizing data where descriptive statistics and graphics are used to display data. In this chapter, we discuss the fundamentals of descriptive statistics, including describing qualitative and quantitative variables. For describing quantitative variables, measures of location and spread, for example the standard deviation, are presented along with graphical presentations. We also discuss distributions of statistics, for example the variance, as well as the use of transformations. The concepts in this chapter are useful for uncovering patterns within the data and for effectively presenting the results of a project.
Statistical Power in Plant Pathology Research.
Gent, David H; Esker, Paul D; Kriss, Alissa B
2018-01-01
In null hypothesis testing, failure to reject a null hypothesis may have two potential interpretations. One interpretation is that the treatments being evaluated do not have a significant effect, and a correct conclusion was reached in the analysis. Alternatively, a treatment effect may have existed but the conclusion of the study was that there was none. This is termed a Type II error, which is most likely to occur when studies lack sufficient statistical power to detect a treatment effect. In basic terms, the power of a study is the ability to identify a true effect through a statistical test. The power of a statistical test is 1 - (the probability of Type II errors), and depends on the size of treatment effect (termed the effect size), variance, sample size, and significance criterion (the probability of a Type I error, α). Low statistical power is prevalent in scientific literature in general, including plant pathology. However, power is rarely reported, creating uncertainty in the interpretation of nonsignificant results and potentially underestimating small, yet biologically significant relationships. The appropriate level of power for a study depends on the impact of Type I versus Type II errors and no single level of power is acceptable for all purposes. Nonetheless, by convention 0.8 is often considered an acceptable threshold and studies with power less than 0.5 generally should not be conducted if the results are to be conclusive. The emphasis on power analysis should be in the planning stages of an experiment. Commonly employed strategies to increase power include increasing sample sizes, selecting a less stringent threshold probability for Type I errors, increasing the hypothesized or detectable effect size, including as few treatment groups as possible, reducing measurement variability, and including relevant covariates in analyses. Power analysis will lead to more efficient use of resources and more precisely structured hypotheses, and may even
Computing Confidence Bounds for Power and Sample Size of the General Linear Univariate Model
Taylor, Douglas J.; Muller, Keith E.
1995-01-01
The power of a test, the probability of rejecting the null hypothesis in favor of an alternative, may be computed using estimates of one or more distributional parameters. Statisticians frequently fix mean values and calculate power or sample size using a variance estimate from an existing study. Hence computed power becomes a random variable for a fixed sample size. Likewise, the sample size necessary to achieve a fixed power varies randomly. Standard statistical practice requires reporting ...
Statistical inference for template aging
Schuckers, Michael E.
2006-04-01
A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.
Testing statistical self-similarity in the topology of river networks
Troutman, Brent M.; Mantilla, Ricardo; Gupta, Vijay K.
2010-01-01
Recent work has demonstrated that the topological properties of real river networks deviate significantly from predictions of Shreve's random model. At the same time the property of mean self-similarity postulated by Tokunaga's model is well supported by data. Recently, a new class of network model called random self-similar networks (RSN) that combines self-similarity and randomness has been introduced to replicate important topological features observed in real river networks. We investigate if the hypothesis of statistical self-similarity in the RSN model is supported by data on a set of 30 basins located across the continental United States that encompass a wide range of hydroclimatic variability. We demonstrate that the generators of the RSN model obey a geometric distribution, and self-similarity holds in a statistical sense in 26 of these 30 basins. The parameters describing the distribution of interior and exterior generators are tested to be statistically different and the difference is shown to produce the well-known Hack's law. The inter-basin variability of RSN parameters is found to be statistically significant. We also test generator dependence on two climatic indices, mean annual precipitation and radiative index of dryness. Some indication of climatic influence on the generators is detected, but this influence is not statistically significant with the sample size available. Finally, two key applications of the RSN model to hydrology and geomorphology are briefly discussed.
Gene cluster statistics with gene families.
Raghupathy, Narayanan; Durand, Dannie
2009-05-01
Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data
Directory of Open Access Journals (Sweden)
Justin London
2010-01-01
Full Text Available In “National Metrical Types in Nineteenth Century Art Song” Leigh Van Handel gives a sympathetic critique of William Rothstein’s claim that in western classical music of the late 18th and 19th centuries there are discernable differences in the phrasing and metrical practice of German versus French and Italian composers. This commentary (a examines just what Rothstein means in terms of his proposed metrical typology, (b questions Van Handel on how she has applied it to a purely melodic framework, (c amplifies Van Handel’s critique of Rothstein, and then (d concludes with a rumination on the reach of quantitative (i.e., statistically-driven versus qualitative claims regarding such things as “national metrical types.”
Statistical characterization report for Single-Shell Tank 241-T-107
International Nuclear Information System (INIS)
Cromar, R.D.; Wilmarth, S.R.; Jensen, L.
1994-01-01
This report contains the results of the statistical analysis of data from three core samples obtained from single-shell tank 241-T-107 (T-107). Four specific topics are addressed. They are summarized below. Section 3.0 contains mean concentration estimates of analytes found in T-107. The estimates of open-quotes errorclose quotes associated with the concentration estimates are given as 95% confidence intervals (CI) on the mean. The results given are based on three types of samples: core composite samples, core segment samples, and drainable liquid samples. Section 4.0 contains estimates of the spatial variability (variability between cores and between segments) and the analytical variability (variability between the primary and the duplicate analysis). Statistical tests were performed to test the hypothesis that the between cores and the between segments spatial variability is zero. The results of the tests are as follows. Based on the core composite data, the between cores variance is significantly different from zero for 35 out of 74 analytes; i.e., for 53% of the analytes there is no statistically significant difference between the concentration means for two cores. Based on core segment data, the between segments variance is significantly different from zero for 22 out of 24 analytes and the between cores variance is significantly different from zero for 4 out of 24 analytes; i.e., for 8% of the analytes there is no statistically significant difference between segment means and for 83% of the analytes there is no difference between the means from the three cores. Section 5.0 contains the results of the application of multiple comparison methods to the core composite data, the core segment data, and the drainable liquid data. Section 6.0 contains the results of a statistical test conducted to determine the 222-S Analytical Laboratory's ability to homogenize solid core segments
Information Statistics in Schools Educate your students about the value and everyday use of statistics. The Statistics in Schools program provides resources for teaching and learning with real life data. Explore the site for standards-aligned, classroom-ready activities. Statistics in Schools Math Activities History
Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.
Deng, Yangqing; Pan, Wei
2017-12-01
There is growing interest in testing genetic pleiotropy, which is when a single genetic variant influences multiple traits. Several methods have been proposed; however, these methods have some limitations. First, all the proposed methods are based on the use of individual-level genotype and phenotype data; in contrast, for logistical, and other, reasons, summary statistics of univariate SNP-trait associations are typically only available based on meta- or mega-analyzed large genome-wide association study (GWAS) data. Second, existing tests are based on marginal pleiotropy, which cannot distinguish between direct and indirect associations of a single genetic variant with multiple traits due to correlations among the traits. Hence, it is useful to consider conditional analysis, in which a subset of traits is adjusted for another subset of traits. For example, in spite of substantial lowering of low-density lipoprotein cholesterol (LDL) with statin therapy, some patients still maintain high residual cardiovascular risk, and, for these patients, it might be helpful to reduce their triglyceride (TG) level. For this purpose, in order to identify new therapeutic targets, it would be useful to identify genetic variants with pleiotropic effects on LDL and TG after adjusting the latter for LDL; otherwise, a pleiotropic effect of a genetic variant detected by a marginal model could simply be due to its association with LDL only, given the well-known correlation between the two types of lipids. Here, we develop a new pleiotropy testing procedure based only on GWAS summary statistics that can be applied for both marginal analysis and conditional analysis. Although the main technical development is based on published union-intersection testing methods, care is needed in specifying conditional models to avoid invalid statistical estimation and inference. In addition to the previously used likelihood ratio test, we also propose using generalized estimating equations under the
Gulf War Illness as a Brain Autoimmune Disorder
2017-10-01
the data using the IBM- SPSS statistical package (version 23). More specifically, we carried out a univariate and a multivariate analysis of...statistical methods were used to analyze the data, including analysis of covariance (ANCOVA). The follow- ing packages were employed: IBM- SPSS statistical
Ing, Alex; Schwarzbauer, Christian
2014-01-01
Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods--the cluster size statistic (CSS) and cluster mass statistic (CMS)--are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity.
Statistical Damage Detection of Civil Engineering Structures using ARMAV Models
DEFF Research Database (Denmark)
Andersen, P.; Kirkegaard, Poul Henning
In this paper a statistically based damage detection of a lattice steel mast is performed. By estimation of the modal parameters and their uncertainties it is possible to detect whether some of the modal parameters have changed with a statistical significance. The estimation of the uncertainties ...
Clinicopathological significance of c-MYC in esophageal squamous cell carcinoma.
Lian, Yu; Niu, Xiangdong; Cai, Hui; Yang, Xiaojun; Ma, Haizhong; Ma, Shixun; Zhang, Yupeng; Chen, Yifeng
2017-07-01
Esophageal squamous cell carcinoma is one of the most common malignant tumors. The oncogene c-MYC is thought to be important in the initiation, promotion, and therapy resistance of cancer. In this study, we aim to investigate the clinicopathologic roles of c-MYC in esophageal squamous cell carcinoma tissue. This study is aimed at discovering and analyzing c-MYC expression in a series of human esophageal tissues. A total of 95 esophageal squamous cell carcinoma samples were analyzed by the western blotting and immunohistochemistry techniques. Then, correlation of c-MYC expression with clinicopathological features of esophageal squamous cell carcinoma patients was statistically analyzed. In most esophageal squamous cell carcinoma cases, the c-MYC expression was positive in tumor tissues. The positive rate of c-MYC expression in tumor tissues was 61.05%, obviously higher than the adjacent normal tissues (8.42%, 8/92) and atypical hyperplasia tissues (19.75%, 16/95). There was a statistical difference among adjacent normal tissues, atypical hyperplasia tissues, and tumor tissues. Overexpression of the c-MYC was detected in 61.05% (58/95) esophageal squamous cell carcinomas, which was significantly correlated with the degree of differentiation (p = 0.004). The positive rate of c-MYC expression was 40.0% in well-differentiated esophageal tissues, with a significantly statistical difference (p = 0.004). The positive rate of c-MYC was 41.5% in T1 + T2 esophageal tissues and 74.1% in T3 + T4 esophageal tissues, with a significantly statistical difference (p = 0.001). The positive rate of c-MYC was 45.0% in I + II esophageal tissues and 72.2% in III + IV esophageal tissues, with a significantly statistical difference (p = 0.011). The c-MYC expression strongly correlated with clinical staging (p = 0.011), differentiation degree (p = 0.004), lymph node metastasis (p = 0.003), and invasion depth (p = 0.001) of patients with esophageal squamous cell carcinoma. The c-MYC was
Energy Technology Data Exchange (ETDEWEB)
Paul, D. [SSBB and Senior Member-ASQ, Kolkata (India); Mandal, S.N. [Kalyani Govt Engg College, Kalyani (India); Mukherjee, D.; Bhadra Chaudhuri, S.R. [Dept of E. and T. C. Engg, B.E.S.U., Shibpur (India)
2010-10-15
System efficiency and payback time are yet to attain a commercially viable level for solar photovoltaic energy projects. Despite huge development in prediction of solar radiation data, there is a gap in extraction of pertinent information from such data. Hence the available data cannot be effectively utilized for engineering application. This is acting as a barrier for the emerging technology. For making accurate engineering and financial calculations regarding any solar energy project, it is crucial to identify and optimize the most significant statistic(s) representing insolation availability by the Photovoltaic setup at the installation site. Quality Function Deployment (QFD) technique has been applied for identifying the statistic(s), which are of high significance from a project designer's point of view. A MATLAB trademark program has been used to build the annual frequency distribution of hourly insolation over any module plane at a given location. Descriptive statistical analysis of such distributions is done through MINITAB trademark. For Building Integrated Photo Voltaic (BIPV) installation, similar statistical analysis has been carried out for the composite frequency distribution, which is formed by weighted summation of insolation distributions for different module planes used in the installation. Vital most influential statistic(s) of the composite distribution have been optimized through Artificial Neural Network computation. This approach is expected to open up a new horizon in BIPV system design. (author)
PRIS-STATISTICS: Power Reactor Information System Statistical Reports. User's Manual
International Nuclear Information System (INIS)
2013-01-01
The IAEA developed the Power Reactor Information System (PRIS)-Statistics application to assist PRIS end users with generating statistical reports from PRIS data. Statistical reports provide an overview of the status, specification and performance results of every nuclear power reactor in the world. This user's manual was prepared to facilitate the use of the PRIS-Statistics application and to provide guidelines and detailed information for each report in the application. Statistical reports support analyses of nuclear power development and strategies, and the evaluation of nuclear power plant performance. The PRIS database can be used for comprehensive trend analyses and benchmarking against best performers and industrial standards.
Weller, Daniel; Wiedmann, Martin; Strawn, Laura K
2015-06-01
Environmental (i.e., meteorological and landscape) factors and management practices can affect the prevalence of foodborne pathogens in produce production environments. This study was conducted to determine the prevalence of Listeria monocytogenes, Listeria species (including L. monocytogenes), Salmonella, and Shiga toxin-producing Escherichia coli (STEC) in produce production environments and to identify environmental factors and management practices associated with their isolation. Ten produce farms in New York State were sampled during a 6-week period in 2010, and 124 georeferenced samples (80 terrestrial, 33 water, and 11 fecal) were collected. L. monocytogenes, Listeria spp., Salmonella, and STEC were detected in 16, 44, 4, and 5% of terrestrial samples, 30, 58, 12, and 3% of water samples, and 45, 45, 27, and 9% of fecal samples, respectively. Environmental factors and management practices were evaluated for their association with terrestrial samples positive for L. monocytogenes or other Listeria species by univariate logistic regression; analysis was not conducted for Salmonella or STEC because the number of samples positive for these pathogens was low. Although univariate analysis identified associations between isolation of L. monocytogenes or Listeria spp. from terrestrial samples and various water-related factors (e.g., proximity to wetlands and precipitation), multivariate analysis revealed that only irrigation within 3 days of sample collection was significantly associated with isolation of L. monocytogenes (odds ratio = 39) and Listeria spp. (odds ratio = 5) from terrestrial samples. These findings suggest that intervention at the irrigation level may reduce the risk of produce contamination.
Sadovskii, Michael V
2012-01-01
This volume provides a compact presentation of modern statistical physics at an advanced level. Beginning with questions on the foundations of statistical mechanics all important aspects of statistical physics are included, such as applications to ideal gases, the theory of quantum liquids and superconductivity and the modern theory of critical phenomena. Beyond that attention is given to new approaches, such as quantum field theory methods and non-equilibrium problems.
Koh, Young Wha; Park, Seong Yong; Hyun, Seung Hyup; Lee, Su Jin
2018-02-01
We evaluated the association between positron emission tomography (PET) textural features and glucose transporter 1 (GLUT1) expression level and further investigated the prognostic significance of textural features in lung adenocarcinoma. We evaluated 105 adenocarcinoma patients. We extracted texture-based PET parameters of primary tumors. Conventional PET parameters were also measured. The relationships between PET parameters and GLUT1 expression levels were evaluated. The association between PET parameters and overall survival (OS) was assessed using Cox's proportional hazard regression models. In terms of PET textural features, tumors expressing high levels of GLUT1 exhibited significantly lower coarseness, contrast, complexity, and strength, but significantly higher busyness. On univariate analysis, the metabolic tumor volume, total lesion glycolysis, contrast, busyness, complexity, and strength were significant predictors of OS. Multivariate analysis showed that lower complexity (HR=2.017, 95%CI=1.032-3.942, p=0.040) was independently associated with poorer survival. PET textural features may aid risk stratification in lung adenocarcinoma patients. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Acute exacerbation of idiopathic pulmonary fibrosis: high-resolution CT scores predict mortality
International Nuclear Information System (INIS)
Fujimoto, Kiminori; Taniguchi, Hiroyuki; Kondoh, Yasuhiro; Kataoka, Kensuke; Johkoh, Takeshi; Ichikado, Kazuya; Sumikawa, Hiromitsu; Ogura, Takashi; Endo, Takahiro; Kawaguchi, Atsushi; Mueller, Nestor L.
2012-01-01
To determine high-resolution computed tomography (HRCT) findings helpful in predicting mortality in patients with acute exacerbation of idiopathic pulmonary fibrosis (AEx-IPF). Sixty patients with diagnosis of AEx-IPF were reviewed retrospectively. Two groups (two observers each) independently evaluated pattern, distribution, and extent of HRCT findings at presentation and calculated an HRCT score at AEx based on normal attenuation areas and extent of abnormalities, such as areas of ground-glass attenuation and/or consolidation with or without traction bronchiectasis or bronchiolectasis and areas of honeycombing. The correlation between the clinical data including the HRCT score and mortality (cause-specific survival) was evaluated using the univariate and multivariate Cox-regression analyses. Serum KL-6 level, PaCO 2 , and the HRCT score were statistically significant predictors on univariate analysis. Multivariate analysis revealed that the HRCT score was an independently significant predictor of outcome (hazard ratio, 1.13; 95% confidence interval, 1.06-1.19, P = 0.0002). The area under receiver operating characteristics curve for the HRCT score was statistically significant in the classification of survivors or nonsurvivors (0.944; P < 0.0001). Survival in patients with HRCT score ≥245 was worse than those with lower score (log-rank test, P < 0.0001). The HRCT score at AEx is independently related to prognosis in patients with AEx-IPF. (orig.)
Al-Shayyab, Mohammad H; Baqain, Zaid H
2018-04-01
The aim of this study was to assess the influence of patients' and surgical variables on the onset and duration of action of local anesthesia (LA) in mandibular third-molar (M3) surgery. Patients scheduled for mandibular M3 surgery were considered for inclusion in this prospective cohort study. Patients' and surgical variables were recorded. Two per cent (2%) lidocaine with 1:100,000 epinephrine was used to block the nerves for extraction of mandibular M3. Then, the onset of action and duration of LA were monitored. Univariate analysis and multivariate regression analysis were used to analyze the data. The final cohort included 88 subjects (32 men and 56 women; mean age ± SD = 29.3 ± 12.3 yr). With univariate analysis, age, gender, body mass index (BMI), smoking quantity and duration, operation time, and 'volume of local anesthetic needed' significantly influenced the onset of action and duration of LA. Multivariate regression revealed that age and smoking quantity were the only statistically significant predictors of the onset of action of LA, whereas age, smoking quantity, and 'volume of local anesthetic needed' were the only statistically significant predictors of duration of LA. Further studies are recommended to uncover other predictors of the onset of action and duration of LA. © 2018 Eur J Oral Sci.
Acute exacerbation of idiopathic pulmonary fibrosis: high-resolution CT scores predict mortality
Energy Technology Data Exchange (ETDEWEB)
Fujimoto, Kiminori [Kurume University School of Medicine, and Center for Diagnostic Imaging, Kurume University Hospital, Department of Radiology, Kurume, Fukuoka (Japan); Taniguchi, Hiroyuki; Kondoh, Yasuhiro; Kataoka, Kensuke [Tosei General Hospital, Department of Respiratory Medicine and Allergy, Seto, Aichi (Japan); Johkoh, Takeshi [Kinki Central Hospital of Mutual Aid Association of Public School Teachers, Department of Radiology, Itami (Japan); Ichikado, Kazuya [Saiseikai Kumamoto Hospital, Division of Respiratory Medicine, Kumamoto (Japan); Sumikawa, Hiromitsu [Osaka University Graduate School of Medicine, Department of Radiology, Suita, Osaka (Japan); Ogura, Takashi; Endo, Takahiro [Kanagawa Cardiovascular and Respiratory Center, Department of Respiratory Medicine, Yokohama, Kanagawa (Japan); Kawaguchi, Atsushi [Kurume University School of Medicine, Biostatistics Center, Kurume (Japan); Mueller, Nestor L. [University of British Columbia and Vancouver General Hospital, Department of Radiology, Vancouver, B.C. (Canada)
2012-01-15
To determine high-resolution computed tomography (HRCT) findings helpful in predicting mortality in patients with acute exacerbation of idiopathic pulmonary fibrosis (AEx-IPF). Sixty patients with diagnosis of AEx-IPF were reviewed retrospectively. Two groups (two observers each) independently evaluated pattern, distribution, and extent of HRCT findings at presentation and calculated an HRCT score at AEx based on normal attenuation areas and extent of abnormalities, such as areas of ground-glass attenuation and/or consolidation with or without traction bronchiectasis or bronchiolectasis and areas of honeycombing. The correlation between the clinical data including the HRCT score and mortality (cause-specific survival) was evaluated using the univariate and multivariate Cox-regression analyses. Serum KL-6 level, PaCO{sub 2}, and the HRCT score were statistically significant predictors on univariate analysis. Multivariate analysis revealed that the HRCT score was an independently significant predictor of outcome (hazard ratio, 1.13; 95% confidence interval, 1.06-1.19, P = 0.0002). The area under receiver operating characteristics curve for the HRCT score was statistically significant in the classification of survivors or nonsurvivors (0.944; P < 0.0001). Survival in patients with HRCT score {>=}245 was worse than those with lower score (log-rank test, P < 0.0001). The HRCT score at AEx is independently related to prognosis in patients with AEx-IPF. (orig.)
Behavioral investment strategy matters: a statistical arbitrage approach
Sun, David; Tsai, Shih-Chuan; Wang, Wei
2011-01-01
In this study, we employ a statistical arbitrage approach to demonstrate that momentum investment strategy tend to work better in periods longer than six months, a result different from findings in past literature. Compared with standard parametric tests, the statistical arbitrage method produces more clearly that momentum strategies work only in longer formation and holding periods. Also they yield positive significant returns in an up market, but negative yet insignificant returns in a down...
Energy Technology Data Exchange (ETDEWEB)
Kyle, Jennifer E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Casey, Cameron P. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Stratton, Kelly G. [National Security Directorate, Pacific Northwest National Laboratory, Richland WA USA; Zink, Erika M. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Kim, Young-Mo [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Zheng, Xueyun [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Monroe, Matthew E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Weitz, Karl K. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Bloodsworth, Kent J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Orton, Daniel J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Ibrahim, Yehia M. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Moore, Ronald J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Lee, Christine G. [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Research Service, Portland Veterans Affairs Medical Center, Portland OR USA; Pedersen, Catherine [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Orwoll, Eric [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Smith, Richard D. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Burnum-Johnson, Kristin E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Baker, Erin S. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA
2017-02-05
The use of dried blood spots (DBS) has many advantages over traditional plasma and serum samples such as smaller blood volume required, storage at room temperature, and ability for sampling in remote locations. However, understanding the robustness of different analytes in DBS samples is essential, especially in older samples collected for longitudinal studies. Here we analyzed DBS samples collected in 2000-2001 and stored at room temperature and compared them to matched serum samples stored at -80°C to determine if they could be effectively used as specific time points in a longitudinal study following metabolic disease. Four hundred small molecules were identified in both the serum and DBS samples using gas chromatograph-mass spectrometry (GC-MS), liquid chromatography-MS (LC-MS) and LC-ion mobility spectrometry-MS (LC-IMS-MS). The identified polar metabolites overlapped well between the sample types, though only one statistically significant polar metabolite in a case-control study was conserved, indicating degradation occurs in the DBS samples affecting quantitation. Differences in the lipid identifications indicated that some oxidation occurs in the DBS samples. However, thirty-six statistically significant lipids correlated in both sample types indicating that lipid quantitation was more stable across the sample types.
International Nuclear Information System (INIS)
Venkataraman, G.
1992-01-01
Treating radiation gas as a classical gas, Einstein derived Planck's law of radiation by considering the dynamic equilibrium between atoms and radiation. Dissatisfied with this treatment, S.N. Bose derived Plank's law by another original way. He treated the problem in generality: he counted how many cells were available for the photon gas in phase space and distributed the photons into these cells. In this manner of distribution, there were three radically new ideas: The indistinguishability of particles, the spin of the photon (with only two possible orientations) and the nonconservation of photon number. This gave rise to a new discipline of quantum statistical mechanics. Physics underlying Bose's discovery, its significance and its role in development of the concept of ideal gas, spin-statistics theorem and spin particles are described. The book has been written in a simple and direct language in an informal style aiming to stimulate the curiosity of a reader. (M.G.B.)
Giuca, Maria Rita; Cappè, Maria; Carli, Elisabetta; Lardani, Lisa
2018-01-01
Aim The purpose of the present study was to evaluate the clinical defects and etiological factors potentially involved in the onset of MIH in a pediatric sample. Methods 120 children, selected from the university dental clinic, were included: 60 children (25 boys and 35 girls; average age: 9.8 ± 1.8 years) with MIH formed the test group and 60 children (27 boys and 33 girls; average age: 10.1 ± 2 years) without MIH constituted the control group. Distribution and severity of MIH defects were evaluated, and a questionnaire was used to investigate the etiological variables; chi-square, univariate, and multivariate statistical tests were performed (significance level set at p MIH defects: 55 molars and 75 incisors showed mild defects, 91 molars and 20 incisors had moderate lesions, and 40 molars and 3 incisors showed severe lesions. Univariate and multivariate statistical analysis showed a significant association (p MIH and ear, nose, and throat (ENT) disorders and the antibiotics used during pregnancy (0.019). Conclusions Moderate defects were more frequent in the molars, while mild lesions were more frequent in the incisors. Antibiotics used during pregnancy and ENT may be directly involved in the etiology of MIH in children. PMID:29861729
Statistical trend analysis methodology for rare failures in changing technical systems
International Nuclear Information System (INIS)
Ott, K.O.; Hoffmann, H.J.
1983-07-01
A methodology for a statistical trend analysis (STA) in failure rates is presented. It applies primarily to relatively rare events in changing technologies or components. The formulation is more general and the assumptions are less restrictive than in a previously published version. Relations of the statistical analysis and probabilistic assessment (PRA) are discussed in terms of categorization of decisions for action following particular failure events. The significance of tentatively identified trends is explored. In addition to statistical tests for trend significance, a combination of STA and PRA results quantifying the trend complement is proposed. The STA approach is compared with other concepts for trend characterization. (orig.)
Statistical distribution for generalized ideal gas of fractional-statistics particles
International Nuclear Information System (INIS)
Wu, Y.
1994-01-01
We derive the occupation-number distribution in a generalized ideal gas of particles obeying fractional statistics, including mutual statistics, by adopting a state-counting definition. When there is no mutual statistics, the statistical distribution interpolates between bosons and fermions, and respects a fractional exclusion principle (except for bosons). Anyons in a strong magnetic field at low temperatures constitute such a physical system. Applications to the thermodynamic properties of quasiparticle excitations in the Laughlin quantum Hall fluid are discussed
... Watchdog Ratings Feedback Contact Select Page Childhood Cancer Statistics Home > Cancer Resources > Childhood Cancer Statistics Childhood Cancer Statistics – Graphs and Infographics Number of Diagnoses Incidence Rates ...
Critical analysis of adsorption data statistically
Kaushal, Achla; Singh, S. K.
2017-10-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are mango leaf powder.
Sun, Ying; Genton, Marc G.; Nychka, Douglas W.
2012-01-01
© 2012 John Wiley & Sons, Ltd. Band depth is an important nonparametric measure that generalizes order statistics and makes univariate methods based on order statistics possible for functional data. However, the computational burden of band depth
Can Money Buy Happiness? A Statistical Analysis of Predictors for User Satisfaction
Hunter, Ben; Perret, Robert
2011-01-01
2007 data from LibQUAL+[TM] and the ACRL Library Trends and Statistics database were analyzed to determine if there is a statistically significant correlation between library expenditures and usage statistics and library patron satisfaction across 73 universities. The results show that users of larger, better funded libraries have higher…
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Dowdy, Shirley; Chilko, Daniel
2011-01-01
Praise for the Second Edition "Statistics for Research has other fine qualities besides superior organization. The examples and the statistical methods are laid out with unusual clarity by the simple device of using special formats for each. The book was written with great care and is extremely user-friendly."-The UMAP Journal Although the goals and procedures of statistical research have changed little since the Second Edition of Statistics for Research was published, the almost universal availability of personal computers and statistical computing application packages have made it possible f
Griffiths, Dawn
2009-01-01
Wouldn't it be great if there were a statistics book that made histograms, probability distributions, and chi square analysis more enjoyable than going to the dentist? Head First Statistics brings this typically dry subject to life, teaching you everything you want and need to know about statistics through engaging, interactive, and thought-provoking material, full of puzzles, stories, quizzes, visual aids, and real-world examples. Whether you're a student, a professional, or just curious about statistical analysis, Head First's brain-friendly formula helps you get a firm grasp of statistics
A Nineteenth Century Statistical Society that Abandoned Statistics
Stamhuis, I.H.
2007-01-01
In 1857, a Statistical Society was founded in the Netherlands. Within this society, statistics was considered a systematic, quantitative, and qualitative description of society. In the course of time, the society attracted a wide and diverse membership, although the number of physicians on its rolls
Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben
2017-09-15
Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Naghshpour, Shahdad
2012-01-01
Statistics is the branch of mathematics that deals with real-life problems. As such, it is an essential tool for economists. Unfortunately, the way you and many other economists learn the concept of statistics is not compatible with the way economists think and learn. The problem is worsened by the use of mathematical jargon and complex derivations. Here's a book that proves none of this is necessary. All the examples and exercises in this book are constructed within the field of economics, thus eliminating the difficulty of learning statistics with examples from fields that have no relation to business, politics, or policy. Statistics is, in fact, not more difficult than economics. Anyone who can comprehend economics can understand and use statistics successfully within this field, including you! This book utilizes Microsoft Excel to obtain statistical results, as well as to perform additional necessary computations. Microsoft Excel is not the software of choice for performing sophisticated statistical analy...
Active Learning with Rationales for Identifying Operationally Significant Anomalies in Aviation
Sharma, Manali; Das, Kamalika; Bilgic, Mustafa; Matthews, Bryan; Nielsen, David Lynn; Oza, Nikunj C.
2016-01-01
A major focus of the commercial aviation community is discovery of unknown safety events in flight operations data. Data-driven unsupervised anomaly detection methods are better at capturing unknown safety events compared to rule-based methods which only look for known violations. However, not all statistical anomalies that are discovered by these unsupervised anomaly detection methods are operationally significant (e.g., represent a safety concern). Subject Matter Experts (SMEs) have to spend significant time reviewing these statistical anomalies individually to identify a few operationally significant ones. In this paper we propose an active learning algorithm that incorporates SME feedback in the form of rationales to build a classifier that can distinguish between uninteresting and operationally significant anomalies. Experimental evaluation on real aviation data shows that our approach improves detection of operationally significant events by as much as 75% compared to the state-of-the-art. The learnt classifier also generalizes well to additional validation data sets.
Prevalence of significant bacteriuria among symptomatic and ...
African Journals Online (AJOL)
Data were analyzed using the Statistical Package for Social Sciences (SPSS) version 16.0 (SPSS, Inc., Chicago, Ill). Results: A total of 100 consenting participants were recruited into the study. The mean age was: 23.42 ± 8.31 years and a range of 14‑50 years. Only 9% (9/100) had significant bacteriuria while 44.4% (4/9) ...
Understanding and forecasting polar stratospheric variability with statistical models
Directory of Open Access Journals (Sweden)
C. Blume
2012-07-01
Full Text Available The variability of the north-polar stratospheric vortex is a prominent aspect of the middle atmosphere. This work investigates a wide class of statistical models with respect to their ability to model geopotential and temperature anomalies, representing variability in the polar stratosphere. Four partly nonstationary, nonlinear models are assessed: linear discriminant analysis (LDA; a cluster method based on finite elements (FEM-VARX; a neural network, namely the multi-layer perceptron (MLP; and support vector regression (SVR. These methods model time series by incorporating all significant external factors simultaneously, including ENSO, QBO, the solar cycle, volcanoes, to then quantify their statistical importance. We show that variability in reanalysis data from 1980 to 2005 is successfully modeled. The period from 2005 to 2011 can be hindcasted to a certain extent, where MLP performs significantly better than the remaining models. However, variability remains that cannot be statistically hindcasted within the current framework, such as the unexpected major warming in January 2009. Finally, the statistical model with the best generalization performance is used to predict a winter 2011/12 with warm and weak vortex conditions. A vortex breakdown is predicted for late January, early February 2012.
Clinical significance of altered nm23-H1, EGFR, RB and p53 expression in bilharzial bladder cancer
International Nuclear Information System (INIS)
Khaled, Hussein M; Bahnassy, Abeer A; Raafat, Amira A; Zekri, Abdel-Rahman N; Madboul, Maha S; Mokhtar, Nadia M
2009-01-01
Clinical characterization of bladder carcinomas is still inadequate using the standard clinico-pathological prognostic markers. We assessed the correlation between nm23-H1, Rb, EGFR and p53 in relation to the clinical outcome of patients with muscle invasive bilharzial bladder cancer (MI-BBC). nm23-H1, Rb, EGFR and p53 expression was assessed in 59 MI-BBC patients using immunohistochemistry and reverse transcription (RT-PCR) and was correlated to the standard clinico-pathological prognostic factors, patient's outcome and the overall survival (OS) rate. Overexpression of EGFR and p53 proteins was detected in 66.1% and 35.6%; respectively. Loss of nm23-H1and Rb proteins was detected in 42.4% and 57.6%; respectively. Increased EGFR and loss of nm23-H1 RNA were detected in 61.5% and 36.5%; respectively. There was a statistically significant correlation between p53 and EGFR overexpression (p < 0.0001), nm23 loss (protein and RNA), lymph node status (p < 0.0001); between the incidence of local recurrence and EGFR RNA overexpression (p= 0.003) as well as between the incidence of metastasis and altered Rb expression (p = 0.026), p53 overexpression (p < 0.0001) and mutation (p = 0.04). Advanced disease stage correlated significantly with increased EGFR (protein and RNA) (p = 0.003 & 0.01), reduced nm23-H1 RNA (p = 0.02), altered Rb (p = 0.023), and p53 overexpression (p = 0.004). OS rates correlated significantly, in univariate analysis, with p53 overexpression (p = 0.011), increased EGFR (protein and RNA, p = 0.034&0.031), nm23-H1 RNA loss (p = 0.021) and aberrations of ≥ 2 genes. However, multivariate analysis showed that only high EGFR overexpression, metastatic recurrence, high tumor grade and the combination of ≥ 2 affected markers were independent prognostic factors. nm23-H1, EGFR and p53 could be used as prognostic biomarkers in MI-BBC patients. In addition to the standard pathological prognostic factors, a combination of these markers (≥ 2) has
Statistical data analysis using SAS intermediate statistical methods
Marasinghe, Mervyn G
2018-01-01
The aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data. The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude. Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitab...
Natural radionuclides in effluents release by a deactivated uranium mine
International Nuclear Information System (INIS)
Pereira, Wagner S.; Kelecom, Alphonse; Silva, Ademir X.; Lopes, José M.; Pinto, Carlos E.C.; Py Júnior, Delcy A.; Antunes, Marcos M.; Indústrias Nucleares do Brasil; Universidade Federal Fluminense; Coordenacao de Pos-Graduacao e Pesquisa de Engenharia
2017-01-01
The Ore Treatment Unit (OTU) is a mine and deactivated uranium plant in the city of Caldas, Minas Gerais, Brazil. This facility possesses three points of release of liquid effluents containing radionuclides: point 014, 025 and 076. At these points, the values of activity concentrations (AC) of the radionuclides U_n_a_t, "2"2"6Ra, "2"1"0Pb, "2"3"2Th and "2"2"8Ra were analyzed in 2012. The evaluation of point 014 by univariate statistics pointed four groups. [U_n_a_t > "2"2"8Ra > ("2"2"6Ra = "2"1"0Pb) >"2"3"2Th]. The multivariate statistics separated the radionuclides into two groups: [(U_n_a_t and "2"3"2Th) and ("2"2"6Ra, "2"2"8Ra and "2"1"0Pb)]. At point 025, the univariate statistics described three groups: [Un_a_t > ("2"2"8Ra = "2"1"0Pb) > ("2"2"6Ra = "2"3"2Th)] and the multivariate analysis also described three but different groups: [(U_n_a_t and "2"2"8Ra), ("2"2"6Ra and "2"1"0Pb) and "2"3"2Th]. In turn, point 076 showed another behavior. The univariate analysis showed only two groups: [(U_n_a_t) > ("2"2"6Ra, "2"2"8Ra, "2"1"0Pb, "2"3"2Th)]. Differently, the multivariate statistics defined three groups: [(U_n_a_t and "2"3"2Th), ("2"2"6Ra and "2"2"8Ra) and "2"1"0Pb].Thus, statistical analysis showed that each point has releases of effluents with different characteristics. Both the behaviors of releases, based on multivariate statistics, and of the AC magnitudes, based on the univariate statistics, are different between the points. The only common features were the greater magnitude of uranium and the smaller magnitude of thorium. (author)
Neuroendocrine Tumor: Statistics
... Tumor > Neuroendocrine Tumor: Statistics Request Permissions Neuroendocrine Tumor: Statistics Approved by the Cancer.Net Editorial Board , 01/ ... the body. It is important to remember that statistics on the survival rates for people with a ...
Institute of Scientific and Technical Information of China (English)
Wei Gao; Jing Wang; Chao Zhang; Ping Qin
2017-01-01
Objective:To determine the serum inflammatory cytokines and oxidative stress parameters of diabetic retinopathy (DR) patients to explore their possible role in the DR.Methods: 116 cases of type 2 diabetic patients were selected from June 2015 to June 2016 in our hospital as research subjects, divided into diabetic Diabetes without retinopathy (NDR group,n = 63) and diabetic with retinopathy patients (DR group,n = 53). And 60 cases of healthy check-ups of the same period in our hospital medical center were selected as normal control group (NC). The VEGF, IL-6, TNF-α , MDA and SOD levels of three groups of patients were detected. Results:The IL-6 levels of NC group, NDR group and DR group were increased gradually, and the difference was statistically significant (P<0.05). The TNF-α levels of NC group, NDR group and DR group were increased gradually, and the difference was statistically significant (P<0.05). The VEGF levels of NC group, NDR group and DR group were increased gradually, and the difference was statistically significant (P<0.05). The malondialdehyde (MDA) levels of NC group, NDR group and DR group increased gradually, and the difference was statistically significant (P<0.05). The superoxide dismutase (SOD) levels of NC group, NDR group and DR group were decreased gradually, and the difference was statistically significant (P<0.05). Conclusions: DR patients express high levels of IL-6, TNF-α and VEGF, and there exists significant oxidative stress in DR, which shows that the inflammation occurrence and oxidative stress state play an important role in the development of DR.
The significance test controversy revisited the fiducial Bayesian alternative
Lecoutre, Bruno
2014-01-01
The purpose of this book is not only to revisit the “significance test controversy,”but also to provide a conceptually sounder alternative. As such, it presents a Bayesian framework for a new approach to analyzing and interpreting experimental data. It also prepares students and researchers for reporting on experimental results. Normative aspects: The main views of statistical tests are revisited and the philosophies of Fisher, Neyman-Pearson and Jeffrey are discussed in detail. Descriptive aspects: The misuses of Null Hypothesis Significance Tests are reconsidered in light of Jeffreys’ Bayesian conceptions concerning the role of statistical inference in experimental investigations. Prescriptive aspects: The current effect size and confidence interval reporting practices are presented and seriously questioned. Methodological aspects are carefully discussed and fiducial Bayesian methods are proposed as a more suitable alternative for reporting on experimental results. In closing, basic routine procedures...
The SACE Review Panel's Final Report: Significant Flaws in the Analysis of Statistical Data
Gregory, Kelvin
2006-01-01
The South Australian Certificate of Education (SACE) is a credential and formal qualification within the Australian Qualifications Framework. A recent review of the SACE outlined a number of recommendations for significant changes to this certificate. These recommendations were the result of a process that began with the review panel…
Directory of Open Access Journals (Sweden)
J Sachithanandham
2014-01-01
Full Text Available Purpose: Opportunistic viral infections are one of the major causes of morbidity and mortality in HIV infection and their molecular detection in the whole blood could be a useful diagnostic tool. Objective: The frequency of opportunistic DNA virus infections among HIV-1-infected individuals using multiplex real-time PCR assays was studied. Materials and Methods: The subjects were in two groups; group 1: Having CD4 counts 350 cells/µl (n = 173. Individuals were classified by WHO clinical staging system. Samples from 70 healthy individuals were tested as controls. In-house qualitative multiplex real-time PCR was standardised and whole blood samples from 291 were tested, followed by quantitative real-time PCR for positives. In a proportion of samples genotypes of Epstein-Barr virus (EBV and CMV were determined. Results: The two major viral infections observed were EBV and CMV. The univariate analysis of CMV load showed significant association with cryptococcal meningitis, oral hairy leukoplakia (OHL, CMV retinitis, CD4 counts and WHO staging (P < 0.05 while the multivariate analysis showed an association with OHL (P = 0.02 and WHO staging (P = 0.05. Univariate analysis showed an association of EBV load with CD4 counts and WHO staging (P < 0.05 and multivariate analysis had association only with CD4 counts. The CMV load was significantly associated with elevated SGPT and SGOT level (P < 0.05 while the EBV had only with SGOT. Conclusion: This study showed an association of EBV and CMV load with CD4+ T cell counts, WHO staging and elevated liver enzymes. These viral infections can accelerate HIV disease and multiplex real-time PCR can be used for the early detection. Genotype 1 and 2 of EBV and genotype gB1 and gB2 of CMV were the prevalent in the HIV-1 subtype C-infected south Indians.
... this page: https://medlineplus.gov/usestatistics.html MedlinePlus Statistics To use the sharing features on this page, ... By Quarter View image full size Quarterly User Statistics Quarter Page Views Unique Visitors Oct-Dec-98 ...
Second Language Experience Facilitates Statistical Learning of Novel Linguistic Materials.
Potter, Christine E; Wang, Tianlin; Saffran, Jenny R
2017-04-01
Recent research has begun to explore individual differences in statistical learning, and how those differences may be related to other cognitive abilities, particularly their effects on language learning. In this research, we explored a different type of relationship between language learning and statistical learning: the possibility that learning a new language may also influence statistical learning by changing the regularities to which learners are sensitive. We tested two groups of participants, Mandarin Learners and Naïve Controls, at two time points, 6 months apart. At each time point, participants performed two different statistical learning tasks: an artificial tonal language statistical learning task and a visual statistical learning task. Only the Mandarin-learning group showed significant improvement on the linguistic task, whereas both groups improved equally on the visual task. These results support the view that there are multiple influences on statistical learning. Domain-relevant experiences may affect the regularities that learners can discover when presented with novel stimuli. Copyright © 2016 Cognitive Science Society, Inc.
Measuring radioactive half-lives via statistical sampling in practice
Lorusso, G.; Collins, S. M.; Jagan, K.; Hitt, G. W.; Sadek, A. M.; Aitken-Smith, P. M.; Bridi, D.; Keightley, J. D.
2017-10-01
The statistical sampling method for the measurement of radioactive decay half-lives exhibits intriguing features such as that the half-life is approximately the median of a distribution closely resembling a Cauchy distribution. Whilst initial theoretical considerations suggested that in certain cases the method could have significant advantages, accurate measurements by statistical sampling have proven difficult, for they require an exercise in non-standard statistical analysis. As a consequence, no half-life measurement using this method has yet been reported and no comparison with traditional methods has ever been made. We used a Monte Carlo approach to address these analysis difficulties, and present the first experimental measurement of a radioisotope half-life (211Pb) by statistical sampling in good agreement with the literature recommended value. Our work also focused on the comparison between statistical sampling and exponential regression analysis, and concluded that exponential regression achieves generally the highest accuracy.
Blakemore, J S
1962-01-01
Semiconductor Statistics presents statistics aimed at complementing existing books on the relationships between carrier densities and transport effects. The book is divided into two parts. Part I provides introductory material on the electron theory of solids, and then discusses carrier statistics for semiconductors in thermal equilibrium. Of course a solid cannot be in true thermodynamic equilibrium if any electrical current is passed; but when currents are reasonably small the distribution function is but little perturbed, and the carrier distribution for such a """"quasi-equilibrium"""" co
Statistical analogues of thermodynamic extremum principles
Ramshaw, John D.
2018-05-01
As shown by Jaynes, the canonical and grand canonical probability distributions of equilibrium statistical mechanics can be simply derived from the principle of maximum entropy, in which the statistical entropy S=- {k}{{B}}{\\sum }i{p}i{log}{p}i is maximised subject to constraints on the mean values of the energy E and/or number of particles N in a system of fixed volume V. The Lagrange multipliers associated with those constraints are then found to be simply related to the temperature T and chemical potential μ. Here we show that the constrained maximisation of S is equivalent to, and can therefore be replaced by, the essentially unconstrained minimisation of the obvious statistical analogues of the Helmholtz free energy F = E ‑ TS and the grand potential J = F ‑ μN. Those minimisations are more easily performed than the maximisation of S because they formally eliminate the constraints on the mean values of E and N and their associated Lagrange multipliers. This procedure significantly simplifies the derivation of the canonical and grand canonical probability distributions, and shows that the well known extremum principles for the various thermodynamic potentials possess natural statistical analogues which are equivalent to the constrained maximisation of S.
Statistics without Tears: Complex Statistics with Simple Arithmetic
Smith, Brian
2011-01-01
One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
International Nuclear Information System (INIS)
Nemnes, G A; Anghel, D V
2010-01-01
We present a stochastic method for the simulation of the time evolution in systems which obey generalized statistics, namely fractional exclusion statistics and Gentile's statistics. The transition rates are derived in the framework of canonical ensembles. This approach introduces a tool for describing interacting fermionic and bosonic systems in non-equilibrium as ideal FES systems, in a computationally efficient manner. The two types of statistics are analyzed comparatively, indicating their intrinsic thermodynamic differences and revealing key aspects related to the species size
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance
Kramer, Karen L.; Veile, Amanda; Ot?rola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger s...
Principles of Statistics: What the Sports Medicine Professional Needs to Know.
Riemann, Bryan L; Lininger, Monica R
2018-07-01
Understanding the results and statistics reported in original research remains a large challenge for many sports medicine practitioners and, in turn, may be among one of the biggest barriers to integrating research into sports medicine practice. The purpose of this article is to provide minimal essentials a sports medicine practitioner needs to know about interpreting statistics and research results to facilitate the incorporation of the latest evidence into practice. Topics covered include the difference between statistical significance and clinical meaningfulness; effect sizes and confidence intervals; reliability statistics, including the minimal detectable difference and minimal important difference; and statistical power. Copyright © 2018 Elsevier Inc. All rights reserved.
DEFF Research Database (Denmark)
Tryggestad, Kjell
2004-01-01
The study aims is to describe how the inclusion and exclusion of materials and calculative devices construct the boundaries and distinctions between statistical facts and artifacts in economics. My methodological approach is inspired by John Graunt's (1667) Political arithmetic and more recent work...... within constructivism and the field of Science and Technology Studies (STS). The result of this approach is here termed reversible statistics, reconstructing the findings of a statistical study within economics in three different ways. It is argued that all three accounts are quite normal, albeit...... in different ways. The presence and absence of diverse materials, both natural and political, is what distinguishes them from each other. Arguments are presented for a more symmetric relation between the scientific statistical text and the reader. I will argue that a more symmetric relation can be achieved...
Wannier, Gregory Hugh
1966-01-01
Until recently, the field of statistical physics was traditionally taught as three separate subjects: thermodynamics, statistical mechanics, and kinetic theory. This text, a forerunner in its field and now a classic, was the first to recognize the outdated reasons for their separation and to combine the essentials of the three subjects into one unified presentation of thermal physics. It has been widely adopted in graduate and advanced undergraduate courses, and is recommended throughout the field as an indispensable aid to the independent study and research of statistical physics.Designed for
Doll, Corinne M; Aquino-Parsons, Christina; Pintilie, Melania; Klimowicz, Alexander C; Petrillo, Stephanie K; Milosevic, Michael; Craighead, Peter S; Clarke, Blaise; Lees-Miller, Susan P; Fyles, Anthony W; Magliocco, Anthony M
2013-03-01
ERCC1 (excision repair cross-complementation group 1) expression has been shown to be a molecular marker of cisplatin resistance in many tumor sites, but has not been well studied in cervical cancer patients. The purpose of this study was to measure tumoral ERCC1 in patients with locally advanced cervical cancer treated with chemoradiation therapy (CRT) in a large multicenter cohort, and to correlate expression with clinical outcome parameters. A total of 264 patients with locally advanced cervical cancer, treated with curative-intent radical CRT from 3 major Canadian cancer centers were evaluated. Pretreatment formalin-fixed, paraffin-embedded tumor specimens were retrieved, and tissue microarrays were constructed. Tumoral ERCC1 (FL297 antibody) was measured using AQUA (R) technology. Statistical analysis was performed to determine the significance of clinical factors and ERCC1 status with progression-free survival (PFS) and overall survival (OS) at 5 years. The majority of patients had International Federation of Gynecology and Obstetrics (FIGO) stage II disease (n=119, 45%); median tumor size was 5 cm. OS was associated with tumor size (HR 1.16, P=.018), pretreatment hemoglobin status (HR 2.33, P=.00027), and FIGO stage. In addition, tumoral ERCC1 status (nuclear to cytoplasmic ratio) was associated with PFS (HR 2.33 [1.05-5.18], P=.038) and OS (HR 3.13 [1.27-7.71], P=.013). ERCC1 status was not significant on multivariate analysis when the model was adjusted for the clinical factors: for PFS (HR 1.49 [0.61-3.6], P=.38); for OS (HR 2.42 [0.94-6.24] P=.067). In this large multicenter cohort of locally advanced cervical cancer patients treated with radical CRT, stage, tumor size, and pretreatment hemoglobin status were significantly associated with PFS and OS. ERCC1 status appears to have prognostic impact on univariate analysis in these patients, but was not independently associated with outcome on multivariate analysis. Copyright © 2013. Published by Elsevier
Energy Technology Data Exchange (ETDEWEB)
Doll, Corinne M., E-mail: Corinne.Doll@albertahealthservices.ca [Department of Oncology, University of Calgary, Calgary, AB (Canada); Aquino-Parsons, Christina [Department of Radiation Oncology, University of British Columbia, Vancouver, BC (Canada); Pintilie, Melania [Department of Biostatistics, Ontario Cancer Institute/Princess Margaret Hospital, University of Toronto, Toronto, ON (Canada); Klimowicz, Alexander C. [Department of Oncology, University of Calgary, Calgary, AB (Canada); Petrillo, Stephanie K. [Department of Pathology, University of Calgary, Calgary, AB (Canada); Milosevic, Michael [Department of Radiation Oncology, University Health Network, University of Toronto, Toronto, ON (Canada); Craighead, Peter S. [Department of Oncology, University of Calgary, Calgary, AB (Canada); Clarke, Blaise [Department of Pathology, University of Toronto, Toronto, ON (Canada); Lees-Miller, Susan P. [Departments of Biochemistry and Molecular Biology, and Oncology, University of Calgary, Calgary, AB (Canada); Fyles, Anthony W. [Department of Radiation Oncology, University Health Network, University of Toronto, Toronto, ON (Canada); Magliocco, Anthony M. [Department of Pathology, Lee Moffitt Cancer Center, Tampa, Florida (United States)
2013-03-01
Purpose: ERCC1 (excision repair cross-complementation group 1) expression has been shown to be a molecular marker of cisplatin resistance in many tumor sites, but has not been well studied in cervical cancer patients. The purpose of this study was to measure tumoral ERCC1 in patients with locally advanced cervical cancer treated with chemoradiation therapy (CRT) in a large multicenter cohort, and to correlate expression with clinical outcome parameters. Methods and Materials: A total of 264 patients with locally advanced cervical cancer, treated with curative-intent radical CRT from 3 major Canadian cancer centers were evaluated. Pretreatment formalin-fixed, paraffin-embedded tumor specimens were retrieved, and tissue microarrays were constructed. Tumoral ERCC1 (FL297 antibody) was measured using AQUA (R) technology. Statistical analysis was performed to determine the significance of clinical factors and ERCC1 status with progression-free survival (PFS) and overall survival (OS) at 5 years. Results: The majority of patients had International Federation of Gynecology and Obstetrics (FIGO) stage II disease (n=119, 45%); median tumor size was 5 cm. OS was associated with tumor size (HR 1.16, P=.018), pretreatment hemoglobin status (HR 2.33, P=.00027), and FIGO stage. In addition, tumoral ERCC1 status (nuclear to cytoplasmic ratio) was associated with PFS (HR 2.33 [1.05-5.18], P=.038) and OS (HR 3.13 [1.27-7.71], P=.013). ERCC1 status was not significant on multivariate analysis when the model was adjusted for the clinical factors: for PFS (HR 1.49 [0.61-3.6], P=.38); for OS (HR 2.42 [0.94-6.24] P=.067). Conclusions: In this large multicenter cohort of locally advanced cervical cancer patients treated with radical CRT, stage, tumor size, and pretreatment hemoglobin status were significantly associated with PFS and OS. ERCC1 status appears to have prognostic impact on univariate analysis in these patients, but was not independently associated with outcome on
International Nuclear Information System (INIS)
Doll, Corinne M.; Aquino-Parsons, Christina; Pintilie, Melania; Klimowicz, Alexander C.; Petrillo, Stephanie K.; Milosevic, Michael; Craighead, Peter S.; Clarke, Blaise; Lees-Miller, Susan P.; Fyles, Anthony W.; Magliocco, Anthony M.
2013-01-01
Purpose: ERCC1 (excision repair cross-complementation group 1) expression has been shown to be a molecular marker of cisplatin resistance in many tumor sites, but has not been well studied in cervical cancer patients. The purpose of this study was to measure tumoral ERCC1 in patients with locally advanced cervical cancer treated with chemoradiation therapy (CRT) in a large multicenter cohort, and to correlate expression with clinical outcome parameters. Methods and Materials: A total of 264 patients with locally advanced cervical cancer, treated with curative-intent radical CRT from 3 major Canadian cancer centers were evaluated. Pretreatment formalin-fixed, paraffin-embedded tumor specimens were retrieved, and tissue microarrays were constructed. Tumoral ERCC1 (FL297 antibody) was measured using AQUA (R) technology. Statistical analysis was performed to determine the significance of clinical factors and ERCC1 status with progression-free survival (PFS) and overall survival (OS) at 5 years. Results: The majority of patients had International Federation of Gynecology and Obstetrics (FIGO) stage II disease (n=119, 45%); median tumor size was 5 cm. OS was associated with tumor size (HR 1.16, P=.018), pretreatment hemoglobin status (HR 2.33, P=.00027), and FIGO stage. In addition, tumoral ERCC1 status (nuclear to cytoplasmic ratio) was associated with PFS (HR 2.33 [1.05-5.18], P=.038) and OS (HR 3.13 [1.27-7.71], P=.013). ERCC1 status was not significant on multivariate analysis when the model was adjusted for the clinical factors: for PFS (HR 1.49 [0.61-3.6], P=.38); for OS (HR 2.42 [0.94-6.24] P=.067). Conclusions: In this large multicenter cohort of locally advanced cervical cancer patients treated with radical CRT, stage, tumor size, and pretreatment hemoglobin status were significantly associated with PFS and OS. ERCC1 status appears to have prognostic impact on univariate analysis in these patients, but was not independently associated with outcome on
Kalman filter for statistical monitoring of forest cover across sub-continental regions [Symposium
Raymond L. Czaplewski
1991-01-01
The Kalman filter is a generalization of the composite estimator. The univariate composite estimate combines 2 prior estimates of population parameter with a weighted average where the scalar weight is inversely proportional to the variances. The composite estimator is a minimum variance estimator that requires no distributional assumptions other than estimates of the...
Probability theory for 3-layer remote sensing radiative transfer model: univariate case.
Ben-David, Avishai; Davidson, Charles E
2012-04-23
A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of line of sight) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive infrared remote sensing. The probability model is described by the Johnson family of distributions that are used as a fit for theoretically computed moments of the radiative transfer model. From the Johnson family we use the SU distribution that can address a wide range of skewness and kurtosis values (in addition to addressing the first two moments, mean and variance). In the limit, SU can also describe lognormal and normal distributions. With the probability model one can evaluate the potential for detecting a target (vapor cloud layer), the probability of observing thermal contrast, and evaluate performance (receiver operating characteristics curves) in clutter-noise limited scenarios. This is (to our knowledge) the first probability model for the 3-layer remote sensing geometry that treats all parameters as random variables and includes higher-order statistics. © 2012 Optical Society of America
Managing Macroeconomic Risks by Using Statistical Simulation
Directory of Open Access Journals (Sweden)
Merkaš Zvonko
2017-06-01
Full Text Available The paper analyzes the possibilities of using statistical simulation in the macroeconomic risks measurement. At the level of the whole world, macroeconomic risks are, due to the excessive imbalance, significantly increased. Using analytical statistical methods and Monte Carlo simulation, the authors interpret the collected data sets, compare and analyze them in order to mitigate potential risks. The empirical part of the study is a qualitative case study that uses statistical methods and Monte Carlo simulation for managing macroeconomic risks, which is the central theme of this work. Application of statistical simulation is necessary because the system, for which it is necessary to specify the model, is too complex for an analytical approach. The objective of the paper is to point out the previous need for consideration of significant macroeconomic risks, particularly in terms of the number of the unemployed in the society, the movement of gross domestic product and the country’s credit rating, and the use of data previously processed by statistical methods, through statistical simulation, to analyze the existing model of managing the macroeconomic risks and suggest elements for a management model development that will allow, with the lowest possible probability and consequences, the emergence of the recent macroeconomic risks. The stochastic characteristics of the system, defined by random variables as input values defined by probability distributions, require the performance of a large number of iterations on which to record the output of the model and calculate the mathematical expectations. The paper expounds the basic procedures and techniques of discrete statistical simulation applied to systems that can be characterized by a number of events which represent a set of circumstances that have caused a change in the system’s state and the possibility of its application in the field of assessment of macroeconomic risks. The method has no
Attitudes toward statistics in medical postgraduates: measuring, evaluating and monitoring.
Zhang, Yuhai; Shang, Lei; Wang, Rui; Zhao, Qinbo; Li, Chanjuan; Xu, Yongyong; Su, Haixia
2012-11-23
In medical training, statistics is considered a very difficult course to learn and teach. Current studies have found that students' attitudes toward statistics can influence their learning process. Measuring, evaluating and monitoring the changes of students' attitudes toward statistics are important. Few studies have focused on the attitudes of postgraduates, especially medical postgraduates. Our purpose was to understand current attitudes regarding statistics held by medical postgraduates and explore their effects on students' achievement. We also wanted to explore the influencing factors and the sources of these attitudes and monitor their changes after a systematic statistics course. A total of 539 medical postgraduates enrolled in a systematic statistics course completed the pre-form of the Survey of Attitudes Toward Statistics -28 scale, and 83 postgraduates were selected randomly from among them to complete the post-form scale after the course. Most medical postgraduates held positive attitudes toward statistics, but they thought statistics was a very difficult subject. The attitudes mainly came from experiences in a former statistical or mathematical class. Age, level of statistical education, research experience, specialty and mathematics basis may influence postgraduate attitudes toward statistics. There were significant positive correlations between course achievement and attitudes toward statistics. In general, student attitudes showed negative changes after completing a statistics course. The importance of student attitudes toward statistics must be recognized in medical postgraduate training. To make sure all students have a positive learning environment, statistics teachers should measure their students' attitudes and monitor their change of status during a course. Some necessary assistance should be offered for those students who develop negative attitudes.
Johnson, Norman
This is author-approved bcc: This is the third volume of a collection of seminal papers in the statistical sciences written during the past 110 years. These papers have each had an outstanding influence on the development of statistical theory and practice over the last century. Each paper is preceded by an introduction written by an authority in the field providing background information and assessing its influence. Volume III concerntrates on articles from the 1980's while including some earlier articles not included in Volume I and II. Samuel Kotz is Professor of Statistics in the College of Business and Management at the University of Maryland. Norman L. Johnson is Professor Emeritus of Statistics at the University of North Carolina. Also available: Breakthroughs in Statistics Volume I: Foundations and Basic Theory Samuel Kotz and Norman L. Johnson, Editors 1993. 631 pp. Softcover. ISBN 0-387-94037-5 Breakthroughs in Statistics Volume II: Methodology and Distribution Samuel Kotz and Norman L. Johnson, Edi...
Prognostic significance of multiple kallikreins in high-grade astrocytoma
International Nuclear Information System (INIS)
Drucker, Kristen L.; Gianinni, Caterina; Decker, Paul A.; Diamandis, Eleftherios P.; Scarisbrick, Isobel A.
2015-01-01
Kallikreins have clinical value as prognostic markers in a subset of malignancies examined to date, including kallikrein 3 (prostate specific antigen) in prostate cancer. We previously demonstrated that kallikrein 6 is expressed at higher levels in grade IV compared to grade III astrocytoma and is associated with reduced survival of GBM patients. In this study we determined KLK1, KLK6, KLK7, KLK8, KLK9 and KLK10 protein expression in two independent tissue microarrays containing 60 grade IV and 8 grade III astrocytoma samples. Scores for staining intensity, percent of tumor stained and immunoreactivity scores (IR, product of intensity and percent) were determined and analyzed for correlation with patient survival. Grade IV glioma was associated with higher levels of kallikrein-immunostaining compared to grade III specimens. Univariable Cox proportional hazards regression analysis demonstrated that elevated KLK6- or KLK7-IR was associated with poor patient prognosis. In addition, an increased percent of tumor immunoreactive for KLK6 or KLK9 was associated with decreased survival in grade IV patients. Kaplan-Meier survival analysis indicated that patients with KLK6-IR < 10, KLK6 percent tumor core stained < 3, or KLK7-IR < 9 had a significantly improved survival. Multivariable analysis indicated that the significance of these parameters was maintained even after adjusting for gender and performance score. These data suggest that elevations in glioblastoma KLK6, KLK7 and KLK9 protein have utility as prognostic markers of patient survival. The online version of this article (doi:10.1186/s12885-015-1566-5) contains supplementary material, which is available to authorized users
Ocean Wave Slope Statistics from Automated Analysis of Sun Glitter Photographs
1985-06-01
8217*.... . .. , .. . .. I 1 SCONTROL MAPCROSSREF.LAdEf_ 2 Si4OuTINE HDSPLY ( HTST . No NAME. XO. XSTEPI 3 C 4 C SIUBROUTINE TO nISPLAY A UNIVARIATE HISTOGRAM...LYRANON. CSC, FESRUARV ?6s 1qA0. 7 C a C HTST z HISTOGRAM ARRAY. 9 C NT 0 ROW DIMFNSION OF HIST. to C N.1 x COLUMN DIMENSTnN OF MIST. it C 12 REAL HIST
A tilting approach to ranking influence
Genton, Marc G.; Hall, Peter
2014-01-01
We suggest a new approach, which is applicable for general statistics computed from random samples of univariate or vector-valued or functional data, to assessing the influence that individual data have on the value of a statistic, and to ranking
Ziekinski, A.F.; Haminiuk, C.W.I.; Nunes, C.A.; Schnitzler, E.; Ruth, van S.M.; Granato, D.
2014-01-01
The use of univariate, bivariate, and multivariate statistical techniques, such as analysis of variance, multiple comparisons of means, and linear correlations, has spread widely in the area of Food Science and Technology. However, the use of supervised and unsupervised statistical techniques
The significance of reporting to the thousandths place: Figuring out the laboratory limitations
Directory of Open Access Journals (Sweden)
Joely A. Straseski
2017-04-01
Full Text Available Objectives: A request to report laboratory values to a specific number of decimal places represents a delicate balance between clinical interpretation of a true analytical change versus laboratory understanding of analytical imprecision and significant figures. Prostate specific antigen (PSA was used as an example to determine if an immunoassay routinely reported to the hundredths decimal place based on significant figure assessment in our laboratory was capable of providing analytically meaningful results when reported to the thousandths places when requested by clinicians. Design and methods: Results of imprecision studies of a representative PSA assay (Roche MODULAR E170 employing two methods of statistical analysis are reported. Sample pools were generated with target values of 0.01 and 0.20Â Î¼g/L PSA as determined by the E170. Intra-assay imprecision studies were conducted and the resultant data were analyzed using two independent statistical methods to evaluate reporting limits. Results: These statistical methods indicated reporting results to the thousandths place at the two assessed concentrations was an appropriate reflection of the measurement imprecision for the representative assay. This approach used two independent statistical tests to determine the ability of an analytical system to support a desired reporting level. Importantly, data were generated during a routine intra-assay imprecision study, thus this approach does not require extra data collection by the laboratory. Conclusions: Independent statistical analysis must be used to determine appropriate significant figure limitations for clinically relevant analytes. Establishing these limits is the responsibility of the laboratory and should be determined prior to providing clinical results. Keywords: Significant figures, Imprecision, Prostate cancer, Prostate specific antigen, PSA
The new statistics: why and how.
Cumming, Geoff
2014-01-01
We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data-analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesis significance testing (NHST), we need to shift from reliance on NHST to estimation and other preferred techniques. The new statistics refers to recommended practices, including estimation based on effect sizes, confidence intervals, and meta-analysis. The techniques are not new, but adopting them widely would be new for many researchers, as well as highly beneficial. This article explains why the new statistics are important and offers guidance for their use. It describes an eight-step new-statistics strategy for research with integrity, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.
Potential errors and misuse of statistics in studies on leakage in endodontics.
Lucena, C; Lopez, J M; Pulgar, R; Abalos, C; Valderrama, M J
2013-04-01
To assess the quality of the statistical methodology used in studies of leakage in Endodontics, and to compare the results found using appropriate versus inappropriate inferential statistical methods. The search strategy used the descriptors 'root filling' 'microleakage', 'dye penetration', 'dye leakage', 'polymicrobial leakage' and 'fluid filtration' for the time interval 2001-2010 in journals within the categories 'Dentistry, Oral Surgery and Medicine' and 'Materials Science, Biomaterials' of the Journal Citation Report. All retrieved articles were reviewed to find potential pitfalls in statistical methodology that may be encountered during study design, data management or data analysis. The database included 209 papers. In all the studies reviewed, the statistical methods used were appropriate for the category attributed to the outcome variable, but in 41% of the cases, the chi-square test or parametric methods were inappropriately selected subsequently. In 2% of the papers, no statistical test was used. In 99% of cases, a statistically 'significant' or 'not significant' effect was reported as a main finding, whilst only 1% also presented an estimation of the magnitude of the effect. When the appropriate statistical methods were applied in the studies with originally inappropriate data analysis, the conclusions changed in 19% of the cases. Statistical deficiencies in leakage studies may affect their results and interpretation and might be one of the reasons for the poor agreement amongst the reported findings. Therefore, more effort should be made to standardize statistical methodology. © 2012 International Endodontic Journal.
Factors predicting radiation pneumonitis in lung cancer patients: a retrospective study
International Nuclear Information System (INIS)
Rancati, T.; Ceresoli, G.L.; Gagliardi, G.; Schipani, S.; Cattaneo, G.M.
2003-01-01
Purpose: To evaluate clinical and lung dose-volume histogram based factors as predictors of radiation pneumonitis (RP) in lung cancer patients (PTs) treated with thoracic irradiation. Methods and materials: Records of all lung cancer PTs irradiated at our Institution between 1994 and 2000 were retrospectively reviewed. Eighty-four PTs with small or non-small-cell lung cancer, irradiated at >40 Gy, with full 3D dosimetry data and a follow-up time of >6 months from start of treatment, were analysed for RP. Pneumonitis was scored on the basis of SWOG toxicity criteria and was considered a complication when grade≥II. The following clinical parameters were considered: gender, age, surgery, chemotherapy agents, presence of chronic obstructive pulmonary disease (COPD), performance status. Dosimetric factors including prescribed dose (D iso ), presence of final conformal boost, mean lung dose (D mean ), % of lung receiving ≥20, 25, 30, 35, 40, and 45 Gy (respectively V 20 →V 45 ), and normal tissue complication probability (NTCP) values were analysed. DVHs data and NTCP values were collected for both lungs considered as a paired organ. Median and quartile values were taken as cut-off for statistical analysis. Factors that influenced RP were assessed by univariate (log-rank) and multivariate analyses (Cox hazard model). Results: There were 14 PTs (16.6%) who had ≥grade II pulmonary toxicity. In the entire population, the univariate analysis revealed that many dosimetric parameters (D iso , V 20 , V 30 , V 40 , V 45 ) were significantly associated with RP. No significant correlation was found between the incidence of RP and D mean or NTCP values. Multivariate analysis revealed that the use of mitomycin (MMC) (P=0.005) and the presence of COPD (P=0.026) were the most important risk factor for RP. In the group without COPD (55 PTs, seven RP) a few dosimetric factors (D mean , V 20 , V 45 ) and NTCP values (all models) were associated with RP in the univariate analysis
Statistical density of nuclear excited states
Directory of Open Access Journals (Sweden)
V. M. Kolomietz
2015-10-01
Full Text Available A semi-classical approximation is applied to the calculations of single-particle and statistical level densities in excited nuclei. Landau's conception of quasi-particles with the nucleon effective mass m* < m is used. The approach provides the correct description of the continuum contribution to the level density for realistic finite-depth potentials. It is shown that the continuum states does not affect significantly the thermodynamic calculations for sufficiently small temperatures T ≤ 1 MeV but reduce strongly the results for the excitation energy at high temperatures. By use of standard Woods - Saxon potential and nucleon effective mass m* = 0.7m the A-dependency of the statistical level density parameter K was evaluated in a good qualitative agreement with experimental data.
Statistical monitoring of linear antenna arrays
Harrou, Fouzi
2016-11-03
The paper concerns the problem of monitoring linear antenna arrays using the generalized likelihood ratio (GLR) test. When an abnormal event (fault) affects an array of antenna elements, the radiation pattern changes and significant deviation from the desired design performance specifications can resulted. In this paper, the detection of faults is addressed from a statistical point of view as a fault detection problem. Specifically, a statistical method rested on the GLR principle is used to detect potential faults in linear arrays. To assess the strength of the GLR-based monitoring scheme, three case studies involving different types of faults were performed. Simulation results clearly shown the effectiveness of the GLR-based fault-detection method to monitor the performance of linear antenna arrays.
Paradigms and pragmatism: approaches to medical statistics.
Healy, M J
2000-01-01
Until recently, the dominant philosophy of science was that due to Karl Popper, with its doctrine that the proper task of science was the formulation of hypotheses followed by attempts at refuting them. In spite of the close analogy with significance testing, these ideas do not fit well with the practice of medical statistics. The same can be said of the later philosophy of Thomas Kuhn, who maintains that science proceeds by way of revolutionary upheavals separated by periods of relatively pedestrian research which are governed by what Kuhn refers to as paradigms. Through there have been paradigm shifts in the history of statistics, a degree of continuity can also be discerned. A current paradigm shift is embodied in the spread of Bayesian ideas. It may be that a future paradigm will emphasise the pragmatic approach to statistics that is associated with the name of Daniel Schwartz.
DEFF Research Database (Denmark)
Nielsen, Tine; Kreiner, Svend
Short abstract Motivated by experiencing with students’ psychological barriers for learning statistics we modified and extended the Statistical Anxiety Rating Scale (STARS) to develop a contemporary Danish measure of attitudes and relationship to statistics for use with higher education students...... with evidence of DIF in all cases: One TCA-item functioned differentially relative to age, one WS-item functioned differentially relative to statistics course (first or second), and two IA-items functioned differentially relative to statistics course and academic discipline (sociology, public health...
Lenard, Christopher; McCarthy, Sally; Mills, Terence
2014-01-01
There are many different aspects of statistics. Statistics involves mathematics, computing, and applications to almost every field of endeavour. Each aspect provides an opportunity to spark someone's interest in the subject. In this paper we discuss some ethical aspects of statistics, and describe how an introduction to ethics has been…
International Nuclear Information System (INIS)
1999-01-01
For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2001-01-01
For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Serdobolskii, Vadim Ivanovich
2007-01-01
This monograph presents mathematical theory of statistical models described by the essentially large number of unknown parameters, comparable with sample size but can also be much larger. In this meaning, the proposed theory can be called "essentially multiparametric". It is developed on the basis of the Kolmogorov asymptotic approach in which sample size increases along with the number of unknown parameters.This theory opens a way for solution of central problems of multivariate statistics, which up until now have not been solved. Traditional statistical methods based on the idea of an infinite sampling often break down in the solution of real problems, and, dependent on data, can be inefficient, unstable and even not applicable. In this situation, practical statisticians are forced to use various heuristic methods in the hope the will find a satisfactory solution.Mathematical theory developed in this book presents a regular technique for implementing new, more efficient versions of statistical procedures. ...
Bulmer, M G
1979-01-01
There are many textbooks which describe current methods of statistical analysis, while neglecting related theory. There are equally many advanced textbooks which delve into the far reaches of statistical theory, while bypassing practical applications. But between these two approaches is an unfilled gap, in which theory and practice merge at an intermediate level. Professor M. G. Bulmer's Principles of Statistics, originally published in 1965, was created to fill that need. The new, corrected Dover edition of Principles of Statistics makes this invaluable mid-level text available once again fo
Industrial statistics with Minitab
Cintas, Pere Grima; Llabres, Xavier Tort-Martorell
2012-01-01
Industrial Statistics with MINITAB demonstrates the use of MINITAB as a tool for performing statistical analysis in an industrial context. This book covers introductory industrial statistics, exploring the most commonly used techniques alongside those that serve to give an overview of more complex issues. A plethora of examples in MINITAB are featured along with case studies for each of the statistical techniques presented. Industrial Statistics with MINITAB: Provides comprehensive coverage of user-friendly practical guidance to the essential statistical methods applied in industry.Explores
Statistical uncertainties and unrecognized relationships
International Nuclear Information System (INIS)
Rankin, J.P.
1985-01-01
Hidden relationships in specific designs directly contribute to inaccuracies in reliability assessments. Uncertainty factors at the system level may sometimes be applied in attempts to compensate for the impact of such unrecognized relationships. Often uncertainty bands are used to relegate unknowns to a miscellaneous category of low-probability occurrences. However, experience and modern analytical methods indicate that perhaps the dominant, most probable and significant events are sometimes overlooked in statistical reliability assurances. The author discusses the utility of two unique methods of identifying the otherwise often unforeseeable system interdependencies for statistical evaluations. These methods are sneak circuit analysis and a checklist form of common cause failure analysis. Unless these techniques (or a suitable equivalent) are also employed along with the more widely-known assurance tools, high reliability of complex systems may not be adequately assured. This concern is indicated by specific illustrations. 8 references, 5 figures
ASYMPTOTIC COMPARISONS OF U-STATISTICS, V-STATISTICS AND LIMITS OF BAYES ESTIMATES BY DEFICIENCIES
Toshifumi, Nomachi; Hajime, Yamato; Graduate School of Science and Engineering, Kagoshima University:Miyakonojo College of Technology; Faculty of Science, Kagoshima University
2001-01-01
As estimators of estimable parameters, we consider three statistics which are U-statistic, V-statistic and limit of Bayes estimate. This limit of Bayes estimate, called LB-statistic in this paper, is obtained from Bayes estimate of estimable parameter based on Dirichlet process, by letting its parameter tend to zero. For the estimable parameter with non-degenerate kernel, the asymptotic relative efficiencies of LB-statistic with respect to U-statistic and V-statistic and that of V-statistic w...
State Transportation Statistics 2014
2014-12-15
The Bureau of Transportation Statistics (BTS) presents State Transportation Statistics 2014, a statistical profile of transportation in the 50 states and the District of Columbia. This is the 12th annual edition of State Transportation Statistics, a ...
Attitudes toward statistics in medical postgraduates: measuring, evaluating and monitoring
2012-01-01
Background In medical training, statistics is considered a very difficult course to learn and teach. Current studies have found that students’ attitudes toward statistics can influence their learning process. Measuring, evaluating and monitoring the changes of students’ attitudes toward statistics are important. Few studies have focused on the attitudes of postgraduates, especially medical postgraduates. Our purpose was to understand current attitudes regarding statistics held by medical postgraduates and explore their effects on students’ achievement. We also wanted to explore the influencing factors and the sources of these attitudes and monitor their changes after a systematic statistics course. Methods A total of 539 medical postgraduates enrolled in a systematic statistics course completed the pre-form of the Survey of Attitudes Toward Statistics −28 scale, and 83 postgraduates were selected randomly from among them to complete the post-form scale after the course. Results Most medical postgraduates held positive attitudes toward statistics, but they thought statistics was a very difficult subject. The attitudes mainly came from experiences in a former statistical or mathematical class. Age, level of statistical education, research experience, specialty and mathematics basis may influence postgraduate attitudes toward statistics. There were significant positive correlations between course achievement and attitudes toward statistics. In general, student attitudes showed negative changes after completing a statistics course. Conclusions The importance of student attitudes toward statistics must be recognized in medical postgraduate training. To make sure all students have a positive learning environment, statistics teachers should measure their students’ attitudes and monitor their change of status during a course. Some necessary assistance should be offered for those students who develop negative attitudes. PMID:23173770
Glaz, Joseph
2009-01-01
Suitable for graduate students and researchers in applied probability and statistics, as well as for scientists in biology, computer science, pharmaceutical science and medicine, this title brings together a collection of chapters illustrating the depth and diversity of theory, methods and applications in the area of scan statistics.
Inverse statistical approach in heartbeat time series
International Nuclear Information System (INIS)
Ebadi, H; Shirazi, A H; Mani, Ali R; Jafari, G R
2011-01-01
We present an investigation on heart cycle time series, using inverse statistical analysis, a concept borrowed from studying turbulence. Using this approach, we studied the distribution of the exit times needed to achieve a predefined level of heart rate alteration. Such analysis uncovers the most likely waiting time needed to reach a certain change in the rate of heart beat. This analysis showed a significant difference between the raw data and shuffled data, when the heart rate accelerates or decelerates to a rare event. We also report that inverse statistical analysis can distinguish between the electrocardiograms taken from healthy volunteers and patients with heart failure
International Nuclear Information System (INIS)
Lankford, Scott P.; Pollack, Alan; Zagars, Gunar K.
1997-01-01
Purpose: Although the pretreatment serum prostate-specific antigen level (PSAL) is the single-most significant predictor of local and biochemical control in prostate cancer patients treated with radiotherapy, it is relatively insensitive for patients with a PSAL in the intermediate range (4-20 ng/ml). PSA density (PSAD) has been shown to be slightly more predictive of outcome than PSAL for this intermediate risk group; however, this improvement is small and of little use clinically. PSA cancer volume (PSACV), an estimate of cancer volume based on PSA, has recently been described and has been purported to be more significant t than PSAL in predicting early biochemical failure after radiotherapy. We report a detailed comparison between this new prognostic factor, PSAL, and PSAD. Methods and Materials: The records of 356 patients treated with definitive external beam radiotherapy for regionally localized (T1-4,Nx,M0) adenocarcinoma of the prostate were reviewed. Each patient had a PSAL, biopsy Gleason score, and pretreatment prostate volume by transrectal ultrasonography. The median PSAL was 9.3 ng/ml and 66% had Gleason scores in the 2-6 range. The median radiation dose was 66.0 Gy and the median follow-up for those living was 27 months. PSACV was calculated using a formula which takes into account PSAL, pretreatment prostate ultrasound volume, and Gleason score. The median PSACV was 1.43 cc. Biochemical failure was defined as increases in two consecutive follow-up PSA levels, one increase by a factor > 1.5, or an absolute increase of > 1 ng/ml. Local failure was defined as a cancer-positive prostate biopsy, obtained for evidence of tumor progression. Results: The distributions of PSACV and PSAL were similar and, when normalized by log transformation, were highly correlated (p < 0.0001, linear regression). There was a statistically significant relationship between PSACV and several potential prognostic factors including PSAL, PSAD, stage, Gleason score, and
Cristea, Ioana Alina; Ioannidis, John P A
2018-01-01
P values represent a widely used, but pervasively misunderstood and fiercely contested method of scientific inference. Display items, such as figures and tables, often containing the main results, are an important source of P values. We conducted a survey comparing the overall use of P values and the occurrence of significant P values in display items of a sample of articles in the three top multidisciplinary journals (Nature, Science, PNAS) in 2017 and, respectively, in 1997. We also examined the reporting of multiplicity corrections and its potential influence on the proportion of statistically significant P values. Our findings demonstrated substantial and growing reliance on P values in display items, with increases of 2.5 to 14.5 times in 2017 compared to 1997. The overwhelming majority of P values (94%, 95% confidence interval [CI] 92% to 96%) were statistically significant. Methods to adjust for multiplicity were almost non-existent in 1997, but reported in many articles relying on P values in 2017 (Nature 68%, Science 48%, PNAS 38%). In their absence, almost all reported P values were statistically significant (98%, 95% CI 96% to 99%). Conversely, when any multiplicity corrections were described, 88% (95% CI 82% to 93%) of reported P values were statistically significant. Use of Bayesian methods was scant (2.5%) and rarely (0.7%) articles relied exclusively on Bayesian statistics. Overall, wider appreciation of the need for multiplicity corrections is a welcome evolution, but the rapid growth of reliance on P values and implausibly high rates of reported statistical significance are worrisome.
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Statistics and Discoveries at the LHC (1/4)
CERN. Geneva
2010-01-01
The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.
Statistics and Discoveries at the LHC (3/4)
CERN. Geneva
2010-01-01
The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.
Statistics and Discoveries at the LHC (2/4)
CERN. Geneva
2010-01-01
The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.
Statistics and Discoveries at the LHC (4/4)
CERN. Geneva
2010-01-01
The lectures will give an introduction to statistics as applied in particle physics and will provide all the necessary basics for data analysis at the LHC. Special emphasis will be placed on the the problems and questions that arise when searching for new phenomena, including p-values, discovery significance, limit setting procedures, treatment of small signals in the presence of large backgrounds. Specific issues that will be addressed include the advantages and drawbacks of different statistical test procedures (cut-based, likelihood-ratio, etc.), the look-elsewhere effect and treatment of systematic uncertainties.
Directory of Open Access Journals (Sweden)
Joachim I. Krueger
2018-04-01
Full Text Available The practice of Significance Testing (ST remains widespread in psychological science despite continual criticism of its flaws and abuses. Using simulation experiments, we address four concerns about ST and for two of these we compare ST’s performance with prominent alternatives. We find the following: First, the 'p' values delivered by ST predict the posterior probability of the tested hypothesis well under many research conditions. Second, low 'p' values support inductive inferences because they are most likely to occur when the tested hypothesis is false. Third, 'p' values track likelihood ratios without raising the uncertainties of relative inference. Fourth, 'p' values predict the replicability of research findings better than confidence intervals do. Given these results, we conclude that 'p' values may be used judiciously as a heuristic tool for inductive inference. Yet, 'p' values cannot bear the full burden of inference. We encourage researchers to be flexible in their selection and use of statistical methods.
Performing Inferential Statistics Prior to Data Collection
Trafimow, David; MacDonald, Justin A.
2017-01-01
Typically, in education and psychology research, the investigator collects data and subsequently performs descriptive and inferential statistics. For example, a researcher might compute group means and use the null hypothesis significance testing procedure to draw conclusions about the populations from which the groups were drawn. We propose an…
International Nuclear Information System (INIS)
Kim, Kyu Tae; Kim, Oh Hwan
1999-01-01
A simplified statistical methodology is developed in order to both reduce over-conservatism of deterministic methodologies employed for PWR fuel rod internal pressure (RIP) calculation and simplify the complicated calculation procedure of the widely used statistical methodology which employs the response surface method and Monte Carlo simulation. The simplified statistical methodology employs the system moment method with a deterministic statistical methodology employs the system moment method with a deterministic approach in determining the maximum variance of RIP. The maximum RIP variance is determined with the square sum of each maximum value of a mean RIP value times a RIP sensitivity factor for all input variables considered. This approach makes this simplified statistical methodology much more efficient in the routine reload core design analysis since it eliminates the numerous calculations required for the power history-dependent RIP variance determination. This simplified statistical methodology is shown to be more conservative in generating RIP distribution than the widely used statistical methodology. Comparison of the significances of each input variable to RIP indicates that fission gas release model is the most significant input variable. (author). 11 refs., 6 figs., 2 tabs
Kanji, Gopal K
2006-01-01
This expanded and updated Third Edition of Gopal K. Kanji's best-selling resource on statistical tests covers all the most commonly used tests with information on how to calculate and interpret results with simple datasets. Each entry begins with a short summary statement about the test's purpose, and contains details of the test objective, the limitations (or assumptions) involved, a brief outline of the method, a worked example, and the numerical calculation. 100 Statistical Tests, Third Edition is the one indispensable guide for users of statistical materials and consumers of statistical information at all levels and across all disciplines.