Statistical analysis of survival data.
Crowley, J; Breslow, N
1984-01-01
A general review of the statistical techniques that the authors feel are most important in the analysis of survival data is presented. The emphasis is on the study of the duration of time between any two events as applied to people and on the nonparametric and semiparametric models most often used in these settings. The unifying concept is the hazard function, variously known as the risk, the force of mortality, or the force of transition.
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Graphics and statistics for cardiology: survival analysis.
May, Susanne; McKnight, Barbara
2017-03-01
Reports of data in the medical literature frequently lack information needed to assess the validity and generalisability of study results. Some recommendations and standards for reporting have been developed over the last two decades, but few are available specifically for survival data. We provide recommendations for tabular and graphical representations of survival data. We argue that data and analytic software should be made available to promote reproducible research. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Tutorial: survival analysis--a statistic for clinical, efficacy, and theoretical applications.
Gruber, F A
1999-04-01
Current demands for increased research attention to therapeutic efficacy, efficiency, and also for improved developmental models call for analysis of longitudinal outcome data. Statistical treatment of longitudinal speech and language data is difficult, but there is a family of statistical techniques in common use in medicine, actuarial science, manufacturing, and sociology that has not been used in speech or language research. Survival analysis is introduced as a method that avoids many of the statistical problems of other techniques because it treats time as the outcome. In survival analysis, probabilities are calculated not just for groups but also for individuals in a group. This is a major advantage for clinical work. This paper provides a basic introduction to nonparametric and semiparametric survival analysis using speech outcomes as examples. A brief discussion of potential conflicts between actuarial analysis and clinical intuition is also provided.
Miller, Rupert G
2011-01-01
A concise summary of the statistical methods used in the analysis of survival data with censoring. Emphasizes recently developed nonparametric techniques. Outlines methods in detail and illustrates them with actual data. Discusses the theory behind each method. Includes numerous worked problems and numerical exercises.
Energy Technology Data Exchange (ETDEWEB)
Smith, Steven G.; Skalski, John R.; Schelechte, J. Warren [Univ. of Washington, Seattle, WA (United States). Center for Quantitative Science
1994-12-01
Program SURPH is the culmination of several years of research to develop a comprehensive computer program to analyze survival studies of fish and wildlife populations. Development of this software was motivated by the advent of the PIT-tag (Passive Integrated Transponder) technology that permits the detection of salmonid smolt as they pass through hydroelectric facilities on the Snake and Columbia Rivers in the Pacific Northwest. Repeated detections of individually tagged smolt and analysis of their capture-histories permits estimates of downriver survival probabilities. Eventual installation of detection facilities at adult fish ladders will also permit estimation of ocean survival and upstream survival of returning salmon using the statistical methods incorporated in SURPH.1. However, the utility of SURPH.1 far exceeds solely the analysis of salmonid tagging studies. Release-recapture and radiotelemetry studies from a wide range of terrestrial and aquatic species have been analyzed using SURPH.1 to estimate discrete time survival probabilities and investigate survival relationships. The interactive computing environment of SURPH.1 was specifically developed to allow researchers to investigate the relationship between survival and capture processes and environmental, experimental and individual-based covariates. Program SURPH.1 represents a significant advancement in the ability of ecologists to investigate the interplay between morphologic, genetic, environmental and anthropogenic factors on the survival of wild species. It is hoped that this better understanding of risk factors affecting survival will lead to greater appreciation of the intricacies of nature and to improvements in the management of wild resources. This technical report is an introduction to SURPH.1 and provides a user guide for both the UNIX and MS-Windows{reg_sign} applications of the SURPH software.
Rupoli, S; Da Lio, L; Sisti, S; Campanati, G; Salvi, A; Brianzoni, M F; D'Amico, S; Cinciripini, A; Leoni, P
1994-04-01
In the present study we analyzed the prognostic significance of several clinical, hematological, and histological parameters recorded at diagnosis in a consecutive series of 72 patients with primary myelofibrosis (PMF). Univariate analysis showed that the most significant indicators of poor survival were the following: age greater than 60, splenomegaly, anemia (hemoglobin > 10 g/dl), leukopenia (WBC 14 x 10(9)/l), and any of these histological features: adipose tissue and megakaryocyte reduction, prominent osteoblastic rims along the trabecular bone, presence of peritrabecular megakaryocytes (Mk), absence of normal or giant Mk. The multivariate analysis showed that only the level of hemoglobin and the presence of both normal Mk and fever independently influenced the prognosis. These parameters were used to set up a prognostic scoring system, allowing a feasible prognosis to be made for each patient at the time of diagnosis and identifying those patients in urgent need of new therapeutic approaches.
Applied survival analysis using R
Moore, Dirk F
2016-01-01
Applied Survival Analysis Using R covers the main principles of survival analysis, gives examples of how it is applied, and teaches how to put those principles to use to analyze data using R as a vehicle. Survival data, where the primary outcome is time to a specific event, arise in many areas of biomedical research, including clinical trials, epidemiological studies, and studies of animals. Many survival methods are extensions of techniques used in linear regression and categorical data, while other aspects of this field are unique to survival data. This text employs numerous actual examples to illustrate survival curve estimation, comparison of survivals of different groups, proper accounting for censoring and truncation, model variable selection, and residual analysis. Because explaining survival analysis requires more advanced mathematics than many other statistical topics, this book is organized with basic concepts and most frequently used procedures covered in earlier chapters, with more advanced topics...
Mathematical and statistical analysis
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Statistical data analysis handbook
National Research Council Canada - National Science Library
Wall, Francis J
1986-01-01
It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...
Analyzing sickness absence with statistical models for survival data
DEFF Research Database (Denmark)
Christensen, Karl Bang; Andersen, Per Kragh; Smith-Hansen, Lars
2007-01-01
OBJECTIVES: Sickness absence is the outcome in many epidemiologic studies and is often based on summary measures such as the number of sickness absences per year. In this study the use of modern statistical methods was examined by making better use of the available information. Since sickness...... absence data deal with events occurring over time, the use of statistical models for survival data has been reviewed, and the use of frailty models has been proposed for the analysis of such data. METHODS: Three methods for analyzing data on sickness absences were compared using a simulation study...... between the psychosocial work environment and sickness absence were used to illustrate the results. RESULTS: Standard methods were found to underestimate true effect sizes by approximately one-tenth [method i] and one-third [method ii] and to have lower statistical power than frailty models. CONCLUSIONS...
Per Object statistical analysis
DEFF Research Database (Denmark)
2008-01-01
This RS code is to do Object-by-Object analysis of each Object's sub-objects, e.g. statistical analysis of an object's individual image data pixels. Statistics, such as percentiles (so-called "quartiles") are derived by the process, but the return of that can only be a Scene Variable, not an Object...... an analysis of the values of the object's pixels in MS-Excel. The shell of the proceedure could also be used for purposes other than just the derivation of Object - Sub-object statistics, e.g. rule-based assigment processes....... Variable. This procedure was developed in order to be able to export objects as ESRI shape data with the 90-percentile of the Hue of each object's pixels as an item in the shape attribute table. This procedure uses a sub-level single pixel chessboard segmentation, loops for each of the objects...
Survival analysis models and applications
Liu, Xian
2012-01-01
Survival analysis concerns sequential occurrences of events governed by probabilistic laws. Recent decades have witnessed many applications of survival analysis in various disciplines. This book introduces both classic survival models and theories along with newly developed techniques. Readers will learn how to perform analysis of survival data by following numerous empirical illustrations in SAS. Survival Analysis: Models and Applications: Presents basic techniques before leading onto some of the most advanced topics in survival analysis.Assumes only a minimal knowledge of SAS whilst enablin
Beginning statistics with data analysis
Mosteller, Frederick; Rourke, Robert EK
2013-01-01
This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.
Associative Analysis in Statistics
Directory of Open Access Journals (Sweden)
Mihaela Muntean
2015-03-01
Full Text Available In the last years, the interest in technologies such as in-memory analytics and associative search has increased. This paper explores how you can use in-memory analytics and an associative model in statistics. The word “associative” puts the emphasis on understanding how datasets relate to one another. The paper presents the main characteristics of “associative” data model. Also, the paper presents how to design an associative model for labor market indicators analysis. The source is the EU Labor Force Survey. Also, this paper presents how to make associative analysis.
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
Survival-time statistics for sample space reducing stochastic processes.
Yadav, Avinash Chand
2016-04-01
Stochastic processes wherein the size of the state space is changing as a function of time offer models for the emergence of scale-invariant features observed in complex systems. I consider such a sample-space reducing (SSR) stochastic process that results in a random sequence of strictly decreasing integers {x(t)},0≤t≤τ, with boundary conditions x(0)=N and x(τ) = 1. This model is shown to be exactly solvable: P_{N}(τ), the probability that the process survives for time τ is analytically evaluated. In the limit of large N, the asymptotic form of this probability distribution is Gaussian, with mean and variance both varying logarithmically with system size: 〈τ〉∼lnN and σ_{τ}^{2}∼lnN. Correspondence can be made between survival-time statistics in the SSR process and record statistics of independent and identically distributed random variables.
Inferential Statistics from Black Hispanic Breast Cancer Survival Data
Directory of Open Access Journals (Sweden)
Hafiz M. R. Khan
2014-01-01
Full Text Available In this paper we test the statistical probability models for breast cancer survival data for race and ethnicity. Data was collected from breast cancer patients diagnosed in United States during the years 1973–2009. We selected a stratified random sample of Black Hispanic female patients from the Surveillance Epidemiology and End Results (SEER database to derive the statistical probability models. We used three common model building criteria which include Akaike Information Criteria (AIC, Bayesian Information Criteria (BIC, and Deviance Information Criteria (DIC to measure the goodness of fit tests and it was found that Black Hispanic female patients survival data better fit the exponentiated exponential probability model. A novel Bayesian method was used to derive the posterior density function for the model parameters as well as to derive the predictive inference for future response. We specifically focused on Black Hispanic race. Markov Chain Monte Carlo (MCMC method was used for obtaining the summary results of posterior parameters. Additionally, we reported predictive intervals for future survival times. These findings would be of great significance in treatment planning and healthcare resource allocation.
DEFF Research Database (Denmark)
Ris Hansen, Inge; Søgaard, Karen; Gram, Bibi
2015-01-01
This is the analysis plan for the multicentre randomised control study looking at the effect of training and exercises in chronic neck pain patients that is being conducted in Jutland and Funen, Denmark. This plan will be used as a work description for the analyses of the data collected....
Research design and statistical analysis
Myers, Jerome L; Lorch Jr, Robert F
2013-01-01
Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Survival Predictions of Ceramic Crowns Using Statistical Fracture Mechanics.
Nasrin, S; Katsube, N; Seghi, R R; Rokhlin, S I
2017-05-01
This work establishes a survival probability methodology for interface-initiated fatigue failures of monolithic ceramic crowns under simulated masticatory loading. A complete 3-dimensional (3D) finite element analysis model of a minimally reduced molar crown was developed using commercially available hardware and software. Estimates of material surface flaw distributions and fatigue parameters for 3 reinforced glass-ceramics (fluormica [FM], leucite [LR], and lithium disilicate [LD]) and a dense sintered yttrium-stabilized zirconia (YZ) were obtained from the literature and incorporated into the model. Utilizing the proposed fracture mechanics-based model, crown survival probability as a function of loading cycles was obtained from simulations performed on the 4 ceramic materials utilizing identical crown geometries and loading conditions. The weaker ceramic materials (FM and LR) resulted in lower survival rates than the more recently developed higher-strength ceramic materials (LD and YZ). The simulated 10-y survival rate of crowns fabricated from YZ was only slightly better than those fabricated from LD. In addition, 2 of the model crown systems (FM and LD) were expanded to determine regional-dependent failure probabilities. This analysis predicted that the LD-based crowns were more likely to fail from fractures initiating from margin areas, whereas the FM-based crowns showed a slightly higher probability of failure from fractures initiating from the occlusal table below the contact areas. These 2 predicted fracture initiation locations have some agreement with reported fractographic analyses of failed crowns. In this model, we considered the maximum tensile stress tangential to the interfacial surface, as opposed to the more universally reported maximum principal stress, because it more directly impacts crack propagation. While the accuracy of these predictions needs to be experimentally verified, the model can provide a fundamental understanding of the
Frailty Models in Survival Analysis
Wienke, Andreas
2010-01-01
The concept of frailty offers a convenient way to introduce unobserved heterogeneity and associations into models for survival data. In its simplest form, frailty is an unobserved random proportionality factor that modifies the hazard function of an individual or a group of related individuals. "Frailty Models in Survival Analysis" presents a comprehensive overview of the fundamental approaches in the area of frailty models. The book extensively explores how univariate frailty models can represent unobserved heterogeneity. It also emphasizes correlated frailty models as extensions of
Mathematical Methods in Survival Analysis, Reliability and Quality of Life
Huber, Catherine; Mesbah, Mounir
2008-01-01
Reliability and survival analysis are important applications of stochastic mathematics (probability, statistics and stochastic processes) that are usually covered separately in spite of the similarity of the involved mathematical theory. This title aims to redress this situation: it includes 21 chapters divided into four parts: Survival analysis, Reliability, Quality of life, and Related topics. Many of these chapters were presented at the European Seminar on Mathematical Methods for Survival Analysis, Reliability and Quality of Life in 2006.
Statistical methods for bioimpedance analysis
Directory of Open Access Journals (Sweden)
Christian Tronstad
2014-04-01
Full Text Available This paper gives a basic overview of relevant statistical methods for the analysis of bioimpedance measurements, with an aim to answer questions such as: How do I begin with planning an experiment? How many measurements do I need to take? How do I deal with large amounts of frequency sweep data? Which statistical test should I use, and how do I validate my results? Beginning with the hypothesis and the research design, the methodological framework for making inferences based on measurements and statistical analysis is explained. This is followed by a brief discussion on correlated measurements and data reduction before an overview is given of statistical methods for comparison of groups, factor analysis, association, regression and prediction, explained in the context of bioimpedance research. The last chapter is dedicated to the validation of a new method by different measures of performance. A flowchart is presented for selection of statistical method, and a table is given for an overview of the most important terms of performance when evaluating new measurement technology.
Pyrotechnic Shock Analysis Using Statistical Energy Analysis
2015-10-23
29th Aerospace Testing Seminar, October 2015 Pyrotechnic Shock Analysis Using Statistical Energy Analysis James Ho-Jin Hwang Engineering...maximum structural response due to a pyrotechnic shock input using Statistical Energy Analysis (SEA). It had been previously understood that since the...pyrotechnic shock is not a steady state event, traditional SEA method may not applicable. A new analysis methodology effectively utilizes the
Regularized Statistical Analysis of Anatomy
DEFF Research Database (Denmark)
Sjöstrand, Karl
2007-01-01
This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus cal...
Survival probability and order statistics of diffusion on disordered media.
Acedo, L; Yuste, S B
2002-07-01
We investigate the first passage time t(j,N) to a given chemical or Euclidean distance of the first j of a set of N>1 independent random walkers all initially placed on a site of a disordered medium. To solve this order-statistics problem we assume that, for short times, the survival probability (the probability that a single random walker is not absorbed by a hyperspherical surface during some time interval) decays for disordered media in the same way as for Euclidean and some class of deterministic fractal lattices. This conjecture is checked by simulation on the incipient percolation aggregate embedded in two dimensions. Arbitrary moments of t(j,N) are expressed in terms of an asymptotic series in powers of 1/ln N, which is formally identical to those found for Euclidean and (some class of) deterministic fractal lattices. The agreement of the asymptotic expressions with simulation results for the two-dimensional percolation aggregate is good when the boundary is defined in terms of the chemical distance. The agreement worsens slightly when the Euclidean distance is used.
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Duality of circulation decay statistics and survival probability
2010-09-01
Survival probability and circulation decay history have both been used for setting wake turbulence separation standards. Conceptually a strong correlation should exist between these two characterizations of the vortex behavior, however, the literatur...
Incidence, Survival and Prevalence Statistics of Classical Myeloproliferative Neoplasm in Korea.
Lim, Yoojoo; Lee, Jeong Ok; Bang, Soo Mee
2016-10-01
The nationwide statistical analysis of each disease of classical myeloproliferative neoplasm (MPN) in Korea has not been reported yet. To this end, we have analyzed incidence rates, survival rates and treatment pattern of polycythemia vera (PV), primary myelofibrosis (MF) and essential thrombocythemia (ET) using Korea National Cancer Incidence Database (KNCIDB) and Health Insurance Review and Assessment Service (HIRA) database. Between 2003 and 2011, a total of 4,342 new cases of MPN were reported to the KNCIDB. ET was the most common, followed by MF and PV. The crude incidence rates for PV, MF, and ET have increased during the period, reaching 0.40, 0.15, and 0.84 per 100,000, respectively. Five-year relative survival rate of all MPN patients was 89.3%, with lowest relative survival rate with MF (53.1%). The prevalence of each disease estimated from HIRA data also increased during the study period. Notably, ET was found to be most prevalent. The prescription rate of hydroxyurea and phlebotomy to PV, MF and ET patients remained constant over the period, and the prescription rate of hydroxyurea was higher in patients with age over 60 years. This is the first Korean nationwide statistics of MPN, using central registry data. This set of data can be utilized to compare the Korean MPN status to international data and guidelines.
Statistical analysis of management data
Gatignon, Hubert
2013-01-01
This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.
Empirical likelihood method in survival analysis
Zhou, Mai
2015-01-01
Add the Empirical Likelihood to Your Nonparametric ToolboxEmpirical Likelihood Method in Survival Analysis explains how to use the empirical likelihood method for right censored survival data. The author uses R for calculating empirical likelihood and includes many worked out examples with the associated R code. The datasets and code are available for download on his website and CRAN.The book focuses on all the standard survival analysis topics treated with empirical likelihood, including hazard functions, cumulative distribution functions, analysis of the Cox model, and computation of empiric
A Statistical Analysis of Cryptocurrencies
Stephen Chan; Jeffrey Chu; Saralees Nadarajah; Joerg Osterrieder
2017-01-01
We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization), of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic...
A Statistical Analysis of Cryptocurrencies
Directory of Open Access Journals (Sweden)
Stephen Chan
2017-05-01
Full Text Available We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization, of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic distribution gives the best fit, while for the smaller cryptocurrencies the normal inverse Gaussian distribution, generalized t distribution, and Laplace distribution give good fits. The results are important for investment and risk management purposes.
Morphological Analysis for Statistical Machine Translation
National Research Council Canada - National Science Library
Lee, Young-Suk
2004-01-01
We present a novel morphological analysis technique which induces a morphological and syntactic symmetry between two languages with highly asymmetrical morphological structures to improve statistical...
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical methods for astronomical data analysis
Chattopadhyay, Asis Kumar
2014-01-01
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...
STATISTICAL ANALYSIS OF MONETARY POLICY INDICATORS VARIABILITY
Directory of Open Access Journals (Sweden)
ANAMARIA POPESCU
2016-10-01
Full Text Available This paper attempts to characterize through statistical indicators of statistical data that we have available. The purpose of this paper is to present statistical indicators, primary and secondary, simple and synthetic, which is frequently used for statistical characterization of statistical series. We can thus analyze central tendency, and data variability, form and concentration distributions package data using analytical tools in Microsoft Excel that enables automatic calculation of descriptive statistics using Data Analysis option from the Tools menu. We will also study the links which exist between statistical variables can be studied using two techniques, correlation and regression. From the analysis of monetary policy in the period 2003 - 2014 and information provided by the website of the National Bank of Romania (BNR seems to be a certain tendency towards eccentricity and asymmetry of financial data series.
Relevance Vector Machine for Survival Analysis.
Kiaee, Farkhondeh; Sheikhzadeh, Hamid; Mahabadi, Samaneh Eftekhari
2016-03-01
An accelerated failure time (AFT) model has been widely used for the analysis of censored survival or failure time data. However, the AFT imposes the restrictive log-linear relation between the survival time and the explanatory variables. In this paper, we introduce a relevance vector machine survival (RVMS) model based on Weibull AFT model that enables the use of kernel framework to automatically learn the possible nonlinear effects of the input explanatory variables on target survival times. We take advantage of the Bayesian inference technique in order to estimate the model parameters. We also introduce two approaches to accelerate the RVMS training. In the first approach, an efficient smooth prior is employed that improves the degree of sparsity. In the second approach, a fast marginal likelihood maximization procedure is used for obtaining a sparse solution of survival analysis task by sequential addition and deletion of candidate basis functions. These two approaches, denoted by smooth RVMS and fast RVMS, typically use fewer basis functions than RVMS and improve the RVMS training time; however, they cause a slight degradation in the RVMS performance. We compare the RVMS and the two accelerated approaches with the previous sparse kernel survival analysis method on a synthetic data set as well as six real-world data sets. The proposed kernel survival analysis models have been discovered to be more accurate in prediction, although they benefit from extra sparsity. The main advantages of our proposed models are: 1) extra sparsity that leads to a better generalization and avoids overfitting; 2) automatic relevance sample determination based on data that provide more accuracy, in particular for highly censored survival data; and 3) flexibility to utilize arbitrary number and types of kernel functions (e.g., non-Mercer kernels and multikernel learning).
Analysis of survival data from telemetry projects
Bunck, C.M.; Winterstein, S.R.; Pollock, K.H.
1985-01-01
Telemetry techniques can be used to study the survival rates of animal populations and are particularly suitable for species or settings for which band recovery models are not. Statistical methods for estimating survival rates and parameters of survival distributions from observations of radio-tagged animals will be described. These methods have been applied to medical and engineering studies and to the study of nest success. Estimates and tests based on discrete models, originally introduced by Mayfield, and on continuous models, both parametric and nonparametric, will be described. Generalizations, including staggered entry of subjects into the study and identification of mortality factors will be considered. Additional discussion topics will include sample size considerations, relocation frequency for subjects, and use of covariates.
Statistical analysis with Excel for dummies
Schmuller, Joseph
2013-01-01
Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro
Attenuation caused by infrequently updated covariates in survival analysis
DEFF Research Database (Denmark)
Andersen, Per Kragh; Liestøl, Knut
2003-01-01
Attenuation; Cox regression model; Measurement errors; Survival analysis; Time-dependent covariates......Attenuation; Cox regression model; Measurement errors; Survival analysis; Time-dependent covariates...
Statistical analysis of medical data using SAS
Der, Geoff
2005-01-01
An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences
Survival analysis of orthodontic mini-implants.
Lee, Shin-Jae; Ahn, Sug-Joon; Lee, Jae Won; Kim, Seong-Hun; Kim, Tae-Woo
2010-02-01
Survival analysis is useful in clinical research because it focuses on comparing the survival distributions and the identification of risk factors. Our aim in this study was to investigate the survival characteristics and risk factors of orthodontic mini-implants with survival analyses. One hundred forty-one orthodontic patients (treated from October 1, 2000, to November 29, 2007) were included in this survival study. A total of 260 orthodontic mini-implants that had sandblasted (large grit) and acid-etched screw parts were placed between the maxillary second premolar and the first molar. Failures of the implants were recorded as event data, whereas implants that were removed because treatment ended and those that were not removed during the study period were recorded as censored data. A nonparametric life table method was used to visualize the hazard function, and Kaplan-Meier survival curves were generated to identify the variables associated with implant failure. Prognostic variables associated with implant failure were identified with the Cox proportional hazard model. Of the 260 implants, 22 failed. The hazard function for implant failure showed that the risk is highest immediately after placement. The survival function showed that the median survival time of orthodontic mini-implants is sufficient for relatively long orthodontic treatments. The Cox proportional hazard model identified that increasing age is a decisive factor for implant survival. The decreasing pattern of the hazard function suggested gradual osseointegration of orthodontic mini-implants. When implants are placed in a young patient, special caution is needed to lessen the increased probability of failure, especially immediately after placement.
Model selection criterion in survival analysis
Karabey, Uǧur; Tutkun, Nihal Ata
2017-07-01
Survival analysis deals with time until occurrence of an event of interest such as death, recurrence of an illness, the failure of an equipment or divorce. There are various survival models with semi-parametric or parametric approaches used in medical, natural or social sciences. The decision on the most appropriate model for the data is an important point of the analysis. In literature Akaike information criteria or Bayesian information criteria are used to select among nested models. In this study,the behavior of these information criterion is discussed for a real data set.
[Statistical analysis using freely-available "EZR (Easy R)" software].
Kanda, Yoshinobu
2015-10-01
Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
Hypothesis testing and statistical analysis of microbiome
Directory of Open Access Journals (Sweden)
Yinglin Xia
2017-09-01
Full Text Available After the initiation of Human Microbiome Project in 2008, various biostatistic and bioinformatic tools for data analysis and computational methods have been developed and applied to microbiome studies. In this review and perspective, we discuss the research and statistical hypotheses in gut microbiome studies, focusing on mechanistic concepts that underlie the complex relationships among host, microbiome, and environment. We review the current available statistic tools and highlight recent progress of newly developed statistical methods and models. Given the current challenges and limitations in biostatistic approaches and tools, we discuss the future direction in developing statistical methods and models for the microbiome studies.
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Covariate analysis of bivariate survival data
Energy Technology Data Exchange (ETDEWEB)
Bennett, L.E.
1992-01-01
The methods developed are used to analyze the effects of covariates on bivariate survival data when censoring and ties are present. The proposed method provides models for bivariate survival data that include differential covariate effects and censored observations. The proposed models are based on an extension of the univariate Buckley-James estimators which replace censored data points by their expected values, conditional on the censoring time and the covariates. For the bivariate situation, it is necessary to determine the expectation of the failure times for one component conditional on the failure or censoring time of the other component. Two different methods have been developed to estimate these expectations. In the semiparametric approach these expectations are determined from a modification of Burke's estimate of the bivariate empirical survival function. In the parametric approach censored data points are also replaced by their conditional expected values where the expected values are determined from a specified parametric distribution. The model estimation will be based on the revised data set, comprised of uncensored components and expected values for the censored components. The variance-covariance matrix for the estimated covariate parameters has also been derived for both the semiparametric and parametric methods. Data from the Demographic and Health Survey was analyzed by these methods. The two outcome variables are post-partum amenorrhea and breastfeeding; education and parity were used as the covariates. Both the covariate parameter estimates and the variance-covariance estimates for the semiparametric and parametric models will be compared. In addition, a multivariate test statistic was used in the semiparametric model to examine contrasts. The significance of the statistic was determined from a bootstrap distribution of the test statistic.
SURVIVAL ANALYSIS AND LENGTH-BIASED SAMPLING
Directory of Open Access Journals (Sweden)
Masoud Asgharian
2010-12-01
Full Text Available When survival data are colleted as part of a prevalent cohort study, the recruited cases have already experienced their initiating event. These prevalent cases are then followed for a fixed period of time at the end of which the subjects will either have failed or have been censored. When interests lies in estimating the survival distribution, from onset, of subjects with the disease, one must take into account that the survival times of the cases in a prevalent cohort study are left truncated. When it is possible to assume that there has not been any epidemic of the disease over the past period of time that covers the onset times of the subjects, one may assume that the underlying incidence process that generates the initiating event times is a stationary Poisson process. Under such assumption, the survival times of the recruited subjects are called “lengthbiased”. I discuss the challenges one is faced with in analyzing these type of data. To address the theoretical aspects of the work, I present asymptotic results for the NPMLE of the length-biased as well as the unbiased survival distribution. I also discuss estimating the unbiased survival function using only the follow-up time. This addresses the case that the onset times are either unknown or known with uncertainty. Some of our most recent work and open questions will be presented. These include some aspects of analysis of covariates, strong approximation, functional LIL and density estimation under length-biased sampling with right censoring. The results will be illustrated with survival data from patients with dementia, collected as part of the Canadian Study of Health and Aging (CSHA.
Classification, (big) data analysis and statistical learning
Conversano, Claudio; Vichi, Maurizio
2018-01-01
This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...
Statistics and analysis of scientific data
Bonamente, Massimiliano
2013-01-01
Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...
Neyman, Markov processes and survival analysis.
Yang, Grace
2013-07-01
J. Neyman used stochastic processes extensively in his applied work. One example is the Fix and Neyman (F-N) competing risks model (1951) that uses finite homogeneous Markov processes to analyse clinical trials with breast cancer patients. We revisit the F-N model, and compare it with the Kaplan-Meier (K-M) formulation for right censored data. The comparison offers a way to generalize the K-M formulation to include risks of recovery and relapses in the calculation of a patient's survival probability. The generalization is to extend the F-N model to a nonhomogeneous Markov process. Closed-form solutions of the survival probability are available in special cases of the nonhomogeneous processes, like the popular multiple decrement model (including the K-M model) and Chiang's staging model, but these models do not consider recovery and relapses while the F-N model does. An analysis of sero-epidemiology current status data with recurrent events is illustrated. Fix and Neyman used Neyman's RBAN (regular best asymptotic normal) estimates for the risks, and provided a numerical example showing the importance of considering both the survival probability and the length of time of a patient living a normal life in the evaluation of clinical trials. The said extension would result in a complicated model and it is unlikely to find analytical closed-form solutions for survival analysis. With ever increasing computing power, numerical methods offer a viable way of investigating the problem.
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; Højsgaard, Søren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....
The implicative statistical analysis: an interdisciplinary paradigm
Iurato, Giuseppe
2012-01-01
In this brief note, which has simply the role of an epistemological survey paper, some of the main basic elements of Implicative Statistical Analysis (ISA) pattern are put into a possible critical comparison with some of the main aspects of Probability Theory, Inductive Inference Theory, Nonparametric and Multivariate Statistics, Optimization Theory and Dynamical System Theory which point out the very interesting multidisciplinary nature of the ISA pattern and related possible hints.
Foundation of statistical energy analysis in vibroacoustics
Le Bot, A
2015-01-01
This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.
Statistical analysis of SAMPEX PET proton measurements
Pierrard, V; Heynderickx, D; Kruglanski, M; Looper, M; Blake, B; Mewaldt, D
2000-01-01
We present a statistical study of the distributions of proton counts from the Proton-Electron Telescope aboard the low-altitude polar satellite SAMPEX. Our statistical analysis shows that histograms of observed proton counts are generally distributed according to Poisson distributions but are sometimes quite different. The observed departures from Poisson distributions can be attributed to variations of the average flux or to the non-constancy of the detector lifetimes.
Making relative survival analysis relatively easy.
Pohar, Maja; Stare, Janez
2007-12-01
In survival analysis we are interested in time from the beginning of an observation until certain event (death, relapse, etc.). We assume that the final event is well defined, so that we are never in doubt whether the final event has occurred or not. In practice this is not always true. If we are interested in cause-specific deaths, then it may sometimes be difficult or even impossible to establish the cause of death, or there may be different causes of death, making it impossible to assign death to just one cause. Suicides of terminal cancer patients are a typical example. In such cases, standard survival techniques cannot be used for estimation of mortality due to a certain cause. The cure to the problem are relative survival techniques which compare the survival experience in a study cohort to the one expected should they follow the background population mortality rates. This enables the estimation of the proportion of deaths due to a certain cause. In this paper, we briefly review some of the techniques to model relative survival, and outline a new fitting method for the additive model, which solves the problem of dependency of the parameter estimation on the assumption about the baseline excess hazard. We then direct the reader's attention to our R package relsurv that provides functions for easy and flexible fitting of all the commonly used relative survival regression models. The basic features of the package have been described in detail elsewhere, but here we additionally explain the usage of the new fitting method and the interface for using population mortality data freely available on the Internet. The combination of the package and the data sets provides a powerful informational tool in the hands of a skilled statistician/informatician.
Integrative Genomics with Mediation Analysis in a Survival Context
Directory of Open Access Journals (Sweden)
Szilárd Nemes
2013-01-01
Full Text Available DNA copy number aberrations (DCNA and subsequent altered gene expression profiles may have a major impact on tumor initiation, on development, and eventually on recurrence and cancer-specific mortality. However, most methods employed in integrative genomic analysis of the two biological levels, DNA and RNA, do not consider survival time. In the present note, we propose the adoption of a survival analysis-based framework for the integrative analysis of DCNA and mRNA levels to reveal their implication on patient clinical outcome with the prerequisite that the effect of DCNA on survival is mediated by mRNA levels. The specific aim of the paper is to offer a feasible framework to test the DCNA-mRNA-survival pathway. We provide statistical inference algorithms for mediation based on asymptotic results. Furthermore, we illustrate the applicability of the method in an integrative genomic analysis setting by using a breast cancer data set consisting of 141 invasive breast tumors. In addition, we provide implementation in R.
Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas
2017-04-15
The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Statistical analysis of network data with R
Kolaczyk, Eric D
2014-01-01
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Cancer Statistics in Korea: Incidence, Mortality, Survival, and Prevalence in 2014.
Jung, Kyu-Won; Won, Young-Joo; Oh, Chang-Mo; Kong, Hyun-Joo; Lee, Duk Hyoung; Lee, Kang Hyun
2017-04-01
This study presents the 2014 nationwide cancer statistics in Korea, including cancer incidence, survival, prevalence, and mortality. Cancer incidence data from 1999 to 2014 was obtained from the Korea National Cancer Incidence Database and followed until December 31, 2015. Mortality data from 1983 to 2014 were obtained from Statistics Korea. The prevalence was defined as the number of cancer patients alive on January 1, 2015, among all cancer patients diagnosed since 1999. Crude and age-standardized rates (ASRs) for incidence, mortality, prevalence, and 5-year relative survivals were also calculated. In 2014, 217,057 and 76,611 Koreans were newly diagnosed and died from cancer respectively. The ASRs for cancer incidence and mortality in 2014 were 270.7 and 85.1 per 100,000, respectively. The all-cancer incidence rate has increased significantly by 3.4% annually from 1999 to 2012, and started to decrease after 2012 (2012-2014; annual percent change, -6.6%). However, overall cancer mortality has decreased 2.7% annually since 2002. The 5-year relative survival rate for patients diagnosed with cancer between 2010 and 2014 was 70.3%, an improvement from the 41.2% for patients diagnosed between 1993 and 1995. Age-standardized cancer incidence rates have decreased since 2012 and mortality rates have also declined since 2002, while 5-year survival rates have improved remarkably from 1993-1995 to 2010-2014 in Korea.
Survival analysis of patients on maintenance hemodialysis
Directory of Open Access Journals (Sweden)
A Chandrashekar
2014-01-01
Full Text Available Despite the continuous improvement of dialysis technology and pharmacological treatment, mortality rates for dialysis patients are still high. A 2-year prospective study was conducted at a tertiary care hospital to determine the factors influencing survival among patients on maintenance hemodialysis. 96 patients with end-stage renal disease surviving more than 3 months on hemodialysis (8-12 h/week were studied. Follow-up was censored at the time of death or at the end of 2-year study period, whichever occurred first. Of the 96 patients studied (mean age 49.74 ± 14.55 years, 75% male and 44.7% diabetics, 19 died with an estimated mortality rate of 19.8%. On an age-adjusted multivariate analysis, female gender and hypokalemia independently predicted mortality. In Cox analyses, patient survival was associated with delivered dialysis dose (single pool Kt/V, hazard ratio [HR] =0.01, P = 0.016, frequency of hemodialysis (HR = 3.81, P = 0.05 and serum albumin (HR = 0.24, P = 0.005. There was no significant difference between diabetes and non-diabetes in relation to death (Relative Risk = 1.109; 95% CI = 0.49-2.48, P = 0.803. This study revealed that mortality among hemodialysis patients remained high, mostly due to sepsis and ischemic heart disease. Patient survival was better with higher dialysis dose, increased frequency of dialysis and adequate serum albumin level. Efforts at minimizing infectious complications, preventing cardiovascular events and improving nutrition should increase survival among hemodialysis patients.
About Statistical Analysis of Qualitative Survey Data
Directory of Open Access Journals (Sweden)
Stefan Loehnert
2010-01-01
Full Text Available Gathered data is frequently not in a numerical form allowing immediate appliance of the quantitative mathematical-statistical methods. In this paper are some basic aspects examining how quantitative-based statistical methodology can be utilized in the analysis of qualitative data sets. The transformation of qualitative data into numeric values is considered as the entrance point to quantitative analysis. Concurrently related publications and impacts of scale transformations are discussed. Subsequently, it is shown how correlation coefficients are usable in conjunction with data aggregation constrains to construct relationship modelling matrices. For illustration, a case study is referenced at which ordinal type ordered qualitative survey answers are allocated to process defining procedures as aggregation levels. Finally options about measuring the adherence of the gathered empirical data to such kind of derived aggregation models are introduced and a statistically based reliability check approach to evaluate the reliability of the chosen model specification is outlined.
Statistics and analysis of scientific data
Bonamente, Massimiliano
2017-01-01
The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...
The fuzzy approach to statistical analysis
Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.
2006-01-01
For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;
FS5 sun exposure survivability analysis
Directory of Open Access Journals (Sweden)
Ming-Ying Hsu
2017-01-01
Full Text Available During the Acquisition and Safe Hold (ASH mode, FORMOAT-5 (FS5 satellite attitude is not fully controlled. Direct sun exposure on the Remote Sensing Instrument (RSI satellite telescope sensor may occur. The sun exposure effect on RSI sensor performance is investigated to evaluate the instrument’s survivability in orbit. Both satellite spin speed and sun exposure duration are considered as the key parameters in this study. A simple radiometry technique is used to calculate the total sun radiance exposure to examine the RSI sensor integrity. Total sun irradiance on the sensor is computed by considering the spectral variation effect through the RSI’s five-band filter. Experiments that directly expose the sensor to the sun on the ground were performed with no obvious performance degradation found. Based on both the analysis and experiment results, it is concluded that the FS5 RSI sensor can survive direct sun exposure during the ASH mode.
Selected papers on analysis, probability, and statistics
Nomizu, Katsumi
1994-01-01
This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
Allemani, Claudia; Harewood, Rhea; Johnson, Christopher J; Carreira, Helena; Spika, Devon; Bonaventure, Audrey; Ward, Kevin; Weir, Hannah K; Coleman, Michel P
2017-12-15
Robust comparisons of population-based cancer survival estimates require tight adherence to the study protocol, standardized quality control, appropriate life tables of background mortality, and centralized analysis. The CONCORD program established worldwide surveillance of population-based cancer survival in 2015, analyzing individual data on 26 million patients (including 10 million US patients) diagnosed between 1995 and 2009 with 1 of 10 common malignancies. In this Cancer supplement, we analyzed data from 37 state cancer registries that participated in the second cycle of the CONCORD program (CONCORD-2), covering approximately 80% of the US population. Data quality checks were performed in 3 consecutive phases: protocol adherence, exclusions, and editorial checks. One-, 3-, and 5-year age-standardized net survival was estimated using the Pohar Perme estimator and state- and race-specific life tables of all-cause mortality for each year. The cohort approach was adopted for patients diagnosed between 2001 and 2003, and the complete approach for patients diagnosed between 2004 and 2009. Articles in this supplement report population coverage, data quality indicators, and age-standardized 5-year net survival by state, race, and stage at diagnosis. Examples of tables, bar charts, and funnel plots are provided in this article. Population-based cancer survival is a key measure of the overall effectiveness of services in providing equitable health care. The high quality of US cancer registry data, 80% population coverage, and use of an unbiased net survival estimator ensure that the survival trends reported in this supplement are robustly comparable by race and state. The results can be used by policymakers to identify and address inequities in cancer survival in each state and for the United States nationally. Cancer 2017;123:4982-93. Published 2017. This article is a U.S. Government work and is in the public domain in the USA. Published 2017. This article is a U
Riaz, Ahsun; Gabr, Ahmed; Abouchaleh, Nadine; Ali, Rehan; Alasadi, Ali; Mora, Ronald; Kulik, Laura; Desai, Kush; Thornburg, Bartley; Mouli, Samdeep; Hickey, Ryan; Miller, Frank H; Yaghmai, Vahid; Ganger, Daniel; Lewandowski, Robert J; Salem, Riad
2017-08-18
Does imaging response predict survival in hepatocellular carcinoma (HCC)? We studied the ability of post-therapeutic imaging response to predict overall survival. Over 14 years, 948 HCC patients were treated with radioembolization. Patients with baseline metastases, vascular invasion, multifocal disease, Child-Pugh>B7 and transplanted/resected were excluded. This created our homogenous study cohort of 134 Child-Pugh≤B7 patients with solitary HCC. Response (using European Association for Study of the Liver [EASL] and Response Evaluation Criteria in Solid Tumors 1.1 [RECIST 1.1] criteria) was associated with survival using Landmark and risk-of-death methodologies after reviewing 960 scans. In a sub-analysis, survival times of responders were compared to those of patients with stable disease (SD) and progressive disease (PD). Uni/multivariate survival analyses were performed at each Landmark. At the 3-month Landmark, responders survived longer than nonresponders by EASL (HR:0.46; CI:0.26-0.82; P=0.002) but not RECIST 1.1 criteria (HR:0.70; CI:0.37-1.32; P=0.32). At the 6-month Landmark, responders survived longer than nonresponders by EASL (HR:0.32; CI:0.15-0.77; P<0.001) and RECIST 1.1 criteria (HR:0.50; CI:0.29-0.87; P=0.021). At the 12-month Landmark, responders survived longer than nonresponders by EASL (HR:0.34; CI:0.15-0.77; P<0.001) and RECIST 1.1 criteria (HR:0.52;CI 0.27-0.98; P=0.049). At 6 months, risk of death was lower for responders by EASL (P<0.001) and RECIST 1.1 (P=0.0445). In sub-analyses, responders lived longer than patients with SD or PD. EASL response was a significant predictor of survival at 3, 6, and 12 month Landmarks on uni/multivariate analyses. Response to radioembolization in patients with solitary HCC can prognosticate improved survival. EASL necrosis criteria outperformed RECIST 1.1 size criteria in predicting survival. The therapeutic objective of radioembolization should be radiologic response and not solely to prevent progression
Urdapilleta, Eugenio
2011-02-01
The survival probability and the first-passage-time statistics are important quantities in different fields. The Wiener process is the simplest stochastic process with continuous variables, and important results can be explicitly found from it. The presence of a constant drift does not modify its simplicity; however, when the process has a time-dependent component the analysis becomes difficult. In this work we analyze the statistical properties of the Wiener process with an absorbing boundary, under the effect of an exponential time-dependent drift. Based on the backward Fokker-Planck formalism we set the time-inhomogeneous equation and conditions that rule the diffusion of the corresponding survival probability. We propose as the solution an expansion series in terms of the intensity of the exponential drift, resulting in a set of recurrence equations. We explicitly solve the expansion up to second order and comment on higher-order solutions. The first-passage-time density function arises naturally from the survival probability and preserves the proposed expansion. Explicit results, related properties, and limit behaviors are analyzed and extensively compared to numerical simulations.
Urdapilleta, Eugenio
2011-02-01
The survival probability and the first-passage-time statistics are important quantities in different fields. The Wiener process is the simplest stochastic process with continuous variables, and important results can be explicitly found from it. The presence of a constant drift does not modify its simplicity; however, when the process has a time-dependent component the analysis becomes difficult. In this work we analyze the statistical properties of the Wiener process with an absorbing boundary, under the effect of an exponential time-dependent drift. Based on the backward Fokker-Planck formalism we set the time-inhomogeneous equation and conditions that rule the diffusion of the corresponding survival probability. We propose as the solution an expansion series in terms of the intensity of the exponential drift, resulting in a set of recurrence equations. We explicitly solve the expansion up to second order and comment on higher-order solutions. The first-passage-time density function arises naturally from the survival probability and preserves the proposed expansion. Explicit results, related properties, and limit behaviors are analyzed and extensively compared to numerical simulations.
Statistical Tools for Forensic Analysis of Toolmarks
Energy Technology Data Exchange (ETDEWEB)
David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser
2004-04-22
Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.
Spatial Statistical Analysis of Large Astronomical Datasets
Szapudi, Istvan
2002-12-01
The future of astronomy will be dominated with large and complex data bases. Megapixel CMB maps, joint analyses of surveys across several wavelengths, as envisioned in the planned National Virtual Observatory (NVO), TByte/day data rate of future surveys (Pan-STARRS) put stringent constraints on future data analysis methods: they have to achieve at least N log N scaling to be viable in the long term. This warrants special attention to computational requirements, which were ignored during the initial development of current analysis tools in favor of statistical optimality. Even an optimal measurement, however, has residual errors due to statistical sample variance. Hence a suboptimal technique with significantly smaller measurement errors than the unavoidable sample variance produces results which are nearly identical to that of a statistically optimal technique. For instance, for analyzing CMB maps, I present a suboptimal alternative, indistinguishable from the standard optimal method with N3 scaling, that can be rendered N log N with a hierarchical representation of the data; a speed up of a trillion times compared to other methods. In this spirit I will present a set of novel algorithms and methods for spatial statistical analyses of future large astronomical data bases, such as galaxy catalogs, megapixel CMB maps, or any point source catalog.
Statistical Analysis of Iberian Peninsula Megaliths Orientations
González-García, A. C.
2009-08-01
Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.
Multivariate analysis: A statistical approach for computations
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Vapor Pressure Data Analysis and Statistics
2016-12-01
there were flaws in the original data prior to its publication. 3. FITTING METHODS Our process for correlating experimental vapor pressure ...2. Penski, E.C. Vapor Pressure Data Analysis Methodology, Statistics, and Applications; CRDEC-TR-386; U.S. Army Chemical Research, Development, and... Chemical Biological Center: Aberdeen Proving Ground, MD, 2006; UNCLASSIFIED Report (ADA447993). 11. Kemme, H.R.; Kreps, S.I. Vapor Pressure of
[Clinical research XXI. From the clinical judgment to survival analysis].
Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios, Lino; Talavera, Juan O
2014-01-01
Decision making in health care implies knowledge of the clinical course of the disease. Knowing the course allows us to estimate the likelihood of occurrence of a phenomenon at a given time or its duration. Within the statistical models that allow us to have a summary measure to estimate the time of occurrence of a phenomenon in a given population are the linear regression (the outcome variable is continuous and normally distributed -time to the occurrence of the event-), logistic regression (outcome variable is dichotomous, and it is evaluated at one single interval), and survival curves (outcome event is dichotomous, and it can be evaluated at multiple intervals). The first reference we have of this type of analysis is the work of the astronomer Edmond Halley, an English physicist and mathematician, famous for the calculation of the appearance of the comet orbit, recognized as the first periodic comet (1P/Halley's Comet). Halley also contributed in the area of health to estimate the mortality rate for a Polish population. The survival curve allows us to estimate the probability of an event occurring at different intervals. Also, it leds us to estimate the median survival time of any phenomenon of interest (although the used term is survival, the outcome does not need to be death, it may be the occurrence of any other event).
Regression analysis of restricted mean survival time based on pseudo-observations
DEFF Research Database (Denmark)
Andersen, Per Kragh; Hansen, Mette Gerster; Klein, John P.
censoring; hazard function; health economics; regression model; survival analysis; mean survival time; restricted mean survival time; pseudo-observations......censoring; hazard function; health economics; regression model; survival analysis; mean survival time; restricted mean survival time; pseudo-observations...
Regression Analysis of Restricted Mean Survival Time Based on Pseudo-Observations
DEFF Research Database (Denmark)
Andersen, Per Kragh; Hansen, Mette Gerster; Klein, John P.
2004-01-01
censoring; hazard function; health economics; mean survival time; pseudo-observations; regression model; restricted mean survival time; survival analysis......censoring; hazard function; health economics; mean survival time; pseudo-observations; regression model; restricted mean survival time; survival analysis...
Statistical modelling of survival data with random effects h-likelihood approach
Ha, Il Do; Lee, Youngjo
2017-01-01
This book provides a groundbreaking introduction to the likelihood inference for correlated survival data via the hierarchical (or h-) likelihood in order to obtain the (marginal) likelihood and to address the computational difficulties in inferences and extensions. The approach presented in the book overcomes shortcomings in the traditional likelihood-based methods for clustered survival data such as intractable integration. The text includes technical materials such as derivations and proofs in each chapter, as well as recently developed software programs in R (“frailtyHL”), while the real-world data examples together with an R package, “frailtyHL” in CRAN, provide readers with useful hands-on tools. Reviewing new developments since the introduction of the h-likelihood to survival analysis (methods for interval estimation of the individual frailty and for variable selection of the fixed effects in the general class of frailty models) and guiding future directions, the book is of interest to research...
Statistical analysis of brake squeal noise
Oberst, S.; Lai, J. C. S.
2011-06-01
Despite substantial research efforts applied to the prediction of brake squeal noise since the early 20th century, the mechanisms behind its generation are still not fully understood. Squealing brakes are of significant concern to the automobile industry, mainly because of the costs associated with warranty claims. In order to remedy the problems inherent in designing quieter brakes and, therefore, to understand the mechanisms, a design of experiments study, using a noise dynamometer, was performed by a brake system manufacturer to determine the influence of geometrical parameters (namely, the number and location of slots) of brake pads on brake squeal noise. The experimental results were evaluated with a noise index and ranked for warm and cold brake stops. These data are analysed here using statistical descriptors based on population distributions, and a correlation analysis, to gain greater insight into the functional dependency between the time-averaged friction coefficient as the input and the peak sound pressure level data as the output quantity. The correlation analysis between the time-averaged friction coefficient and peak sound pressure data is performed by applying a semblance analysis and a joint recurrence quantification analysis. Linear measures are compared with complexity measures (nonlinear) based on statistics from the underlying joint recurrence plots. Results show that linear measures cannot be used to rank the noise performance of the four test pad configurations. On the other hand, the ranking of the noise performance of the test pad configurations based on the noise index agrees with that based on nonlinear measures: the higher the nonlinearity between the time-averaged friction coefficient and peak sound pressure, the worse the squeal. These results highlight the nonlinear character of brake squeal and indicate the potential of using nonlinear statistical analysis tools to analyse disc brake squeal.
The CALORIES trial: statistical analysis plan.
Harvey, Sheila E; Parrott, Francesca; Harrison, David A; Mythen, Michael; Rowan, Kathryn M
2014-12-01
The CALORIES trial is a pragmatic, open, multicentre, randomised controlled trial (RCT) of the clinical effectiveness and cost-effectiveness of early nutritional support via the parenteral route compared with early nutritional support via the enteral route in unplanned admissions to adult general critical care units (CCUs) in the United Kingdom. The trial derives from the need for a large, pragmatic RCT to determine the optimal route of delivery for early nutritional support in the critically ill. To describe the proposed statistical analyses for the evaluation of the clinical effectiveness in the CALORIES trial. With the primary and secondary outcomes defined precisely and the approach to safety monitoring and data collection summarised, the planned statistical analyses, including prespecified subgroups and secondary analyses, were developed and are described. The primary outcome is all-cause mortality at 30 days. The primary analysis will be reported as a relative risk and absolute risk reduction and tested with the Fisher exact test. Prespecified subgroup analyses will be based on age, degree of malnutrition, acute severity of illness, mechanical ventilation at admission to the CCU, presence of cancer and time from CCU admission to commencement of early nutritional support. Secondary analyses include adjustment for baseline covariates. In keeping with best trial practice, we have developed, described and published a statistical analysis plan for the CALORIES trial and are placing it in the public domain before inspecting data from the trial.
Sensitivity analysis and related analysis : A survey of statistical techniques
Kleijnen, J.P.C.
1995-01-01
This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical
Statistical analysis of sleep spindle occurrences.
Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr
2013-01-01
Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Coupling strength assumption in statistical energy analysis
Lafont, T.; Totaro, N.; Le Bot, A.
2017-04-01
This paper is a discussion of the hypothesis of weak coupling in statistical energy analysis (SEA). The examples of coupled oscillators and statistical ensembles of coupled plates excited by broadband random forces are discussed. In each case, a reference calculation is compared with the SEA calculation. First, it is shown that the main SEA relation, the coupling power proportionality, is always valid for two oscillators irrespective of the coupling strength. But the case of three subsystems, consisting of oscillators or ensembles of plates, indicates that the coupling power proportionality fails when the coupling is strong. Strong coupling leads to non-zero indirect coupling loss factors and, sometimes, even to a reversal of the energy flow direction from low to high vibrational temperature.
Acute pancreatitis: analysis of factors influencing survival.
Jacobs, M L; Daggett, W M; Civette, J M; Vasu, M A; Lawson, D W; Warshaw, A L; Nardi, G L; Bartlett, M K
1977-01-01
Of patients with acute pancreatitis (AP), there remains a group who suffer life-threatening complications despite current modes of therapy. To identify factors which distinguish this group from the entire patient population, a retrospectiva analysis of 519 cases of AP occurring over a 5-year period was undertaken. Thirty-one per cent of these patients had a history of alcoholism and 47% had a history of biliary disease. The overall mortality was 12.9%. Of symptoms and signs recorded at the time of admission, hypotension, tachycardia, fever, abdominal mass, and abnormal examination of the lung fields correlated positively with increased mortality. Seven features of the initial laboratory examination correlated with increased mortality. Shock, massive colloid requirement, hypocalcemia, renal failure, and respiratory failure requiring endotracheal intubation were complications associated with the poorest prognosis. Among patients in this series with three or more of these clinical characteristics, maximal nonoperative treatment yielded a survival rate of 29%, compared to the 64% survival rate for a group of patients treated operatively with cholecystostomy, gastrostomy, feeding jejunostomy, and sump drainage of the lesser sac and retroperitoneum.
Evaluating disease management program effectiveness: an introduction to survival analysis.
Linden, Ariel; Adams, John L; Roberts, Nancy
2004-01-01
Currently, the most widely used method in the disease management industry for evaluating program effectiveness is the "total population approach." This model is a pretest-posttest design, with the most basic limitation being that without a control group, there may be sources of bias and/or competing extraneous confounding factors that offer plausible rationale explaining the change from baseline. Survival analysis allows for the inclusion of data from censored cases, those subjects who either "survived" the program without experiencing the event (e.g., achievement of target clinical levels, hospitalization) or left the program prematurely, due to disenrollement from the health plan or program, or were lost to follow-up. Additionally, independent variables may be included in the model to help explain the variability in the outcome measure. In order to maximize the potential of this statistical method, validity of the model and research design must be assured. This paper reviews survival analysis as an alternative, and more appropriate, approach to evaluating DM program effectiveness than the current total population approach.
Statistical trend analysis methods for temporal phenomena
Energy Technology Data Exchange (ETDEWEB)
Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.
Analysis of Preference Data Using Intermediate Test Statistic ...
African Journals Online (AJOL)
Intermediate statistic is a link between Friedman test statistic and the multinomial statistic. The statistic is based on ranking in a selected number of treatments, not necessarily all alternatives. We show that this statistic is transitive to well-known test statistic being used for analysis of preference data. Specifically, it is shown ...
Liu, Fangfang
The thesis is composed of three independent projects: (i) analyzing transposon-sequencing data to infer functions of genes on bacteria growth (chapter 2), (ii) developing semi-parametric Bayesian method for differential gene expression analysis with RNA-sequencing data (chapter 3), (iii) solving group selection problem for survival data (chapter 4). All projects are motivated by statistical challenges raised in biological research. The first project is motivated by the need to develop statistical models to accommodate the transposon insertion sequencing (Tn-Seq) data, Tn-Seq data consist of sequence reads around each transposon insertion site. The detection of transposon insertion at a given site indicates that the disruption of genomic sequence at this site does not cause essential function loss and the bacteria can still grow. Hence, such measurements have been used to infer the functions of each gene on bacteria growth. We propose a zero-inflated Poisson regression method for analyzing the Tn-Seq count data, and derive an Expectation-Maximization (EM) algorithm to obtain parameter estimates. We also propose a multiple testing procedure that categorizes genes into each of the three states, hypo-tolerant, tolerant, and hyper-tolerant, while controlling false discovery rate. Simulation studies show our method provides good estimation of model parameters and inference on gene functions. In the second project, we model the count data from RNA-sequencing experiment for each gene using a Poisson-Gamma hierarchical model, or equivalently, a negative binomial (NB) model. We derive a full semi-parametric Bayesian approach with Dirichlet process as the prior for the fold changes between two treatment means. An inference strategy using Gibbs algorithm is developed for differential expression analysis. We evaluate our method with several simulation studies, and the results demonstrate that our method outperforms other methods including the popularly applied ones such as edge
Statistical analysis of solar proton events
Directory of Open Access Journals (Sweden)
V. Kurt
2004-06-01
Full Text Available A new catalogue of 253 solar proton events (SPEs with energy >10MeV and peak intensity >10 protons/cm2.s.sr (pfu at the Earth's orbit for three complete 11-year solar cycles (1970-2002 is given. A statistical analysis of this data set of SPEs and their associated flares that occurred during this time period is presented. It is outlined that 231 of these proton events are flare related and only 22 of them are not associated with Ha flares. It is also noteworthy that 42 of these events are registered as Ground Level Enhancements (GLEs in neutron monitors. The longitudinal distribution of the associated flares shows that a great number of these events are connected with west flares. This analysis enables one to understand the long-term dependence of the SPEs and the related flare characteristics on the solar cycle which are useful for space weather prediction.
Wavelet and statistical analysis for melanoma classification
Nimunkar, Amit; Dhawan, Atam P.; Relue, Patricia A.; Patwardhan, Sachin V.
2002-05-01
The present work focuses on spatial/frequency analysis of epiluminesence images of dysplastic nevus and melanoma. A three-level wavelet decomposition was performed on skin-lesion images to obtain coefficients in the wavelet domain. A total of 34 features were obtained by computing ratios of the mean, variance, energy and entropy of the wavelet coefficients along with the mean and standard deviation of image intensity. An unpaired t-test for a normal distribution based features and the Wilcoxon rank-sum test for non-normal distribution based features were performed for selecting statistically correlated features. For our data set, the statistical analysis of features reduced the feature set from 34 to 5 features. For classification, the discriminant functions were computed in the feature space using the Mahanalobis distance. ROC curves were generated and evaluated for false positive fraction from 0.1 to 0.4. Most of the discrimination functions provided a true positive rate for melanoma of 93% with a false positive rate up to 21%.
Statistical analysis of tourism destination competitiveness
Directory of Open Access Journals (Sweden)
Attilio Gardini
2013-05-01
Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.
The dChip survival analysis module for microarray data
Directory of Open Access Journals (Sweden)
Minvielle Stéphane
2011-03-01
Full Text Available Abstract Background Genome-wide expression signatures are emerging as potential marker for overall survival and disease recurrence risk as evidenced by recent commercialization of gene expression based biomarkers in breast cancer. Similar predictions have recently been carried out using genome-wide copy number alterations and microRNAs. Existing software packages for microarray data analysis provide functions to define expression-based survival gene signatures. However, there is no software that can perform survival analysis using SNP array data or draw survival curves interactively for expression-based sample clusters. Results We have developed the survival analysis module in the dChip software that performs survival analysis across the genome for gene expression and copy number microarray data. Built on the current dChip software's microarray analysis functions such as chromosome display and clustering, the new survival functions include interactive exploring of Kaplan-Meier (K-M plots using expression or copy number data, computing survival p-values from the log-rank test and Cox models, and using permutation to identify significant chromosome regions associated with survival. Conclusions The dChip survival module provides user-friendly way to perform survival analysis and visualize the results in the context of genes and cytobands. It requires no coding expertise and only minimal learning curve for thousands of existing dChip users. The implementation in Visual C++ also enables fast computation. The software and demonstration data are freely available at http://dchip-surv.chenglilab.org.
Multivariate statistical analysis of wildfires in Portugal
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
Statistical analysis of sleep spindle occurrences.
Directory of Open Access Journals (Sweden)
Dagmara Panas
Full Text Available Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Statistical Analysis of Bus Networks in India
Chatterjee, Atanu; Ramadurai, Gitakrishnan
2015-01-01
Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical system as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in \\textit{L}-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world property. We observe tha...
IIB osteosarcoma. Current management, local control, and survival statistics--São Paulo, Brazil.
Petrilli, S; Penna, V; Lopes, A; Figueiredo, M T; Gentil, F C
1991-09-01
Ninety-two patients with IIB osteosarcoma of the extremities were treated with intraarterial (IA) cisplatinum (CDDP) followed by surgery [amputation (61.6%) or resection with endoprosthesis (38.4%)]. Postoperative chemotherapy alternating adriamycin and CDDP was used. The total three-year survival was 62.1%, and the disease-free survival was 41.1%. The pathologic evaluation of the degree of tumor necrosis in response to the IA CDDP showed that in 53.2%, the necrosis was over 90%. The multivariate analysis of prognostic factors has shown that the highest survival was among females with tumors smaller than 15 cm. Patients with lesions equal to or larger than 15 cm were three times as likely to die of the disease. A second, more aggressive study is now underway, in which high dose methotrexate (HDMTX) is preoperatively combined with adriamycin and CDDP. Following operation, ifosfamide is added to the cases with a smaller degree of tumor necrosis, while the other group of patients will continue with HDMTX, in addition to CDDP and adriamycin (these last two drugs are used in both arms). Until now, complete remission has been achieved in 82% and 86%, respectively, with a follow-up examination varying from four to 26 months (average, 14 months). This is of extreme importance, because the majority of the authors' patients have tumors at initial evaluation larger than 10 cm in diameter.
Survival analysis and classification methods for forest fire size.
Directory of Open Access Journals (Sweden)
Pier-Olivier Tremblay
Full Text Available Factors affecting wildland-fire size distribution include weather, fuels, and fire suppression activities. We present a novel application of survival analysis to quantify the effects of these factors on a sample of sizes of lightning-caused fires from Alberta, Canada. Two events were observed for each fire: the size at initial assessment (by the first fire fighters to arrive at the scene and the size at "being held" (a state when no further increase in size is expected. We developed a statistical classifier to try to predict cases where there will be a growth in fire size (i.e., the size at "being held" exceeds the size at initial assessment. Logistic regression was preferred over two alternative classifiers, with covariates consistent with similar past analyses. We conducted survival analysis on the group of fires exhibiting a size increase. A screening process selected three covariates: an index of fire weather at the day the fire started, the fuel type burning at initial assessment, and a factor for the type and capabilities of the method of initial attack. The Cox proportional hazards model performed better than three accelerated failure time alternatives. Both fire weather and fuel type were highly significant, with effects consistent with known fire behaviour. The effects of initial attack method were not statistically significant, but did suggest a reverse causality that could arise if fire management agencies were to dispatch resources based on a-priori assessment of fire growth potentials. We discuss how a more sophisticated analysis of larger data sets could produce unbiased estimates of fire suppression effect under such circumstances.
Survival analysis and classification methods for forest fire size.
Tremblay, Pier-Olivier; Duchesne, Thierry; Cumming, Steven G
2018-01-01
Factors affecting wildland-fire size distribution include weather, fuels, and fire suppression activities. We present a novel application of survival analysis to quantify the effects of these factors on a sample of sizes of lightning-caused fires from Alberta, Canada. Two events were observed for each fire: the size at initial assessment (by the first fire fighters to arrive at the scene) and the size at "being held" (a state when no further increase in size is expected). We developed a statistical classifier to try to predict cases where there will be a growth in fire size (i.e., the size at "being held" exceeds the size at initial assessment). Logistic regression was preferred over two alternative classifiers, with covariates consistent with similar past analyses. We conducted survival analysis on the group of fires exhibiting a size increase. A screening process selected three covariates: an index of fire weather at the day the fire started, the fuel type burning at initial assessment, and a factor for the type and capabilities of the method of initial attack. The Cox proportional hazards model performed better than three accelerated failure time alternatives. Both fire weather and fuel type were highly significant, with effects consistent with known fire behaviour. The effects of initial attack method were not statistically significant, but did suggest a reverse causality that could arise if fire management agencies were to dispatch resources based on a-priori assessment of fire growth potentials. We discuss how a more sophisticated analysis of larger data sets could produce unbiased estimates of fire suppression effect under such circumstances.
Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak
2016-06-01
Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.
Statistics Analysis Measures Painting of Cooling Tower
Directory of Open Access Journals (Sweden)
A. Zacharopoulou
2013-01-01
Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.
FCS Vehicle Transportability, Survivability, and Reliability Analysis
National Research Council Canada - National Science Library
Dion-Schwarz, Cynthia; Hirsch, Leon; Koehn, Phillip; Macheret, Jenya; Sparrow, Dave
2005-01-01
.... The investigation into metrics for transportability revealed that the C130 Transportability requirement for FCS vehicles is a constraint that leads to a less survivable platform but without improving Unit of Action (UA) transportability...
Survival analysis using S analysis of time-to-event data
Tableman, Mara
2003-01-01
Survival Analysis Using S: Analysis of Time-to-Event Data is designed as a text for a one-semester or one-quarter course in survival analysis for upper-level or graduate students in statistics, biostatistics, and epidemiology. Prerequisites are a standard pre-calculus first course in probability and statistics, and a course in applied linear regression models. No prior knowledge of S or R is assumed. A wide choice of exercises is included, some intended for more advanced students with a first course in mathematical statistics. The authors emphasize parametric log-linear models, while also detailing nonparametric procedures along with model building and data diagnostics. Medical and public health researchers will find the discussion of cut point analysis with bootstrap validation, competing risks and the cumulative incidence estimator, and the analysis of left-truncated and right-censored data invaluable. The bootstrap procedure checks robustness of cut point analysis and determines cut point(s). In a chapter ...
Transit safety & security statistics & analysis 2002 annual report (formerly SAMIS)
2004-12-01
The Transit Safety & Security Statistics & Analysis 2002 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
Transit safety & security statistics & analysis 2003 annual report (formerly SAMIS)
2005-12-01
The Transit Safety & Security Statistics & Analysis 2003 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data
Tekwe, C. D.
2012-05-24
MOTIVATION: Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. RESULTS: Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. AVAILABILITY: The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. CONTACT: ctekwe@stat.tamu.edu.
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social...... and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs...
Statistical Analysis of Bus Networks in India.
Chatterjee, Atanu; Manohar, Manju; Ramadurai, Gitakrishnan
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future.
Developments in statistical analysis in quantitative genetics
DEFF Research Database (Denmark)
Sorensen, Daniel
2009-01-01
A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and ...
Analysis of Preference Data Using Intermediate Test Statistic Abstract
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... We show that this statistic is transitive to well-known test statistic being used for analysis of preference data. Specifically, it is shown that our link is equivalent to the ... Keywords:-Preference data, Friedman statistic, multinomial test statistic, intermediate test ... favourable ones would not be a big issue in.
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard; Hoffmeyer, P.
Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull. The......-parameter Weibull (and Normal) distributions give the best fits to the data available, especially if tail fits are used whereas the LogNormal distribution generally gives poor fit and larger coefficients of variation, especially if tail fits are used........ The statistical fits have generally been made using all data (100%) and the lower tail (30%) of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. 8 different databases are analysed. The results show that 2...
Statistical convergence, selection principles and asymptotic analysis
Energy Technology Data Exchange (ETDEWEB)
Di Maio, G. [Dipartimento di Matematica, Seconda Universita di Napoli, Via Vivaldi 43, 81100 Caserta (Italy)], E-mail: giuseppe.dimaio@unina2.it; Djurcic, D. [Technical Faculty, University of Kragujevac, Svetog Save 65, 32000 Cacak (Serbia)], E-mail: dragandj@tfc.kg.ac.yu; Kocinac, Lj.D.R. [Faculty of Sciences and Mathematics, University of Nis, Visegradska 33, 18000 Nis (Serbia)], E-mail: lkocinac@ptt.rs; Zizovic, M.R. [Technical Faculty, University of Kragujevac, Svetog Save 65, 32000 Cacak (Serbia)], E-mail: zizo@tfc.kg.ac.yu
2009-12-15
We consider the set S of sequences of positive real numbers in the context of statistical convergence/divergence and show that some subclasses of S have certain nice selection and game-theoretic properties.
Survival analysis of piglet pre-weaning mortality
P. Carnier; E. Zanetti; F. Maretto; Cecchinato, A.
2010-01-01
Survival analysis methodology was applied in order to analyse sources of variation of preweaning survival time and to estimate variance components using data from a crossbred piglets population. A frailty sire model was used with the litter effect treated as an additional random source of variation. All the variables considered had a significant effect on survivability: sex, cross-fostering, parity of the nurse-sow and litter size. The variance estimates of sire and litter were closed to 0.08...
Statistical analysis of microbiological diagnostic tests
Directory of Open Access Journals (Sweden)
C P Baveja
2017-01-01
Full Text Available No study in medical science is complete without application of the statistical principles. Incorrect application of statistical tests causes incorrect interpretation of the study results obtained through hard work. Yet statistics remains one of the most neglected and loathed areas, probably due to the lack of understanding of the basic principles. In microbiology, rapid progress is being made in the field of diagnostic test, and a huge number of studies being conducted are related to the evaluation of these tests. Therefore, a good knowledge of statistical principles will aid a microbiologist to plan, conduct and interpret the result. The initial part of this review discusses the study designs, types of variables, principles of sampling, calculation of sample size, types of errors and power of the study. Subsequently, description of the performance characteristics of a diagnostic test, receiver operator characteristic curve and tests of significance are explained. Lack of a perfect gold standard test against which our test is being compared can hamper the study results; thus, it becomes essential to apply the remedial measures described here. Rapid computerisation has made statistical calculations much simpler, obviating the need for the routine researcher to rote learn the derivations and apply the complex formulae. Thus, greater focus has been laid on developing an understanding of principles. Finally, it should be kept in mind that a diagnostic test may show exemplary statistical results, yet it may not be useful in the routine laboratory or in the field; thus, its operational characteristics are as important as the statistical results.
Directory of Open Access Journals (Sweden)
Ghazi Sharkas
2011-04-01
Full Text Available Background: Breast cancer is the most common cancer among Jordanian women, yet survival data are scarce. This study aims to assess the observed five-year survival rate of breast cancer in Jordan from 1997 to 2002 and to determine factors that may influence survival. Methods: Data were obtained from the Jordan Cancer Registry (JCR, which is a population-based registry. From 1997-2002, 2121 patients diagnosed with breast cancer were registered in JCR. Relevant data were collected from JCR files, hospital medical records and histopathology reports. Patient's status, whether alive or dead, wasascertained from the Department of Civil Status using patients’ national numbers (ID. Statistical analysis was carried out using SPSS (version 10. Survival probabilities by age, morphology, grade, stage and other relevant variables were obtained with the Kaplan Meier method. Results: The overall five-year survival for breast cancer in Jordan, regardless of the stage or grade was 64.2%, meanwhile it was 58% in the group aged less than 30 years. The best survival was in the age group 40-49 years (69.3%. The survival for adenocarcinoma was 57.4% and for medullary carcinoma, it was 82%. The survival rate approximated 73.8% for well-differentiated, 55.6% for anaplastic, and 58% for poorly differentiated cancers. The five-year survival rate was 82.7% for stage I, 72.2% for stage II, 58.7% for stage III, and 34.6% for stage IV cancers.Conclusion: According to univariate analysis, stage, grade, age and laterality of breast cancer significantly influenced cancer survival. Cox regression analysis revealed that stage, grade and age factors correlated with prognosis, while laterality showed no significant effect on survival. Results demonstrated that overall survival was relatively poor. We hypothesized that this was due to low levels of awareness and lack of screening programs.
Lucy Asher; Harvey, Naomi D.; Martin Green; England, Gary C.W.
2017-01-01
Epidemiology is the study of patterns of health-related states or events in populations. Statistical models developed for epidemiology could be usefully applied to behavioral states or events. The aim of this study is to present the application of epidemiological statistics to understand animal behavior where discrete outcomes are of interest, using data from guide dogs to illustrate. Specifically, survival analysis and multistate modeling are applied to data on guide dogs comparing dogs that...
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
. The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...
Survival analysis of patients under chronic HIV-care and ...
African Journals Online (AJOL)
Background: Health care planning depends upon good knowledge of prevalence that requires a clear understanding of survival patterns of patients who receive medication, treatment and care. Survival analysis can bring to light the effect that some demographic, social, medical and clinical characteristics have on the ...
Potential density and tree survival: an analysis based on South ...
African Journals Online (AJOL)
Finally, we present a tree survival analysis, based on the Weibull distribution function, for the Nelshoogte replicated CCT study, which has been observed for almost 40 years after planting and provides information about tree survival in response to planting espacements ranging from 494 to 2 965 trees per hectare.
Multiple imputation of missing blood pressure covariates in survival analysis
Buuren, S. van; Boshuizen, H.C.; Knook, D.L.
1999-01-01
This paper studies a non-response problem in survival analysis where the occurrence of missing data in the risk factor is related to mortality. In a study to determine the influence of blood pressure on survival in the very old (85+ years), blood pressure measurements are missing in about 12.5 per
Survival analysis of mortality data among elderly patients in ...
African Journals Online (AJOL)
A study on the mortality among old patients 60 years or more, admitted at University of Ilorin Teaching Hospital (UITH), Ilorin was carried out using survival analysis approach. Results revealed that the median survival time, which is the time beyond which half of the patients are expected to stay in hospital before death was ...
Survival analysis of piglet pre-weaning mortality
Directory of Open Access Journals (Sweden)
P. Carnier
2010-04-01
Full Text Available Survival analysis methodology was applied in order to analyse sources of variation of preweaning survival time and to estimate variance components using data from a crossbred piglets population. A frailty sire model was used with the litter effect treated as an additional random source of variation. All the variables considered had a significant effect on survivability: sex, cross-fostering, parity of the nurse-sow and litter size. The variance estimates of sire and litter were closed to 0.08 and 2 respectively and the heritability of pre-weaning survival was 0.03.
Information sources of company's competitive environment statistical analysis
Khvostenko, O.
2010-01-01
The article is dedicated to a problem of the company's competitive environment statistical analysis and its information sources. The main features of information system and its significance in the competitive environment statistical research have been considered.
Meta-analysis of survival prediction with Palliative Performance Scale.
Downing, Michael; Lau, Francis; Lesperance, Mary; Karlson, Nicholas; Shaw, Jack; Kuziemsky, Craig; Bernard, Steve; Hanson, Laura; Olajide, Lola; Head, Barbara; Ritchie, Christine; Harrold, Joan; Casarett, David
2007-01-01
This paper aims to reconcile the use of Palliative Performance Scale (PPSv2) for survival prediction in palliative care through an international collaborative study by five research groups. The study involves an individual patient data meta-analysis on 1,808 patients from four original datasets to reanalyze their survival patterns by age, gender, cancer status, and initial PPS score. Our findings reveal a strong association between PPS and survival across the four datasets. The Kaplan-Meier survival curves show each PPS level as distinct, with a strong ordering effect in which higher PPS levels are associated with increased length of survival. Using a stratified Cox proportional hazard model to adjust for study differences, we found females lived significantly longer than males, with a further decrease in hazard for females not diagnosed with cancer. Further work is needed to refine the reporting of survival times/probabilities and to improve prediction accuracy with the inclusion of other variables in the models.
Statistical analysis of lineaments of Goa, India
Digital Repository Service at National Institute of Oceanography (India)
Iyer, S.D.; Banerjee, G.; Wagle, B.G.
statistically to obtain the nonlinear pattern in the form of a cosine wave. Three distinct peaks were found at azimuths of 40-45 degrees, 90-95 degrees and 140-145 degrees, which have peak values of 5.85, 6.80 respectively. These three peaks are correlated...
[Dealing with competing events in survival analysis].
Béchade, Clémence; Lobbedez, Thierry
2015-04-01
Survival analyses focus on the occurrences of an event of interest, in order to determine risk factors and estimate a risk. Competing events prevent from observing the event of interest. If there are competing events, it can lead to a bias in the risk's estimation. The aim of this article is to explain why Cox model is not appropriate when there are competing events, and to present Fine and Gray model, which can help when dealing with competing risks. Copyright © 2015 Association Société de néphrologie. Published by Elsevier SAS. All rights reserved.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
Fundamentals of statistical experimental design and analysis
Easterling, Robert G
2015-01-01
Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and t...
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood.
Microcomputer-assisted univariate survival data analysis using Kaplan-Meier life table estimators.
Campos-Filho, N; Franco, E L
1988-01-01
We describe a microcomputer program (KMSURV) for exploratory univariate statistical analysis of survival data which is directly applicable to the evaluation of clinical trials and to retrospective epidemiological studies of hospital registry-based data. The program calculates life-table-like information based on Kaplan-Meier's product-limit estimators of the survivorship function S(t) and provides summary measures of average survival times. In addition, two non-parametric tests for the comparison of survival distributions are performed. A report-quality, high resolution plot of the S(t) estimates for all groups being compared complements each set of analyses. KMSURV is not a simple adaptation of a mainframe statistical analysis package and, thus, it utilizes efficiently the interactive environment which is inherent in microcomputing.
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
Commentary Discrepancy between statistical analysis method and ...
African Journals Online (AJOL)
to strive for compatibility between study design and analysis plan. Many authors have reported on common discrepancies in medical research, specifically between analysis methods and study design.4,5 For instance, after reviewing several published studies, Varnell et al. observed that many studies had applied.
Confidence Levels in Statistical Analyses. Analysis of Variances. Case Study.
Directory of Open Access Journals (Sweden)
Ileana Brudiu
2010-05-01
Full Text Available Applying a statistical test to check statistical assumptions offers a positive or negative response regarding the veracity of the issued hypothesis. In case of variance analysis it’s necessary to apply a post hoc test to determine differences within the group. Statistical estimation using confidence levels provides more information than a statistical test, it shows the high degree of uncertainty resulting from small samples and builds conclusions in terms of "marginally significant" or "almost significant (p being close to 0,05 . The case study shows how the statistical estimation completes the application form the analysis of variance test and Tukey test.
U.S. Geological Survey, Department of the Interior — These data provide information on the survival of California red-legged frogs in a unique ecosystem to better conserve this threatened species while restoring...
Why Flash Type Matters: A Statistical Analysis
Mecikalski, Retha M.; Bitzer, Phillip M.; Carey, Lawrence D.
2017-09-01
While the majority of research only differentiates between intracloud (IC) and cloud-to-ground (CG) flashes, there exists a third flash type, known as hybrid flashes. These flashes have extensive IC components as well as return strokes to ground but are misclassified as CG flashes in current flash type analyses due to the presence of a return stroke. In an effort to show that IC, CG, and hybrid flashes should be separately classified, the two-sample Kolmogorov-Smirnov (KS) test was applied to the flash sizes, flash initiation, and flash propagation altitudes for each of the three flash types. The KS test statistically showed that IC, CG, and hybrid flashes do not have the same parent distributions and thus should be separately classified. Separate classification of hybrid flashes will lead to improved lightning-related research, because unambiguously classified hybrid flashes occur on the same order of magnitude as CG flashes for multicellular storms.
Statistics over features: EEG signals analysis.
Derya Ubeyli, Elif
2009-08-01
This paper presented the usage of statistics over the set of the features representing the electroencephalogram (EEG) signals. Since classification is more accurate when the pattern is simplified through representation by important features, feature extraction and selection play an important role in classifying systems such as neural networks. Multilayer perceptron neural network (MLPNN) architectures were formulated and used as basis for detection of electroencephalographic changes. Three types of EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified. The selected Lyapunov exponents, wavelet coefficients and the power levels of power spectral density (PSD) values obtained by eigenvector methods of the EEG signals were used as inputs of the MLPNN trained with Levenberg-Marquardt algorithm. The classification results confirmed that the proposed MLPNN has potential in detecting the electroencephalographic changes.
Statistical power analysis for the behavioral sciences
National Research Council Canada - National Science Library
Cohen, Jacob
1988-01-01
.... A chapter has been added for power analysis in set correlation and multivariate methods (Chapter 10). Set correlation is a realization of the multivariate general linear model, and incorporates the standard multivariate methods...
Statistical methods for categorical data analysis
Powers, Daniel
2008-01-01
This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/
Statistical power analysis for the behavioral sciences
National Research Council Canada - National Science Library
Cohen, Jacob
1988-01-01
... offers a unifying framework and some new data-analytic possibilities. 2. A new chapter (Chapter 11) considers some general topics in power analysis in more integrted form than is possible in the earlier...
Statistical Analysis of the Flexographic Printing Quality
Directory of Open Access Journals (Sweden)
Agnė Matulaitienė
2014-02-01
Full Text Available Analysis of flexographic printing output quality was performedusing SPSS software package. Samples of defected productswere collected for one year in the existing flexographic printingcompany. Any defective products examples were described indetails and analyzed. It was decided to use SPPS software packagebecause of large amount of data. Data flaw based hypotheseswere formulated which were approved or rejected in analysis.The results obtained are presented in the charts.
Mediation analysis for survival data using semiparametric probit models.
Huang, Yen-Tsung; Cai, Tianxi
2016-06-01
Causal mediation modeling has become a popular approach for studying the effect of an exposure on an outcome through mediators. Currently, the literature on mediation analyses with survival outcomes largely focused on settings with a single mediator and quantified the mediation effects on the hazard, log hazard and log survival time (Lange and Hansen 2011; VanderWeele 2011). In this article, we propose a multi-mediator model for survival data by employing a flexible semiparametric probit model. We characterize path-specific effects (PSEs) of the exposure on the outcome mediated through specific mediators. We derive closed form expressions for PSEs on a transformed survival time and the survival probabilities. Statistical inference on the PSEs is developed using a nonparametric maximum likelihood estimator under the semiparametric probit model and the functional Delta method. Results from simulation studies suggest that our proposed methods perform well in finite sample. We illustrate the utility of our method in a genomic study of glioblastoma multiforme survival. © 2015, The International Biometric Society.
Survival of Patients with Primary Brain Tumors: Comparison of Two Statistical Approaches.
Directory of Open Access Journals (Sweden)
Iveta Selingerová
Full Text Available We reviewed the survival time for patients with primary brain tumors undergoing treatment with stereotactic radiation methods at the Masaryk Memorial Cancer Institute Brno. We also identified risk factors and characteristics, and described their influence on survival time.In summarizing survival data, there are two functions of principal interest, namely, the survival function and the hazard function. In practice, both of them can depend on some characteristics. We focused on nonparametric methods, propose a method based on kernel smoothing, and compared our estimates with the results of the Cox regression model. The hazard function is conditional to age and gross tumor volume and visualized as a color-coded surface. A multivariate Cox model was also designed.There were 88 patients with primary brain cancer, treated with stereotactic radiation. The median survival of our patient cohort was 47.8 months. The estimate of the hazard function has two peaks (about 10 months and about 40 months. The survival time of patients was significantly different for various diagnoses (p≪0.001, KI (p = 0.047 and stereotactic methods (p = 0.033. Patients with a greater GTV had higher risk of death. The suitable threshold for GTV is 20 cm3. Younger patients with a survival time of about 50 months had a higher risk of death. In the multivariate Cox regression model, the selected variables were age, GTV, sex, diagnosis, KI, location, and some of their interactions.Kernel methods give us the possibility to evaluate continuous risk variables and based on the results offer risk-prone patients a different treatment, and can be useful for verifying assumptions of the Cox model or for finding thresholds of continuous variables.
Survival of Patients with Primary Brain Tumors: Comparison of Two Statistical Approaches.
Selingerová, Iveta; Doleželová, Hana; Horová, Ivanka; Katina, Stanislav; Zelinka, Jiří
2016-01-01
We reviewed the survival time for patients with primary brain tumors undergoing treatment with stereotactic radiation methods at the Masaryk Memorial Cancer Institute Brno. We also identified risk factors and characteristics, and described their influence on survival time. In summarizing survival data, there are two functions of principal interest, namely, the survival function and the hazard function. In practice, both of them can depend on some characteristics. We focused on nonparametric methods, propose a method based on kernel smoothing, and compared our estimates with the results of the Cox regression model. The hazard function is conditional to age and gross tumor volume and visualized as a color-coded surface. A multivariate Cox model was also designed. There were 88 patients with primary brain cancer, treated with stereotactic radiation. The median survival of our patient cohort was 47.8 months. The estimate of the hazard function has two peaks (about 10 months and about 40 months). The survival time of patients was significantly different for various diagnoses (p≪0.001), KI (p = 0.047) and stereotactic methods (p = 0.033). Patients with a greater GTV had higher risk of death. The suitable threshold for GTV is 20 cm3. Younger patients with a survival time of about 50 months had a higher risk of death. In the multivariate Cox regression model, the selected variables were age, GTV, sex, diagnosis, KI, location, and some of their interactions. Kernel methods give us the possibility to evaluate continuous risk variables and based on the results offer risk-prone patients a different treatment, and can be useful for verifying assumptions of the Cox model or for finding thresholds of continuous variables.
Survival of Patients with Primary Brain Tumors: Comparison of Two Statistical Approaches
Selingerová, Iveta; Doleželová, Hana; Horová, Ivanka; Katina, Stanislav; Zelinka, Jiří
2016-01-01
Purpose We reviewed the survival time for patients with primary brain tumors undergoing treatment with stereotactic radiation methods at the Masaryk Memorial Cancer Institute Brno. We also identified risk factors and characteristics, and described their influence on survival time. Methods In summarizing survival data, there are two functions of principal interest, namely, the survival function and the hazard function. In practice, both of them can depend on some characteristics. We focused on nonparametric methods, propose a method based on kernel smoothing, and compared our estimates with the results of the Cox regression model. The hazard function is conditional to age and gross tumor volume and visualized as a color-coded surface. A multivariate Cox model was also designed. Results There were 88 patients with primary brain cancer, treated with stereotactic radiation. The median survival of our patient cohort was 47.8 months. The estimate of the hazard function has two peaks (about 10 months and about 40 months). The survival time of patients was significantly different for various diagnoses (p≪0.001), KI (p = 0.047) and stereotactic methods (p = 0.033). Patients with a greater GTV had higher risk of death. The suitable threshold for GTV is 20 cm3. Younger patients with a survival time of about 50 months had a higher risk of death. In the multivariate Cox regression model, the selected variables were age, GTV, sex, diagnosis, KI, location, and some of their interactions. Conclusion Kernel methods give us the possibility to evaluate continuous risk variables and based on the results offer risk-prone patients a different treatment, and can be useful for verifying assumptions of the Cox model or for finding thresholds of continuous variables. PMID:26863415
A statistical package for computing time and frequency domain analysis
Brownlow, J.
1978-01-01
The spectrum analysis (SPA) program is a general purpose digital computer program designed to aid in data analysis. The program does time and frequency domain statistical analyses as well as some preanalysis data preparation. The capabilities of the SPA program include linear trend removal and/or digital filtering of data, plotting and/or listing of both filtered and unfiltered data, time domain statistical characterization of data, and frequency domain statistical characterization of data.
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Directory of Open Access Journals (Sweden)
Sjoerd P. F. T. Nota
2015-01-01
Full Text Available Introduction. Chondrosarcomas are malignant bone tumors that are characterized by the production of chondroid tissue. Since radiation therapy and chemotherapy have limited effect on chondrosarcoma, treatment of most patients depends on surgical resection. We conducted this study to identify independent predictive factors and survival characteristics for conventional central chondrosarcoma and dedifferentiated central chondrosarcoma. Methods. A systematic literature review was performed in September 2014 using the Pubmed, Embase, and Cochrane databases. Subsequent to a beforehand-composed selection procedure we included 13 studies, comprising a total of 1114 patients. Results. The prognosis of central chondrosarcoma is generally good for the histologically low-grade tumors. Prognosis for the high-grade chondrosarcoma and the dedifferentiated chondrosarcoma is poor with lower survival rates. Poor prognostic factors in conventional chondrosarcoma for overall survival are high-grade tumors and axial/pelvic tumor location. In dedifferentiated chondrosarcoma the percentage of dedifferentiated component has significant influence on disease-free survival. Conclusion. Despite the fact that there are multiple prognostic factors identified, as shown in this study, there is a need for prospective and comparative studies. The resulting knowledge about prognostic factors and survival can give direction in the development of better therapies. This could eventually lead to an evidence-based foundation for treating chondrosarcoma patients.
Statistical analysis of Hasegawa-Wakatani turbulence
Anderson, Johan; Hnat, Bogdan
2017-06-01
Resistive drift wave turbulence is a multipurpose paradigm that can be used to understand transport at the edge of fusion devices. The Hasegawa-Wakatani model captures the essential physics of drift turbulence while retaining the simplicity needed to gain a qualitative understanding of this process. We provide a theoretical interpretation of numerically generated probability density functions (PDFs) of intermittent events in Hasegawa-Wakatani turbulence with enforced equipartition of energy in large scale zonal flows, and small scale drift turbulence. We find that for a wide range of adiabatic index values, the stochastic component representing the small scale turbulent eddies of the flow, obtained from the autoregressive integrated moving average model, exhibits super-diffusive statistics, consistent with intermittent transport. The PDFs of large events (above one standard deviation) are well approximated by the Laplace distribution, while small events often exhibit a Gaussian character. Furthermore, there exists a strong influence of zonal flows, for example, via shearing and then viscous dissipation maintaining a sub-diffusive character of the fluxes.
Book review: Statistical Analysis and Modelling of Spatial Point Patterns
DEFF Research Database (Denmark)
Møller, Jesper
2009-01-01
Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912......Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912...
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Sensitivity analysis of ranked data: from order statistics to quantiles
Heidergott, B.F.; Volk-Makarewicz, W.
2015-01-01
In this paper we provide the mathematical theory for sensitivity analysis of order statistics of continuous random variables, where the sensitivity is with respect to a distributional parameter. Sensitivity analysis of order statistics over a finite number of observations is discussed before
The Statistical Analysis of Failure Time Data
Kalbfleisch, John D
2011-01-01
Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.
Statistical Analysis Of Reconnaissance Geochemical Data From ...
African Journals Online (AJOL)
Five factors, whose structures were similar to the subjective groupings derived from the correlation matrix, were derived from R-mode factor analysis and have been interpreted in terms of underlying rock lithology, potential mineralization, and physico-chemical conditions in the environment. A high possibility of occurrence ...
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
NUCLEI SHAPE ANALYSIS, A STATISTICAL APPROACH
Directory of Open Access Journals (Sweden)
Alberto Nettel-Aguirre
2011-05-01
Full Text Available The method presented in our paper suggests the use of Functional Data Analysis (FDA techniques in an attempt to characterise the nuclei of two types of cells: Cancer and non-cancer, based on their 2 dimensional profiles. The characteristics of the profile itself, as traced by its X and Y coordinates, their first and second derivatives, their variability and use in characterization are the main focus of this approach which is not constrained to star shaped nuclei. Findings: Principal components created from the coordinates relate to shape with significant differences between nuclei type. Characterisations for each type of profile were found.
Baltic sea algae analysis using Bayesian spatial statistics methods
Directory of Open Access Journals (Sweden)
Eglė Baltmiškytė
2013-03-01
Full Text Available Spatial statistics is one of the fields in statistics dealing with spatialy spread data analysis. Recently, Bayes methods are often applied for data statistical analysis. A spatial data model for predicting algae quantity in the Baltic Sea is made and described in this article. Black Carrageen is a dependent variable and depth, sand, pebble, boulders are independent variables in the described model. Two models with different covariation functions (Gaussian and exponential are built to estimate the best model fitting for algae quantity prediction. Unknown model parameters are estimated and Bayesian kriging prediction posterior distribution is computed in OpenBUGS modeling environment by using Bayesian spatial statistics methods.
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology
A statistical analysis of UK financial networks
Chu, J.; Nadarajah, S.
2017-04-01
In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.
Statistical Performance Analysis and Modeling Techniques for Nanometer VLSI Designs
Shen, Ruijing; Yu, Hao
2012-01-01
Since process variation and chip performance uncertainties have become more pronounced as technologies scale down into the nanometer regime, accurate and efficient modeling or characterization of variations from the device to the architecture level have become imperative for the successful design of VLSI chips. This book provides readers with tools for variation-aware design methodologies and computer-aided design (CAD) of VLSI systems, in the presence of process variations at the nanometer scale. It presents the latest developments for modeling and analysis, with a focus on statistical interconnect modeling, statistical parasitic extractions, statistical full-chip leakage and dynamic power analysis considering spatial correlations, statistical analysis and modeling for large global interconnects and analog/mixed-signal circuits. Provides readers with timely, systematic and comprehensive treatments of statistical modeling and analysis of VLSI systems with a focus on interconnects, on-chip power grids and ...
Schiegnitz, E; Al-Nawas, B; Kämmerer, P W; Grötz, K A
2014-04-01
The aim of this comprehensive literature review is to provide recommendations and guidelines for dental implant therapy in patients with a history of radiation in the head and neck region. For the first time, a meta-analysis comparing the implant survival in irradiated and non-irradiated patients was performed. An extensive electronic search in the electronic databases of the National Library of Medicine was conducted for articles published between January 1990 and January 2013 to identify literature presenting survival data on the topic of dental implants in patients receiving radiotherapy for head and neck cancer. Review and meta-analysis were performed according to Preferred Reporting Items for Systematic Review and Meta-Analyses statement. For meta-analysis, only studies with a mean follow-up of at least 5 years were included. After screening 529 abstracts from the electronic database, we included 31 studies in qualitative and 8 in quantitative synthesis. The mean implant survival rate of all examined studies was 83 % (range, 34-100 %). Meta-analysis of the current literature (2007-2013) revealed no statistically significant difference in implant survival between non-irradiated native bone and irradiated native bone (odds ratio [OR], 1.44; confidence interval [CI], 0.67-3.1). In contrast, meta-analysis of the literature of the years 1990-2006 showed a significant difference in implant survival between non-irradiated and irradiated patients ([OR], 2.12; [CI], 1.69-2.65) with a higher implant survival in the non-irradiated bone. Meta-analysis of the implant survival regarding bone origin indicated a statistically significant higher implant survival in the irradiated native bone compared to the irradiated grafted bone ([OR], 1.82; [CI], 1.14-2.90). Within the limits of this meta-analytic approach to the literature, this study describes for the first time a comparable implant survival in non-irradiated and irradiated native bone in the current literature. Grafted
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Survival Analysis of Patients with End Stage Renal Disease
Urrutia, J. D.; Gayo, W. S.; Bautista, L. A.; Baccay, E. B.
2015-06-01
This paper provides a survival analysis of End Stage Renal Disease (ESRD) under Kaplan-Meier Estimates and Weibull Distribution. The data were obtained from the records of V. L. MakabaliMemorial Hospital with respect to time t (patient's age), covariates such as developed secondary disease (Pulmonary Congestion and Cardiovascular Disease), gender, and the event of interest: the death of ESRD patients. Survival and hazard rates were estimated using NCSS for Weibull Distribution and SPSS for Kaplan-Meier Estimates. These lead to the same conclusion that hazard rate increases and survival rate decreases of ESRD patient diagnosed with Pulmonary Congestion, Cardiovascular Disease and both diseases with respect to time. It also shows that female patients have a greater risk of death compared to males. The probability risk was given the equation R = 1 — e-H(t) where e-H(t) is the survival function, H(t) the cumulative hazard function which was created using Cox-Regression.
Nonparametric survival analysis of infectious disease data.
Kenah, Eben
2013-03-01
This paper develops nonparametric methods based on contact intervals for the analysis of infectious disease data. The contact interval from person i to person j is the time between the onset of infectiousness in i and infectious contact from i to j, where we define infectious contact as a contact sufficient to infect a susceptible individual. The hazard function of the contact interval distribution equals the hazard of infectious contact from i to j, so it provides a summary of the evolution of infectiousness over time. When who-infects-whom is observed, the Nelson-Aalen estimator produces an unbiased estimate of the cumulative hazard function of the contact interval distribution. When who-infects-whom is not observed, we use an EM algorithm to average the Nelson-Aalen estimates from all possible combinations of who-infected-whom consistent with the observed data. This converges to a nonparametric maximum likelihood estimate of the cumulative hazard function that we call the marginal Nelson-Aalen estimate. We study the behavior of these methods in simulations and use them to analyze household surveillance data from the 2009 influenza A(H1N1) pandemic.
Nonparametric survival analysis of infectious disease data
Kenah, Eben
2012-01-01
Summary This paper develops nonparametric methods based on contact intervals for the analysis of infectious disease data. The contact interval from person i to person j is the time between the onset of infectiousness in i and infectious contact from i to j, where we define infectious contact as a contact sufficient to infect a susceptible individual. The hazard function of the contact interval distribution equals the hazard of infectious contact from i to j, so it provides a summary of the evolution of infectiousness over time. When who-infects-whom is observed, the Nelson-Aalen estimator produces an unbiased estimate of the cumulative hazard function of the contact interval distribution. When who-infects-whom is not observed, we use an EM algorithm to average the Nelson-Aalen estimates from all possible combinations of who-infected-whom consistent with the observed data. This converges to a nonparametric maximum likelihood estimate of the cumulative hazard function that we call the marginal Nelson-Aalen estimate. We study the behavior of these methods in simulations and use them to analyze household surveillance data from the 2009 influenza A(H1N1) pandemic. PMID:23772180
Comparative analysis of positive and negative attitudes toward statistics
Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah
2015-02-01
Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Directory of Open Access Journals (Sweden)
Chenglei Liu
Full Text Available Dedifferentiated chondrosarcoma is a rare, highly malignant tumor with a poor survival. There are many confusing issues concerning the imaging feature that can facilitate early diagnosis and the factors that might be related to outcomes.Twenty-three patients with dedifferentiated chondrosarcoma confirmed by pathology were retrospectively reviewed from 2008 to 2015. The patients' clinical information, images from radiographs (n = 17, CT (n = 19, and MRI (n = 17, histological features, treatment and prognosis were analyzed.There were 12 males and 11 females, and the mean age was 50.39 years old. Fourteen cases affected the axial bone (pelvis, spine, and 9 cases involved the appendicular bone. Seven (41.17%, 9 (47.36, and 12 (66.66% lesions showed a biphasic nature on radiograph, CT and MRI, respectively. Of the lesions, 17.39% (4/23 were accompanied by pathological fractures. Histologically, the cartilage component was considered histological Grade1 in 12 patients and Grade 2 in 11 patients. The dedifferentiated component showed features of osteosarcoma in 8 cases, malignant fibrous histiocytoma in3 cases, myofibroblastic sarcoma in 1 case and spindle cell sarcoma in 11cases. Twenty-two cases were treated with surgical resection, and 17 cases achieved adequate (wide or radical surgical margin. In 8 cases, surgery was combined with adjuvant chemotherapy. The overall median survival time was nine months; 17.4% of patients survived to five years.Axial bone location, lung metastasis at diagnosis, inadequate surgical margin, incorrect diagnosis before surgery and pathological fractures was related to poorer outcome. Pre- or postoperative chemotherapy had no definitively effect on improved survival.
Survival analysis for customer satisfaction: A case study
Hadiyat, M. A.; Wahyudi, R. D.; Sari, Y.
2017-11-01
Most customer satisfaction surveys are conducted periodically to track their dynamics. One of the goals of this survey was to evaluate the service design by recognizing the trend of satisfaction score. Many researchers recommended in redesigning the service when the satisfaction scores were decreasing, so that the service life cycle could be predicted qualitatively. However, these scores were usually set in Likert scale and had quantitative properties. Thus, they should also be analyzed in quantitative model so that the predicted service life cycle would be done by applying the survival analysis. This paper discussed a starting point for customer satisfaction survival analysis with a case study in healthcare service.
Statistical Analysis of Research Data | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 12-13, 2017 from 9:00 AM – 5:00 PM at the Natcher Conference Center, Balcony A on the Bethesda campus. SARD is designed to provide an overview of the general principles of statistical analysis of research data. The course will be taught by Paul W. Thurman of Columbia University.
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
Statistical evaluation of diagnostic performance topics in ROC analysis
Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E
2016-01-01
Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Analysis of thrips distribution: application of spatial statistics and Kriging
John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard
1991-01-01
Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
The Importance of Statistical Modeling in Data Analysis and Inference
Rollins, Derrick, Sr.
2017-01-01
Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…
Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data
Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val
2006-01-01
Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…
Statistical Analysis of the Exchange Rate of Bitcoin: e0133678
National Research Council Canada - National Science Library
Jeffrey Chu; Saralees Nadarajah; Stephen Chan
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar...
Statistical Analysis for Grinding Mechanism of Fine Ceramic Material
National Research Council Canada - National Science Library
NISHIOKA, Takao; TANAKA, Yoshio; YAMAKAWA, Akira; MIYAKE, Masaya
1994-01-01
.... Statistical analysis was conducted on the specific grinding energy and stock removal rate with respect to the maximum grain depth of cut by a new method of directly evaluation successive cutting point spacing...
Children in Africa: Key Statistics on Child Survival, Protection and Development
UNICEF, 2014
2014-01-01
This report presents key statistics relating to: (1) child malnutrition in Africa; (2) HIV/AIDS and Malaria in Africa; (3) child marriage, birth registration and Female Genital Mutilation/Cutting (FGM/C); (4) education in Africa; (5) child mortality in Africa; (6) Drinking water and sanitation in Africa; and (7) maternal health in Africa.…
Multimodality treatment of brain metastases: an institutional survival analysis of 275 patients
Directory of Open Access Journals (Sweden)
Demakas John J
2011-07-01
Full Text Available Abstract Background Whole brain radiation therapy (WBRT, surgical resection, stereotactic radiosurgery (SRS, and combinations of the three modalities are used in the management of patients with metastatic brain tumors. We present the previously unreported survival outcomes of 275 patients treated for newly diagnosed brain metastases at Cancer Care Northwest and Gamma Knife of Spokane between 1998 and 2008. Methods The effects treatment regimen, age, Eastern Cooperative Oncology Group-Performance Status (ECOG-PS, primary tumor histology, number of brain metastases, and total volume of brain metastases have on patient overall survival were analyzed. Statistical analysis was performed using Kaplan-Meier survival curves, Andersen 95% confidence intervals, approximate confidence intervals for log hazard-ratios, and multivariate Cox proportional hazard models. Results The median clinical follow up time was 7.2 months. On multivariate analysis, survival statistically favored patients treated with SRS alone when compared to patients treated with WBRT alone (p Conclusions In our analysis, patients benefited from a combined modality treatment approach and physicians must consider patient age, performance status, and primary tumor histology when recommending specific treatments regimens.
Using multivariate statistical analysis to assess changes in water ...
African Journals Online (AJOL)
Abstract. Multivariate statistical analysis was used to investigate changes in water chemistry at 5 river sites in the Vaal Dam catch- ... analysis (CCA) showed that the environmental variables used in the analysis, discharge and month of sampling, explained ...... DINGENEN R, WILD O and ZENG G (2006) The global atmos-.
Statistical Learning in Specific Language Impairment: A Meta-Analysis
Lammertink, Imme; Boersma, Paul; Wijnen, Frank; Rispens, Judith
2017-01-01
Purpose: The current meta-analysis provides a quantitative overview of published and unpublished studies on statistical learning in the auditory verbal domain in people with and without specific language impairment (SLI). The database used for the meta-analysis is accessible online and open to updates (Community-Augmented Meta-Analysis), which…
Detecting errors in micro and trace analysis by using statistics
DEFF Research Database (Denmark)
Heydorn, K.
1993-01-01
to be in statistical control. Significant deviations between analytical results from different laboratories reveal the presence of systematic errors, and agreement between different laboratories indicate the absence of systematic errors. This statistical approach, referred to as the analysis of precision, was applied...... to results for chlorine in freshwater from BCR certification analyses by highly competent analytical laboratories in the EC. Titration showed systematic errors of several percent, while radiochemical neutron activation analysis produced results without detectable bias....
[Appropriate usage of statistical analysis in eye research].
Ge, Jian
2013-02-01
To avoid data bias in clinical research, it is essential to carefully select the suitable analysis of statistics on different research purposes and designs. It is optimal that team-work by statistician, scientist and specialist will assure to obtain reliable and scientific analysis of a study. The best way to analyze a study is to select more appropriate statistical methods rather than complicated ones.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Breastfeeding, birth intervals and child survival: analysis of the 1997 ...
African Journals Online (AJOL)
Original article. Breastfeeding, birth intervals and child survival: analysis of the 1997 community and family survey data in southern Ethiopia. Markos Ezra, Eshetu Gurmu. Abstract. Background: This paper uses the 1997 community and family survey data to primarily address the question of whether or not short birth intervals ...
Use of parametric and non-parametric survival analysis techniques ...
African Journals Online (AJOL)
This paper presents parametric and non-parametric survival analysis procedures that can be used to compare acaricides. The effectiveness of Delta Tick Pour On and Delta Tick Spray in knocking down tsetse flies were determined. The two formulations were supplied by Chemplex. The comparison was based on data ...
Using Survival Analysis to Understand Graduation of Students with Disabilities
Schifter, Laura A.
2016-01-01
This study examined when students with disabilities graduated high school and how graduation patterns differed for students based on selected demographic and educational factors. Utilizing statewide data on students with disabilities from Massachusetts from 2005 through 2012, the author conducted discrete-time survival analysis to estimate the…
A gradient boosting algorithm for survival analysis via direct optimization of concordance index.
Chen, Yifei; Jia, Zhenyu; Mercola, Dan; Xie, Xiaohui
2013-01-01
Survival analysis focuses on modeling and predicting the time to an event of interest. Many statistical models have been proposed for survival analysis. They often impose strong assumptions on hazard functions, which describe how the risk of an event changes over time depending on covariates associated with each individual. In particular, the prevalent proportional hazards model assumes that covariates are multiplicatively related to the hazard. Here we propose a nonparametric model for survival analysis that does not explicitly assume particular forms of hazard functions. Our nonparametric model utilizes an ensemble of regression trees to determine how the hazard function varies according to the associated covariates. The ensemble model is trained using a gradient boosting method to optimize a smoothed approximation of the concordance index, which is one of the most widely used metrics in survival model performance evaluation. We implemented our model in a software package called GBMCI (gradient boosting machine for concordance index) and benchmarked the performance of our model against other popular survival models with a large-scale breast cancer prognosis dataset. Our experiment shows that GBMCI consistently outperforms other methods based on a number of covariate settings. GBMCI is implemented in R and is freely available online.
Korell, Julia; Coulter, Carolyn V; Duffull, Stephen B
2011-12-21
The aim of this work is to compare different labelling methods that are commonly used to estimate the lifespan of red blood cells (RBCs), e.g. in anaemia of renal failure, where the effect of treatment with erythropoietin depends on the lifespan of RBCs. A previously developed model for the survival time of RBCs that accounts for plausible physiological processes of RBC destruction was used to simulate ideal random and cohort labelling methods for RBCs, as well as the flaws associated with these methods (e.g. reuse of label and loss of the label from the surviving RBCs). Random labelling with radioactive chromium and cohort labelling using heavy nitrogen were considered. Blood sampling times were determined for RBC survival studies using both labelling methods by applying the theory of optimal design. It was assessed whether the underlying parameter values of the model are estimable from these studies, and the precision of the parameter estimates were calculated. In theory, parameter estimation would be possible for both types of ideal labelling methods without flaws. However, flaws associated with random labelling are significant and not all parameters controlling RBC survival in the model can be estimated with good precision. In contrast, cohort labelling shows good precision in the parameter estimates even in the presence of reuse and prolonged incorporation of the label. A model based analysis of RBC survival studies is recommended in future to account for limitations in methodology as well as likely causes of RBC destruction. Copyright © 2011 Elsevier Ltd. All rights reserved.
Basic statistical tools in research and data analysis
Directory of Open Access Journals (Sweden)
Zulfiqar Ali
2016-01-01
Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Numeric computation and statistical data analysis on the Java platform
Chekanov, Sergei V
2016-01-01
Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...
Revisit of 1997 TNM staging system--survival analysis of 1112 lung cancer patients in Taiwan.
Perng, Reury-Perng; Chen, Chih-Yi; Chang, Gee-Chen; Hsia, Te-Chun; Hsu, Nan-Yung; Tsai, Ying-Huang; Tsai, Chun-Ming; Yang, Chih-Hsin; Chen, Yuh-Min; Yu, Chong-Jen; Lee, Jen-Jyh; Hsu, Han-Shui; Yu, Chih-Teng; Kao, Eing-Long; Chiu, Chao-Hua
2007-01-01
There is neither a nation-wide nor a large-scale, multi-institutional lung cancer database available for stage-by-stage survival analysis in Taiwan at present. Using the data element provided by the International Association for the Study of Lung Cancer, the Taiwan Lung Cancer Society initiated a project to include native lung cancer patients into a global database. A total of 1112 Taiwan lung cancer patients treated in 7 medical centers were enrolled. In small cell lung cancer, patients with ipsilateral pleural effusion had a survival between those with locoregional disease alone and those with distant metastasis; however, the difference was not statistically significant (P = 0.204). In non-small cell lung cancer, tumor size had significant survival influence for patients as a whole (P < 0.001) but it did not support the further division of stage IA according to tumor size (P = 0.122). The survival was compatible in stage IIIB and IV patients and therefore, the survival impact of pleural effusion cannot be determined. In patients with pIIIA-N2 disease, those who had station 8 nodal metastasis had inferior survival (P = 0.020) and station 5 superior survival (P = 0.010). In patients with distant metastasis, bone, liver, or distant lymph node metastasis predicted an inferior survival (all P values < 0.05). The present study provides for comparison in this area a stage-by-stage reference for the survival of lung cancer patients. Some factors other than current TNM descriptors need to be further investigated in constructing the next version of the staging system.
Asher, Lucy; Harvey, Naomi D.; Green, Martin; England, Gary C. W.
2017-01-01
Epidemiology is the study of patterns of health-related states or events in populations. Statistical models developed for epidemiology could be usefully applied to behavioral states or events. The aim of this study is to present the application of epidemiological statistics to understand animal behavior where discrete outcomes are of interest, using data from guide dogs to illustrate. Specifically, survival analysis and multistate modeling are applied to data on guide dogs comparing dogs that completed training and qualified as a guide dog, to those that were withdrawn from the training program. Survival analysis allows the time to (or between) a binary event(s) and the probability of the event occurring at or beyond a specified time point. Survival analysis, using a Cox proportional hazards model, was used to examine the time taken to withdraw a dog from training. Sex, breed, and other factors affected time to withdrawal. Bitches were withdrawn faster than dogs, Labradors were withdrawn faster, and Labrador × Golden Retrievers slower, than Golden Retriever × Labradors; and dogs not bred by Guide Dogs were withdrawn faster than those bred by Guide Dogs. Multistate modeling (MSM) can be used as an extension of survival analysis to incorporate more than two discrete events or states. Multistate models were used to investigate transitions between states of training to qualification as a guide dog or behavioral withdrawal, and from qualification as a guide dog to behavioral withdrawal. Sex, breed (with purebred Labradors and Golden retrievers differing from F1 crosses), and bred by Guide Dogs or not, effected movements between states. We postulate that survival analysis and MSM could be applied to a wide range of behavioral data and key examples are provided. PMID:28804710
Asher, Lucy; Harvey, Naomi D; Green, Martin; England, Gary C W
2017-01-01
Epidemiology is the study of patterns of health-related states or events in populations. Statistical models developed for epidemiology could be usefully applied to behavioral states or events. The aim of this study is to present the application of epidemiological statistics to understand animal behavior where discrete outcomes are of interest, using data from guide dogs to illustrate. Specifically, survival analysis and multistate modeling are applied to data on guide dogs comparing dogs that completed training and qualified as a guide dog, to those that were withdrawn from the training program. Survival analysis allows the time to (or between) a binary event(s) and the probability of the event occurring at or beyond a specified time point. Survival analysis, using a Cox proportional hazards model, was used to examine the time taken to withdraw a dog from training. Sex, breed, and other factors affected time to withdrawal. Bitches were withdrawn faster than dogs, Labradors were withdrawn faster, and Labrador × Golden Retrievers slower, than Golden Retriever × Labradors; and dogs not bred by Guide Dogs were withdrawn faster than those bred by Guide Dogs. Multistate modeling (MSM) can be used as an extension of survival analysis to incorporate more than two discrete events or states. Multistate models were used to investigate transitions between states of training to qualification as a guide dog or behavioral withdrawal, and from qualification as a guide dog to behavioral withdrawal. Sex, breed (with purebred Labradors and Golden retrievers differing from F1 crosses), and bred by Guide Dogs or not, effected movements between states. We postulate that survival analysis and MSM could be applied to a wide range of behavioral data and key examples are provided.
Directory of Open Access Journals (Sweden)
Lucy Asher
2017-07-01
Full Text Available Epidemiology is the study of patterns of health-related states or events in populations. Statistical models developed for epidemiology could be usefully applied to behavioral states or events. The aim of this study is to present the application of epidemiological statistics to understand animal behavior where discrete outcomes are of interest, using data from guide dogs to illustrate. Specifically, survival analysis and multistate modeling are applied to data on guide dogs comparing dogs that completed training and qualified as a guide dog, to those that were withdrawn from the training program. Survival analysis allows the time to (or between a binary event(s and the probability of the event occurring at or beyond a specified time point. Survival analysis, using a Cox proportional hazards model, was used to examine the time taken to withdraw a dog from training. Sex, breed, and other factors affected time to withdrawal. Bitches were withdrawn faster than dogs, Labradors were withdrawn faster, and Labrador × Golden Retrievers slower, than Golden Retriever × Labradors; and dogs not bred by Guide Dogs were withdrawn faster than those bred by Guide Dogs. Multistate modeling (MSM can be used as an extension of survival analysis to incorporate more than two discrete events or states. Multistate models were used to investigate transitions between states of training to qualification as a guide dog or behavioral withdrawal, and from qualification as a guide dog to behavioral withdrawal. Sex, breed (with purebred Labradors and Golden retrievers differing from F1 crosses, and bred by Guide Dogs or not, effected movements between states. We postulate that survival analysis and MSM could be applied to a wide range of behavioral data and key examples are provided.
Vulnerability survival analysis: a novel approach to vulnerability management
Farris, Katheryn A.; Sullivan, John; Cybenko, George
2017-05-01
Computer security vulnerabilities span across large, enterprise networks and have to be mitigated by security engineers on a routine basis. Presently, security engineers will assess their "risk posture" through quantifying the number of vulnerabilities with a high Common Vulnerability Severity Score (CVSS). Yet, little to no attention is given to the length of time by which vulnerabilities persist and survive on the network. In this paper, we review a novel approach to quantifying the length of time a vulnerability persists on the network, its time-to-death, and predictors of lower vulnerability survival rates. Our contribution is unique in that we apply the cox proportional hazards regression model to real data from an operational IT environment. This paper provides a mathematical overview of the theory behind survival analysis methods, a description of our vulnerability data, and an interpretation of the results.
Breast Cancer Heterogeneity: MR Imaging Texture Analysis and Survival Outcomes.
Kim, Jae-Hun; Ko, Eun Sook; Lim, Yaeji; Lee, Kyung Soo; Han, Boo-Kyung; Ko, Eun Young; Hahn, Soo Yeon; Nam, Seok Jin
2017-03-01
Purpose To determine the relationship between tumor heterogeneity assessed by means of magnetic resonance (MR) imaging texture analysis and survival outcomes in patients with primary breast cancer. Materials and Methods Between January and August 2010, texture analysis of the entire primary breast tumor in 203 patients was performed with T2-weighted and contrast material-enhanced T1-weighted subtraction MR imaging for preoperative staging. Histogram-based uniformity and entropy were calculated. To dichotomize texture parameters for survival analysis, the 10-fold cross-validation method was used to determine cutoff points in the receiver operating characteristic curve analysis. The Cox proportional hazards model and Kaplan-Meier analysis were used to determine the association of texture parameters and morphologic or volumetric information obtained at MR imaging or clinical-pathologic variables with recurrence-free survival (RFS). Results There were 26 events, including 22 recurrences (10 local-regional and 12 distant) and four deaths, with a mean follow-up time of 56.2 months. In multivariate analysis, a higher N stage (RFS hazard ratio, 11.15 [N3 stage]; P = .002, Bonferroni-adjusted α = .0167), triple-negative subtype (RFS hazard ratio, 16.91; P breast cancers that appeared more heterogeneous on T2-weighted images (higher entropy) and those that appeared less heterogeneous on contrast-enhanced T1-weighted subtraction images (lower entropy) exhibited poorer RFS. © RSNA, 2016 Online supplemental material is available for this article.
Prognostic and survival analysis of presbyopia: The healthy twin study
Lira, Adiyani; Sung, Joohon
2015-12-01
Presbyopia, a vision condition in which the eye loses its flexibility to focus on near objects, is part of ageing process which mostly perceptible in the early or mid 40s. It is well known that age is its major risk factor, while sex, alcohol, poor nutrition, ocular and systemic diseases are known as common risk factors. However, many other variables might influence the prognosis. Therefore in this paper we developed a prognostic model to estimate survival from presbyopia. 1645 participants which part of the Healthy Twin Study, a prospective cohort study that has recruited Korean adult twins and their family members based on a nation-wide registry at public health agencies since 2005, were collected and analyzed by univariate analysis as well as Cox proportional hazard model to reveal the prognostic factors for presbyopia while survival curves were calculated by Kaplan-Meier method. Besides age, sex, diabetes, and myopia; the proposed model shows that education level (especially engineering program) also contribute to the occurrence of presbyopia as well. Generally, at 47 years old, the chance of getting presbyopia becomes higher with the survival probability is less than 50%. Furthermore, our study shows that by stratifying the survival curve, MZ has shorter survival with average onset time about 45.8 compare to DZ and siblings with 47.5 years old. By providing factors that have more effects and mainly associate with presbyopia, we expect that we could help to design an intervention to control or delay its onset time.
Direct Survival Analysis: a new stock assessment method
Directory of Open Access Journals (Sweden)
Eduardo Ferrandis
2007-03-01
Full Text Available In this work, a new stock assessment method, Direct Survival Analysis, is proposed and described. The parameter estimation of the Weibull survival model proposed by Ferrandis (2007 is obtained using trawl survey data. This estimation is used to establish a baseline survival function, which is in turn used to estimate the specific survival functions in the different cohorts considered through an adaptation of the separable model of the fishing mortality rates introduced by Pope and Shepherd (1982. It is thus possible to test hypotheses on the evolution of survival during the period studied and to identify trends in recruitment. A link is established between the preceding analysis of trawl survey data and the commercial catch-at-age data that are generally obtained to evaluate the population using analytical models. The estimated baseline survival, with the proposed versions of the stock and catch equations and the adaptation of the Separable Model, may be applied to commercial catch-at-age data. This makes it possible to estimate the survival corresponding to the landing data, the initial size of the cohort and finally, an effective age of first capture, in order to complete the parameter model estimation and consequently the estimation of the whole survival and mortality, along with the reference parameters that are useful for management purposes. Alternatively, this estimation of an effective age of first capture may be obtained by adapting the demographic structure of trawl survey data to that of the commercial fleet through suitable selectivity models of the commercial gears. The complete model provides the evaluation of the stock at any age. The coherence (and hence the mutual “calibration” between the two kinds of information may be analysed and compared with results obtained by other methods, such as virtual population analysis (VPA, in order to improve the diagnosis of the state of exploitation of the population. The model may be
Simulation Experiments in Practice: Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...
ALGORITHM OF PRIMARY STATISTICAL ANALYSIS OF ARRAYS OF EXPERIMENTAL DATA
Directory of Open Access Journals (Sweden)
LAUKHIN D. V.
2017-02-01
Full Text Available Annotation. Purpose. Construction of an algorithm for preliminary (primary estimation of arrays of experimental data for further obtaining a mathematical model of the process under study. Methodology. The use of the main regularities of the theory of processing arrays of experimental values in the initial analysis of data. Originality. An algorithm for performing a primary statistical analysis of the arrays of experimental data is given. Practical value. Development of methods for revealing statistically unreliable values in arrays of experimental data for the purpose of their subsequent detailed analysis and construction of a mathematical model of the studied processes.
Statistical analysis of planktic foraminifera of the surface Continental ...
African Journals Online (AJOL)
Planktic foraminiferal assemblage recorded from selected samples obtained from shallow continental shelf sediments off southwestern Nigeria were subjected to statistical analysis. The Principal Component Analysis (PCA) was used to determine variants of planktic parameters. Values obtained for these parameters were ...
PRECISE - pregabalin in addition to usual care: Statistical analysis plan
S. Mathieson (Stephanie); L. Billot (Laurent); C. Maher (Chris); A.J. McLachlan (Andrew J.); J. Latimer (Jane); B.W. Koes (Bart); M.J. Hancock (Mark J.); I. Harris (Ian); R.O. Day (Richard O.); J. Pik (Justin); S. Jan (Stephen); C.-W.C. Lin (Chung-Wei Christine)
2016-01-01
textabstractBackground: Sciatica is a severe, disabling condition that lacks high quality evidence for effective treatment strategies. This a priori statistical analysis plan describes the methodology of analysis for the PRECISE study. Methods/design: PRECISE is a prospectively registered, double
HistFitter software framework for statistical data analysis
Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...
A Divergence Statistics Extension to VTK for Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Statistical analysis of spatial and spatio-temporal point patterns
Diggle, Peter J
2013-01-01
Written by a prominent statistician and author, the first edition of this bestseller broke new ground in the then emerging subject of spatial statistics with its coverage of spatial point patterns. Retaining all the material from the second edition and adding substantial new material, Statistical Analysis of Spatial and Spatio-Temporal Point Patterns, Third Edition presents models and statistical methods for analyzing spatially referenced point process data. Reflected in the title, this third edition now covers spatio-temporal point patterns. It explores the methodological developments from th
Longitudinal data analysis a handbook of modern statistical methods
Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert
2008-01-01
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint
Gene expression meta-analysis identifies chromosomal regions involved in ovarian cancer survival
DEFF Research Database (Denmark)
Thomassen, Mads; Jochumsen, Kirsten M; Mogensen, Ole
2009-01-01
Ovarian cancer cells exhibit complex karyotypic alterations causing deregulation of numerous genes. Some of these genes are probably causal for cancer formation and local growth, whereas others are causal for metastasis and recurrence. By using publicly available data sets, we have investigated...... the relation of gene expression and chromosomal position to identify chromosomal regions of importance for early recurrence of ovarian cancer. By use of *Gene Set Enrichment Analysis*, we have ranked chromosomal regions according to their association to survival. Over-representation analysis including 1...... summarized mutation load in these regions by a combined mutation score that is statistical significantly associated to survival by analysis in the data sets used for identification of the regions. Furthermore, the prognostic value of the combined mutation score was validated in an independent large data set...
Datema, Frank R; Moya, Ana; Krause, Peter; Bäck, Thomas; Willmes, Lars; Langeveld, Ton; Baatenburg de Jong, Robert J; Blom, Henk M
2012-01-01
Electronic patient files generate an enormous amount of medical data. These data can be used for research, such as prognostic modeling. Automatization of statistical prognostication processes allows automatic updating of models when new data is gathered. The increase of power behind an automated prognostic model makes its predictive capability more reliable. Cox proportional hazard regression is most frequently used in prognostication. Automatization of a Cox model is possible, but we expect the updating process to be time-consuming. A possible solution lies in an alternative modeling technique called random survival forests (RSFs). RSF is easily automated and is known to handle the proportionality assumption coherently and automatically. Performance of RSF has not yet been tested on a large head and neck oncological dataset. This study investigates performance of head and neck overall survival of RSF models. Performances are compared to a Cox model as the "gold standard." RSF might be an interesting alternative modeling approach for automatization when performances are similar. RSF models were created in R (Cox also in SPSS). Four RSF splitting rules were used: log-rank, conservation of events, log-rank score, and log-rank approximation. Models were based on historical data of 1371 patients with primary head-and-neck cancer, diagnosed between 1981 and 1998. Models contain 8 covariates: tumor site, T classification, N classification, M classification, age, sex, prior malignancies, and comorbidity. Model performances were determined by Harrell's concordance error rate, in which 33% of the original data served as a validation sample. RSF and Cox models delivered similar error rates. The Cox model performed slightly better (error rate, 0.2826). The log-rank splitting approach gave the best RSF performance (error rate, 0.2873). In accord with Cox and RSF models, high T classification, high N classification, and severe comorbidity are very important covariates in the
Zain, Zakiyah; Aziz, Nazrina; Ahmad, Yuhaniz; Azwan, Zairul; Raduan, Farhana; Sagap, Ismail
2014-12-01
Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.
Energy Technology Data Exchange (ETDEWEB)
Zain, Zakiyah, E-mail: zac@uum.edu.my; Ahmad, Yuhaniz, E-mail: yuhaniz@uum.edu.my [School of Quantitative Sciences, Universiti Utara Malaysia, UUM Sintok 06010, Kedah (Malaysia); Azwan, Zairul, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Raduan, Farhana, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Sagap, Ismail, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com [Surgery Department, Universiti Kebangsaan Malaysia Medical Centre, Jalan Yaacob Latif, 56000 Bandar Tun Razak, Kuala Lumpur (Malaysia); Aziz, Nazrina, E-mail: nazrina@uum.edu.my
2014-12-04
Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.
Adler, I D; Bootman, J; Favor, J; Hook, G; Schriever-Schwemmer, G; Welzl, G; Whorton, E; Yoshimura, I; Hayashi, M
1998-09-01
A workshop was held on September 13 and 14, 1993, at the GSF, Neuherberg, Germany, to start a discussion of experimental design and statistical analysis issues for three in vivo mutagenicity test systems, the micronucleus test in mouse bone marrow/peripheral blood, the chromosomal aberration tests in mouse bone marrow/differentiating spermatogonia, and the mouse dominant lethal test. The discussion has now come to conclusions which we would like to make generally known. Rather than dwell upon specific statistical tests which could be used for data analysis, serious consideration was given to test design. However, the test design, its power of detecting a given increase of adverse effects and the test statistics are interrelated. Detailed analyses of historical negative control data led to important recommendations for each test system. Concerning the statistical sensitivity parameters, a type I error of 0.05 (one tailed), a type II error of 0.20 and a dose related increase of twice the background (negative control) frequencies were generally adopted. It was recommended that sufficient observations (cells, implants) be planned for each analysis unit (animal) so that at least one adverse outcome (micronucleus, aberrant cell, dead implant) would likely be observed. The treated animal was the smallest unit of analysis allowed. On the basis of these general consideration the sample size was determined for each of the three assays. A minimum of 2000 immature erythrocytes/animal should be scored for micronuclei from each of at least 4 animals in each comparison group in the micronucleus assays. A minimum of 200 cells should be scored for chromosomal aberrations from each of at least 5 animals in each comparison group in the aberration assays. In the dominant lethal test, a minimum of 400 implants (40-50 pregnant females) are required per dose group for each mating period. The analysis unit for the dominant lethal test would be the treated male unless the background
[Prognostic factors in renal cancer with venous thrombus survival analysis.
Pascual-Fernández, Angela; Calleja-Escudero, Jesús; Gómez de Segura, Cristina; Pesquera-Ortega, Laura; Taylor, James; Fajardo, José Antonio; González de Zárate, Javier; Monllor-Gisbert, Jesús; Cortiñas-González, José Ramón
2017-07-01
To analyze surgery for renal cancer with venous thrombus at different levels, perioperative complications and prognostic factors associated to overall, cancer-specific and disease-free survival. Retrospective analysis of 42 cases of renal cancer with venous thrombus performed between 2005 and 2015. The level reached by the thrombus was established according to the Mayo Clinic classification. Postoperative complications were staged according to Clavien-Dindo classification. Most frequent in males. Mean age 65.7 years. 16.6% were tumors with level II thrombus. Subcostal approach was performed in 58.9%. Extracorporeal circulation with cardiac arrest and hypothermia was established in 2 patients. Resection of metastatic disease was performed in 3 patients during radical nephrectomy. Reoperation was 2.3% while, perioperative mortality was 4.7%. 30% presented with metastases at diagnosis. Twenty patients progressed at 15.5 months (3-55). Overall survival was 60 months. The cancer-specific mortality was 75%. Disease-free survival was 30% at 55 months. Surgical treatment of renal cancer with venous thrombus requires a multidisciplinary management. The surgical technique varies according to the level reached by the venous thrombus. Tumor stage is the most important prognostic factor. Thrombus level influences prognosis, with longer survival for patients with tumor thrombus confined to the renal vein (pT3a) in comparison to tumors with thrombus in the atrium (pT3c).
Towards proper sampling and statistical analysis of defects
Directory of Open Access Journals (Sweden)
Cetin Ali
2014-06-01
Full Text Available Advancements in applied statistics with great relevance to defect sampling and analysis are presented. Three main issues are considered; (i proper handling of multiple defect types, (ii relating sample data originating from polished inspection surfaces (2D to finite material volumes (3D, and (iii application of advanced extreme value theory in statistical analysis of block maximum data. Original and rigorous, but practical mathematical solutions are presented. Finally, these methods are applied to make prediction regarding defect sizes in a steel alloy containing multiple defect types.
Statistical analysis of hydroclimatic time series: Uncertainty and insights
Koutsoyiannis, Demetris; Montanari, Alberto
2007-05-01
Today, hydrologic research and modeling depends largely on climatological inputs, whose physical and statistical behavior are the subject of many debates in the scientific community. A relevant ongoing discussion is focused on long-term persistence (LTP), a natural behavior identified in several studies of instrumental and proxy hydroclimatic time series, which, nevertheless, is neglected in some climatological studies. LTP may reflect a long-term variability of several factors and thus can support a more complete physical understanding and uncertainty characterization of climate. The implications of LTP in hydroclimatic research, especially in statistical questions and problems, may be substantial but appear to be not fully understood or recognized. To offer insights on these implications, we demonstrate by using analytical methods that the characteristics of temperature series, which appear to be compatible with the LTP hypothesis, imply a dramatic increase of uncertainty in statistical estimation and reduction of significance in statistical testing, in comparison with classical statistics. Therefore we maintain that statistical analysis in hydroclimatic research should be revisited in order not to derive misleading results and simultaneously that merely statistical arguments do not suffice to verify or falsify the LTP (or another) climatic hypothesis.
Directory of Open Access Journals (Sweden)
Irit Ben-Aharon
Full Text Available The role of bisphosphonates (BP in early breast cancer (BC has been considered controversial. We performed a meta-analysis of all randomized controlled trials (RCTs that appraised the effects of BP on survival in early BC.RCTs were identified by searching the Cochrane Library, MEDLINE databases and conference proceedings. Hazard ratios (HRs of overall survival (OS, disease-free survival (DFS and relative risks of adverse events were estimated and pooled.Thirteen trials met the inclusion criteria, evaluating a total of 15,762 patients. Meta-analysis of ten trials which reported OS revealed no statistically significant benefit in OS for BP (HR 0.89, 95% CI = 0.79 to 1.01. Meta-analysis of nine trials which reported the DFS revealed no benefit in DFS (HR 0.95 (0.81-1.12. Meta-analysis upon menopausal status showed a statistically significant better DFS in the BP-treated patients (HR 0.81(0.69-0.95. In meta-regression, chemotherapy was negatively associated with HR of survival.Our meta-analysis indicates a positive effect for adjuvant BP on survival only in postmenopausal patients. Meta-regression demonstrated a negative association between chemotherapy use BP effect on survival. Further large scale RCTs are warranted to unravel the specific subgroups that would benefit from the addition of BP in the adjuvant setting.
Data analysis using the Gnu R system for statistical computation
Energy Technology Data Exchange (ETDEWEB)
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
Statistical analysis of fNIRS data: a comprehensive review.
Tak, Sungho; Ye, Jong Chul
2014-01-15
Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Bayesian analysis: a new statistical paradigm for new technology.
Grunkemeier, Gary L; Payne, Nicola
2002-12-01
Full Bayesian analysis is an alternative statistical paradigm, as opposed to traditionally used methods, usually called frequentist statistics. Bayesian analysis is controversial because it requires assuming a prior distribution, which can be arbitrarily chosen; thus there is a subjective element, which is considered to be a major weakness. However, this could also be considered a strength since it provides a formal way of incorporating prior knowledge. Since it is flexible and permits repeated looks at evolving data, Bayesian analysis is particularly well suited to the evaluation of new medical technology. Bayesian analysis can refer to a range of things: from a simple, noncontroversial formula for inverting probabilities to an alternative approach to the philosophy of science. Its advantages include: (1) providing direct probability statements--which are what most people wrongly assume they are getting from conventional statistics; (2) formally incorporating previous information in statistical inference of a data set, a natural approach which we follow in everyday reasoning; and (3) flexible, adaptive research designs allowing multiple looks at accumulating study data. Its primary disadvantage is the element of subjectivity which some think is not scientific. We discuss and compare frequentist and Bayesian approaches and provide three examples of Bayesian analysis: (1) EKG interpretation, (2) a coin-tossing experiment, and (3) assessing the thromboembolic risk of a new mechanical heart valve.
Analysis of room transfer function and reverberant signal statistics
DEFF Research Database (Denmark)
Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn
2008-01-01
For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the R...... “anechoic” and “reverberant” audio speech signals, in order to model the alterations due to room acoustics. The above results are obtained from both in-situ room response measurements and controlled acoustical response simulations.......For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the RTF...... magnitude and phase). RTF fractional octave smoothing, as with 1-slash 3 octave analysis, may lead to RTF simplifications that can be useful for several audio applications, like room compensation, room modeling, auralisation purposes. The aim of this work is to identify the relationship of optimal response...
Survival Analysis in Patients with Non- metastatic Squamous Cell Carcinoma of the Urinary Bladder
Directory of Open Access Journals (Sweden)
Ahmed M. Abdel-Rahim
2011-04-01
Full Text Available Background: We conducted a retrospective analysis to evaluate overall survival(OAS and disease free survival (DFS rates in patients with squamous cell carcinoma of the urinary bladder according to different prognostic factors. Methods: This retrospective study analyzed the medical records of patients with non-metastatic squamous cell carcinoma of the urinary bladder. All men underwent radical cystectomy and women underwent anterior pelvic exentration. Most patients had postoperative radiation therapy. The log-rank test examined differences in OASand DFS rates. Results: The medical records of 106 patients were analyzed. The median follow-up from the date of enrollment was 30 months and ranged from 2 to 73 months. For the entire group, three-year OAS rates were 46.9% and DFS rates were 44%. For patients with P2 (tumor invasion into the muscularis propria the three-year OAS rate was 53%, for P3 (tumor invasion into perivesical fat it was 45% and 9% for P4 (tumor invasion into adjacent organs, pelvic wall or abdominal wall The OAS rate was statistically significant in favor of P2 disease (P=0.0041. The three-year DFS rate was 50% for P2, 45% for P3 and 9% for P4 disease (P=0.0125. Administration of post-operative radiotherapy did not result in statistically significant improvement in three-year OASand DFS rates. Conclusion: Survival rates were statistically significant and higher in patients with P2 and P3 disease compared to P4 disease. Adjuvant radiotherapy did not result in statistically significant survival improvement.
Predicting survival of Salmonella in low-water activity foods: an analysis of literature data.
Santillana Farakos, Sofia M; Schaffner, Donald W; Frank, Joseph F
2014-09-01
Factors such as temperature, water activity (aw), substrate, culture media, serotype, and strain influence the survival of Salmonella in low-aw foods. Predictive models for Salmonella survival in low-aw foods at temperatures ranging from 21 to 80(u) C and water activities below 0.6 were previously developed. Literature data on survival of Salmonella in low-aw foods were analyzed in the present study to validate these predictive models and to determine global influencing factors. The results showed the Weibull model provided suitable fits to the data in 75% of the curves as compared with the log-linear model. The secondary models predicting the time required for log-decimal reduction (log δ) and shape factor (log β) values were useful in predicting the survival of Salmonella in low-aw foods. Statistical analysis indicated overall fail-safe secondary models, with 88% of the residuals in the acceptable and safe zones (survival kinetics of Salmonella in low-aw foods and its influencing factors.
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Kalil, Andre C; Florescu, Diana F
2013-07-04
Despite the same manufacturer, the same drotrecogin alfa activated dose, and the same placebo-controlled design, the negative result from the PROWESS-SHOCK trial contradicted the survival benefit observed in the PROWESS trial. We hypothesize that the different results were due to factors other than the experimental therapy and performed an analysis of the clinical heterogeneity (differences related to the trials' clinical aspects) and the statistical heterogeneity (differences related to the trials' statistical aspects) between these trials. Baseline characteristics and co-interventions were analyzed by chi-square testing and mortality was analyzed by random-effects modeling and I2. Our findings show that clinical variables presented significant heterogeneity, and that up to 90% of the mortality differences between both trials were not due to chance. These results demonstrate that PROWESS and PROWESS-SHOCK are not comparable trials due to the highly significant clinical and statistical heterogeneity. We propose a new and pragmatic solution.
Statistical Compilation of the ICT Sector and Policy Analysis | IDRC ...
International Development Research Centre (IDRC) Digital Library (Canada)
Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...
statistical analysis of wind speed for electrical power generation in ...
African Journals Online (AJOL)
HOD
are employed to fit wind speed data of some selected sites in Northern Nigeria. This is because the design of wind energy conversion systems depends on the correct analysis of the site renewable energy resources. [13]. In addition, the statistical judgements are based on the accuracy in fitting the available data at the sites.
Using multivariate statistical analysis to assess changes in water ...
African Journals Online (AJOL)
Multivariate statistical analysis was used to investigate changes in water chemistry at 5 river sites in the Vaal Dam catchment, draining the Highveld grasslands. These grasslands receive more than 8 kg sulphur (S) ha-1·year-1 and 6 kg nitrogen (N) ha-1·year-1 via atmospheric deposition. It was hypothesised that between ...
Statistical Compilation of the ICT Sector and Policy Analysis | CRDI ...
International Development Research Centre (IDRC) Digital Library (Canada)
Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...
A Statistical Analysis of Women's Perceptions on Politics and Peace ...
African Journals Online (AJOL)
This article is a statistical analysis of the perception that more women in politics would enhance peace building. The data was drawn from a comparative survey of 325 women and four men (community leaders) in the regions of the Niger Delta (Nigeria) and KwaZulu-Natal (South Africa). According to the findings, the ...
Stige, Leif Chr.; Langangen, Øystein; Yaragina, Natalia A.; Vikebø, Frode B.; Bogstad, Bjarte; Ottersen, Geir; Stenseth, Nils Chr.; Hjermann, Dag Ø.
2015-05-01
Understanding the causes of the large interannual fluctuations in the recruitment to many marine fishes is a key challenge in fisheries ecology. We here propose that the combination of mechanistic and statistical modelling of the pelagic early life stages (ELS) prior to recruitment can be a powerful approach for improving our understanding of local-scale and population-scale dynamics. Specifically, this approach allows separating effects of ocean transport and survival, and thereby enhances the knowledge of the processes that regulate recruitment. We analyse data on the pelagic eggs, larvae and post-larvae of Northeast Arctic cod and on copepod nauplii, the main prey of the cod larvae. The data originate from two surveys, one in spring and one in summer, for 30 years. A coupled physical-biological model is used to simulate the transport, ambient temperature and development of cod ELS from spawning through spring and summer. The predictions from this model are used as input in a statistical analysis of the summer data, to investigate effects of covariates thought to be linked to growth and survival. We find significant associations between the local-scale ambient copepod nauplii concentration and temperature in spring and the local-scale occurrence of cod (post)larvae in summer, consistent with effects on survival. Moreover, years with low copepod nauplii concentrations and low temperature in spring are significantly associated with lower mean length of the cod (post)larvae in summer, likely caused in part by higher mortality leading to increased dominance of young and hence small individuals. Finally, we find that the recruitment at age 3 is strongly associated with the mean body length of the cod ELS, highlighting the biological significance of the findings.
Multivariate statistical analysis of atom probe tomography data.
Parish, Chad M; Miller, Michael K
2010-10-01
The application of spectrum imaging multivariate statistical analysis methods, specifically principal component analysis (PCA), to atom probe tomography (APT) data has been investigated. The mathematical method of analysis is described and the results for two example datasets are analyzed and presented. The first dataset is from the analysis of a PM 2000 Fe-Cr-Al-Ti steel containing two different ultrafine precipitate populations. PCA properly describes the matrix and precipitate phases in a simple and intuitive manner. A second APT example is from the analysis of an irradiated reactor pressure vessel steel. Fine, nm-scale Cu-enriched precipitates having a core-shell structure were identified and qualitatively described by PCA. Advantages, disadvantages, and future prospects for implementing these data analysis methodologies for APT datasets, particularly with regard to quantitative analysis, are also discussed. Copyright 2010 Elsevier B.V. All rights reserved.
Bayesian survival analysis in clinical trials: What methods are used in practice?
Brard, Caroline; Le Teuff, Gwénaël; Le Deley, Marie-Cécile; Hampson, Lisa V
2017-02-01
Background Bayesian statistics are an appealing alternative to the traditional frequentist approach to designing, analysing, and reporting of clinical trials, especially in rare diseases. Time-to-event endpoints are widely used in many medical fields. There are additional complexities to designing Bayesian survival trials which arise from the need to specify a model for the survival distribution. The objective of this article was to critically review the use and reporting of Bayesian methods in survival trials. Methods A systematic review of clinical trials using Bayesian survival analyses was performed through PubMed and Web of Science databases. This was complemented by a full text search of the online repositories of pre-selected journals. Cost-effectiveness, dose-finding studies, meta-analyses, and methodological papers using clinical trials were excluded. Results In total, 28 articles met the inclusion criteria, 25 were original reports of clinical trials and 3 were re-analyses of a clinical trial. Most trials were in oncology (n = 25), were randomised controlled (n = 21) phase III trials (n = 13), and half considered a rare disease (n = 13). Bayesian approaches were used for monitoring in 14 trials and for the final analysis only in 14 trials. In the latter case, Bayesian survival analyses were used for the primary analysis in four cases, for the secondary analysis in seven cases, and for the trial re-analysis in three cases. Overall, 12 articles reported fitting Bayesian regression models (semi-parametric, n = 3; parametric, n = 9). Prior distributions were often incompletely reported: 20 articles did not define the prior distribution used for the parameter of interest. Over half of the trials used only non-informative priors for monitoring and the final analysis (n = 12) when it was specified. Indeed, no articles fitting Bayesian regression models placed informative priors on the parameter of interest. The prior for the treatment
Using Multivariate Statistical Analysis for Grouping of State Forest Enterprises
Directory of Open Access Journals (Sweden)
Atakan Öztürk
2010-11-01
Full Text Available The purpose of this study was to investigate the use possibilities of multivariate statistical analysis methods for grouping of Forest Enterprises. This study involved 24 Forest Enterprises in Eastern Black Sea Region. A total 69 variables, classified as physical, economic, social, rural settlements, technical-managerial, and functional variables, were developed. Multivariate statistics such as factor, cluster and discriminate analyses were used to classify 24 Forest Enterpprises. These enterprises classified into 2 groups. 22 enterprises were in first group and while remained 2 enterprises in second group.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban
2017-09-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
A Gradient Boosting Algorithm for Survival Analysis via Direct Optimization of Concordance Index
Directory of Open Access Journals (Sweden)
Yifei Chen
2013-01-01
statistical models have been proposed for survival analysis. They often impose strong assumptions on hazard functions, which describe how the risk of an event changes over time depending on covariates associated with each individual. In particular, the prevalent proportional hazards model assumes that covariates are multiplicatively related to the hazard. Here we propose a nonparametric model for survival analysis that does not explicitly assume particular forms of hazard functions. Our nonparametric model utilizes an ensemble of regression trees to determine how the hazard function varies according to the associated covariates. The ensemble model is trained using a gradient boosting method to optimize a smoothed approximation of the concordance index, which is one of the most widely used metrics in survival model performance evaluation. We implemented our model in a software package called GBMCI (gradient boosting machine for concordance index and benchmarked the performance of our model against other popular survival models with a large-scale breast cancer prognosis dataset. Our experiment shows that GBMCI consistently outperforms other methods based on a number of covariate settings. GBMCI is implemented in R and is freely available online.
Explorations in statistics: the analysis of ratios and normalized data.
Curran-Everett, Douglas
2013-09-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of Explorations in Statistics explores the analysis of ratios and normalized-or standardized-data. As researchers, we compute a ratio-a numerator divided by a denominator-to compute a proportion for some biological response or to derive some standardized variable. In each situation, we want to control for differences in the denominator when the thing we really care about is the numerator. But there is peril lurking in a ratio: only if the relationship between numerator and denominator is a straight line through the origin will the ratio be meaningful. If not, the ratio will misrepresent the true relationship between numerator and denominator. In contrast, regression techniques-these include analysis of covariance-are versatile: they can accommodate an analysis of the relationship between numerator and denominator when a ratio is useless.
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-09
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Evaluation of parametric models by the prediction error in colorectal cancer survival analysis.
Baghestani, Ahmad Reza; Gohari, Mahmood Reza; Orooji, Arezoo; Pourhoseingholi, Mohamad Amin; Zali, Mohammad Reza
2015-01-01
The aim of this study is to determine the factors influencing predicted survival time for patients with colorectal cancer (CRC) using parametric models and select the best model by predicting error's technique. Survival models are statistical techniques to estimate or predict the overall time up to specific events. Prediction is important in medical science and the accuracy of prediction is determined by a measurement, generally based on loss functions, called prediction error. A total of 600 colorectal cancer patients who admitted to the Cancer Registry Center of Gastroenterology and Liver Disease Research Center, Taleghani Hospital, Tehran, were followed at least for 5 years and have completed selected information for this study. Body Mass Index (BMI), Sex, family history of CRC, tumor site, stage of disease and histology of tumor included in the analysis. The survival time was compared by the Log-rank test and multivariate analysis was carried out using parametric models including Log normal, Weibull and Log logistic regression. For selecting the best model, the prediction error by apparent loss was used. Log rank test showed a better survival for females, BMI more than 25, patients with early stage at diagnosis and patients with colon tumor site. Prediction error by apparent loss was estimated and indicated that Weibull model was the best one for multivariate analysis. BMI and Stage were independent prognostic factors, according to Weibull model. In this study, according to prediction error Weibull regression showed a better fit. Prediction error would be a criterion to select the best model with the ability to make predictions of prognostic factors in survival analysis.
Feature-Based Statistical Analysis of Combustion Simulation Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS
Brownlow, J. D.
1994-01-01
The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval
[Corneal transplant in a second level hospital. A survival analysis].
Hernández-Da Mota, Sergio E; Paniagua Jacobo, Margarita; Gómez Revuelta, Gustavo; Páez Martínez, Raymundo Mauricio
2013-01-01
To determine the long-term corneal graft survival in patients of General Hospital Dr. Miguel Silva. This was a retrospective cohort study. Records from patients who underwent corneal transplant surgery at General Hospital Dr. Miguel Silva were analyzed. The percentages of graft failure were obtained. Kaplan-Meier survival analysis was performed to evaluate the long-term cumulative probability of graft non-rejection in all patients according to diagnosis. Overall, 71.9% (CI 95%: 64.8-78.9) of the patients did not have any graft rejections, and 12.5% (CI 95%: 7-18) required a regraft and were considered graft failures. Patients with posttraumatic leucoma had a cumulative probability of non-rejection of 100%. Subjects with keratoconus had a 65% likelihood of non-rejection after 40 months of follow-up. The likelihood of non-rejection was greater than 80% at 100 months of follow-up in pseudophakic bullous keratopathy patients and 60% at 20 months of follow-up in inactive herpetic leucoma patients. Posttraumatic leucoma patients had the greatest cumulative survival probability compared with postherpetic leucoma patients and other patient groups.
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)
Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee
2010-12-01
Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review
CORSSA: Community Online Resource for Statistical Seismicity Analysis
Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.
2011-12-01
Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
Statistical Analysis of SAR Sea Clutter for Classification Purposes
Directory of Open Access Journals (Sweden)
Jaime Martín-de-Nicolás
2014-09-01
Full Text Available Statistical analysis of radar clutter has always been one of the topics, where more effort has been put in the last few decades. These studies were usually focused on finding the statistical models that better fitted the clutter distribution; however, the goal of this work is not the modeling of the clutter, but the study of the suitability of the statistical parameters to carry out a sea state classification. In order to achieve this objective and provide some relevance to this study, an important set of maritime and coastal Synthetic Aperture Radar data is considered. Due to the nature of the acquisition of data by SAR sensors, speckle noise is inherent to these data, and a specific study of how this noise affects the clutter distribution is also performed in this work. In pursuit of a sense of wholeness, a thorough study of the most suitable statistical parameters, as well as the most adequate classifier is carried out, achieving excellent results in terms of classification success rates. These concluding results confirm that a sea state classification is not only viable, but also successful using statistical parameters different from those of the best modeling distribution and applying a speckle filter, which allows a better characterization of the parameters used to distinguish between different sea states.
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-06
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Univariate statistical analysis of environmental (compositional) data: problems and possibilities.
Filzmoser, Peter; Hron, Karel; Reimann, Clemens
2009-11-15
For almost 30 years it has been known that compositional (closed) data have special geometrical properties. In environmental sciences, where the concentration of chemical elements in different sample materials is investigated, almost all datasets are compositional. In general, compositional data are parts of a whole which only give relative information. Data that sum up to a constant, e.g. 100 wt.%, 1,000,000 mg/kg are the best known example. It is widely neglected that the "closure" characteristic remains even if only one of all possible elements is measured, it is an inherent property of compositional data. No variable is free to vary independent of all the others. Existing transformations to "open" closed data are seldom applied. They are more complicated than a log transformation and the relationship to the original data unit is lost. Results obtained when using classical statistical techniques for data analysis appeared reasonable and the possible consequences of working with closed data were rarely questioned. Here the simple univariate case of data analysis is investigated. It can be demonstrated that data closure must be overcome prior to calculating even simple statistical measures like mean or standard deviation or plotting graphs of the data distribution, e.g. a histogram. Some measures like the standard deviation (or the variance) make no statistical sense with closed data and all statistical tests building on the standard deviation (or variance) will thus provide erroneous results if used with the original data.
STATISTICAL ANALYSIS OF THE HEAVY NEUTRAL ATOMS MEASURED BY IBEX
Energy Technology Data Exchange (ETDEWEB)
Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard [Space Science Center and Department of Physics, University of New Hampshire, 8 College Road, Durham, NH 03824 (United States); Galli, André [Pysics Institute, University of Bern, Bern 3012 (Switzerland); Livadiotis, George; Fuselier, Steve A.; McComas, David J., E-mail: jtl29@wildcats.unh.edu [Southwest Research Institute, P.O. Drawer 28510, San Antonio, TX 78228 (United States)
2015-10-15
We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O and Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O and Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath.
GNSS Spoofing Detection Based on Signal Power Measurements: Statistical Analysis
Directory of Open Access Journals (Sweden)
V. Dehghanian
2012-01-01
Full Text Available A threat to GNSS receivers is posed by a spoofing transmitter that emulates authentic signals but with randomized code phase and Doppler values over a small range. Such spoofing signals can result in large navigational solution errors that are passed onto the unsuspecting user with potentially dire consequences. An effective spoofing detection technique is developed in this paper, based on signal power measurements and that can be readily applied to present consumer grade GNSS receivers with minimal firmware changes. An extensive statistical analysis is carried out based on formulating a multihypothesis detection problem. Expressions are developed to devise a set of thresholds required for signal detection and identification. The detection processing methods developed are further manipulated to exploit incidental antenna motion arising from user interaction with a GNSS handheld receiver to further enhance the detection performance of the proposed algorithm. The statistical analysis supports the effectiveness of the proposed spoofing detection technique under various multipath conditions.
Statistics in experimental design, preprocessing, and analysis of proteomics data.
Jung, Klaus
2011-01-01
High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Lifetime statistics of quantum chaos studied by a multiscale analysis
Di Falco, A.
2012-04-30
In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Statistical Analysis of the Exchange Rate of Bitcoin.
Directory of Open Access Journals (Sweden)
Jeffrey Chu
Full Text Available Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
: Statistical analysis of the students' behavior in algebra
Bisson, Gilles; Bronner, Alain; Gordon, Mirta; Nicaud, Jean-François; Renaudie, David
2003-01-01
We present an analysis of behaviors of students solving algebra exercises with the Aplusix software. We built a set of statistics from the protocols (records of the interactions between the student and the software) in order to evaluate the correctness of the calculation steps and of the corresponding solutions. We have particularly studied the activities of college students (sixteen and seventeen years old). This study emphasizes the didactic variables which are relevant for the types of sel...
Development of statistical analysis for single dose bronchodilators.
Salsburg, D
1981-12-01
When measurements developed for the diagnosis of patients are used to detect treatment effects in clinical trials with chronic disease, problems in definition of response and in the statistical distributions of those measurements within patients have to be resolved before the results of clinical studies can be analyzed. An example of this process is shown in the development of the analysis of single-dose bronchodilator trials.
Lifetime statistics of quantum chaos studied by a multiscale analysis
Di Falco, A.; Krauss, T. F.; Fratalocchi, A.
2012-04-01
In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.
Statistical Challenges of Big Data Analysis in Medicine
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2015-01-01
Roč. 3, č. 1 (2015), s. 24-27 ISSN 1805-8698 R&D Projects: GA ČR GA13-23940S Grant - others:CESNET Development Fund(CZ) 494/2013 Institutional support: RVO:67985807 Keywords : big data * variable selection * classification * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research http://www.ijbh.org/ijbh2015-1.pdf
Statistical and machine learning approaches for network analysis
Dehmer, Matthias
2012-01-01
Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation
Metz, Anneke M
2008-01-01
There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study.
A statistical analysis of human lymphocyte transformation data.
Harina, B M; Gill, T J; Rabin, B S; Taylor, F H
1979-06-01
The lymphocytes from 107 maternal-foetal pairs were examined for their in vitro responsiveness, as determined by the incorporation of tritiated thymidine following stimulation with phytohaemagglutinin (PHA), candida, varicella, mumps, streptokinase-streptodornase (SKSD) and tetanus toxoid. The data were collected and analysed in two sequential groups (forty-seven and sixty) in order to determine whether the results were reproducible. The variable chosen for analysis was the difference (d) between the square roots of the isotope incorporation in the stimulated and control cultures because it gave the most symmetrical distribution of the data. The experimental error in the determination of maternal lymphocyte stimulation was 1.4--8.6% and of the foetal lymphocytes, 1.0--16.6%, depending upon the antigen or mitogen and its concentration. The data in the two sets of patients were statistically the same in forty-eight of the fifty-six analyses (fourteen antigen or mitogen concentrations in autologous and AB plasma for maternal and foetal lymphocytes). The statistical limits of the distribution of responses for stimulation or suppression were set by an analysis of variance taking two standard deviations from the mean as the limits. When these limits were translated into stimulation indices, they varied for each antigen or mitogen and for different concentrations of the same antigen. Thus, a detailed statistical analysis of a large volume of lymphocyte transformation data indicates that the technique is reproducible and offers a reliable method for determing when significant differences from control values are present.
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2009-01-01
An All-in-One Resource for Using SAS and R to Carry out Common TasksProvides a path between languages that is easier than reading complete documentationSAS and R: Data Management, Statistical Analysis, and Graphics presents an easy way to learn how to perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and the creation of graphics, along with more complex applicat
Institutions Function and Failure Statistic and Analysis of Wind Turbine
yang, Ma; Chengbing, He; Xinxin, Feng
Recently,with install capacity of wind turbines increases continuously, the wind power consisting of operation,research on reliability,maintenance and rapair will be developed into a key point..Failure analysis can support operation,management of spare components and accessories in wind plants,maintenance and repair of wind turbines.In this paper,with the eye of wind plants'structure and function,statistic and analysis the common fault of each part of the plant,and then find out the faults law, faults cause and fault effect,from which put forward the corresponding measures.
Statistical Analysis of Hypercalcaemia Data related to Transferability
DEFF Research Database (Denmark)
Frølich, Anne; Nielsen, Bo Friis
2005-01-01
In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...... at conferences during the late eighties and early nineties. In this report we give a more detailed description of many of the analysis and provide some new results primarily by simultaneous studies of several databases....
Statistical Analysis of 30 Years Rainfall Data: A Case Study
Arvind, G.; Ashok Kumar, P.; Girish Karthi, S.; Suribabu, C. R.
2017-07-01
Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
HistFitter: a flexible framework for statistical data analysis
Besjes, G J; Côté, D; Koutsman, A; Lorenz, J M; Short, D
2015-01-01
HistFitter is a software framework for statistical data analysis that has been used extensively in the ATLAS Collaboration to analyze data of proton-proton collisions produced by the Large Hadron Collider at CERN. Most notably, HistFitter has become a de-facto standard in searches for supersymmetric particles since 2012, with some usage for Exotic and Higgs boson physics. HistFitter coherently combines several statistics tools in a programmable and flexible framework that is capable of bookkeeping hundreds of data models under study using thousands of generated input histograms.HistFitter interfaces with the statistics tools HistFactory and RooStats to construct parametric models and to perform statistical tests of the data, and extends these tools in four key areas. The key innovations are to weave the concepts of control, validation and signal regions into the very fabric of HistFitter, and to treat these with rigorous methods. Multiple tools to visualize and interpret the results through a simple configura...
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
The bivariate statistical analysis of environmental (compositional) data.
Filzmoser, Peter; Hron, Karel; Reimann, Clemens
2010-09-01
Environmental sciences usually deal with compositional (closed) data. Whenever the concentration of chemical elements is measured, the data will be closed, i.e. the relevant information is contained in the ratios between the variables rather than in the data values reported for the variables. Data closure has severe consequences for statistical data analysis. Most classical statistical methods are based on the usual Euclidean geometry - compositional data, however, do not plot into Euclidean space because they have their own geometry which is not linear but curved in the Euclidean sense. This has severe consequences for bivariate statistical analysis: correlation coefficients computed in the traditional way are likely to be misleading, and the information contained in scatterplots must be used and interpreted differently from sets of non-compositional data. As a solution, the ilr transformation applied to a variable pair can be used to display the relationship and to compute a measure of stability. This paper discusses how this measure is related to the usual correlation coefficient and how it can be used and interpreted. Moreover, recommendations are provided for how the scatterplot can still be used, and which alternatives exist for displaying the relationship between two variables. Copyright 2010 Elsevier B.V. All rights reserved.
Statistical Analysis of Sport Movement Observations: the Case of Orienteering
Amouzandeh, K.; Karimipour, F.
2017-09-01
Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope) and non-spatial movement attributes (e.g. speed and heart rate) of athletes. As the case study, an example dataset of movement observations acquired during the "orienteering" sport is presented and statistically analyzed.
The NIRS Analysis Package: noise reduction and statistical inference.
Directory of Open Access Journals (Sweden)
Tomer Fekete
Full Text Available Near infrared spectroscopy (NIRS is a non-invasive optical imaging technique that can be used to measure cortical hemodynamic responses to specific stimuli or tasks. While analyses of NIRS data are normally adapted from established fMRI techniques, there are nevertheless substantial differences between the two modalities. Here, we investigate the impact of NIRS-specific noise; e.g., systemic (physiological, motion-related artifacts, and serial autocorrelations, upon the validity of statistical inference within the framework of the general linear model. We present a comprehensive framework for noise reduction and statistical inference, which is custom-tailored to the noise characteristics of NIRS. These methods have been implemented in a public domain Matlab toolbox, the NIRS Analysis Package (NAP. Finally, we validate NAP using both simulated and actual data, showing marked improvement in the detection power and reliability of NIRS.
Statistical Analysis of Radio Propagation Channel in Ruins Environment
Directory of Open Access Journals (Sweden)
Jiao He
2015-01-01
Full Text Available The cellphone based localization system for search and rescue in complex high density ruins has attracted a great interest in recent years, where the radio channel characteristics are critical for design and development of such a system. This paper presents a spatial smoothing estimation via rotational invariance technique (SS-ESPRIT for radio channel characterization of high density ruins. The radio propagations at three typical mobile communication bands (0.9, 1.8, and 2 GHz are investigated in two different scenarios. Channel parameters, such as arrival time, delays, and complex amplitudes, are statistically analyzed. Furthermore, a channel simulator is built based on these statistics. By comparison analysis of average excess delay and delay spread, the validation results show a good agreement between the measurements and channel modeling results.
STATISTICAL ANALYSIS OF SPORT MOVEMENT OBSERVATIONS: THE CASE OF ORIENTEERING
Directory of Open Access Journals (Sweden)
K. Amouzandeh
2017-09-01
Full Text Available Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope and non-spatial movement attributes (e.g. speed and heart rate of athletes. As the case study, an example dataset of movement observations acquired during the “orienteering” sport is presented and statistically analyzed.
Directory of Open Access Journals (Sweden)
Su-jie ZHANG
2012-06-01
Full Text Available Objectives To explore the factors influencing survival time in lung cancer associated hypercalcemia patients. Methods Thirty-four patients with pathologically confirmed lung cancer complicated with hypercalcemia, who were treated at the Department of Oncology in General Hospital of PLA from Jan. 2001 to Dec. 2010, were enrolled in this study. The clinical data analyzed included sex, age, pathological type of the malignancies, organ metastasis (bone, lung, liver, kidney, brain, number of distal metastatic site, mental status, interval between final diagnosis of lung cancer and of hypercalcemia, peak value of blood calcium during the disease course, treatment methods and so on. Survival analysis was performed with the Kaplan-Meier method and Cox analysis with statistic software SPSS 18.0 to identify the potential prognostic factors. Results The highest blood calcium level ranged from 2.77 to 4.87mmol/L, and the median value was 2.94mmol/L. The patients' survival time after diagnosis of hypercalcemia varied from 1 day to 1067 days, and the median survival time was 92 days. With the log-rank test, age above 50 years old, hypercalcemia occurring over 90 days after diagnosis of cancer, central nervous system symptoms and renal metastasis were predictors for poor survival (P=0.048, P=0.001, P=0.000, P=0.003. In the COX proportional hazard model analysis, age above 50 years old, hypercalcemia occurring over 90 days after cancer diagnosis, central nervous system symptoms and renal metastasis were significant prognostic factors for poor survival (HR=11.483, P=0.006; HR=4.371, P=0.002; HR=6.064, P=0.026; HR=8.502, P=0.011. Conclusions Patients with lung cancer associated hypercalcemia have a shorter survival time and poor prognosis. Age above 50 years old, hypercalcemia occurring over 90 days after cancer diagnosis, central nervous system symptoms and renal metastasis are significant factors of poor prognosis.
National Research Council Canada - National Science Library
Guyot, Patricia; Ades, A E; Ouwens, Mario J N M; Welton, Nicky J
2012-01-01
.... In order to enhance the quality of secondary data analyses, we propose a method which derives from the published Kaplan Meier survival curves a close approximation to the original individual patient...
International Conference on Modern Problems of Stochastic Analysis and Statistics
2017-01-01
This book brings together the latest findings in the area of stochastic analysis and statistics. The individual chapters cover a wide range of topics from limit theorems, Markov processes, nonparametric methods, acturial science, population dynamics, and many others. The volume is dedicated to Valentin Konakov, head of the International Laboratory of Stochastic Analysis and its Applications on the occasion of his 70th birthday. Contributions were prepared by the participants of the international conference of the international conference “Modern problems of stochastic analysis and statistics”, held at the Higher School of Economics in Moscow from May 29 - June 2, 2016. It offers a valuable reference resource for researchers and graduate students interested in modern stochastics.
Statistical analysis plan for the EuroHYP-1 trial
DEFF Research Database (Denmark)
Winkel, Per; Bath, Philip M; Gluud, Christian
2017-01-01
Score; (4) brain infarct size at 48 +/-24 hours; (5) EQ-5D-5 L score, and (6) WHODAS 2.0 score. Other outcomes are: the primary safety outcome serious adverse events; and the incremental cost-effectiveness, and cost utility ratios. The analysis sets include (1) the intention-to-treat population, and (2...... outcome), logistic regression (binary outcomes), general linear model (continuous outcomes), and the Poisson or negative binomial model (rate outcomes). DISCUSSION: Major adjustments compared with the original statistical analysis plan encompass: (1) adjustment of analyses by nationality; (2) power......) the per protocol population. The sample size is estimated to 800 patients (5% type 1 and 20% type 2 errors). All analyses are adjusted for the protocol-specified stratification variables (nationality of centre), and the minimisation variables. In the analysis, we use ordinal regression (the primary...
Composition and Statistical Analysis of Biophenols in Apulian Italian EVOOs
Centonze, Carla; Grasso, Maria Elena; Latronico, Maria Francesca; Mastrangelo, Pier Francesco; Maffia, Michele
2017-01-01
Extra-virgin olive oil (EVOO) is among the basic constituents of the Mediterranean diet. Its nutraceutical properties are due mainly, but not only, to a plethora of molecules with antioxidant activity known as biophenols. In this article, several biophenols were measured in EVOOs from South Apulia, Italy. Hydroxytyrosol, tyrosol and their conjugated structures to elenolic acid in different forms were identified and quantified by high performance liquid chromatography (HPLC) together with lignans, luteolin and α-tocopherol. The concentration of the analyzed metabolites was quite high in all the cultivars studied, but it was still possible to discriminate them through multivariate statistical analysis (MVA). Furthermore, principal component analysis (PCA) and orthogonal partial least-squares discriminant analysis (OPLS-DA) were also exploited for determining variances among samples depending on the interval time between harvesting and milling, on the age of the olive trees, and on the area where the olive trees were grown. PMID:29057813
Composition and Statistical Analysis of Biophenols in Apulian Italian EVOOs
Directory of Open Access Journals (Sweden)
Andrea Ragusa
2017-10-01
Full Text Available Extra-virgin olive oil (EVOO is among the basic constituents of the Mediterranean diet. Its nutraceutical properties are due mainly, but not only, to a plethora of molecules with antioxidant activity known as biophenols. In this article, several biophenols were measured in EVOOs from South Apulia, Italy. Hydroxytyrosol, tyrosol and their conjugated structures to elenolic acid in different forms were identified and quantified by high performance liquid chromatography (HPLC together with lignans, luteolin and α-tocopherol. The concentration of the analyzed metabolites was quite high in all the cultivars studied, but it was still possible to discriminate them through multivariate statistical analysis (MVA. Furthermore, principal component analysis (PCA and orthogonal partial least-squares discriminant analysis (OPLS-DA were also exploited for determining variances among samples depending on the interval time between harvesting and milling, on the age of the olive trees, and on the area where the olive trees were grown.
STATISTICS. The reusable holdout: Preserving validity in adaptive data analysis.
Dwork, Cynthia; Feldman, Vitaly; Hardt, Moritz; Pitassi, Toniann; Reingold, Omer; Roth, Aaron
2015-08-07
Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. Existing approaches to ensuring the validity of inferences drawn from data assume a fixed procedure to be performed, selected before the data are examined. In common practice, however, data analysis is an intrinsically adaptive process, with new analyses generated on the basis of data exploration, as well as the results of previous analyses on the same data. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis. As an application, we show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses. Copyright © 2015, American Association for the Advancement of Science.
PROGNOSTIC FACTORS AND SURVIVAL ANALYSIS IN ESOPHAGEAL CARCINOMA.
Tustumi, Francisco; Kimura, Cintia Mayumi Sakurai; Takeda, Flavio Roberto; Uema, Rodrigo Hideki; Salum, Rubens Antônio Aissar; Ribeiro-Junior, Ulysses; Cecconello, Ivan
2016-01-01
Despite recent advances in diagnosis and treatment, esophageal cancer still has high mortality. Prognostic factors associated with patient and with disease itself are multiple and poorly explored. Assess prognostic variables in esophageal cancer patients. Retrospective review of all patients with esophageal cancer in an oncology referral center. They were divided according to histological diagnosis (444 squamous cell carcinoma patients and 105 adenocarcinoma), and their demographic, pathological and clinical characteristics were analyzed and compared to clinical stage and overall survival. No difference was noted between squamous cell carcinoma and esophageal adenocarcinoma overall survival curves. Squamous cell carcinoma presented 22.8% survival after five years against 20.2% for adenocarcinoma. When considering only patients treated with curative intent resection, after five years squamous cell carcinoma survival rate was 56.6 and adenocarcinoma, 58%. In patients with squamous cell carcinoma, poor differentiation histology and tumor size were associated with worse oncology stage, but this was not evidenced in adenocarcinoma. Weight loss (kg), BMI variation (kg/m²) and percentage of weight loss are factors that predict worse stage at diagnosis in the squamous cell carcinoma. In adenocarcinoma, these findings were not statistically significant. Apesar dos avanços recentes nos métodos diagnósticos e tratamento, o câncer de esôfago mantém alta mortalidade. Fatores prognósticos associados ao paciente e ao câncer propriamente dito são pouco conhecidos. Investigar variáveis prognósticas no câncer esofágico. Pacientes diagnosticados entre 2009 e 2012 foram analisados e subdivididos de acordo com tipo histológico (444 carcinomas espinocelulares e 105 adenocarcinomas), e então características demográficas, anatomopatológicas e clínicas foram analisadas. Não houve diferença entre os dois tipos histológicos na sobrevida global. Carcinoma espinocelular
GIS-BASED SPATIAL STATISTICAL ANALYSIS OF COLLEGE GRADUATES EMPLOYMENT
Directory of Open Access Journals (Sweden)
R. Tang
2012-07-01
Full Text Available It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004–2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Gis-Based Spatial Statistical Analysis of College Graduates Employment
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Consolidity analysis for fully fuzzy functions, matrices, probability and statistics
Directory of Open Access Journals (Sweden)
Walaa Ibrahim Gabr
2015-03-01
Full Text Available The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, the handling of fuzzy matrices covered determinants of fuzzy matrices, the eigenvalues of fuzzy matrices, and solving least-squares fuzzy linear equations. The approach demonstrated to be also applicable in a systematic way in handling new fuzzy probabilistic and statistical problems. This included extending the conventional probabilistic and statistical analysis for handling fuzzy random data. Application also covered the consolidity of fuzzy optimization problems. Various numerical examples solved have demonstrated that the new consolidity concept is highly effective in solving in a compact form the propagation of fuzziness in linear, nonlinear, multivariable and dynamic problems with different types of complexities. Finally, it is demonstrated that the implementation of the suggested fuzzy mathematics can be easily embedded within normal mathematics through building special fuzzy functions library inside the computational Matlab Toolbox or using other similar software languages.
Statistical analysis of C/NOFS planar Langmuir probe data
Directory of Open Access Journals (Sweden)
E. Costa
2014-07-01
Full Text Available The planar Langmuir probe (PLP onboard the Communication/Navigation Outage Forecasting System (C/NOFS satellite has been monitoring ionospheric plasma densities and their irregularities with high resolution almost seamlessly since May 2008. Considering the recent changes in status of the C/NOFS mission, it may be interesting to summarize some statistical results from these measurements. PLP data from 2 different years (1 October 2008–30 September 2009 and 1 January 2012–31 December 2012 were selected for analysis. The first data set corresponds to solar minimum conditions and the second one is as close to solar maximum conditions of solar cycle 24 as possible at the time of the analysis. The results from the analysis show how the values of the standard deviation of the ion density which are greater than specified thresholds are statistically distributed as functions of several combinations of the following geophysical parameters: (i solar activity, (ii altitude range, (iii longitude sector, (iv local time interval, (v geomagnetic latitude interval, and (vi season.
The features of Drosophila core promoters revealed by statistical analysis
Directory of Open Access Journals (Sweden)
Trifonov Edward N
2006-06-01
Full Text Available Abstract Background Experimental investigation of transcription is still a very labor- and time-consuming process. Only a few transcription initiation scenarios have been studied in detail. The mechanism of interaction between basal machinery and promoter, in particular core promoter elements, is not known for the majority of identified promoters. In this study, we reveal various transcription initiation mechanisms by statistical analysis of 3393 nonredundant Drosophila promoters. Results Using Drosophila-specific position-weight matrices, we identified promoters containing TATA box, Initiator, Downstream Promoter Element (DPE, and Motif Ten Element (MTE, as well as core elements discovered in Human (TFIIB Recognition Element (BRE and Downstream Core Element (DCE. Promoters utilizing known synergetic combinations of two core elements (TATA_Inr, Inr_MTE, Inr_DPE, and DPE_MTE were identified. We also establish the existence of promoters with potentially novel synergetic combinations: TATA_DPE and TATA_MTE. Our analysis revealed several motifs with the features of promoter elements, including possible novel core promoter element(s. Comparison of Human and Drosophila showed consistent percentages of promoters with TATA, Inr, DPE, and synergetic combinations thereof, as well as most of the same functional and mutual positions of the core elements. No statistical evidence of MTE utilization in Human was found. Distinct nucleosome positioning in particular promoter classes was revealed. Conclusion We present lists of promoters that potentially utilize the aforementioned elements/combinations. The number of these promoters is two orders of magnitude larger than the number of promoters in which transcription initiation was experimentally studied. The sequences are ready to be experimentally tested or used for further statistical analysis. The developed approach may be utilized for other species.
[Use and comprehensibility of statistical analysis in Archivos de Bronconeumología (1970-1999)].
De Granda Orive, J I; García Río, F; Gutiérrez Jiménez, T; Escobar Sacristán, J; Gallego Rodríguez, V; Sáez Valls, R
2002-08-01
To describe the type of statistical analysis used most often in original research articles published in Archivos de Bronconeumología, and the evolution of statistical analysis over time in terms of complexity. To determine comprehensibility, taking bivariate analysis as the reference threshold. All articles published in the original research section of Archivos de Bronconeumología from 1970 through 1999 were reviewed manually. For each article we recorded the category or categories of statistical analysis and its comprehensibility (with a reference threshold set at category 7). We studied the following factors: year of publication, type of analysis, comprehensibility, maximum category achieved, subject area and number of authors. Eight hundred sixty original articles, with a mean 5 2 authors per article were examined. The maximum category reached was a mean 4.15 4.61. The three types of analysis used most often in all articles were category 1 (descriptive only) at 49.4%, category 2 (t and z tests) at 26.4% and category 3 (bivariate tables) at 19.1%. Among the more complex analytical categories, the most often used were analysis of variance (category 8) at 9%, survival analysis (category 16) at 6.2%, and non-parametric correlations at 3.4%. Comparing results by decade, the proportion of articles with only descriptive analysis fell from 74% in the seventies to 63.9% in the eighties and to 36.1% in the nineties (90s vs. 80s, p Archivos de Bronconeumología increased over time, while comprehensibility decreased.
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2014-01-01
An Up-to-Date, All-in-One Resource for Using SAS and R to Perform Frequent TasksThe first edition of this popular guide provided a path between SAS and R using an easy-to-understand, dictionary-like approach. Retaining the same accessible format, SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition explains how to easily perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferentia
Using R for Data Management, Statistical Analysis, and Graphics
Horton, Nicholas J
2010-01-01
This title offers quick and easy access to key element of documentation. It includes worked examples across a wide variety of applications, tasks, and graphics. "Using R for Data Management, Statistical Analysis, and Graphics" presents an easy way to learn how to perform an analytical task in R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation and vast number of add-on packages. Organized by short, clear descriptive entries, the book covers many common tasks, such as data management, descriptive summaries, inferential proc
STATISTIC ANALYSIS OF INTERNATIONAL TOURISM ON ROMANIAN SEASIDE
Directory of Open Access Journals (Sweden)
MIRELA SECARĂ
2010-01-01
Full Text Available In order to meet European and international touristic competition standards, modernization, re-establishment and development of Romanian tourism are necessary as well as creation of modern touristic products that are competitive on this market. The use of modern methods of statistic analysis in the field of tourism facilitates the achievement of systems of information that are the instruments for: evaluation of touristic demand and touristic supply, follow-up of touristic services of each touring form, follow-up of transportation services, leisure activities, hotel accommodation, touristic market study, and a complex flexible system of management and accountancy.
Statistical Analysis of Strength Data for an Aerospace Aluminum Alloy
Neergaard, L.; Malone, T.
2001-01-01
Aerospace vehicles are produced in limited quantities that do not always allow development of MIL-HDBK-5 A-basis design allowables. One method of examining production and composition variations is to perform 100% lot acceptance testing for aerospace Aluminum (Al) alloys. This paper discusses statistical trends seen in strength data for one Al alloy. A four-step approach reduced the data to residuals, visualized residuals as a function of time, grouped data with quantified scatter, and conducted analysis of variance (ANOVA).
Spatial Analysis Along Networks Statistical and Computational Methods
Okabe, Atsuyuki
2012-01-01
In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process
Statistical Analysis of Designed Experiments Theory and Applications
Tamhane, Ajit C
2012-01-01
A indispensable guide to understanding and designing modern experiments The tools and techniques of Design of Experiments (DOE) allow researchers to successfully collect, analyze, and interpret data across a wide array of disciplines. Statistical Analysis of Designed Experiments provides a modern and balanced treatment of DOE methodology with thorough coverage of the underlying theory and standard designs of experiments, guiding the reader through applications to research in various fields such as engineering, medicine, business, and the social sciences. The book supplies a foundation for the
Meta-analysis of the effects of beta blocker on survival time in cancer patients.
Choi, Chel Hun; Song, Taejong; Kim, Tae Hyun; Choi, Jun Kuk; Park, Jin-Young; Yoon, Aera; Lee, Yoo-Young; Kim, Tae-Joong; Bae, Duk-Soo; Lee, Jeong-Won; Kim, Byoung-Gie
2014-07-01
This study was to elucidate the potential benefit of beta blockers on cancer survival. We comprehensively searched PubMed, Embase, and the Cochrane Library from their inception to April 2013. Two authors independently screened and reviewed the eligibility of each study and coded the participants, treatment, and outcome characteristics. The primary outcomes were overall survival (OS) and disease-free survival (DFS). Twelve studies published between 1993 and 2013 were included in the final analysis. Four papers reported results from 10 independent groups, resulting in a total of 18 comparisons based on data obtained from 20,898 subjects. Effect sizes (hazard ratios, HR) were heterogeneous, and random-effects models were used in the analyses. The meta-analysis demonstrated that beta blocker use is associated with improved OS (HR 0.79; 95 % CI 0.67-0.93; p = 0.004) and DFS (HR 0.69; 95 % CI 0.53-0.91; p = 0.009). Although statistically not significant, the effect size was greater in patients with low-stage cancer or cancer treated primarily with surgery than in patients with high-stage cancer or cancer treated primarily without surgery (HR 0.60 vs. 0.78, and 0.60 vs. 0.80, respectively). Although only two study codes were analyzed, the studies using nonselective beta blockers showed that there was no overall effect on OS (HR 0.52, 95 % CI 0.09-3.04). This meta-analysis provides evidence that beta blocker use can be associated with the prolonged survival of cancer patients, especially patients with early-stage cancer treated primarily with surgery.
Improved statistical model checking methods for pathway analysis.
Koh, Chuan Hock; Palaniappan, Sucheendra K; Thiagarajan, P S; Wong, Limsoon
2012-01-01
Statistical model checking techniques have been shown to be effective for approximate model checking on large stochastic systems, where explicit representation of the state space is impractical. Importantly, these techniques ensure the validity of results with statistical guarantees on errors. There is an increasing interest in these classes of algorithms in computational systems biology since analysis using traditional model checking techniques does not scale well. In this context, we present two improvements to existing statistical model checking algorithms. Firstly, we construct an algorithm which removes the need of the user to define the indifference region, a critical parameter in previous sequential hypothesis testing algorithms. Secondly, we extend the algorithm to account for the case when there may be a limit on the computational resources that can be spent on verifying a property; i.e, if the original algorithm is not able to make a decision even after consuming the available amount of resources, we resort to a p-value based approach to make a decision. We demonstrate the improvements achieved by our algorithms in comparison to current algorithms first with a straightforward yet representative example, followed by a real biological model on cell fate of gustatory neurons with microRNAs.
Layton, Danielle M; Clarke, Michael; Walton, Terry R
2012-01-01
This systematic review reports on the survival of feldspathic porcelain veneers. The Cochrane Library, MEDLINE (OVID), Embase, Web of Knowledge, selected journals, clinical trials registers, and conference proceedings were searched independently by two reviewers. Academic colleagues were also contacted to identify relevant research. Inclusion criteria were human cohort studies (prospective and retrospective) and controlled trials assessing outcomes of feldspathic porcelain veneers in more than 15 patients and with at least some of the veneers in situ for 5 years. Of 4,294 articles identified, 116 studies underwent full-text screenings and 69 were further reviewed for eligibility. Of these, 11 were included in the qualitative analysis and 6 (5 cohorts) were included in meta-analyses. Estimated cumulative survival and standard error for each study were assessed and used for meta-, sensitivity, and post hoc analyses. The I2 statistic and the Cochran Q test and its associated P value were used to evaluate statistical heterogeneity, with a random-effects meta-analysis used when the P value for heterogeneity was less than .1. Galbraith, forest, and funnel plots explored heterogeneity, publication patterns, and small study biases. The estimated cumulative survival for feldspathic porcelain veneers was 95.7% (95% confidence interval [CI]: 92.9% to 98.4%) at 5 years and ranged from 64% to 95% at 10 years across three studies. A post hoc meta-analysis indicated that the 10-year best estimate may approach 95.6% (95% CI: 93.8% to 97.5%). High levels of statistical heterogeneity were found. When bonded to enamel substrate, feldspathic porcelain veneers have a very high 10-year survival rate that may approach 95%. Clinical heterogeneity is associated with differences in reported survival rates. Use of clinically relevant survival definitions and careful reporting of tooth characteristics, censorship, clustering, and precise results in future research would improve metaanalytic
Vibroacoustic optimization using a statistical energy analysis model
Culla, Antonio; D`Ambrogio, Walter; Fregolent, Annalisa; Milana, Silvia
2016-08-01
In this paper, an optimization technique for medium-high frequency dynamic problems based on Statistical Energy Analysis (SEA) method is presented. Using a SEA model, the subsystem energies are controlled by internal loss factors (ILF) and coupling loss factors (CLF), which in turn depend on the physical parameters of the subsystems. A preliminary sensitivity analysis of subsystem energy to CLF's is performed to select CLF's that are most effective on subsystem energies. Since the injected power depends not only on the external loads but on the physical parameters of the subsystems as well, it must be taken into account under certain conditions. This is accomplished in the optimization procedure, where approximate relationships between CLF's, injected power and physical parameters are derived. The approach is applied on a typical aeronautical structure: the cabin of a helicopter.
Topics in statistical data analysis for high-energy physics
Cowan, G.
2013-06-27
These lectures concern two topics that are becoming increasingly important in the analysis of High Energy Physics (HEP) data: Bayesian statistics and multivariate methods. In the Bayesian approach we extend the interpretation of probability to cover not only the frequency of repeatable outcomes but also to include a degree of belief. In this way we are able to associate probability with a hypothesis and thus to answer directly questions that cannot be addressed easily with traditional frequentist methods. In multivariate analysis we try to exploit as much information as possible from the characteristics that we measure for each event to distinguish between event types. In particular we will look at a method that has gained popularity in HEP in recent years: the boosted decision tree (BDT).
On Understanding Statistical Data Analysis in Higher Education
Montalbano, Vera
2012-01-01
Data analysis is a powerful tool in all experimental sciences. Statistical methods, such as sampling theory, computer technologies necessary for handling large amounts of data, skill in analysing information contained in different types of graphs are all competences necessary for achieving an in-depth data analysis. In higher education, these topics are usually fragmentized in different courses, the interdisciplinary integration can lack, some caution in the use of these topics can missing or be misunderstood. Students are often obliged to acquire these skills by themselves during the preparation of the final experimental thesis. A proposal for a learning path on nuclear phenomena is presented in order to develop these scientific competences in physics courses. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources can be monitored for long time in a physics laboratory. The data are collected and analyzed in a computer lab i...
Statistical learning analysis in neuroscience: aiming for transparency
Directory of Open Access Journals (Sweden)
Michael Hanke
2010-05-01
Full Text Available Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires ``neuroscience-aware'' technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here we review its features and applicability to various neural data modalities.
First statistical analysis of Geant4 quality software metrics
Ronchieri, Elisabetta; Grazia Pia, Maria; Giacomini, Francesco
2015-12-01
Geant4 is a simulation system of particle transport through matter, widely used in several experimental areas from high energy physics and nuclear experiments to medical studies. Some of its applications may involve critical use cases; therefore they would benefit from an objective assessment of the software quality of Geant4. In this paper, we provide a first statistical evaluation of software metrics data related to a set of Geant4 physics packages. The analysis aims at identifying risks for Geant4 maintainability, which would benefit from being addressed at an early stage. The findings of this pilot study set the grounds for further extensions of the analysis to the whole of Geant4 and to other high energy physics software systems.
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability
Energy Technology Data Exchange (ETDEWEB)
Shear, Trevor Allan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-08-29
Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystal sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.
[Analysis of complaints in primary care using statistical process control].
Valdivia Pérez, Antonio; Arteaga Pérez, Lourdes; Escortell Mayor, Esperanza; Monge Corella, Susana; Villares Rodríguez, José Enrique
2009-08-01
To analyze patient complaints in a Primary Health Care District (PHCD) using statistical process control methods compared to multivariate methods, as regards their results and feasibility of application in this context. Descriptive study based on an aggregate analysis of administrative complaints. Complaints received between January 2005 and August 2008 in the Customer Management Department in the 3rd PHCD Management Office, Madrid Health Services. Complaints are registered through Itrack, a computer software tool used throughout the whole Community of Madrid. Total number of complaints, complaints sorted by Reason and Primary Health Care Team (PHCT), total number of patient visits (including visits on demand, appointment visits and home visits) and visits by PHCT and per month and year. Multivariate analysis and control charts were used. 44-month time series with a mean of 76 complaints per month, an increasing trend in the first three years and decreasing during summer months. Poisson regression detected an excess of complaints in 8 out of the 44 months in the series. The control chart detected the same 8 months plus two additional ones. Statistical process control can be useful for detecting an excess of complaints in a PHCD and enables comparisons to be made between different PHC teams. As it is a simple technique, it can be used for ongoing monitoring of customer perceived quality.
Using zebrafish to learn statistical analysis and Mendelian genetics.
Lindemann, Samantha; Senkler, Jon; Auchter, Elizabeth; Liang, Jennifer O
2011-06-01
This project was developed to promote understanding of how mathematics and statistical analysis are used as tools in genetic research. It gives students the opportunity to carry out hypothesis-driven experiments in the classroom: students generate hypotheses about Mendelian and non-Mendelian inheritance patterns, gather raw data, and test their hypotheses using chi-square statistical analysis. In the first protocol, students are challenged to analyze inheritance patterns using GloFish, brightly colored, commercially available, transgenic zebrafish that express Green, Yellow, or Red Fluorescent Protein throughout their muscles. In the second protocol, students learn about genetic screens, microscopy, and developmental biology by analyzing the inheritance patterns of mutations that cause developmental defects. The difficulty of the experiments can be adapted for middle school to upper level undergraduate students. Since the GloFish experiments use only fish and materials that can be purchased from pet stores, they should be accessible to many schools. For each protocol, we provide detailed instructions, ideas for how the experiments fit into an undergraduate curriculum, raw data, and example analyses. Our plan is to have these protocols form the basis of a growing and adaptable educational tool available on the Zebrafish in the Classroom Web site.
A biologist's guide to statistical thinking and analysis.
Fay, David S; Gerow, Ken
2013-07-09
The proper understanding and use of statistical tools are essential to the scientific enterprise. This is true both at the level of designing one's own experiments as well as for critically evaluating studies carried out by others. Unfortunately, many researchers who are otherwise rigorous and thoughtful in their scientific approach lack sufficient knowledge of this field. This methods chapter is written with such individuals in mind. Although the majority of examples are drawn from the field of Caenorhabditis elegans biology, the concepts and practical applications are also relevant to those who work in the disciplines of molecular genetics and cell and developmental biology. Our intent has been to limit theoretical considerations to a necessary minimum and to use common examples as illustrations for statistical analysis. Our chapter includes a description of basic terms and central concepts and also contains in-depth discussions on the analysis of means, proportions, ratios, probabilities, and correlations. We also address issues related to sample size, normality, outliers, and non-parametric approaches.
Pattern recognition in menstrual bleeding diaries by statistical cluster analysis
Directory of Open Access Journals (Sweden)
Wessel Jens
2009-07-01
Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.
Design and statistical analysis of oral medicine studies: common pitfalls.
Baccaglini, L; Shuster, J J; Cheng, J; Theriaque, D W; Schoenbach, V J; Tomar, S L; Poole, C
2010-04-01
A growing number of articles are emerging in the medical and statistics literature that describe epidemiologic and statistical flaws of research studies. Many examples of these deficiencies are encountered in the oral, craniofacial, and dental literature. However, only a handful of methodologic articles have been published in the oral literature warning investigators of potential errors that may arise early in the study and that can irreparably bias the final results. In this study, we briefly review some of the most common pitfalls that our team of epidemiologists and statisticians has identified during the review of submitted or published manuscripts and research grant applications. We use practical examples from the oral medicine and dental literature to illustrate potential shortcomings in the design and analysis of research studies, and how these deficiencies may affect the results and their interpretation. A good study design is essential, because errors in the analysis can be corrected if the design was sound, but flaws in study design can lead to data that are not salvageable. We recommend consultation with an epidemiologist or a statistician during the planning phase of a research study to optimize study efficiency, minimize potential sources of bias, and document the analytic plan.
Survival analysis of preweaning piglet survival in a dry-cured ham-producing crossbred line.
Cecchinato, A; Bonfatti, V; Gallo, L; Carnier, P
2008-10-01
The aim of this study was to investigate piglet preweaning survival and its relationship with a total merit index (TMI) used for selection of Large White terminal boars for dry-cured ham production. Data on 13,924 crossbred piglets (1,347 litters), originated by 189 Large White boars and 328 Large White-derived crossbred sows, were analyzed under a frailty proportional hazards model, assuming different baseline hazard functions and including sire and nursed litter as random effects. Estimated hazard ratios (HR) indicated that sex, cross-fostering, year-month of birth, parity of the nurse sow, size of the nursed litter, and class of TMI were significant effects for piglet preweaning survival. Female piglets had less risk of dying than males (HR = 0.81), as well as cross-fostered piglets (HR = 0.60). Survival increased when piglets were nursed by sows of third (HR = 0.85), fourth (HR = 0.76), and fifth (HR = 0.79) parity in comparison with first and second parity sows. Piglets of small (HR = 3.90) or very large litters (HR >1.60) had less chance of surviving in comparison with litters of intermediate size. Class of TMI exhibited an unfavorable relationship with survival (HR = 1.20 for the TMI top class). The modal estimates of sire variance under different baseline hazard functions were 0.06, whereas the variance for the nursed litter was close to 0.7. The estimate of the nursed litter effect variance was greater than that of the sire, which shows the importance of the common environment generated by the nurse sow. Relationships between sire rankings obtained from different survival models were high. The heritability estimate in equivalent scale was low and reached a value of 0.03. Nevertheless, the exploitable genetic variation for this trait justifies the inclusion of piglet preweaning survival in the current breeding program for selection of Large White terminal boars for dry-cured ham production.
Thermal analysis of ice and glass transitions in insects that do and do not survive freezing.
Rozsypal, Jan; Moos, Martin; Šimek, Petr; Koštál, Vladimír
2018-03-01
Some insects rely on the strategy of freeze tolerance for winter survival. During freezing, extracellular body water transitions from the liquid to solid phase and cells undergo freeze-induced dehydration. Here we present results of a thermal analysis (from differential scanning calorimetry) of ice fraction dynamics during gradual cooling after inoculative freezing in variously acclimated larvae of two drosophilid flies, Drosophila melanogaster and Chymomyza costata. Although the species and variants ranged broadly between 0 and close to 100% survival of freezing, there were relatively small differences in ice fraction dynamics. For instance, the maximum ice fraction (IF max ) ranged between 67.9 and 77.7% total body water (TBW). The C. costata larvae showed statistically significant phenotypic shifts in parameters of ice fraction dynamics (melting point and IF max ) upon entry into diapause, cold-acclimation, and feeding on a proline-augmented diet. These differences were mostly driven by colligative effects of accumulated proline (ranging between 6 and 487 mmol.kg -1 TBW) and other metabolites. Our data suggest that these colligative effects per se do not represent a sufficient mechanistic explanation for high freeze tolerance observed in diapausing, cold-acclimated C. costata larvae. Instead, we hypothesize that accumulated proline exerts its protective role via a combination of mechanisms. Specifically, we found a tight association between proline-induced stimulation of glass transition in partially-frozen body liquids (vitrification) and survival of cryopreservation in liquid nitrogen. © 2018. Published by The Company of Biologists Ltd.
DEFF Research Database (Denmark)
Holbech, Henrik
Since 2012, European experts work towards the development and validation of an OECD test guideline for mollusc reproductive toxicity with the freshwater gastropod Lymnaea stagnalis. A ring-test involving six laboratories allowed studying reproducibility of results, based on survival and reproduct......Since 2012, European experts work towards the development and validation of an OECD test guideline for mollusc reproductive toxicity with the freshwater gastropod Lymnaea stagnalis. A ring-test involving six laboratories allowed studying reproducibility of results, based on survival...... and reproduction data of snails monitored over 56 days exposure to cadmium. A classical statistical analysis of data was initially conducted by hypothesis tests and fit of parametric concentrationresponse models. However, as mortality occurred in exposed snails, these analyses require to be refined, particularly...... was twofold. First, we refined the statistical analyses of reproduction data accounting for mortality all along the test period. The variable “number of clutches/eggs produced per individual-day” was used for EC x modelling, as classically done in epidemiology in order to account for the time...
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Theoretical assessment of image analysis: statistical vs structural approaches
Lei, Tianhu; Udupa, Jayaram K.
2003-05-01
Statistical and structural methods are two major approaches commonly used in image analysis and have demonstrated considerable success. The former is based on statistical properties and stochastic models of the image and the latter utilizes geometric and topological models. In this study, Markov random field (MRF) theory/model based image segmentation and Fuzzy Connectedness (FC) theory/Fuzzy connected objeect delineation are chosen as the representatives for these two approaches, respectively. The comparative study is focused on their theoretical foundations and main operative procedures. The MRF is defined on a lattice and the associated neighborhood system and is based on the Markov property. The FC method is defined on a fuzzy digital space and is based on fuzzy relations. Locally, MRF is characterized by potentials of cliques, and FC is described by fuzzy adjacency and affinity relations. Globally, MRF is characterized by Gibbs distribution, and FC is described by fuzzy connectedness. The task of MRF model based image segmentation is toe seek a realization of the embedded MRF through a two-level operation: partitioning and labeling. The task of FC object delineation is to extract a fuzzy object from a given scene, through a two-step operation: recognition and delineation. Theoretical foundations which underly statistical and structural approaches and the principles of the main operative procedures in image segmentation by these two approaches demonstrate more similarities than differences between them. Two approaches can also complement each other, particularly in seed selection, scale formation, affinity and object membership function design for FC and neighbor set selection and clique potential design for MRF.
The system for statistical analysis of logistic information
Directory of Open Access Journals (Sweden)
Khayrullin Rustam Zinnatullovich
2015-05-01
Full Text Available The current problem for managers in logistic and trading companies is the task of improving the operational business performance and developing the logistics support of sales. The development of logistics sales supposes development and implementation of a set of works for the development of the existing warehouse facilities, including both a detailed description of the work performed, and the timing of their implementation. Logistics engineering of warehouse complex includes such tasks as: determining the number and the types of technological zones, calculation of the required number of loading-unloading places, development of storage structures, development and pre-sales preparation zones, development of specifications of storage types, selection of loading-unloading equipment, detailed planning of warehouse logistics system, creation of architectural-planning decisions, selection of information-processing equipment, etc. The currently used ERP and WMS systems did not allow us to solve the full list of logistics engineering problems. In this regard, the development of specialized software products, taking into account the specifics of warehouse logistics, and subsequent integration of these software with ERP and WMS systems seems to be a current task. In this paper we suggest a system of statistical analysis of logistics information, designed to meet the challenges of logistics engineering and planning. The system is based on the methods of statistical data processing.The proposed specialized software is designed to improve the efficiency of the operating business and the development of logistics support of sales. The system is based on the methods of statistical data processing, the methods of assessment and prediction of logistics performance, the methods for the determination and calculation of the data required for registration, storage and processing of metal products, as well as the methods for planning the reconstruction and development
Jones, James H; Brown, Alison; Moyse, Daniel; Qi, Wenjing; Roy, Lance
2017-11-01
Electrical stimulation of the greater occipital nerves is performed to treat pain secondary to chronic daily headaches and occipital neuralgia. The use of fluoroscopy alone to guide the surgical placement of electrodes near the greater occipital nerves disregards the impact of tissue planes on lead stability and stimulation efficacy. We hypothesized that occipital neurostimulator (ONS) leads placed with ultrasonography combined with fluoroscopy would demonstrate increased survival rates and times when compared to ONS leads placed with fluoroscopy alone. A 2-arm retrospective chart review. A single academic medical center. This retrospective chart review analyzed the procedure notes and demographic data of patients who underwent the permanent implant of an ONS lead between July 2012 and August 2015. Patient data included the diagnosis (reason for implant), smoking tobacco use, disability, and age. ONS lead data included the date of permanent implant, the imaging modality used during permanent implant (fluoroscopy with or without ultrasonography), and, if applicable, the date and reason for lead removal. A total of 21 patients (53 leads) were included for the review. Chi-squared tests, Fishers exact tests, 2-sample t-tests, and Wilcoxon rank-sum tests were used to compare fluoroscopy against combined fluoroscopy and ultrasonography as implant methods with respect to patient demographics. These tests were also used to evaluate the primary aim of this study, which was to compare the survival rates and times of ONS leads placed with combined ultrasonography and fluoroscopy versus those placed with fluoroscopy alone. Survival analysis was used to assess the effect of implant method, adjusted for patient demographics (age, smoking tobacco use, and disability), on the risk of lead explant. Data from 21 patients were collected, including a total of 53 ONS leads. There was no statistically significant difference in the lead survival rate or time, disability, or patient age
Statistical analysis of the autoregressive modeling of reverberant speech.
Gaubitch, Nikolay D; Ward, Darren B; Naylor, Patrick A
2006-12-01
Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M-channel observation (M > 1); and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M-channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced (<0.3m).
Statistical analysis of the breaking processes of Ni nanowires
Energy Technology Data Exchange (ETDEWEB)
Garcia-Mochales, P [Departamento de Fisica de la Materia Condensada, Facultad de Ciencias, Universidad Autonoma de Madrid, c/ Francisco Tomas y Valiente 7, Campus de Cantoblanco, E-28049-Madrid (Spain); Paredes, R [Centro de Fisica, Instituto Venezolano de Investigaciones CientIficas, Apartado 20632, Caracas 1020A (Venezuela); Pelaez, S; Serena, P A [Instituto de Ciencia de Materiales de Madrid, Consejo Superior de Investigaciones CientIficas, c/ Sor Juana Ines de la Cruz 3, Campus de Cantoblanco, E-28049-Madrid (Spain)], E-mail: pedro.garciamochales@uam.es
2008-06-04
We have performed a massive statistical analysis on the breaking behaviour of Ni nanowires using molecular dynamic simulations. Three stretching directions, five initial nanowire sizes and two temperatures have been studied. We have constructed minimum cross-section histograms and analysed for the first time the role played by monomers and dimers. The shape of such histograms and the absolute number of monomers and dimers strongly depend on the stretching direction and the initial size of the nanowire. In particular, the statistical behaviour of the breakage final stages of narrow nanowires strongly differs from the behaviour obtained for large nanowires. We have analysed the structure around monomers and dimers. Their most probable local configurations differ from those usually appearing in static electron transport calculations. Their non-local environments show disordered regions along the nanowire if the stretching direction is [100] or [110]. Additionally, we have found that, at room temperature, [100] and [110] stretching directions favour the appearance of non-crystalline staggered pentagonal structures. These pentagonal Ni nanowires are reported in this work for the first time. This set of results suggests that experimental Ni conducting histograms could show a strong dependence on the orientation and temperature.
Analysis of filament statistics in fast camera data on MAST
Farley, Tom; Militello, Fulvio; Walkden, Nick; Harrison, James; Silburn, Scott; Bradley, James
2017-10-01
Coherent filamentary structures have been shown to play a dominant role in turbulent cross-field particle transport [D'Ippolito 2011]. An improved understanding of filaments is vital in order to control scrape off layer (SOL) density profiles and thus control first wall erosion, impurity flushing and coupling of radio frequency heating in future devices. The Elzar code [T. Farley, 2017 in prep.] is applied to MAST data. The code uses information about the magnetic equilibrium to calculate the intensity of light emission along field lines as seen in the camera images, as a function of the field lines' radial and toroidal locations at the mid-plane. In this way a `pseudo-inversion' of the intensity profiles in the camera images is achieved from which filaments can be identified and measured. In this work, a statistical analysis of the intensity fluctuations along field lines in the camera field of view is performed using techniques similar to those typically applied in standard Langmuir probe analyses. These filament statistics are interpreted in terms of the theoretical ergodic framework presented by F. Militello & J.T. Omotani, 2016, in order to better understand how time averaged filament dynamics produce the more familiar SOL density profiles. This work has received funding from the RCUK Energy programme (Grant Number EP/P012450/1), from Euratom (Grant Agreement No. 633053) and from the EUROfusion consortium.
Shetty, Ashish; Kaiwar, Anjali; Shubhashini, N; Ashwini, P; Naveen, DN; Adarsha, MS; Shetty, Mitha; Meena, N
2011-01-01
Background: Veneer restorations provide a valid conservative alternative to complete coverage as they avoid aggressive dental preparation; thus, maintaining tooth structure. Initially, laminates were placed on the unprepared tooth surface. Although there is as yet no consensus as to whether or not teeth should be prepared for laminate veneers, currently, more conservative preparations have been advocated. Because of their esthetic appeal, biocompatibility and adherence to the physiology of minimal-invasive dentistry, porcelain laminate veneers have now become a restoration of choice. Currently, there is a lack of clinical consensus regarding the type of design preferred for laminates. Widely varying survival rates and methods for its estimation have been reported for porcelain veneers over approximately 2–10 years. Relatively few studies have been reported in the literature that use survival estimates, which allow for valid study comparisons between the types of preparation designs used. No survival analysis has been undertaken for the designs used. The purpose of this article is to attempt to review the survival rates of veneers based on different incisal preparation designs from both clinical and non-clinical studies. Aims and Objectives: The purpose of this study is to review both clinical and non-clinical studies to determine the survival rates of veneers based on different incisal preparation designs. A further objective of the study is to understand which is the most successful design in terms of preparation. Materials and Methods This study evaluated the existing literature – survival rates of veneers based on incisal preparation designs. The search strategy involved MEDLINE, BITTORRENT and other databases. Statistical Analysis Data were tabulated. Because of variability in the follow-up period in different studies, the follow-up period was extrapolated to 10 years in common for all of them. Accordingly, the failure rate was then estimated and The
Survival analysis of HIV-infected patients under antiretroviral ...
African Journals Online (AJOL)
admin
Abstract. Background: The introduction of ART dramatically improved the survival and health quality of HIV-infected patients in the industrialized world; and the survival benefit of ART has been well studied too. However, in resource-poor settings, where such treatment was started only recently, limited data exist on treatment ...
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper......This paper discusses inference for I(2) variables in a VAR model. The estimation procedure suggested consists of two reduced rank regressions. The asymptotic distribution of the proposed estimators of the cointegrating coefficients is mixed Gaussian, which implies that asymptotic inference can...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....
Analysis of Official Suicide Statistics in Spain (1910-2011
Directory of Open Access Journals (Sweden)
2017-01-01
Full Text Available In this article we examine the evolution of suicide rates in Spain from 1910 to 2011. As something new, we use standardised suicide rates, making them perfectly comparable geographically and in time, as they no longer reflect population structure. Using historical data from a series of socioeconomic variables for all Spain's provinces and applying new techniques for the statistical analysis of panel data, we are able to confirm many of the hypotheses established by Durkheim at the end of the 19th century, especially those related to fertility and marriage rates, age, sex and the aging index. Our findings, however, contradict Durkheim's approach regarding the impact of urbanisation processes and poverty on suicide.
LONG TERM SURVIVAL FOLLOWING TRAUMATIC BRAIN INJURY: A POPULATION BASED PARAMETRIC SURVIVAL ANALYSIS
Fuller, Gordon Ward; Ransom, Jeanine; Mandrekar, Jay; Brown, Allen W
2017-01-01
Background Long term mortality may be increased following traumatic brain injury (TBI); however the degree to which survival could be reduced is unknown. We aimed to model life expectancy following post-acute TBI to provide predictions of longevity and quantify differences in survivorship with the general population. Methods A population based retrospective cohort study using data from the Rochester Epidemiology Project (REP) was performed. A random sample of patients from Olmsted County, Minnesota with a confirmed TBI between 1987 and 2000 was identified and vital status determined in 2013. Parametric survival modelling was then used to develop a model to predict life expectancy following TBI conditional on age at injury. Survivorship following TBI was also compared with the general population and age and gender matched non-head injured REP controls. Results 769 patients were included in complete case analyses. Median follow up time was 16.1 years (IQR 9.0–20.4) with 120 deaths occurring in the cohort during the study period. Survival after acute TBI was well represented by a Gompertz distribution. Victims of TBI surviving for at least 6 months post-injury demonstrated a much higher ongoing mortality rate compared to the US general population and non-TBI controls (hazard ratio 1·47, 95% CI 1·15–1·87). US general population cohort life table data was used to update the Gompertz model’s shape and scale parameters to account for cohort effects and allow prediction of life expectancy in contemporary TBI. Conclusions Survivors of TBI have decreased life expectancy compared to the general population. This may be secondary to the head injury itself or result from patient characteristics associated with both the propensity for TBI and increased early mortality. Post-TBI life expectancy estimates may be useful to guide prognosis, in public health planning, for actuarial applications and in the extrapolation of outcomes for TBI economic models. PMID:27165161
Statistical energy analysis for a compact refrigeration compressor
Lim, Ji Min; Bolton, J. Stuart; Park, Sung-Un; Hwang, Seon-Woong
2005-09-01
Traditionally the prediction of the vibrational energy level of the components in a compressor is accomplished by using a deterministic model such as a finite element model. While a deterministic approach requires much detail and computational time for a complete dynamic analysis, statistical energy analysis (SEA) requires much less information and computing time. All of these benefits can be obtained by using data averaged over the frequency and spatial domains instead of the direct use of deterministic data. In this paper, SEA will be applied to a compact refrigeration compressor for the prediction of dynamic behavior of each subsystem. Since the compressor used in this application is compact and stiff, the modal densities of its various components are low, especially in the low frequency ranges, and most energy transfers in these ranges are achieved through the indirect coupling paths instead of via direct coupling. For this reason, experimental SEA (ESEA), a good tool for the consideration of the indirect coupling, was used to derive an SEA formulation. Direct comparison of SEA results and experimental data for an operating compressor will be introduced. The power transfer path analysis at certain frequencies made possible by using SEA will be also described to show the advantage of SEA in this application.
Spectral signature verification using statistical analysis and text mining
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Energy Technology Data Exchange (ETDEWEB)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
A Statistic Analysis Of Romanian Seaside Hydro Tourism
Secara Mirela
2011-01-01
Tourism represents one of the ways of spending spare time for rest, recreation, treatment and entertainment, and the specific aspect of Constanta County economy is touristic and spa capitalization of Romanian seaside. In order to analyze hydro tourism on Romanian seaside we have used statistic indicators within tourism as well as statistic methods such as chronological series, interdependent statistic series, regression and statistic correlation. The major objective of this research is to rai...
Damiani, Lucas Petri; Berwanger, Otavio; Paisani, Denise; Laranjeira, Ligia Nasi; Suzumura, Erica Aranha; Amato, Marcelo Britto Passos; Carvalho, Carlos Roberto Ribeiro; Cavalcanti, Alexandre Biasi
2017-01-01
The Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial (ART) is an international multicenter randomized pragmatic controlled trial with allocation concealment involving 120 intensive care units in Brazil, Argentina, Colombia, Italy, Poland, Portugal, Malaysia, Spain, and Uruguay. The primary objective of ART is to determine whether maximum stepwise alveolar recruitment associated with PEEP titration, adjusted according to the static compliance of the respiratory system (ART strategy), is able to increase 28-day survival in patients with acute respiratory distress syndrome compared to conventional treatment (ARDSNet strategy). To describe the data management process and statistical analysis plan. The statistical analysis plan was designed by the trial executive committee and reviewed and approved by the trial steering committee. We provide an overview of the trial design with a special focus on describing the primary (28-day survival) and secondary outcomes. We describe our data management process, data monitoring committee, interim analyses, and sample size calculation. We describe our planned statistical analyses for primary and secondary outcomes as well as pre-specified subgroup analyses. We also provide details for presenting results, including mock tables for baseline characteristics, adherence to the protocol and effect on clinical outcomes. According to best trial practice, we report our statistical analysis plan and data management plan prior to locking the database and beginning analyses. We anticipate that this document will prevent analysis bias and enhance the utility of the reported results. ClinicalTrials.gov number, NCT01374022.
Statistical Analysis Of Tank 5 Floor Sample Results
Energy Technology Data Exchange (ETDEWEB)
Shine, E. P.
2012-08-01
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
STATISTICAL ANALYSIS OF TANK 5 FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Shine, E.
2012-03-14
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, radionuclide, inorganic, and anion concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements above their
Statistical Analysis of Tank 5 Floor Sample Results
Energy Technology Data Exchange (ETDEWEB)
Shine, E. P.
2013-01-31
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide1, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
Statistical analysis of the operating parameters which affect cupola emissions
Energy Technology Data Exchange (ETDEWEB)
Davis, J.W.; Draper, A.B.
1977-12-01
A sampling program was undertaken to determine the operating parameters which affected air pollution emission from gray iron foundry cupolas. The experimental design utilized the analysis of variance routine. Four independent variables were selected for examination on the basis of previous work reported in the literature. These were: (1) blast rate; (2) iron-coke ratio; (3) blast temperature; and (4) cupola size. The last variable was chosen since it most directly affects melt rate. Emissions from cupolas for which concern has been expressed are particle matter and carbon monoxide. The dependent variables were, therefore, particle loading, particle size distribution, and carbon monoxide concentration. Seven production foundries were visited and samples taken under conditions prescribed by the experimental plan. The data obtained from these tests were analyzed using the analysis of variance and other statistical techniques where applicable. The results indicated that blast rate, blast temperature, and cupola size affected particle emissions and the latter two also affected the particle size distribution. The particle size information was also unique in that it showed a consistent particle size distribution at all seven foundaries with a sizable fraction of the particles less than 1.0 micrometers in diameter.
Criminal victimization in Ukraine: analysis of statistical data
Directory of Open Access Journals (Sweden)
Serhiy Nezhurbida
2007-12-01
Full Text Available The article is based on the analysis of statistical data provided by law-enforcement, judicial and other bodies of Ukraine. The given analysis allows us to give an accurate quantity of a current status of crime victimization in Ukraine, to characterize its basic features (level, rate, structure, dynamics, and etc.. L’article se concentre sur l’analyse des données statystiques fournies par les institutions de contrôle sociale (forces de police et magistrature et par d’autres organes institutionnels ukrainiens. Les analyses effectuées attirent l'attention sur la situation actuelle des victimes du crime en Ukraine et aident à délinéer leur principales caractéristiques (niveau, taux, structure, dynamiques, etc.L’articolo si basa sull’analisi dei dati statistici forniti dalle agenzie del controllo sociale (forze dell'ordine e magistratura e da altri organi istituzionali ucraini. Le analisi effettuate forniscono molte informazioni sulla situazione attuale delle vittime del crimine in Ucraina e aiutano a delinearne le caratteristiche principali (livello, tasso, struttura, dinamiche, ecc..
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
Data Analysis & Statistical Methods for Command File Errors
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Statistical analysis of cone penetration resistance of railway ballast
Directory of Open Access Journals (Sweden)
Saussine Gilles
2017-01-01
Full Text Available Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd sensitivity is important and increases with depth. The (qd distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd distribution for clean ballast; they increase by 50% the average value of (qd for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd distribution.
Statistical analysis of cone penetration resistance of railway ballast
Saussine, Gilles; Dhemaied, Amine; Delforge, Quentin; Benfeddoul, Selim
2017-06-01
Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd) increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd) increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd) increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd) sensitivity is important and increases with depth. The (qd) distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd) distribution for clean ballast; they increase by 50% the average value of (qd) for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd) distribution.
Tucker Tensor analysis of Matern functions in spatial statistics
Litvinenko, Alexander
2017-11-18
In this work, we describe advanced numerical tools for working with multivariate functions and for the analysis of large data sets. These tools will drastically reduce the required computing time and the storage cost, and, therefore, will allow us to consider much larger data sets or finer meshes. Covariance matrices are crucial in spatio-temporal statistical tasks, but are often very expensive to compute and store, especially in 3D. Therefore, we approximate covariance functions by cheap surrogates in a low-rank tensor format. We apply the Tucker and canonical tensor decompositions to a family of Matern- and Slater-type functions with varying parameters and demonstrate numerically that their approximations exhibit exponentially fast convergence. We prove the exponential convergence of the Tucker and canonical approximations in tensor rank parameters. Several statistical operations are performed in this low-rank tensor format, including evaluating the conditional covariance matrix, spatially averaged estimation variance, computing a quadratic form, determinant, trace, loglikelihood, inverse, and Cholesky decomposition of a large covariance matrix. Low-rank tensor approximations reduce the computing and storage costs essentially. For example, the storage cost is reduced from an exponential O(n^d) to a linear scaling O(drn), where d is the spatial dimension, n is the number of mesh points in one direction, and r is the tensor rank. Prerequisites for applicability of the proposed techniques are the assumptions that the data, locations, and measurements lie on a tensor (axes-parallel) grid and that the covariance function depends on a distance, ||x-y||.
STATISTICAL ANALYSIS OF RAW SUGAR MATERIAL FOR SUGAR PRODUCER COMPLEX
Directory of Open Access Journals (Sweden)
A. A. Gromkovskii
2015-01-01
Full Text Available Summary. In the article examines the statistical data on the development of average weight and average sugar content of sugar beet roots. The successful solution of the problem of forecasting these raw indices is essential for solving problems of sugar producing complex control. In the paper by calculating the autocorrelation function demonstrated that the predominant trend component of the growth raw characteristics. For construct the prediction model is proposed to use an autoregressive first and second order. It is shown that despite the small amount of experimental data, which provide raw sugar producing enterprises laboratory, using autoregression is justified. The proposed model allows correctly out properly the dynamics of changes raw indexes in the time, which confirms the estimates. In the article highlighted the fact that in the case the predominance trend components in the dynamics of the studied characteristics of sugar beet proposed prediction models provide the better quality of the forecast. In the presence the oscillations portions of the curve describing the change raw performance, for better construction of the forecast required increase number of measurements data. In the article also presents the results of the use adaptive prediction Brown’s model for predicting sugar beet raw performance. The statistical analysis allowed conclusions about the level of quality sufficient to describe changes raw indices for the forecast development. The optimal discount rates data are identified that determined by the form of the curve of growth sugar content of the beet root and mass in the process of maturation. Formulated conclusions of the quality of the forecast, depending on these factors that determines the expert forecaster. In the article shows the calculated expression, derived from experimental data that allow calculate changes of the raw material feature of sugar beet in the process of maturation.
Foster Care Reentry: A survival analysis assessing differences across permanency type.
Goering, Emily Smith; Shaw, Terry V
2017-06-01
Foster care reentry is an important factor for evaluating the overall success of permanency. Rates of reentry are typically only measured for 12-months and are often evaluated only for children who exit foster care to reunification and not across exit types, also known as 'permanency types'. This study examined the odds of reentry across multiple common permanency types for a cohort of 8107 children who achieved permanency between 2009 and 2013. Overall, 14% of children reentered care within 18-months with an average time to reentry of 6.36 months. A Kaplan-Meier survival analysis was used to assess differences in reentry across permanency types (including reunification, relative guardianship and non-relative guardianship). Children who achieved guardianship with kin had the lowest odds of reentry overall, followed by guardianship with non-kin, and reunification with family of origin. Children reunifying against the recommendations of Children and Family Services had the highest odds of reentry. A Cox regression survival analysis was conducted to assess odds of reentry across permanency type while controlling for demographics, services, and other risk factors. In the final model, only permanency type and cumulative risk were found to have a statistically significant impact on odds of reentry. Copyright © 2017 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Jerneja Pikelj
2015-06-01
Full Text Available The paper has two practical purposes. The first one is to analyze how successfully R can be used for data analysis on surveys carried out by the Statistical Office of the Republic of Slovenia. In order to achieve this goal, we analyzed the data of the Monthly Statistical Survey on Earnings Paid by Legal Persons. The second purpose is to analyze how the assumption on the nonresponse mechanism, which occurs in the sample, impacts the estimated values of the unknown statistics in the survey. Depending on these assumptions, different approaches to adjust the problem caused by unit nonresponse are presented. We conclude the paper with the results of the analysis of the data and the main issues connected with the usage of R in official statistics.
The Inappropriate Symmetries of Multivariate Statistical Analysis in Geometric Morphometrics.
Bookstein, Fred L
In today's geometric morphometrics the commonest multivariate statistical procedures, such as principal component analysis or regressions of Procrustes shape coordinates on Centroid Size, embody a tacit roster of symmetries-axioms concerning the homogeneity of the multiple spatial domains or descriptor vectors involved-that do not correspond to actual biological fact. These techniques are hence inappropriate for any application regarding which we have a-priori biological knowledge to the contrary (e.g., genetic/morphogenetic processes common to multiple landmarks, the range of normal in anatomy atlases, the consequences of growth or function for form). But nearly every morphometric investigation is motivated by prior insights of this sort. We therefore need new tools that explicitly incorporate these elements of knowledge, should they be quantitative, to break the symmetries of the classic morphometric approaches. Some of these are already available in our literature but deserve to be known more widely: deflated (spatially adaptive) reference distributions of Procrustes coordinates, Sewall Wright's century-old variant of factor analysis, the geometric algebra of importing explicit biomechanical formulas into Procrustes space. Other methods, not yet fully formulated, might involve parameterized models for strain in idealized forms under load, principled approaches to the separation of functional from Brownian aspects of shape variation over time, and, in general, a better understanding of how the formalism of landmarks interacts with the many other approaches to quantification of anatomy. To more powerfully organize inferences from the high-dimensional measurements that characterize so much of today's organismal biology, tomorrow's toolkit must rely neither on principal component analysis nor on the Procrustes distance formula, but instead on sound prior biological knowledge as expressed in formulas whose coefficients are not all the same. I describe the problems of
Statistical analysis with measurement error or misclassification strategy, method and application
Yi, Grace Y
2017-01-01
This monograph on measurement error and misclassification covers a broad range of problems and emphasizes unique features in modeling and analyzing problems arising from medical research and epidemiological studies. Many measurement error and misclassification problems have been addressed in various fields over the years as well as with a wide spectrum of data, including event history data (such as survival data and recurrent event data), correlated data (such as longitudinal data and clustered data), multi-state event data, and data arising from case-control studies. Statistical Analysis with Measurement Error or Misclassification: Strategy, Method and Application brings together assorted methods in a single text and provides an update of recent developments for a variety of settings. Measurement error effects and strategies of handling mismeasurement for different models are closely examined in combination with applications to specific problems. Readers with diverse backgrounds and objectives can utilize th...
Rizk, J; Ouzzane, A; Flamand, V; Fantoni, J-C; Puech, P; Leroy, X; Villers, A
2015-03-01
To assess long term biochemical recurrence free survival after radical prostatectomy according to open, laparoscopic and robot-assisted surgical approach and clinicopathological stage. A cohort study of 1313 consecutive patients treated by radical prostatectomy for localized or locally advanced prostate cancer between 2000 and 2013. Open surgery (63.7%), laparoscopy (10%) and robot-assisted laparoscopy (26.4%) were performed. Biochemical recurrence was defined by PSA>0,1ng/mL. The biochemical recurrence free survival was described by Kaplan Meier method and prognostic factors were analysed by multivariable Cox regression. Median follow-up was 57 months (IQR: 31-90). Ten years biochemical recurrence free survival was 88.5%, 71.6% and 53.5% respectively for low, intermediate and high-risk D'Amico groups. On multivariable analysis, the worse prognostic factor was Gleason score (PBiochemical recurrence free survival (P=0.06) and positive surgical margins rate (P=0.87) were not statistically different between the three surgical approaches. Biochemical recurrence free survival in our study does not differ according to surgical approach and is similar to published series. Ten years biochemical recurrence free survival for high-risk tumours without hormone therapy is 54% justifying the role of surgery in the therapeutic conversations in this group of tumours. 3. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Study of Hip Fracture Risk using Tree Structured Survival Analysis
Directory of Open Access Journals (Sweden)
Lu Y
2003-01-01
Full Text Available In dieser Studie wird das Hüftfraktur-Risiko bei postmenopausalen Frauen untersucht, indem die Frauen in verschiedene Subgruppen hinsichtlich dieses Risikos klassifiziert werden. Frauen in einer gemeinsamen Subgruppe haben ein ähnliches Risiko, hingegen in verschiedenen Subgruppen ein unterschiedliches Hüftfraktur-Risiko. Die Subgruppen wurden mittels der Tree Structured Survival Analysis (TSSA aus den Daten von 7.665 Frauen der SOF (Study of Osteoporosis Fracture ermittelt. Bei allen Studienteilnehmerinnen wurde die Knochenmineraldichte (BMD von Unterarm, Oberschenkelhals, Hüfte und Wirbelsäule gemessen. Die Zeit von der BMD-Messung bis zur Hüftfraktur wurde als Endpunkt notiert. Eine Stichprobe von 75% der Teilnehmerinnen wurde verwendet, um die prognostischen Subgruppen zu bilden (Trainings-Datensatz, während die anderen 25% als Bestätigung der Ergebnisse diente (Validierungs-Datensatz. Aufgrund des Trainings-Datensatzes konnten mittels TSSA 4 Subgruppen identifiziert werden, deren Hüftfraktur-Risiko bei einem Follow-up von im Mittel 6,5 Jahren bei 19%, 9%, 4% und 1% lag. Die Einteilung in die Subgruppen erfolgte aufgrund der Bewertung der BMD des Ward'schen Dreiecks sowie des Oberschenkelhalses und nach dem Alter. Diese Ergebnisse konnten mittels des Validierungs-Datensatzes reproduziert werden, was die Sinnhaftigkeit der Klassifizierungregeln in einem klinischen Setting bestätigte. Mittels TSSA war eine sinnvolle, aussagekräftige und reproduzierbare Identifikation von prognostischen Subgruppen, die auf dem Alter und den BMD-Werten beruhen, möglich. In this paper we studied the risk of hip fracture for post-menopausal women by classifying women into different subgroups based on their risk of hip fracture. The subgroups were generated such that all the women in a particular subgroup had relatively similar risk while women belonging to two different subgroups had rather different risks of hip fracture. We used the Tree Structured
Statistical Analysis of Data with Non-Detectable Values
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L.
2004-08-26
Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical
Statistical analysis as approach to conductive heat transfer modelling
Antonyová, A.; Antony, P.
2013-04-01
The main inspiration for article was the problem of high investment into installation of the building insulation. The question of its effectiveness and reliability also after the period of 10 or 15 years was the topic of the international research project carried out at the University of Prešov in Prešov and Vienna University of Technology entitled "Detection and Management of Risk Processes in Building Insulation" and numbered SRDA SK-AT-0008-10. To detect especially the moisture problem as risk process in the space between the wall and insulation led to construction new measuring equipment to test the moisture and temperature without the insulation destruction and this way to describe real situation in old buildings too. The further investigation allowed us to analyse the range of data in the amount of 1680 measurements and express conductive heat transfer using the methods of statistical analysis. Modelling comprises relationships of the environment properties inside the building, in the space between the wall and insulation and in ambient surrounding of the building. Radial distribution function also characterizes the connection of the temperature differences.
Utility green pricing programs: A statistical analysis of program effectiveness
Energy Technology Data Exchange (ETDEWEB)
Wiser, Ryan; Olson, Scott; Bird, Lori; Swezey, Blair
2004-02-01
Development of renewable energy. Such programs have grown in number in recent years. The design features and effectiveness of these programs varies considerably, however, leading a variety of stakeholders to suggest specific marketing and program design features that might improve customer response and renewable energy sales. This report analyzes actual utility green pricing program data to provide further insight into which program features might help maximize both customer participation in green pricing programs and the amount of renewable energy purchased by customers in those programs. Statistical analysis is performed on both the residential and non-residential customer segments. Data comes from information gathered through a questionnaire completed for 66 utility green pricing programs in early 2003. The questionnaire specifically gathered data on residential and non-residential participation, amount of renewable energy sold, program length, the type of renewable supply used, program price/cost premiums, types of consumer research and program evaluation performed, different sign-up options available, program marketing efforts, and ancillary benefits offered to participants.
Measurement of Plethysmogram and Statistical Method for Analysis
Shimizu, Toshihiro
The plethysmogram is measured at different points of human body by using the photo interrupter, which sensitively depends on the physical and mental situation of human body. In this paper the statistical method of the data-analysis is investigated to discuss the dependence of plethysmogram on stress and aging. The first one is the representation method based on the return map, which provides usuful information for the waveform, the flucuation in phase and the fluctuation in amplitude. The return map method makes it possible to understand the fluctuation of plethymogram in amplitude and in phase more clearly and globally than in the conventional power spectrum method. The second is the Lisajous plot and the correlation function to analyze the phase difference between the plethysmograms of the right finger tip and of the left finger tip. The third is the R-index, from which we can estimate “the age of the blood flow”. The R-index is defined by the global character of plethysmogram, which is different from the usual APG-index. The stress- and age-dependence of plethysmogram is discussed by using these methods.
Corrected Statistical Energy Analysis Model for Car Interior Noise
Directory of Open Access Journals (Sweden)
A. Putra
2015-01-01
Full Text Available Statistical energy analysis (SEA is a well-known method to analyze the flow of acoustic and vibration energy in a complex structure. For an acoustic space where significant absorptive materials are present, direct field component from the sound source dominates the total sound field rather than a reverberant field, where the latter becomes the basis in constructing the conventional SEA model. Such environment can be found in a car interior and thus a corrected SEA model is proposed here to counter this situation. The model is developed by eliminating the direct field component from the total sound field and only the power after the first reflection is considered. A test car cabin was divided into two subsystems and by using a loudspeaker as a sound source, the power injection method in SEA was employed to obtain the corrected coupling loss factor and the damping loss factor from the corrected SEA model. These parameters were then used to predict the sound pressure level in the interior cabin using the injected input power from the engine. The results show satisfactory agreement with the directly measured SPL.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Energy Technology Data Exchange (ETDEWEB)
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Statistical analysis of CSP plants by simulating extensive meteorological series
Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana
2017-06-01
The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.
Statistical Analysis of Loss of Offsite Power Events
Directory of Open Access Journals (Sweden)
Andrija Volkanovski
2016-01-01
Full Text Available This paper presents the results of the statistical analysis of the loss of offsite power events (LOOP registered in four reviewed databases. The reviewed databases include the IRSN (Institut de Radioprotection et de Sûreté Nucléaire SAPIDE database and the GRS (Gesellschaft für Anlagen- und Reaktorsicherheit mbH VERA database reviewed over the period from 1992 to 2011. The US NRC (Nuclear Regulatory Commission Licensee Event Reports (LERs database and the IAEA International Reporting System (IRS database were screened for relevant events registered over the period from 1990 to 2013. The number of LOOP events in each year in the analysed period and mode of operation are assessed during the screening. The LOOP frequencies obtained for the French and German nuclear power plants (NPPs during critical operation are of the same order of magnitude with the plant related events as a dominant contributor. A frequency of one LOOP event per shutdown year is obtained for German NPPs in shutdown mode of operation. For the US NPPs, the obtained LOOP frequency for critical and shutdown mode is comparable to the one assessed in NUREG/CR-6890. Decreasing trend is obtained for the LOOP events registered in three databases (IRSN, GRS, and NRC.
Survival analysis of cervical cancer using stratified Cox regression
Purnami, S. W.; Inayati, K. D.; Sari, N. W. Wulan; Chosuvivatwong, V.; Sriplung, H.
2016-04-01
Cervical cancer is one of the mostly widely cancer cause of the women death in the world including Indonesia. Most cervical cancer patients come to the hospital already in an advanced stadium. As a result, the treatment of cervical cancer becomes more difficult and even can increase the death's risk. One of parameter that can be used to assess successfully of treatment is the probability of survival. This study raises the issue of cervical cancer survival patients at Dr. Soetomo Hospital using stratified Cox regression based on six factors such as age, stadium, treatment initiation, companion disease, complication, and anemia. Stratified Cox model is used because there is one independent variable that does not satisfy the proportional hazards assumption that is stadium. The results of the stratified Cox model show that the complication variable is significant factor which influent survival probability of cervical cancer patient. The obtained hazard ratio is 7.35. It means that cervical cancer patient who has complication is at risk of dying 7.35 times greater than patient who did not has complication. While the adjusted survival curves showed that stadium IV had the lowest probability of survival.
Statistical Analysis of Development Trends in Global Renewable Energy
Directory of Open Access Journals (Sweden)
Marina D. Simonova
2016-01-01
Full Text Available The article focuses on the economic and statistical analysis of industries associated with the use of renewable energy sources in several countries. The dynamic development and implementation of technologies based on renewable energy sources (hereinafter RES is the defining trend of world energy development. The uneven distribution of hydrocarbon reserves, increasing demand of developing countries and environmental risks associated with the production and consumption of fossil resources has led to an increasing interest of many states to this field. Creating low-carbon economies involves the implementation of plans to increase the proportion of clean energy through renewable energy sources, energy efficiency, reduce greenhouse gas emissions. The priority of this sector is a characteristic feature of modern development of developed (USA, EU, Japan and emerging economies (China, India, Brazil, etc., as evidenced by the inclusion of the development of this segment in the state energy strategies and the revision of existing approaches to energy security. The analysis of the use of renewable energy, its contribution to value added of countries-producers is of a particular interest. Over the last decade, the share of energy produced from renewable sources in the energy balances of the world's largest economies increased significantly. Every year the number of power generating capacity based on renewable energy is growing, especially, this trend is apparent in China, USA and European Union countries. There is a significant increase in direct investment in renewable energy. The total investment over the past ten years increased by 5.6 times. The most rapidly developing kinds are solar energy and wind power.
Statistical analysis and optimization of igbt manufacturing flow
Directory of Open Access Journals (Sweden)
Baranov V. V.
2015-02-01
Full Text Available The use of computer simulation, design and optimization of power electronic devices formation technological processes can significantly reduce development time, improve the accuracy of calculations, choose the best options for implementation based on strict mathematical analysis. One of the most common power electronic devices is isolated gate bipolar transistor (IGBT, which combines the advantages of MOSFET and bipolar transistor. The achievement of high requirements for these devices is only possible by optimizing device design and manufacturing process parameters. Therefore important and necessary step in the modern cycle of IC design and manufacturing is to carry out the statistical analysis. Procedure of the IGBT threshold voltage optimization was realized. Through screening experiments according to the Plackett-Burman design the most important input parameters (factors that have the greatest impact on the output characteristic was detected. The coefficients of the approximation polynomial adequately describing the relationship between the input parameters and investigated output characteristics ware determined. Using the calculated approximation polynomial, a series of multiple, in a cycle of Monte Carlo, calculations to determine the spread of threshold voltage values at selected ranges of input parameters deviation were carried out. Combinations of input process parameters values were determined randomly by a normal distribution within a given range of changes. The procedure of IGBT process parameters optimization consist a mathematical problem of determining the value range of the input significant structural and technological parameters providing the change of the IGBT threshold voltage in a given interval. The presented results demonstrate the effectiveness of the proposed optimization techniques.
Allen, Kirk
The Statistics Concept Inventory (SCI) is a multiple choice test designed to assess students' conceptual understanding of topics typically encountered in an introductory statistics course. This dissertation documents the development of the SCI from Fall 2002 up to Spring 2006. The first phase of the project essentially sought to answer the question: "Can you write a test to assess topics typically encountered in introductory statistics?" Book One presents the results utilized in answering this question in the affirmative. The bulk of the results present the development and evolution of the items, primarily relying on objective metrics to gauge effectiveness but also incorporating student feedback. The second phase boils down to: "Now that you have the test, what else can you do with it?" This includes an exploration of Cronbach's alpha, the most commonly-used measure of test reliability in the literature. An online version of the SCI was designed, and its equivalency to the paper version is assessed. Adding an extra wrinkle to the online SCI, subjects rated their answer confidence. These results show a general positive trend between confidence and correct responses. However, some items buck this trend, revealing potential sources of misunderstandings, with comparisons offered to the extant statistics and probability educational research. The third phase is a re-assessment of the SCI: "Are you sure?" A factor analytic study favored a uni-dimensional structure for the SCI, although maintaining the likelihood of a deeper structure if more items can be written to tap similar topics. A shortened version of the instrument is proposed, demonstrated to be able to maintain a reliability nearly identical to that of the full instrument. Incorporating student feedback and a faculty topics survey, improvements to the items and recommendations for further research are proposed. The state of the concept inventory movement is assessed, to offer a comparison to the work presented
Individual patient data meta-analysis of survival data using Poisson regression models
Directory of Open Access Journals (Sweden)
Crowther Michael J
2012-03-01
Full Text Available Abstract Background An Individual Patient Data (IPD meta-analysis is often considered the gold-standard for synthesising survival data from clinical trials. An IPD meta-analysis can be achieved by either a two-stage or a one-stage approach, depending on whether the trials are analysed separately or simultaneously. A range of one-stage hierarchical Cox models have been previously proposed, but these are known to be computationally intensive and are not currently available in all standard statistical software. We describe an alternative approach using Poisson based Generalised Linear Models (GLMs. Methods We illustrate, through application and simulation, the Poisson approach both classically and in a Bayesian framework, in two-stage and one-stage approaches. We outline the benefits of our one-stage approach through extension to modelling treatment-covariate interactions and non-proportional hazards. Ten trials of hypertension treatment, with all-cause death the outcome of interest, are used to apply and assess the approach. Results We show that the Poisson approach obtains almost identical estimates to the Cox model, is additionally computationally efficient and directly estimates the baseline hazard. Some downward bias is observed in classical estimates of the heterogeneity in the treatment effect, with improved performance from the Bayesian approach. Conclusion Our approach provides a highly flexible and computationally efficient framework, available in all standard statistical software, to the investigation of not only heterogeneity, but the presence of non-proportional hazards and treatment effect modifiers.
Individual patient data meta-analysis of survival data using Poisson regression models.
Crowther, Michael J; Riley, Richard D; Staessen, Jan A; Wang, Jiguang; Gueyffier, Francois; Lambert, Paul C
2012-03-23
An Individual Patient Data (IPD) meta-analysis is often considered the gold-standard for synthesising survival data from clinical trials. An IPD meta-analysis can be achieved by either a two-stage or a one-stage approach, depending on whether the trials are analysed separately or simultaneously. A range of one-stage hierarchical Cox models have been previously proposed, but these are known to be computationally intensive and are not currently available in all standard statistical software. We describe an alternative approach using Poisson based Generalised Linear Models (GLMs). We illustrate, through application and simulation, the Poisson approach both classically and in a Bayesian framework, in two-stage and one-stage approaches. We outline the benefits of our one-stage approach through extension to modelling treatment-covariate interactions and non-proportional hazards. Ten trials of hypertension treatment, with all-cause death the outcome of interest, are used to apply and assess the approach. We show that the Poisson approach obtains almost identical estimates to the Cox model, is additionally computationally efficient and directly estimates the baseline hazard. Some downward bias is observed in classical estimates of the heterogeneity in the treatment effect, with improved performance from the Bayesian approach. Our approach provides a highly flexible and computationally efficient framework, available in all standard statistical software, to the investigation of not only heterogeneity, but the presence of non-proportional hazards and treatment effect modifiers.
TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION
Directory of Open Access Journals (Sweden)
А. А. Vershinina
2014-01-01
Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.
A note on the statistical analysis of point judgment matrices
African Journals Online (AJOL)
There is scope for further research into statistical approaches for analyzing judgment matrices. In particular statistically based methods address rank reversal since standard errors are associated with estimates of the weights and thus the rankings are not stated with certainty. However, the weights are constrained to lie in a ...
Ye, Ding; Jiang, Danjie; Li, Yingjun; Jin, Mingjuan; Chen, Kun
2017-08-01
The prognostic value of long interspersed nucleotide element-1 (LINE-1) methylation in patients with colorectal cancer (CRC) remains uncertain. We have therefore performed a meta-analysis to elucidate this issue. The PubMed and Web of Science databases were searched for studies published up to 30 June 2016 which reported on an association between LINE-1 methylation and overall survival (OS), disease-free survival (DFS), or cancer-specific survival (CSS) among CRC patients. The reference lists of the identified studies were also analyzed to identify additional eligible studies. Hazard ratios (HRs) with 95% confidence intervals (CIs) were pooled using the fixed-effects or the random-effects model. Stratification analysis and meta-regression analysis were performed to detect the source of heterogeneity. Analyses of sensitivity and publication bias were also carried out. Thirteen independent studies involving 3620 CRC patients were recruited to the meta-analysis. LINE-1 hypomethylation was found to be significantly associated with shorter OS (HR 2.92, 95% CI 2.20-3.88, p LINE-1 hypomethylation and OS or DFS, with the exception being CSS. Moreover, meta-regression analysis suggested that one of the contributors to between-study heterogeneity on the association between LINE-1 methylation and CSS was statistical methodology. The subgroup analysis suggested that the association in studies using the Cox model statistical method (HR 2.76, 95% CI 1.90-4.01, p LINE-1 methylation is significantly associated with the survival of CRC patients and that it could be a predictive factor for CRC prognosis.
Pratt, T R; Pulling, C C; Stanton, M S
2000-07-01
Monitoring and reporting mechanisms are vital tools for clinicians to assess ICD system performance over time for optimal patient care. This article explores the various reporting mechanisms available to the clinician, both historical and current, and compares and contrasts two such methods. The lead survival rates obtained by return product analysis (RPA) are compared with those from an ongoing prospective chronic study that actively follows patients for clinical ICD system failures (Tachyarrhythmia Chronic Systems Study [TCSS]). Examination of available data shows that a prospective study such as the TCSS is capable of detecting clinically significant adverse events in 2.2% of the 3,958 leads followed. By comparison, RPA-based monitoring of the same leads detects "out of specification" events in 0.5% of the 78,571 leads followed. Statistical analyses of two separate families of leads (RV leads and SQ Patch leads) show that survival rates obtained by the two methods begin to differ at approximately 2 years of implant experience, with 95% confidence intervals no longer overlapping at 3 years. The authors conclude that prospective chronic device studies are a superior tool for the ongoing monitoring of implanted device performance compared to RPA-based reports.
Analysis of survival in breast cancer patients by using different parametric models
Enera Amran, Syahila; Asrul Afendi Abdullah, M.; Kek, Sie Long; Afiqah Muhamad Jamil, Siti
2017-09-01
In biomedical applications or clinical trials, right censoring was often arising when studying the time to event data. In this case, some individuals are still alive at the end of the study or lost to follow up at a certain time. It is an important issue to handle the censoring data in order to prevent any bias information in the analysis. Therefore, this study was carried out to analyze the right censoring data with three different parametric models; exponential model, Weibull model and log-logistic models. Data of breast cancer patients from Hospital Sultan Ismail, Johor Bahru from 30 December 2008 until 15 February 2017 was used in this study to illustrate the right censoring data. Besides, the covariates included in this study are the time of breast cancer infection patients survive t, age of each patients X1 and treatment given to the patients X2 . In order to determine the best parametric models in analysing survival of breast cancer patients, the performance of each model was compare based on Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and log-likelihood value using statistical software R. When analysing the breast cancer data, all three distributions were shown consistency of data with the line graph of cumulative hazard function resembles a straight line going through the origin. As the result, log-logistic model was the best fitted parametric model compared with exponential and Weibull model since it has the smallest value in AIC and BIC, also the biggest value in log-likelihood.
Gene–gene interaction analysis for the survival phenotype based on the Cox model
Lee, Seungyeoun; Kwon, Min-Seok; Oh, Jung Mi; Park, Taesung
2012-01-01
Motivation: For the past few decades, many statistical methods in genome-wide association studies (GWAS) have been developed to identify SNP–SNP interactions for case-control studies. However, there has been less work for prospective cohort studies, involving the survival time. Recently, Gui et al. (2011) proposed a novel method, called Surv-MDR, for detecting gene–gene interactions associated with survival time. Surv-MDR is an extension of the multifactor dimensionality reduction (MDR) metho...
Infant and child mortality in Ethiopia: A statistical analysis approach ...
African Journals Online (AJOL)
... associated wi th child mortal ity. Furthermore, Mother's education, birth order has substantial impact on child mortality in E t h i o p i a . Finally these findings specified that an increase in Mothers' education, improve health care services which should in turn raise child survival and should decrease child mortality in Ethiopia ...
Gula, Lorne J; Massel, David; Krahn, Andrew D; Skanes, Allan C; Yee, Raymond; Klein, George J
2007-02-01
Recalls and advisories of implanted cardioverter-defibrillators (ICDs) have become an unfortunate reality of cardiac rhythm management. With a paucity of data available on which to base replacement decisions, our goal is to model the potential risks and benefits of ICD generator replacement. The estimated risks are varied through a wide range to determine the potential range of outcomes. Using initial estimates of risk derived from real data on 2915 advisory devices from 17 implanting centers, a decision analysis and Markov model were used to estimate survival according to device replacement decision. Survival rates at 5 years with and without device replacement were estimated at 60.38% and 60.66%, respectively. This difference was not significantly different on comparative analysis, using variability determined by Monte Carlo simulation. One-way and two-way sensitivity analyses are presented, demonstrating the minimal effect of varying estimates of risk. Only variation in risk of device failure had a differential effect on survival, with a survival benefit at 7 years if annual risk of device failure is at least 1.8%. Little differential effect on survival was demonstrated by variation of estimates of arrhythmia risk, nonarrhythmic mortality, and postprocedure infection rate. Survival rates with a generator replacement or nonreplacement strategy in response to ICD recalls are similar and decrease nearly in parallel over time. The main factor with differential effect on survival is risk of device failure, although the level of this risk required to confer a survival advantage to a replacement strategy is quite large.
AMA Statistical Information Based Analysis of a Compressive Imaging System
Hope, D.; Prasad, S.
-based analysis of a compressive imaging system based on a new highly efficient and robust method that enables us to evaluate statistical entropies. Our method is based on the notion of density of states (DOS), which plays a major role in statistical mechanics by allowing one to express macroscopic thermal averages in terms of the number of configuration states of a system for a certain energy level. Instead of computing the number of states at a certain energy level, however, we compute the number of possible configurations (states) of a particular image scene that correspond to a certain probability value. This allows us to compute the probability for each possible state, or configuration, of the scene being imaged. We assess the performance of a single pixel compressive imaging system based on the amount of information encoded and transmitted in parameters that characterize the information in the scene. Amongst many examples, we study the problem of faint companion detection. Here, we show how information in the recorded images depends on the choice of basis for representing the scene and the amount of measurement noise. The noise creates confusion when associating a recorded image with the correct member of the ensemble that produced the image. We show that multiple measurements enable one to mitigate this confusion noise.
Statistical analysis of compressive low rank tomography with random measurements
Acharya, Anirudh; Guţă, Mădălin
2017-05-01
We consider the statistical problem of ‘compressive’ estimation of low rank states (r\\ll d ) with random basis measurements, where r, d are the rank and dimension of the state respectively. We investigate whether for a fixed sample size N, the estimation error associated with a ‘compressive’ measurement setup is ‘close’ to that of the setting where a large number of bases are measured. We generalise and extend previous results, and show that the mean square error (MSE) associated with the Frobenius norm attains the optimal rate rd/N with only O(r log{d}) random basis measurements for all states. An important tool in the analysis is the concentration of the Fisher information matrix (FIM). We demonstrate that although a concentration of the MSE follows from a concentration of the FIM for most states, the FIM fails to concentrate for states with eigenvalues close to zero. We analyse this phenomenon in the case of a single qubit and demonstrate a concentration of the MSE about its optimal despite a lack of concentration of the FIM for states close to the boundary of the Bloch sphere. We also consider the estimation error in terms of a different metric-the quantum infidelity. We show that a concentration in the mean infidelity (MINF) does not exist uniformly over all states, highlighting the importance of loss function choice. Specifically, we show that for states that are nearly pure, the MINF scales as 1/\\sqrt{N} but the constant converges to zero as the number of settings is increased. This demonstrates a lack of ‘compressive’ recovery for nearly pure states in this metric.
Tutorial on Biostatistics: Statistical Analysis for Correlated Binary Eye Data.
Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard
2018-02-01
To describe and demonstrate methods for analyzing correlated binary eye data. We describe non-model based (McNemar's test, Cochran-Mantel-Haenszel test) and model-based methods (generalized linear mixed effects model, marginal model) for analyses involving both eyes. These methods were applied to: (1) CAPT (Complications of Age-related Macular Degeneration Prevention Trial) where one eye was treated and the other observed (paired design); (2) ETROP (Early Treatment for Retinopathy of Prematurity) where bilaterally affected infants had one eye treated conventionally and the other treated early and unilaterally affected infants had treatment assigned randomly; and (3) AREDS (Age-Related Eye Disease Study) where treatment was systemic and outcome was eye-specific (both eyes in the same treatment group). In the CAPT (n = 80), treatment group (30% vision loss in treated vs. 44% in observed eyes) was not statistically significant (p = 0.07) when inter-eye correlation was ignored, but was significant (p = 0.01) with McNemar's test and the marginal model. Using standard logistic regression for unfavorable vision in ETROP, standard errors and p-values were larger for person-level covariates and were smaller for ocular covariates than using models accounting for inter-eye correlation. For risk factors of geographic atrophy in AREDS, two-eye analyses accounting for inter-eye correlation yielded more power than one-eye analyses and provided larger standard errors and p-values than invalid two-eye analyses ignoring inter-eye correlation. Ignoring inter-eye correlation can lead to larger p-values for paired designs and smaller p-values when both eyes are in the same group. Marginal models or mixed effects models using the eye as the unit of analysis provide valid inference.
Probability and Statistics Questions and Tests : a critical analysis
Directory of Open Access Journals (Sweden)
Fabrizio Maturo
2015-06-01
Full Text Available In probability and statistics courses, a popular method for the evaluation of the students is to assess them using multiple choice tests. The use of these tests allows to evaluate certain types of skills such as fast response, short-term memory, mental clarity and ability to compete. In our opinion, the verification through testing can certainly be useful for the analysis of certain aspects, and to speed up the process of assessment, but we should be aware of the limitations of such a standardized procedure and then exclude that the assessments of pupils, classes and schools can be reduced to processing of test results. To prove this thesis, this article argues in detail the main test limits, presents some recent models which have been proposed in the literature and suggests some alternative valuation methods. Quesiti e test di Probabilità e Statistica: un'analisi critica Nei corsi di Probabilità e Statistica, un metodo molto diffuso per la valutazione degli studenti consiste nel sottoporli a quiz a risposta multipla. L'uso di questi test permette di valutare alcuni tipi di abilità come la rapidità di risposta, la memoria a breve termine, la lucidità mentale e l'attitudine a gareggiare. A nostro parere, la verifica attraverso i test può essere sicuramente utile per l'analisi di alcuni aspetti e per velocizzare il percorso di valutazione ma si deve essere consapevoli dei limiti di una tale procedura standardizzata e quindi escludere che le valutazioni di alunni, classi e scuole possano essere ridotte a elaborazioni di risultati di test. A dimostrazione di questa tesi, questo articolo argomenta in dettaglio i limiti principali dei test, presenta alcuni recenti modelli proposti in letteratura e propone alcuni metodi di valutazione alternativi. Parole Chiave: item responce theory, valutazione, test, probabilità
Statistical analysis of unstructured amino acid residues in protein structures.
Lobanov, M Yu; Garbuzynskiy, S O; Galzitskaya, O V
2010-02-01
We have performed a statistical analysis of unstructured amino acid residues in protein structures available in the databank of protein structures. Data on the occurrence of disordered regions at the ends and in the middle part of protein chains have been obtained: in the regions near the ends (at distance less than 30 residues from the N- or C-terminus), there are 66% of unstructured residues (38% are near the N-terminus and 28% are near the C-terminus), although these terminal regions include only 23% of the amino acid residues. The frequencies of occurrence of unstructured residues have been calculated for each of 20 types in different positions in the protein chain. It has been shown that relative frequencies of occurrence of unstructured residues of 20 types at the termini of protein chains differ from the ones in the middle part of the protein chain; amino acid residues of the same type have different probabilities to be unstructured in the terminal regions and in the middle part of the protein chain. The obtained frequencies of occurrence of unstructured residues in the middle part of the protein chain have been used as a scale for predicting disordered regions from amino acid sequence using the method (FoldUnfold) previously developed by us. This scale of frequencies of occurrence of unstructured residues correlates with the contact scale (previously developed by us and used for the same purpose) at a level of 95%. Testing the new scale on a database of 427 unstructured proteins and 559 completely structured proteins has shown that this scale can be successfully used for the prediction of disordered regions in protein chains.
Statistical Analysis of Upper Ocean Time Series of Vertical Shear.
1982-05-01
SHEAR ............. 5-1 5.1 Preliminary Statistical Tests ............ 5-1 5.1.1 Autocorrelation and Run Test for Randomness .................... 5-1...parameters are based on the statistical model for S7(NTz) from Section 4. 5.1 PRELIMINARY STATISTICAL TESTS 5.1.1 Autocorrelation and Run Test for Randomness...estimating this interval directly from shear auto- correlation functions, and the second involves the use of the run test . * 5-1 In qeneral, shear in the
Gini s ideas: new perspectives for modern multivariate statistical analysis
Directory of Open Access Journals (Sweden)
Angela Montanari
2013-05-01
Full Text Available Corrado Gini (1884-1964 may be considered the greatest Italian statistician. We believe that his important contributions to statistics, however mainly limited to the univariate context, may be profitably employed in modern multivariate statistical methods, aimed at overcoming the curse of dimensionality by decomposing multivariate problems into a series of suitably posed univariate ones.In this paper we critically summarize Gini’s proposals and consider their impact on multivariate statistical methods, both reviewing already well established applications and suggesting new perspectives.Particular attention will be devoted to classification and regression trees, multiple linear regression, linear dimension reduction methods and transvariation based discrimination.
Sparks, Ross; Carter, Chris; Donnelly, John B; O'Keefe, Christine M; Duncan, Jodie; Keighley, Tim; McAullay, Damien
2008-09-01
This paper is concerned with the challenge of enabling the use of confidential or private data for research and policy analysis, while protecting confidentiality and privacy by reducing the risk of disclosure of sensitive information. Traditional solutions to the problem of reducing disclosure risk include releasing de-identified data and modifying data before release. In this paper we discuss the alternative approach of using a remote analysis server which does not enable any data release, but instead is designed to deliver useful results of user-specified statistical analyses with a low risk of disclosure. The techniques described in this paper enable a user to conduct a wide range of methods in exploratory data analysis, regression and survival analysis, while at the same time reducing the risk that the user can read or infer any individual record attribute value. We illustrate our methods with examples from biostatistics using publicly available data. We have implemented our techniques into a software demonstrator called Privacy-Preserving Analytics (PPA), via a web-based interface to the R software. We believe that PPA may provide an effective balance between the competing goals of providing useful information and reducing disclosure risk in some situations.
Analysis of statistical model properties from discrete nuclear structure data
Firestone, Richard B.
2012-02-01
Experimental M1, E1, and E2 photon strengths have been compiled from experimental data in the Evaluated Nuclear Structure Data File (ENSDF) and the Evaluated Gamma-ray Activation File (EGAF). Over 20,000 Weisskopf reduced transition probabilities were recovered from the ENSDF and EGAF databases. These transition strengths have been analyzed for their dependence on transition energies, initial and final level energies, spin/parity dependence, and nuclear deformation. ENSDF BE1W values were found to increase exponentially with energy, possibly consistent with the Axel-Brink hypothesis, although considerable excess strength observed for transitions between 4-8 MeV. No similar energy dependence was observed in EGAF or ARC data. BM1W average values were nearly constant at all energies above 1 MeV with substantial excess strength below 1 MeV and between 4-8 MeV. BE2W values decreased exponentially by a factor of 1000 from 0 to 16 MeV. The distribution of ENSDF transition probabilities for all multipolarities could be described by a lognormal statistical distribution. BE1W, BM1W, and BE2W strengths all increased substantially for initial transition level energies between 4-8 MeV possibly due to dominance of spin-flip and Pygmy resonance transitions at those excitations. Analysis of the average resonance capture data indicated no transition probability dependence on final level spins or energies between 0-3 MeV. The comparison of favored to unfavored transition probabilities for odd-A or odd-Z targets indicated only partial support for the expected branching intensity ratios with many unfavored transitions having nearly the same strength as favored ones. Average resonance capture BE2W transition strengths generally increased with greater deformation. Analysis of ARC data suggest that there is a large E2 admixture in M1 transitions with the mixing ratio δ ≈ 1.0. The ENSDF reduced transition strengths were considerably stronger than those derived from capture gamma ray
Analysis of statistical model properties from discrete nuclear structure data
Directory of Open Access Journals (Sweden)
Firestone Richard B.
2012-02-01
Full Text Available Experimental M1, E1, and E2 photon strengths have been compiled from experimental data in the Evaluated Nuclear Structure Data File (ENSDF and the Evaluated Gamma-ray Activation File (EGAF. Over 20,000 Weisskopf reduced transition probabilities were recovered from the ENSDF and EGAF databases. These transition strengths have been analyzed for their dependence on transition energies, initial and final level energies, spin/parity dependence, and nuclear deformation. ENSDF BE1W values were found to increase exponentially with energy, possibly consistent with the Axel-Brink hypothesis, although considerable excess strength observed for transitions between 4-8 MeV. No similar energy dependence was observed in EGAF or ARC data. BM1W average values were nearly constant at all energies above 1 MeV with substantial excess strength below 1 MeV and between 4-8 MeV. BE2W values decreased exponentially by a factor of 1000 from 0 to 16 MeV. The distribution of ENSDF transition probabilities for all multipolarities could be described by a lognormal statistical distribution. BE1W, BM1W, and BE2W strengths all increased substantially for initial transition level energies between 4-8 MeV possibly due to dominance of spin-flip and Pygmy resonance transitions at those excitations. Analysis of the average resonance capture data indicated no transition probability dependence on final level spins or energies between 0-3 MeV. The comparison of favored to unfavored transition probabilities for odd-A or odd-Z targets indicated only partial support for the expected branching intensity ratios with many unfavored transitions having nearly the same strength as favored ones. Average resonance capture BE2W transition strengths generally increased with greater deformation. Analysis of ARC data suggest that there is a large E2 admixture in M1 transitions with the mixing ratio δ ≈ 1.0. The ENSDF reduced transition strengths were considerably stronger than those derived from
A statistical framework for differential network analysis from microarray data
Directory of Open Access Journals (Sweden)
Datta Somnath
2010-02-01
Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the
Olive mill wastewater characteristics: modelling and statistical analysis
Directory of Open Access Journals (Sweden)
Martins-Dias, Susete
2004-09-01
Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus
Results after replantation of avulsed permanent teeth. III. Tooth loss and survival analysis.
Pohl, Yango; Wahl, Gerhard; Filippi, Andreas; Kirschner, Horst
2005-04-01
Avulsed permanent teeth were replanted following immediate extraoral endodontic treatment by insertion of posts from a retrograde direction. Some teeth were rescued in a physiologic environment (tissue culture medium contained in a tooth rescue box), and in some cases antiresorptive-regenerative therapy (ART) was used. The aim of the study was to identify variables that influence the incidence of tooth loss and the survival of avulsed and replanted permanent incisors. Twenty-eight permanent teeth in 24 patients aged 7-17 years were investigated. In all teeth extraoral endodontic treatment by retrograde insertion of posts was performed. All nine teeth with functional healing (FH) were in situ. Of the 19 teeth with non-FH, seven were removed to allow transplantations. Two teeth were removed due to severe infrapositions. One tooth was lost following a new trauma. No tooth was lost due to acute infections. In descriptive statistics the incidence of tooth loss was significantly related to healing (P = 0.0098, Fisher's exact test), to treatment planning, i.e. consecutive replantation of premolars and primary canines (P = 0.0001, Fisher's exact test) and to immediate physiologic rescue (P = 0.0394). ART was related to tooth loss when tested in teeth with a compromised periodontal ligament (P = 0.0389). No influence could be found for the parameters maturity, age and all other factors. In a regression analysis treatment planning was the only factor left which had a significant influence (P = 0.0002). The estimated mean survival time (Kaplan-Meier analysis) for all teeth was 57.3 months. The survival was significantly reduced (P = 0.0002, log rank test) when consecutive transplantations were intended and performed. No influence could be found for maturity, age and all other factors. The different findings to previous studies can be explained by the prevention of complications related to conventional endodontic treatment approaches. Statistics have to be carefully
Petocz, Agnes; Newbery, Glenn
2010-01-01
Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…
STATISTICAL ANALYSIS OF THE DEMOLITION OF THE HITCH DEVICES ELEMENTS
Directory of Open Access Journals (Sweden)
V. V. Artemchuk
2009-03-01
Full Text Available The results of statistical research of wear of automatic coupler body butts and thrust plates of electric locomotives are presented in the article. Due to the increased wear the mentioned elements require special attention.
Climate time series analysis classical statistical and bootstrap methods
Mudelsee, Manfred
2010-01-01
This book presents bootstrap resampling as a computationally intensive method able to meet the challenges posed by the complexities of analysing climate data. It shows how the bootstrap performs reliably in the most important statistical estimation techniques.
Gregor Mendel's Genetic Experiments: A Statistical Analysis after 150 Years
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2016-01-01
Roč. 12, č. 2 (2016), s. 20-26 ISSN 1801-5603 Institutional support: RVO:67985807 Keywords : genetics * history of science * biostatistics * design of experiments Subject RIV: BB - Applied Statistics, Operational Research
Directory of Open Access Journals (Sweden)
Kanyanat Kaewiad
2017-08-01
Full Text Available Encapsulation may protect viable probiotic cells. This study aims at the evaluation of a bambara groundnut protein isolate (BGPI-alginate matrix designed for encapsulating a probiotic Lactobacillus rhamnosus GG. The response surface methodology was employed to gain the optimal concentrations of BGPI and alginate on encapsulation efficiency and survival of encapsulated cells. The capsules were prepared at the optimal combination by the traditional extrusion method composed of 8.66% w/v BGPI and 1.85% w/v alginate. The encapsulation efficiency was 97.24%, whereas the survival rates in an acidic condition and after the freeze-drying process were 95.56% and 95.20%, respectively—higher than those using either BGPI or alginate as the encapsulating agent individually. The designed capsules increased the probiotic L. rhamnosus GG survival relative to free cells in a simulated gastric fluid by 5.00 log cfu/ml after 3 h and in a simulated intestinal fluid by 8.06 log cfu/ml after 4 h. The shelf-life studies of the capsules over 6 months at 4 °C and 30 °C indicated that the remaining number of viable cells in a BGPI-alginate capsule was significantly higher than that of free cells in both temperatures. It was demonstrated that the BGPI-alginate capsule could be utilized as a new probiotic carrier for enhanced gastrointestinal transit and storage applied in food and/or pharmaceutical products.
Esteban, Laura; Clèries, Ramon; Gálvez, Jordi; Pareja, Laura; Escribà, Josep Maria; Sanz, Xavier; Izquierdo, Angel; Galcerán, Jaume; Ribes, Josepa
2013-03-07
The repertoire of statistical methods dealing with the descriptive analysis of the burden of a disease has been expanded and implemented in statistical software packages during the last years. The purpose of this paper is to present a web-based tool, REGSTATTOOLShttp://regstattools.net intended to provide analysis for the burden of cancer, or other group of disease registry data. Three software applications are included in REGSTATTOOLS: SART (analysis of disease's rates and its time trends), RiskDiff (analysis of percent changes in the rates due to demographic factors and risk of developing or dying from a disease) and WAERS (relative survival analysis). We show a real-data application through the assessment of the burden of tobacco-related cancer incidence in two Spanish regions in the period 1995-2004. Making use of SART we show that lung cancer is the most common cancer among those cancers, with rising trends in incidence among women. We compared 2000-2004 data with that of 1995-1999 to assess percent changes in the number of cases as well as relative survival using RiskDiff and WAERS, respectively. We show that the net change increase in lung cancer cases among women was mainly attributable to an increased risk of developing lung cancer, whereas in men it is attributable to the increase in population size. Among men, lung cancer relative survival was higher in 2000-2004 than in 1995-1999, whereas it was similar among women when these time periods were compared. Unlike other similar applications, REGSTATTOOLS does not require local software installation and it is simple to use, fast and easy to interpret. It is a set of web-based statistical tools intended for automated calculation of population indicators that any professional in health or social sciences may require.
Advanced Online Survival Analysis Tool for Predictive Modelling in Clinical Data Science.
Montes-Torres, Julio; Subirats, José Luis; Ribelles, Nuria; Urda, Daniel; Franco, Leonardo; Alba, Emilio; Jerez, José Manuel
2016-01-01
One of the prevailing applications of machine learning is the use of predictive modelling in clinical survival analysis. In this work, we present our view of the current situation of computer tools for survival analysis, stressing the need of transferring the latest results in the field of machine learning to biomedical researchers. We propose a web based software for survival analysis called OSA (Online Survival Analysis), which has been developed as an open access and user friendly option to obtain discrete time, predictive survival models at individual level using machine learning techniques, and to perform standard survival analysis. OSA employs an Artificial Neural Network (ANN) based method to produce the predictive survival models. Additionally, the software can easily generate survival and hazard curves with multiple options to personalise the plots, obtain contingency tables from the uploaded data to perform different tests, and fit a Cox regression model from a number of predictor variables. In the Materials and Methods section, we depict the general architecture of the application and introduce the mathematical background of each of the implemented methods. The study concludes with examples of use showing the results obtained with public datasets.
Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.
Rudert, Thomas; Lohmann, Gabriele
2008-12-01
To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.
Statistics and data analysis for financial engineering with R examples
Ruppert, David
2015-01-01
The new edition of this influential textbook, geared towards graduate or advanced undergraduate students, teaches the statistics necessary for financial engineering. In doing so, it illustrates concepts using financial markets and economic data, R Labs with real-data exercises, and graphical and analytic methods for modeling and diagnosing modeling errors. Financial engineers now have access to enormous quantities of data. To make use of these data, the powerful methods in this book, particularly about volatility and risks, are essential. Strengths of this fully-revised edition include major additions to the R code and the advanced topics covered. Individual chapters cover, among other topics, multivariate distributions, copulas, Bayesian computations, risk management, multivariate volatility and cointegration. Suggested prerequisites are basic knowledge of statistics and probability, matrices and linear algebra, and calculus. There is an appendix on probability, statistics and linear algebra. Practicing fina...
Statistical analysis of natural disasters and related losses
Pisarenko, VF
2014-01-01
The study of disaster statistics and disaster occurrence is a complicated interdisciplinary field involving the interplay of new theoretical findings from several scientific fields like mathematics, physics, and computer science. Statistical studies on the mode of occurrence of natural disasters largely rely on fundamental findings in the statistics of rare events, which were derived in the 20th century. With regard to natural disasters, it is not so much the fact that the importance of this problem for mankind was recognized during the last third of the 20th century - the myths one encounters in ancient civilizations show that the problem of disasters has always been recognized - rather, it is the fact that mankind now possesses the necessary theoretical and practical tools to effectively study natural disasters, which in turn supports effective, major practical measures to minimize their impact. All the above factors have resulted in considerable progress in natural disaster research. Substantial accrued ma...
Nanotechnology in concrete: Critical review and statistical analysis
Glenn, Jonathan
This thesis investigates the use of nanotechnology in an extensive literature search in the field of cement and concrete. A summary is presented. The research was divided into two categories: (1) nanoparticles and (2) nanofibers and nanotubes. The successes and challenges of each category is documented in this thesis. The data from the literature search is taken and analyzed using statistical prediction by the use of the Monte Carlo and Bayesian methods. It shows how statistical prediction can be used to analyze patterns and trends and also discover optimal additive dosages for concrete mixes.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Scarborough, Anita A.; Hebbeler, Kathleen M.; Spiker, Donna; Simeonsson, Rune J.
2011-01-01
Survival analysis was used to document the developmental achievements of 2298 kindergarten children who participated in the National Early Intervention Longitudinal Study, a study that followed children from entry to Part C early intervention (EI) through kindergarten. Survival functions were produced depicting the percentage of children at…
High-dimensional, massive sample-size Cox proportional hazards regression for survival analysis.
Mittal, Sushil; Madigan, David; Burd, Randall S; Suchard, Marc A
2014-04-01
Survival analysis endures as an old, yet active research field with applications that spread across many domains. Continuing improvements in data acquisition techniques pose constant challenges in applying existing survival analysis methods to these emerging data sets. In this paper, we present tools for fitting regularized Cox survival analysis models on high-dimensional, massive sample-size (HDMSS) data using a variant of the cyclic coordinate descent optimization technique tailored for the sparsity that HDMSS data often present. Experiments on two real data examples demonstrate that efficient analyses of HDMSS data using these tools result in improved predictive performance and calibration.
A Bayesian Statistical Analysis of the Enhanced Greenhouse Effect
de Vos, A.F.; Tol, R.S.J.
1998-01-01
This paper demonstrates that there is a robust statistical relationship between the records of the global mean surface air temperature and the atmospheric concentration of carbon dioxide over the period 1870-1991. As such, the enhanced greenhouse effect is a plausible explanation for the observed
A note on the statistical analysis of point judgment matrices
African Journals Online (AJOL)
by Saaty in the 1970s which has received considerable attention in the mathematical and statistical literature [11, 18]. The core of .... question is how to determine the weights associated with the objects. 3 Distributional approaches ..... Research Foundation of South Africa for financial support. The authors are also grateful.
Statistical analysis of DNT detection using chemically functionalized microcantilever arrays
DEFF Research Database (Denmark)
Bosco, Filippo; Bache, M.; Hwu, E.-T.
2012-01-01
from 1 to 2 cantilevers have been reported, without any information on repeatability and reliability of the presented data. In explosive detection high reliability is needed and thus a statistical measurement approach needs to be developed and implemented. We have developed a DVD-based read-out system...
Statistical analysis and optimization of copper biosorption capability ...
African Journals Online (AJOL)
enoh
2012-03-01
% glucose, 0.5% yeast extract, supplemented with 20 ml/L apple juice) with 15% ... represented by "+" sign, while dead cells, temperature of 25°C, dry weight of 0.13 ... optimum. Using the Microsoft Excel program, statistical t-.
A statistical analysis on the leak detection performance of ...
Indian Academy of Sciences (India)
This paper attempts to provide a statistical insight on the concepts of leak detection performance of WSNs when deployed on overground and underground pipelines.The approach in the study employs the hypothesis testing problem to formulate a solution on the detection plan.Through the hypothesis test, the maximum ...
Did Tanzania Achieve the Second Millennium Development Goal? Statistical Analysis
Magoti, Edwin
2016-01-01
Development Goal "Achieve universal primary education", the challenges faced, along with the way forward towards achieving the fourth Sustainable Development Goal "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all". Statistics show that Tanzania has made very promising steps…
Herbal gardens of India: A statistical analysis report | Rao | African ...
African Journals Online (AJOL)
A knowledge system of the herbal garden in India was developed and these herbal gardens' information was statistically classified for efficient data processing, sharing and retrieving of information, which could act as a decision tool to the farmers, researchers, decision makers and policy makers in the field of medicinal ...
Statistical analysis of stream sediment geochemical data from Oyi ...
African Journals Online (AJOL)
Ife Journal of Science ... The results of concentrations of twenty-four elements treated with both univariate and multivariate statistical analytical techniques revealed that all the elements analyzed except Co, Cr, Fe and V ... The cumulative probability plots of the elements showed that Mn and Cu consisted of one population.
Analysis of breath samples for lung cancer survival
Energy Technology Data Exchange (ETDEWEB)
Schmekel, Birgitta [Division of of Clinical Physiology, County Council of Östergötland, Linköping (Sweden); Clinical Physiology, Department of Medicine and Health, Faculty of Health Sciences, Linköping University, Linköping (Sweden); Winquist, Fredrik, E-mail: frw@ifm.liu.se [Department of Physics, Chemistry and Biology, Linköping University, Linköping SE-581 83 (Sweden); Vikström, Anders [Department of Pulmonary Medicine, University hospital of Linköping, County Council of Östergötland, Linköping (Sweden)
2014-08-20
Graphical abstract: Predictions of survival days for lung cancer patients. - Highlights: • Analyses of exhaled air offer a large diagnostic potential. • Patientswith diagnosed lung cancer were studied using an electronic nose. • Excellent predictions and stable models of survival day were obtained. • Consecutive measurements were very important. - Abstract: Analyses of exhaled air by means of electronic noses offer a large diagnostic potential. Such analyses are non-invasive; samples can also be easily obtained from severely ill patients and repeated within short intervals. Lung cancer is the most deadly malignant tumor worldwide, and monitoring of lung cancer progression is of great importance and may help to decide best therapy. In this report, twenty-two patients with diagnosed lung cancer and ten healthy volunteers were studied using breath samples collected several times at certain intervals and analysed by an electronic nose. The samples were divided into three sub-groups; group d for survivor less than one year, group s for survivor more than a year and group h for the healthy volunteers. Prediction models based on partial least square and artificial neural nets could not classify the collected groups d, s and h, but separated well group d from group h. Using artificial neural net, group d could be separated from group s. Excellent predictions and stable models of survival day for group d were obtained, both based on partial least square and artificial neural nets, with correlation coefficients 0.981 and 0.985, respectively. Finally, the importance of consecutive measurements was shown.
Breastfeeding practices in a public health field practice area in Sri Lanka: a survival analysis
Directory of Open Access Journals (Sweden)
Agampodi Thilini C
2007-10-01
Full Text Available Abstract Background Exclusive breastfeeding up to the completion of the sixth month of age is the national infant feeding recommendation for Sri Lanka. The objective of the present study was to collect data on exclusive breastfeeding up to six months and to describe the association between exclusive breastfeeding and selected socio-demographic factors. Methods A clinic based cross-sectional study was conducted in the Medical Officer of Health area, Beruwala, Sri Lanka in June 2006. Mothers with infants aged 4 to 12 months, attending the 19 child welfare clinics in the area were included in the study. Infants with specific feeding problems (cleft lip and palate and primary lactose intolerance were excluded. Cluster sampling technique was used and consecutive infants fulfilling the inclusion criteria were enrolled. A total of 219 mothers participated in the study. The statistical tests used were survival analysis (Kaplan-Meier survival curves and Cox proportional Hazard model. Results All 219 mothers had initiated breastfeeding. The median duration of exclusive breastfeeding was four months (95% CI 3.75, 4.25. The rates of exclusive breastfeeding at 4 and 6 months were 61.6% (135/219 and 15.5% (24/155 respectively. Bivariate analysis showed that the Muslim ethnicity (p = 0.004, lower levels of parental education (p Conclusion The rate of breastfeeding initiation and exclusive breastfeeding up to the fourth month is very high in Medical Officer of Health area, Beruwala, Sri Lanka. However exclusive breastfeeding up to six months is still low and the prevalence of inappropriate feeding practices is high.
Kenah, Eben; Britton, Tom; Halloran, M Elizabeth; Longini, Ira M
2016-04-01
Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology.
Kenah, Eben; Britton, Tom; Halloran, M. Elizabeth; Longini, Ira M.
2016-01-01
Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology. PMID:27070316
Survival analysis of patients under chronic HIV-care and ...
African Journals Online (AJOL)
admin
2Department of Statistics, College of Computational & Natural Sciences, Addis University, e-mail wenchekoeshetu@yahoo.com. Original ... Studying the history of disease progression due to. HIV/AIDS and the treatments are useful ... including natural history of HIV infection, primary prophylaxis and immunization, nutrition, ...
DEFF Research Database (Denmark)
Andersen, J.S.; Bedaux, J.J.M.; Kooijman, S.A.L.M.
2000-01-01
This paper describes the influence of design characteristics on the statistical inference for an ecotoxicological hazard-based model using simulated survival data. The design characteristics of interest are the number and spacing of observations (counts) in time, the number and spacing of exposure...... concentrations (within c(min) and c(max)), and the initial number of individuals at time 0 in each concentration. A comparison of the coverage probabilities for confidence limits arising from the profile-likelihood approach and the Wald-based approach is carried out. The Wald-based approach is very sensitive...
Zhu, Xiaoyan; Zhou, Xiaobin; Zhang, Yuan; Sun, Xiao; Liu, Haihua; Zhang, Yingying
2017-12-01
Survival analysis methods have gained widespread use in the filed of oncology. For achievement of reliable results, the methodological process and report quality is crucial. This review provides the first examination of methodological characteristics and reporting quality of survival analysis in articles published in leading Chinese oncology journals.To examine methodological and reporting quality of survival analysis, to identify some common deficiencies, to desirable precautions in the analysis, and relate advice for authors, readers, and editors.A total of 242 survival analysis articles were included to be evaluated from 1492 articles published in 4 leading Chinese oncology journals in 2013. Articles were evaluated according to 16 established items for proper use and reporting of survival analysis.The application rates of Kaplan-Meier, life table, log-rank test, Breslow test, and Cox proportional hazards model (Cox model) were 91.74%, 3.72%, 78.51%, 0.41%, and 46.28%, respectively, no article used the parametric method for survival analysis. Multivariate Cox model was conducted in 112 articles (46.28%). Follow-up rates were mentioned in 155 articles (64.05%), of which 4 articles were under 80% and the lowest was 75.25%, 55 articles were100%. The report rates of all types of survival endpoint were lower than 10%. Eleven of 100 articles which reported a loss to follow-up had stated how to treat it in the analysis. One hundred thirty articles (53.72%) did not perform multivariate analysis. One hundred thirty-nine articles (57.44%) did not define the survival time. Violations and omissions of methodological guidelines included no mention of pertinent checks for proportional hazard assumption; no report of testing for interactions and collinearity between independent variables; no report of calculation method of sample size. Thirty-six articles (32.74%) reported the methods of independent variable selection. The above defects could make potentially inaccurate
Statistical methods for data analysis in particle physics
Lista, Luca
2017-01-01
This concise set of course-based notes provides the reader with the main concepts and tools needed to perform statistical analyses of experimental data, in particular in the field of high-energy physics (HEP). First, the book provides an introduction to probability theory and basic statistics, mainly intended as a refresher from readers’ advanced undergraduate studies, but also to help them clearly distinguish between the Frequentist and Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on both discoveries and upper limits, as many applications in HEP concern hypothesis testing, where the main goal is often to provide better and better limits so as to eventually be able to distinguish between competing hypotheses, or to rule out some of them altogether. Many worked-out examples will help newcomers to the field and graduate students alike understand the pitfalls involved in applying theoretical conc...
Statistical methods for data analysis in particle physics
Lista, Luca
2017-01-01
This concise set of course-based notes provides the reader with the main concepts and tools needed to perform statistical analyses of experimental data, in particular in the field of high-energy physics (HEP). First, the book provides an introduction to probability theory and basic statistics, mainly intended as a refresher from readers’ advanced undergraduate studies, but also to help them clearly distinguish between the Frequentist and Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on both discoveries and upper limits, as many applications in HEP concern hypothesis testing, where the main goal is often to provide better and better limits so as to eventually be able to distinguish between competing hypotheses, or to rule out some of them altogether. Many worked-out examples will help newcomers to the field and graduate students alike understand the pitfalls involved in applying theoretical co...
Statistical analysis of motion contrast in optical coherence tomography angiography
Cheng, Yuxuan; Pan, Cong; Lu, Tongtong; Hong, Tianyu; Ding, Zhihua; Li, Peng
2015-01-01
Optical coherence tomography angiography (Angio-OCT), mainly based on the temporal dynamics of OCT scattering signals, has found a range of potential applications in clinical and scientific researches. In this work, based on the model of random phasor sums, temporal statistics of the complex-valued OCT signals are mathematically described. Statistical distributions of the amplitude differential (AD) and complex differential (CD) Angio-OCT signals are derived. The theories are validated through the flow phantom and live animal experiments. Using the model developed in this work, the origin of the motion contrast in Angio-OCT is mathematically explained, and the implications in the improvement of motion contrast are further discussed, including threshold determination and its residual classification error, averaging method, and scanning protocol. The proposed mathematical model of Angio-OCT signals can aid in the optimal design of the system and associated algorithms.
Common misconceptions about data analysis and statistics1
Motulsky, Harvey J
2015-01-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word “significant”. (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
The Digital Divide in Romania – A Statistical Analysis
Directory of Open Access Journals (Sweden)
Daniela BORISOV
2012-06-01
Full Text Available The digital divide is a subject of major importance in the current economic circumstances in which Information and Communication Technologies (ICT are seen as a significant determinant of increasing the domestic competitiveness and contribute to better life quality. Latest international reports regarding various aspects of ICT usage in modern society reveal a decrease of overall digital disparity towards the average trends of the worldwide ITC’s sector – this relates to latest advances of mobile and computer penetration rates, both for personal use and for households/ business. In Romania, the low starting point in the development of economy and society in the ICT direction was, in some extent, compensated by the rapid annual growth of the last decade. Even with these dynamic developments, the statistical data still indicate poor positions in European Union hierarchy; in this respect, the prospects of a rapid recovery of the low performance of the Romanian ICT endowment and usage and the issue continue to be regarded as a challenge for progress in economic and societal terms. The paper presents several methods for assessing the current state of ICT related aspects in terms of Internet usage based on the latest data provided by international databases. The current position of Romanian economy is judged according to several economy using statistical methods based on variability measurements: the descriptive statistics indicators, static measures of disparities and distance metrics.
Learning to Translate: A Statistical and Computational Analysis
Directory of Open Access Journals (Sweden)
Marco Turchi
2012-01-01
Full Text Available We present an extensive experimental study of Phrase-based Statistical Machine Translation, from the point of view of its learning capabilities. Very accurate Learning Curves are obtained, using high-performance computing, and extrapolations of the projected performance of the system under different conditions are provided. Our experiments confirm existing and mostly unpublished beliefs about the learning capabilities of statistical machine translation systems. We also provide insight into the way statistical machine translation learns from data, including the respective influence of translation and language models, the impact of phrase length on performance, and various unlearning and perturbation analyses. Our results support and illustrate the fact that performance improves by a constant amount for each doubling of the data, across different language pairs, and different systems. This fundamental limitation seems to be a direct consequence of Zipf law governing textual data. Although the rate of improvement may depend on both the data and the estimation method, it is unlikely that the general shape of the learning curve will change without major changes in the modeling and inference phases. Possible research directions that address this issue include the integration of linguistic rules or the development of active learning procedures.
Performance Analysis of Statistical Time Division Multiplexing Systems
Directory of Open Access Journals (Sweden)
Johnson A. AJIBOYE
2010-12-01
Full Text Available Multiplexing is a way of accommodating many input sources of a low capacity over a high capacity outgoing channel. Statistical Time Division Multiplexing (STDM is a technique that allows the number of users to be multiplexed over the channel more than the channel can afford. The STDM normally exploits unused time slots by the non-active users and allocates those slots for the active users. Therefore, STDM is appropriate for bursty sources. In this way STDM normally utilizes channel bandwidth better than traditional Time Division Multiplexing (TDM. In this work, the statistical multiplexer is viewed as M/M/1queuing system and the performance is measured by comparing analytical results to simulation results using Matlab. The index used to determine the performance of the statistical multiplexer is the number of packets both in the system and the queue. Comparison of analytical results was also done between M/M/1 and M/M/2 and also between M/M/1 and M/D/1 queue systems. At high utilizations, M/M/2 performs better than M/M/1. M/D/1 also outperforms M/M1.
Energy Technology Data Exchange (ETDEWEB)
Okamoto, Atsutake; Tsuruta, Kohji; Tanaka, Yoshiaki (Tokyo Metropolitan Komagome Hospital (Japan)); Onodera, Tokio
1992-04-01
The present report is a retrospective analysis of the effect of intraoperative radiation therapy (IORT) for localized but unresectable pancreatic carcinoma. Thirteen of 30 patients treated by IORT in combination with external beam radiation therapy (EBRT) survived for more than one year. The longest survival period, attained by two patients, was 20 months. The 1, and 1.5-year survival rates were 46.5% and 20.8%, respectively, with a median survival of 11 months, whereas the 1-year survival rate was 0%, with a median survival of 6.2 months for the 16 patients treated by IORT alone (N=16). There was a statistically significant difference in survival rate between the two groups (p<0.01). Therefore, additional EBRT may be indispensable for prolongation of the survival period. Moreover, IORT conferred the palliative benefit of relief of pain in more than half of the patients with severe pain. In postmortem examination of seven patients who survived for more than one year, the tumors were replaced by fibrous and hyalinized tissue, as a result of the effect of IORT, and degeneration and necrosis of tumor cells were seen in the center of the tumor, while viable tumor cells remained in the periphery, spreading to the retroperitoneal tissues or neighboring organs. These histopathological findings are distinctive features of carcinoma of the pancreas treated by IORT. (author).
Predicting secondary school dropout among South African adolescents: A survival analysis approach
National Research Council Canada - National Science Library
Xie, Hui (Jimmy); Caldwell, Linda L; Smith, Edward A; Weybright, Elizabeth H; Wegner, Lisa
2017-01-01
...% of the age appropriate population remain enrolled. Survival analysis was used to identify the risk of dropping out of secondary school for male and female adolescents and examine the influence of substance use and leisure experience predictors...
Advances in Statistical Methods for Meta-Analysis.
Hedges, Larry V.
1984-01-01
The adequacy of traditional effect size measures for research synthesis is challenged. Analogues to analysis of variance and multiple regression analysis for effect sizes are presented. The importance of tests for the consistency of effect sizes in interpreting results, and problems in obtaining well-specified models for meta-analysis are…
Bayesian Analysis for Dynamic Generalized Linear Latent Model with Application to Tree Survival Rate
Directory of Open Access Journals (Sweden)
Yu-sheng Cheng
2014-01-01
Full Text Available Logistic regression model is the most popular regression technique, available for modeling categorical data especially for dichotomous variables. Classic logistic regression model is typically used to interpret relationship between response variables and explanatory variables. However, in real applications, most data sets are collected in follow-up, which leads to the temporal correlation among the data. In order to characterize the different variables correlations, a new method about the latent variables is introduced in this study. At the same time, the latent variables about AR (1 model are used to depict time dependence. In the framework of Bayesian analysis, parameters estimates and statistical inferences are carried out via Gibbs sampler with Metropolis-Hastings (MH algorithm. Model comparison, based on the Bayes factor, and forecasting/smoothing of the survival rate of the tree are established. A simulation study is conducted to assess the performance of the proposed method and a pika data set is analyzed to illustrate the real application. Since Bayes factor approaches vary significantly, efficiency tests have been performed in order to decide which solution provides a better tool for the analysis of real relational data sets.
Lee, Ho Jin; Kim, Dong Hyun; Lee, Seul; Koh, Myeong Seok; Kim, So Yeon; Lee, Ji Hyun; Lee, Suee; Oh, Sung Yong; Han, Jin Yeong; Kim, Hyo-Jin; Kim, Sung-Hyun
2015-11-01
This study investigated whether patients with acute promyelocytic leukemia (APL) truly fulfill the diagnostic criteria of overt disseminated intravascular coagulation (DIC), as proposed by the International Society on Thrombosis and Haemostasis (ISTH) and the Korean Society on Thrombosis and Hemostasis (KSTH), and analyzed which component of the criteria most contributes to bleeding diathesis. A single-center retrospective analysis was conducted on newly diagnosed APL patients between January 1995 and May 2012. A total of 46 newly diagnosed APL patients were analyzed. Of these, 27 patients (58.7%) showed initial bleeding. The median number of points per patient fulfilling the diagnostic criteria of overt DIC by the ISTH and the KSTH was 5 (range, 1 to 7) and 3 (range, 1 to 4), respectively. At diagnosis of APL, 22 patients (47.8%) fulfilled the overt DIC diagnostic criteria by either the ISTH or KSTH. In multivariate analysis of the ISTH or KSTH diagnostic criteria for overt DIC, the initial fibrinogen level was the only statistically significant factor associated with initial bleeding (p = 0.035), but it was not associated with overall survival (OS). Initial fibrinogen level is associated with initial presentation of bleeding of APL patients, but does not affect OS.
Multivariate Survival Mixed Models for Genetic Analysis of Longevity Traits
DEFF Research Database (Denmark)
Pimentel Maia, Rafael; Madsen, Per; Labouriau, Rodrigo
2014-01-01
A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented....... The discrete time models used are multivariate variants of the discrete relative risk models. These models allow for regular parametric likelihood-based inference by exploring a coincidence of their likelihood functions and the likelihood functions of suitably defined multivariate generalized linear mixed...... models. The models include a dispersion parameter, which is essential for obtaining a decomposition of the variance of the trait of interest as a sum of parcels representing the additive genetic effects, environmental effects and unspecified sources of variability; as required in quantitative genetic...
Multivariate Survival Mixed Models for Genetic Analysis of Longevity Traits
DEFF Research Database (Denmark)
Pimentel Maia, Rafael; Madsen, Per; Labouriau, Rodrigo
2013-01-01
A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented....... The discrete time models used are multivariate variants of the discrete relative risk models. These models allow for regular parametric likelihood-based inference by exploring a coincidence of their likelihood functions and the likelihood functions of suitably defined multivariate generalized linear mixed...... models. The models include a dispersion parameter, which is essential for obtaining a decomposition of the variance of the trait of interest as a sum of parcels representing the additive genetic effects, environmental effects and unspecified sources of variability; as required in quantitative genetic...
Up-to-date and precise estimates of cancer patient survival: model-based period analysis.
Brenner, Hermann; Hakulinen, Timo
2006-10-01
Monitoring of progress in cancer patient survival by cancer registries should be as up-to-date as possible. Period analysis has been shown to provide more up-to-date survival estimates than do traditional methods of survival analysis. However, there is a trade-off between up-to-dateness and the precision of period estimates, in that increasing the up-to-dateness of survival estimates by restricting the analysis to a relatively short, recent time period, such as the most recent calendar year for which cancer registry data are available, goes along with a loss of precision. The authors propose a model-based approach to maximize the up-to-dateness of period estimates at minimal loss of precision. The approach is illustrated for monitoring of 5-year relative survival of patients diagnosed with one of 20 common forms of cancer in Finland between 1953 and 2002 by use of data from the nationwide Finnish Cancer Registry. It is shown that the model-based approach provides survival estimates that are as up-to-date as the most up-to-date conventional period estimates and at the same time much more precise than the latter. The modeling approach may further enhance the use of period analysis for deriving up-to-date cancer survival rates.
Nikulin, M; Mesbah, M; Limnios, N
2004-01-01
Parametric and semiparametric models are tools with a wide range of applications to reliability, survival analysis, and quality of life. This self-contained volume examines these tools in survey articles written by experts currently working on the development and evaluation of models and methods. While a number of chapters deal with general theory, several explore more specific connections and recent results in "real-world" reliability theory, survival analysis, and related fields.
Acute Myeloid Leukemia: analysis of epidemiological profile and survival rate.
de Lima, Mariana Cardoso; da Silva, Denise Bousfield; Freund, Ana Paula Ferreira; Dacoregio, Juliana Shmitz; Costa, Tatiana El Jaick Bonifácio; Costa, Imaruí; Faraco, Daniel; Silva, Maurício Laerte
2016-01-01
To describe the epidemiological profile and the survival rate of patients with acute myeloid leukemia (AML) in a state reference pediatric hospital. Clinical-epidemiological, observational, retrospective, descriptive study. The study included new cases of patients with AML, diagnosed between 2004 and 2012, younger than 15 years. Of the 51 patients studied, 84% were white; 45% were females and 55%, males. Regarding age, 8% were younger than 1 year, 47% were aged between 1 and 10 years, and 45% were older than 10 years. The main signs/symptoms were fever (41.1%), asthenia/lack of appetite (35.2%), and hemorrhagic manifestations (27.4%). The most affected extra-medullary site was the central nervous system (14%). In 47% of patients, the white blood cell (WBC) count was below 10,000/mm(3) at diagnosis. The minimal residual disease (MRD) was less than 0.1%, on the 15th day of treatment in 16% of the sample. Medullary relapse occurred in 14% of cases. When comparing the bone marrow MRD with the vital status, it was observed that 71.42% of the patients with type M3 AML were alive, as were 54.05% of those with non-M3 AML. The death rate was 43% and the main proximate cause was septic shock (63.6%). In this study, the majority of patients were male, white, and older than 1 year. Most patients with WBC count <10,000/mm(3) at diagnosis lived. Overall survival was higher in patients with MRD <0.1%. The prognosis was better in patients with AML-M3. Copyright © 2016 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.
Sutter, E Grant; Orenduff, Justin; Fox, Will J; Myers, Joshua; Garrigues, Grant E
2017-11-30
Baseball pitching imposes significant stress on the upper extremity and can lead to injury. Many studies have attempted to predict injury through pitching mechanics, most of which have used laboratory setups that are often not practical for population-based analysis. This study sought to predict injury risk in professional baseball pitchers using a statistical model based on video analysis evaluating delivery mechanics in a large population. Career data were collected and video analysis was performed on a random sample of former and current professional pitchers. Delivery mechanics were analyzed using 6 categories: mass and momentum, arm swing, posture, position at foot strike, path of arm acceleration, and finish. Effects of demographics and delivery scores on injury were determined using a survival analysis, and model validity was assessed. A total of 449 professional pitchers were analyzed. Risk of injury significantly increased with later birth date, role as reliever vs starter, and previous major injury. Risk of injury significantly decreased with increase in overall delivery score (7.8%) and independently with increase in score of the mass and momentum (16.5%), arm swing (12.0%), and position at foot strike (22.8%) categories. The accuracy of the model in predicting injury was significantly better when including total delivery score compared with demographic factors alone. This study presents a model that evaluates delivery mechanics and predicts injury risk of professional pitchers based on video analysis and demographic variables. This model can be used to assess injury risk of professional pitchers and can be potentially expanded to assess injury risk in pitchers at other levels. [Orthopedics. 201x; xx(x):xx-xx.]. Copyright 2017, SLACK Incorporated.
Toward a theory of statistical tree-shape analysis
DEFF Research Database (Denmark)
Feragen, Aasa; Lo, Pechin Chien Pau; de Bruijne, Marleen
2013-01-01
In order to develop statistical methods for shapes with a tree-structure, we construct a shape space framework for tree-shapes and study metrics on the shape space. This shape space has singularities, which correspond to topological transitions in the represented trees. We study two closely related...... metrics on the shape space, TED and QED. QED is a quotient Euclidean distance arising naturally from the shape space formulation, while TED is the classical tree edit distance. Using Gromov's metric geometry we gain new insight into the geometries defined by TED and QED. We show that the new metric QED...
Introduction to statistical data analysis for the life sciences
Ekstrom, Claus Thorn
2014-01-01
This text provides a computational toolbox that enables students to analyze real datasets and gain the confidence and skills to undertake more sophisticated analyses. Although accessible with any statistical software, the text encourages a reliance on R. For those new to R, an introduction to the software is available in an appendix. The book also includes end-of-chapter exercises as well as an entire chapter of case exercises that help students apply their knowledge to larger datasets and learn more about approaches specific to the life sciences.
Improving the Conduct and Reporting of Statistical Analysis in Psychology.
Sijtsma, Klaas; Veldkamp, Coosje L S; Wicherts, Jelte M
2016-03-01
We respond to the commentaries Waldman and Lilienfeld (Psychometrika, 2015) and Wigboldus and Dotch (Psychometrika, 2015) provided in response to Sijtsma's (Sijtsma in Psychometrika, 2015) discussion article on questionable research practices. Specifically, we discuss the fear of an increased dichotomy between substantive and statistical aspects of research that may arise when the latter aspects are laid entirely in the hands of a statistician, remedies for false positives and replication failure, and the status of data exploration, and we provide a re-definition of the concept of questionable research practices.
Bayesian statistical analysis of censored data in geotechnical engineering
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans
2000-01-01
The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...... is available about the soil parameter distribution.The present paper shows how a characteristic value by computer calcula-tions can be assessed systematically from the actual sample of censored datasupplemented with prior information from a soil parameter data base....
An invariant approach to statistical analysis of shapes
Lele, Subhash R
2001-01-01
INTRODUCTIONA Brief History of MorphometricsFoundations for the Study of Biological FormsDescription of the data SetsMORPHOMETRIC DATATypes of Morphometric DataLandmark Homology and CorrespondenceCollection of Landmark CoordinatesReliability of Landmark Coordinate DataSummarySTATISTICAL MODELS FOR LANDMARK COORDINATE DATAStatistical Models in GeneralModels for Intra-Group VariabilityEffect of Nuisance ParametersInvariance and Elimination of Nuisance ParametersA Definition of FormCoordinate System Free Representation of FormEst
JAWS data collection, analysis highlights, and microburst statistics
Mccarthy, J.; Roberts, R.; Schreiber, W.
1983-01-01
Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.
Data analysis of asymmetric structures advanced approaches in computational statistics
Saito, Takayuki
2004-01-01
Data Analysis of Asymmetric Structures provides a comprehensive presentation of a variety of models and theories for the analysis of asymmetry and its applications and provides a wealth of new approaches in every section. It meets both the practical and theoretical needs of research professionals across a wide range of disciplines and considers data analysis in fields such as psychology, sociology, social science, ecology, and marketing. In seven comprehensive chapters this guide details theories, methods, and models for the analysis of asymmetric structures in a variety of disciplines and presents future opportunities and challenges affecting research developments and business applications.
Integrated survival analysis using an event-time approach in a Bayesian framework
Walsh, Daniel P.; Dreitz, VJ; Heisey, Dennis M.
2015-01-01
Event-time or continuous-time statistical approaches have been applied throughout the biostatistical literature and have led to numerous scientific advances. However, these techniques have traditionally relied on knowing failure times. This has limited application of these analyses, particularly, within the ecological field where fates of marked animals may be unknown. To address these limitations, we developed an integrated approach within a Bayesian framework to estimate hazard rates in the face of unknown fates. We combine failure/survival times from individuals whose fates are known and times of which are interval-censored with information from those whose fates are unknown, and model the process of detecting animals with unknown fates. This provides the foundation for our integrated model and permits necessary parameter estimation. We provide the Bayesian model, its derivation, and use simulation techniques to investigate the properties and performance of our approach under several scenarios. Lastly, we apply our estimation technique using a piece-wise constant hazard function to investigate the effects of year, age, chick size and sex, sex of the tending adult, and nesting habitat on mortality hazard rates of the endangered mountain plover (Charadrius montanus) chicks. Traditional models were inappropriate for this analysis because fates of some individual chicks were unknown due to failed radio transmitters. Simulations revealed biases of posterior mean estimates were minimal (≤ 4.95%), and posterior distributions behaved as expected with RMSE of the estimates decreasing as sample sizes, detection probability, and survival increased. We determined mortality hazard rates for plover chicks were highest at weights and/or whose nest was within agricultural habitats. Based on its performance, our approach greatly expands the range of problems for which event-time analyses can be used by eliminating the need for having completely known fate data.
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
Thomas-Bachli, A L; Pearl, D L; Berke, O; Parmley, E J; Barker, I K
2017-11-01
Surveillance of West Nile virus (WNv) in Ontario has included passive reporting of human cases and testing of trapped mosquitoes and dead birds found by the public. The dead bird surveillance programme was limited to testing within a public health unit (PHU) until a small number of birds test positive. These dead corvid and mosquito surveillance programmes have not been compared for their ability to provide early warning in geographic areas where human cases occur each year. Spatial scan statistics were applied to time-to-event survival data based on first cases of WNv in found dead corvids, mosquitoes and humans. Clusters identified using raw data were compared to clusters based on model-adjusted survival times to evaluate whether geographic and sociodemographic factors influenced their distribution. Statistically significant (p space-time clusters of PHUs with faster time to detection were found using each surveillance data stream. During 2002-2004, the corvid surveillance programme outperformed the mosquito programme in terms of time to WNv detection, while the clusters of first-positive mosquito pools were more spatially similar to first human cases. In 2006, a cluster of first-positive dead corvids was located in northern PHUs and preceded a cluster of early human cases that was identified after controlling for the influence of geographic region and sociodemographic profile. © 2017 Blackwell Verlag GmbH.
The R software fundamentals of programming and statistical analysis
Lafaye de Micheaux, Pierre; Liquet, Benoit
2013-01-01
The contents of The R Software are presented so as to be both comprehensive and easy for the reader to use. Besides its application as a self-learning text, this book can support lectures on R at any level from beginner to advanced. This book can serve as a textbook on R for beginners as well as more advanced users, working on Windows, MacOs or Linux OSes. The first part of the book deals with the heart of the R language and its fundamental concepts, including data organization, import and export, various manipulations, documentation, plots, programming and maintenance. The last chapter in this part deals with oriented object programming as well as interfacing R with C/C++ or Fortran, and contains a section on debugging techniques. This is followed by the second part of the book, which provides detailed explanations on how to perform many standard statistical analyses, mainly in the Biostatistics field. Topics from mathematical and statistical settings that are included are matrix operations, integration, o...
A COMPARISON OF SOME STATISTICAL TECHNIQUES FOR ROAD ACCIDENT ANALYSIS
OPPE, S INST ROAD SAFETY RES, SWOV
1992-01-01
At the TRRL/SWOV Workshop on Accident Analysis Methodology, heldin Amsterdam in 1988, the need to establish a methodology for the analysis of road accidents was firmly stated by all participants. Data from different countries cannot be compared because there is no agreement on research methodology,
Using multivariate statistical analysis to assess changes in water ...
African Journals Online (AJOL)
Canonical correspondence analysis (CCA) showed that the environmental variables used in the analysis, discharge and month of sampling, explained a small proportion of the total variance in the data set – less than 10% at each site. However, the total data set variance, explained by the 4 hypothetical axes generated by ...
Sealed-bid auction of Netherlands mussels: statistical analysis
Kleijnen, J.P.C.; van Schaik, F.D.J.
2011-01-01
This article presents an econometric analysis of the many data on the sealed-bid auction that sells mussels in Yerseke town, the Netherlands. The goals of this analysis are obtaining insight into the important factors that determine the price of these mussels, and quantifying the performance of an
Directory of Open Access Journals (Sweden)
Hawkins Neil
2010-06-01
Full Text Available Abstract Background Data on survival endpoints are usually summarised using either hazard ratio, cumulative number of events, or median survival statistics. Network meta-analysis, an extension of traditional pairwise meta-analysis, is typically based on a single statistic. In this case, studies which do not report the chosen statistic are excluded from the analysis which may introduce bias. Methods In this paper we present a tutorial illustrating how network meta-analyses of survival endpoints can combine count and hazard ratio statistics in a single analysis on the hazard ratio scale. We also describe methods for accounting for the correlations in relative treatment effects (such as hazard ratios that arise in trials with more than two arms. Combination of count and hazard ratio data in a single analysis is achieved by estimating the cumulative hazard for each trial arm reporting count data. Correlation in relative treatment effects in multi-arm trials is preserved by converting the relative treatment effect estimates (the hazard ratios to arm-specific outcomes (hazards. Results A worked example of an analysis of mortality data in chronic obstructive pulmonary disease (COPD is used to illustrate the methods. The data set and WinBUGS code for fixed and random effects models are provided. Conclusions By incorporating all data presentations in a single analysis, we avoid the potential selection bias associated with conducting an analysis for a single statistic and the potential difficulties of interpretation, misleading results and loss of available treatment comparisons associated with conducting separate analyses for different summary statistics.
Shouno, Hayaru; Kido, Shoji; Okada, Masato
2004-09-01
Bidirectional associative memory (BAM) is a kind of an artificial neural network used to memorize and retrieve heterogeneous pattern pairs. Many efforts have been made to improve BAM from the the viewpoint of computer application, and few theoretical studies have been done. We investigated the theoretical characteristics of BAM using a framework of statistical-mechanical analysis. To investigate the equilibrium state of BAM, we applied self-consistent signal to noise analysis (SCSNA) and obtained a macroscopic parameter equations and relative capacity. Moreover, to investigate not only the equilibrium state but also the retrieval process of reaching the equilibrium state, we applied statistical neurodynamics to the update rule of BAM and obtained evolution equations for the macroscopic parameters. These evolution equations are consistent with the results of SCSNA in the equilibrium state.
Permanent teeth pulpotomy survival analysis: retrospective follow-up.
Kunert, Gustavo Golgo; Kunert, Itaborai Revoredo; da Costa Filho, Luiz Cesar; de Figueiredo, José Antônio Poli
2015-09-01
The aim of the present study is to evaluate risk factors influencing the success rates of pulpotomies both in young and adult populations. Pulpotomies (n=273) performed by a single endodontic specialist were analyzed, and data on success rates were collected. Additionally, possible explanatory variables were noted such as: age, gender, clinical findings (teeth, type of restoration after pulpotomy), radiographic findings (dentin bridge formation) and systemic conditions. The follow-up period varied from 1 to 29 years, and the results were analyzed by Kaplan-Meier survival curves and also by Cox regression. Age at the time of pulpotomy ranged from 8 to 79 and had not influenced the success rates (p=0.35). The formation of dentin bridge had a strong protective effect (hazard ratio-HR=0.16, ppulpotomy had the smallest failure rate, and amalgam has not increased the risk of failure significantly in relation to prosthesis. Resin composite restorations following pulpotomy increased in 263% the risk of failure (HR=3.63, ppulpotomy may be a successful treatment at any age, and not only for young permanent teeth. It was also possible to conclude that the use of direct composite restorations following pulpotomies is associated with higher failure rates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Statistical analysis of questionnaires a unified approach based on R and Stata
Bartolucci, Francesco; Gnaldi, Michela
2015-01-01
Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va
Statistical Analysis of Conductor Motion in LHC Superconducting Dipole Magnets
Calvi, M; Pugnat, P; Siemko, A
2004-01-01
Premature training quenches are usually caused by the transient energy release within the magnet coil as it is energised. The dominant disturbances originate in cable motion and produce observable rapid variation in voltage signals called spikes. The experimental set up and the raw data treatment to detect these phenomena are briefly recalled. The statistical properties of different features of spikes are presented like for instance the maximal amplitude, the energy, the duration and the time correlation between events. The parameterisation of the mechanical activity of magnets is addressed. The mechanical activity of full-scale prototype and first preseries LHC dipole magnets is analysed and correlations with magnet manufacturing procedures and quench performance are established. The predictability of the quench occurrence is discussed and examples presented.
PHAST: Protein-like heteropolymer analysis by statistical thermodynamics
Frigori, Rafael B.
2017-06-01
PHAST is a software package written in standard Fortran, with MPI and CUDA extensions, able to efficiently perform parallel multicanonical Monte Carlo simulations of single or multiple heteropolymeric chains, as coarse-grained models for proteins. The outcome data can be straightforwardly analyzed within its microcanonical Statistical Thermodynamics module, which allows for computing the entropy, caloric curve, specific heat and free energies. As a case study, we investigate the aggregation of heteropolymers bioinspired on Aβ25-33 fragments and their cross-seeding with IAPP20-29 isoforms. Excellent parallel scaling is observed, even under numerically difficult first-order like phase transitions, which are properly described by the built-in fully reconfigurable force fields. Still, the package is free and open source, this shall motivate users to readily adapt it to specific purposes.
Detailed statistical analysis plan for the pulmonary protection trial
DEFF Research Database (Denmark)
Buggeskov, Katrine B; Jakobsen, Janus C; Secher, Niels H
2014-01-01
BACKGROUND: Pulmonary dysfunction complicates cardiac surgery that includes cardiopulmonary bypass. The pulmonary protection trial evaluates effect of pulmonary perfusion on pulmonary function in patients suffering from chronic obstructive pulmonary disease. This paper presents the statistical plan...... for the main publication to avoid risk of outcome reporting bias, selective reporting, and data-driven results as an update to the published design and method for the trial. RESULTS: The pulmonary protection trial is a randomized, parallel group clinical trial that assesses the effect of pulmonary perfusion......: The pulmonary protection trial investigates the effect of pulmonary perfusion during cardiopulmonary bypass in chronic obstructive pulmonary disease patients. A preserved oxygenation index following pulmonary perfusion may indicate an effect and inspire to a multicenter confirmatory trial to assess a more...
Statistical mechanics analysis of LDPC coding in MIMO Gaussian channels
Energy Technology Data Exchange (ETDEWEB)
Alamino, Roberto C; Saad, David [Neural Computing Research Group, Aston University, Birmingham B4 7ET (United Kingdom)
2007-10-12
Using analytical methods of statistical mechanics, we analyse the typical behaviour of a multiple-input multiple-output (MIMO) Gaussian channel with binary inputs under low-density parity-check (LDPC) network coding and joint decoding. The saddle point equations for the replica symmetric solution are found in particular realizations of this channel, including a small and large number of transmitters and receivers. In particular, we examine the cases of a single transmitter, a single receiver and symmetric and asymmetric interference. Both dynamical and thermodynamical transitions from the ferromagnetic solution of perfect decoding to a non-ferromagnetic solution are identified for the cases considered, marking the practical and theoretical limits of the system under the current coding scheme. Numerical results are provided, showing the typical level of improvement/deterioration achieved with respect to the single transmitter/receiver result, for the various cases.
Statistical analysis of complex systems with nonclassical invariant measures
Fratalocchi, Andrea
2011-02-28
I investigate the problem of finding a statistical description of a complex many-body system whose invariant measure cannot be constructed stemming from classical thermodynamics ensembles. By taking solitons as a reference system and by employing a general formalism based on the Ablowitz-Kaup-Newell-Segur scheme, I demonstrate how to build an invariant measure and, within a one-dimensional phase space, how to develop a suitable thermodynamics. A detailed example is provided with a universal model of wave propagation, with reference to a transparent potential sustaining gray solitons. The system shows a rich thermodynamic scenario, with a free-energy landscape supporting phase transitions and controllable emergent properties. I finally discuss the origin of such behavior, trying to identify common denominators in the area of complex dynamics.
Statistical Analysis of Upper Bound using Data with Uncertainties
Tng, Barry Jia Hao
2014-01-01
Let $F$ be the unknown distribution of a non-negative continuous random variable. We would like to determine if $supp(F) \\subseteq [0,c]$ where $c$ is a constant (a proposed upper bound). Instead of directly observing $X_1,...,X_n i.i.d. \\sim F$, we only get to observe as data $Y_1,...,Y_n$ where $Y_i = X_i + \\epsilon_i$, with $\\epsilon_i$ being random variables representing errors. In this paper, we will explore methods to handle this statistical problem for two primary cases - parametric and nonparametric. The data from deep inelastic scattering experiments on measurements of $R=\\sigma_L / \\sigma_T$ would be used to test code which has been written to implement the discussed methods.
Statistical analysis of NOMAO customer votes for spots of France
Palovics, Robert; Benczur, Andras; Pap, Julia; Ermann, Leonardo; Phan, Samuel; Chepelianskii, Alexei D; Shepelyansky, Dima L
2015-01-01
We investigate the statistical properties of votes of customers for spots of France collected by the startup company NOMAO. The frequencies of votes per spot and per customer are characterized by a power law distributions which remain stable on a time scale of a decade when the number of votes is varied by almost two orders of magnitude. Using the computer science methods we explore the spectrum and the eigenvalues of a matrix containing user ratings to geolocalized items. Eigenvalues nicely map to large towns and regions but show certain level of instability as we modify the interpretation of the underlying matrix. We evaluate imputation strategies that provide improved prediction performance by reaching geographically smooth eigenvectors. We point on possible links between distribution of votes and the phenomenon of self-organized criticality.
Statistical Analysis of Complexity Generators for Cost Estimation
Rowell, Ginger Holmes
1999-01-01
Predicting the cost of cutting edge new technologies involved with spacecraft hardware can be quite complicated. A new feature of the NASA Air Force Cost Model (NAFCOM), called the Complexity Generator, is being developed to model the complexity factors that drive the cost of space hardware. This parametric approach is also designed to account for the differences in cost, based on factors that are unique to each system and subsystem. The cost driver categories included in this model are weight, inheritance from previous missions, technical complexity, and management factors. This paper explains the Complexity Generator framework, the statistical methods used to select the best model within this framework, and the procedures used to find the region of predictability and the prediction intervals for the cost of a mission.
Statistical Lineament Analysis in South Greenland Based on Landsat Imagery
DEFF Research Database (Denmark)
Conradsen, Knut; Nilsson, Gert; Thyrsted, Tage
1986-01-01
Linear features, mapped visually from MSS channel-7 photoprints (1: 1 000 000) of Landsat images from South Greenland, were digitized and analyzed statistically. A sinusoidal curve was fitted to the frequency distribution which was then divided into ten significant classes of azimuthal trends. Maps...... showing the density of linear features for each of the ten classes indicate that many of the classes are distributed in zones defined by elongate maxima or rows of maxima. In cases where the elongate maxima and the linear feature direction of the class in question are parallel, a zone of major crustal...... discontinuity is inferred. In the area investigated, such zones coincide with geochemical boundaries and graben structures, and the intersections of some zones seem to control intrusion sites. In cases where there is no parallelism between the elongate maxima and the linear feature direction, an en echelon...
Comparative Analysis of Kernel Methods for Statistical Shape Learning
National Research Council Canada - National Science Library
Rathi, Yogesh; Dambreville, Samuel; Tannenbaum, Allen
2006-01-01
.... In this work, we perform a comparative analysis of shape learning techniques such as linear PCA, kernel PCA, locally linear embedding and propose a new method, kernelized locally linear embedding...
Consolidity analysis for fully fuzzy functions, matrices, probability and statistics
Walaa Ibrahim Gabr
2015-01-01
The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, ...
Adusumilli, Praveen; Konatam, Meher Lakshmi; Gundeti, Sadashivudu; Bala, Stalin; Maddali, Lakshmi Srinivas
2017-01-01
Advent of trastuzumab has brought tremendous changes in the survival of human epidermal growth factor receptor 2 (Her2)-positive breast cancer patients. Despite the availability of the drug, it is still out of reach for many patients. There is very limited real world data regarding treatment challenges and survival analysis of these patients. Primary objective is disease-free survival (DFS) and secondary objective is overall survival (OS) and toxicity profile. Statistical analysis is done using GraphPad Prism 7.02. This is a retrospective study of all patients diagnosed with Her2-positive (Her2+) nonmetastatic invasive breast cancer from January 2007 to December 2013. In the period of this study, 885 patients are diagnosed with carcinoma breast, of which 212 are Her2/neu positive (23.9%). Of the 212 patients, only 76 (35.8%) patients received trastuzumab along with chemotherapy. Patients receiving trastuzumab with chemotherapy have longer 5-year DFS compared to those receiving chemotherapy alone, 92% and 52.6%, respectively (P = 0.0001). Five-year OS is 90.5% and 41.7% in those patients who received chemotherapy with and without trastuzumab, respectively (P = 0.0001). Seven patients (9.45%) developed Grade II reversible diastolic dysfunction. Grade II/III peripheral neuropathy due to paclitaxel is the main adverse effect seen in 21 patients. In spite of improvement in DFS and OS with trastuzumab, the number of patient receiving targeted therapy is very low due to financial constraints which need to be addressed to bridge the gap in survival of Her2+ patients.
Jansen, Lina; Eberle, Andrea; Emrich, Katharina; Gondos, Adam; Holleczek, Bernd; Kajüter, Hiltraud; Maier, Werner; Nennecke, Alice; Pritzkuleit, Ron; Brenner, Hermann
2014-06-15
Although socioeconomic inequalities in cancer survival have been demonstrated both within and between countries, evidence on the variation of the inequalities over time past diagnosis is sparse. Furthermore, no comprehensive analysis of socioeconomic differences in cancer survival in Germany has been conducted. Therefore, we analyzed variations in cancer survival for patients diagnosed with one of the 25 most common cancer sites in 1997-2006 in ten population-based cancer registries in Germany (covering 32 million inhabitants). Patients were assigned a socioeconomic status according to the district of residence at diagnosis. Period analysis was used to derive 3-month, 5-year and conditional 1-year and 5-year age-standardized relative survival for 2002-2006 for each deprivation quintile in Germany. Relative survival of patients living in the most deprived district was compared to survival of patients living in all other districts by model-based period analysis. For 21 of 25 cancer sites, 5-year relative survival was lower in the most deprived districts than in all other districts combined. The median relative excess risk of death over the 25 cancer sites decreased from 1.24 in the first 3 months to 1.16 in the following 9 months to 1.08 in the following 4 years. Inequalities persisted after adjustment for stage. These major regional socioeconomic inequalities indicate a potential for improving cancer care and survival in Germany. Studies on individual-level patient data with access to treatment information should be conducted to examine the reasons for these socioeconomic inequalities in cancer survival in more detail. © 2013 UICC.
Multivariate Statistical Analysis of the Tularosa-Hueco Basin
Agrawala, G.; Walton, J. C.
2006-12-01
The border region is growing rapidly and experiencing a sharp decline both in water quality and availability putting a strain on the quickly diminishing resource. Since water is used primarily for agricultural, domestic, commercial, livestock, mining and power generation, its rapid depletion is of major concern in the region. Tools such as Principal Component Analysis (PCA), Correspondence Analysis and Cluster Analysis have the potential to present new insight into this problem. The Tularosa-Hueco Basin is analyzed here using some of these Multivariate Analysis methods. PCA is applied to geo-chemical data from the region and a Cluster Analysis is applied to the results in order to group wells with similar characteristics. The derived Principal Axis and well groups are presented as biplots and overlaid on a digital elevation map of the region providing a visualization of potential interactions and flow path between surface water and ground water. Simulation by this modeling technique give a valuable insight to the water chemistry and the potential pollution threats to the already water diminishing resources.
Statistical and Spatial Analysis of Borderland Ground Water Geochemistry
Agrawala, G. K.; Woocay, A.; Walton, J. C.
2007-12-01
The border region is growing rapidly and experiencing a sharp decline both in water quality and availability putting a strain on the quickly diminishing resource. Since water is used primarily for agricultural, domestic, commercial, livestock, mining and power generation, its rapid depletion is of major concern in the region. Tools such as Principal Component Analysis (PCA), Correspondence Analysis and Cluster Analysis have the potential to present new insight into this problem. The Borderland groundwater is analyzed here using some of these Multivariate Analysis methods. PCA is applied to geo-chemical data from the region and a Cluster Analysis is applied to the results in order to group wells with similar characteristics. The derived Principal Axis and well groups are presented as biplots and overlaid on a digital elevation map of the region providing a visualization of potential interactions and flow path between surface water and ground water. Simulation by this modeling technique give a valuable insight to the water chemistry and the potential pollution threats to the already water diminishing resources.
New Statistical Approach to the Analysis of Hierarchical Data
Neuman, S. P.; Guadagnini, A.; Riva, M.
2014-12-01
Many variables possess a hierarchical structure reflected in how their increments vary in space and/or time. Quite commonly the increments (a) fluctuate in a highly irregular manner; (b) possess symmetric, non-Gaussian frequency distributions characterized by heavy tails that often decay with separation distance or lag; (c) exhibit nonlinear power-law scaling of sample structure functions in a midrange of lags, with breakdown in such scaling at small and large lags; (d) show extended power-law scaling (ESS) at all lags; and (e) display nonlinear scaling of power-law exponent with order of sample structure function. Some interpret this to imply that the variables are multifractal, which explains neither breakdowns in power-law scaling nor ESS. We offer an alternative interpretation consistent with all above phenomena. It views data as samples from stationary, anisotropic sub-Gaussian random fields subordinated to truncated fractional Brownian motion (tfBm) or truncated fractional Gaussian noise (tfGn). The fields are scaled Gaussian mixtures with random variances. Truncation of fBm and fGn entails filtering out components below data measurement or resolution scale and above domain scale. Our novel interpretation of the data allows us to obtain maximum likelihood estimates of all parameters characterizing the underlying truncated sub-Gaussian fields. These parameters in turn make it possible to downscale or upscale all statistical moments to situations entailing smaller or larger measurement or resolution and sampling scales, respectively. They also allow one to perform conditional or unconditional Monte Carlo simulations of random field realizations corresponding to these scales. Aspects of our approach are illustrated on field and laboratory measured porous and fractured rock permeabilities, as well as soil texture characteristics and neural network estimates of unsaturated hydraulic parameters in a deep vadose zone near Phoenix, Arizona. We also use our approach
Chrcanovic, B R; Kisch, J; Albrektsson, T; Wennerberg, A
2016-11-01
Recent studies have suggested that the insertion of dental implants in patients being diagnosed with bruxism negatively affected the implant failure rates. The aim of the present study was to investigate the association between the bruxism and the risk of dental implant failure. This retrospective study is based on 2670 patients who received 10 096 implants at one specialist clinic. Implant- and patient-related data were collected. Descriptive statistics were used to describe the patients and implants. Multilevel mixed effects parametric survival analysis was used to test the association between bruxism and risk of implant failure adjusting for several potential confounders. Criteria from a recent international consensus (Lobbezoo et al., J Oral Rehabil, 40, 2013, 2) and from the International Classification of Sleep Disorders (International classification of sleep disorders, revised: diagnostic and coding manual, American Academy of Sleep Medicine, Chicago, 2014) were used to define and diagnose the condition. The number of implants with information available for all variables totalled 3549, placed in 994 patients, with 179 implants reported as failures. The implant failure rates were 13·0% (24/185) for bruxers and 4·6% (155/3364) for non-bruxers (P bruxism was a statistically significantly risk factor to implant failure (HR 3·396; 95% CI 1·314, 8·777; P = 0·012), as well as implant length, implant diameter, implant surface, bone quantity D in relation to quantity A, bone quality 4 in relation to quality 1 (Lekholm and Zarb classification), smoking and the intake of proton pump inhibitors. It is suggested that the bruxism may be associated with an increased risk of dental implant failure. © 2016 John Wiley & Sons Ltd.
Power flow as a complement to statistical energy analysis and finite element analysis
Cuschieri, J. M.
1987-01-01
Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.
Survival Analysis of Faculty Retention and Promotion in the Social Sciences by Gender.
Directory of Open Access Journals (Sweden)
Janet M Box-Steffensmeier
Full Text Available Recruitment and retention of talent is central to the research performance of universities. Existing research shows that, while men are more likely than women to be promoted at the different stages of the academic career, no such difference is found when it comes to faculty retention rates. Current research on faculty retention, however, focuses on careers in science, technology, engineering, and mathematics (STEM. We extend this line of inquiry to the social sciences.We follow 2,218 tenure-track assistant professors hired since 1990 in seven social science disciplines at nineteen U.S. universities from time of hire to time of departure. We also track their time to promotion to associate and full professor. Using survival analysis, we examine gender differences in time to departure and time to promotion. Our methods account for censoring and unobserved heterogeneity, as well as effect heterogeneity across disciplines and cohorts.We find no statistically significant differences between genders in faculty retention. However, we do find that men are more likely to be granted tenure than women. When it comes to promotion to full professor, the results are less conclusive, as the effect of gender is sensitive to model specification.The results corroborate previous findings about gender patterns in faculty retention and promotion. They suggest that advances have been made when it comes to gender equality in retention and promotion, but important differences still persist.
Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation
Jefferys, William H.; Berger, James O.
1992-01-01
'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.
SEDA: A software package for the Statistical Earthquake Data Analysis
Lombardi, A. M.
2017-03-01
In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package.
Statistical language analysis for automatic exfiltration event detection.
Energy Technology Data Exchange (ETDEWEB)
Robinson, David Gerald
2010-04-01
This paper discusses the recent development a statistical approach for the automatic identification of anomalous network activity that is characteristic of exfiltration events. This approach is based on the language processing method eferred to as latent dirichlet allocation (LDA). Cyber security experts currently depend heavily on a rule-based framework for initial detection of suspect network events. The application of the rule set typically results in an extensive list of uspect network events that are then further explored manually for suspicious activity. The ability to identify anomalous network events is heavily dependent on the experience of the security personnel wading through the network log. Limitations f this approach are clear: rule-based systems only apply to exfiltration behavior that has previously been observed, and experienced cyber security personnel are rare commodities. Since the new methodology is not a discrete rule-based pproach, it is more difficult for an insider to disguise the exfiltration events. A further benefit is that the methodology provides a risk-based approach that can be implemented in a continuous, dynamic or evolutionary fashion. This permits uspect network activity to be identified early with a quantifiable risk associated with decision making when responding to suspicious activity.
Statistics Education Research in Malaysia and the Philippines: A Comparative Analysis
Reston, Enriqueta; Krishnan, Saras; Idris, Noraini
2014-01-01
This paper presents a comparative analysis of statistics education research in Malaysia and the Philippines by modes of dissemination, research areas, and trends. An electronic search for published research papers in the area of statistics education from 2000-2012 yielded 20 for Malaysia and 19 for the Philippines. Analysis of these papers showed…
Spatial statistical analysis of dissatisfaction with the performance of ...
African Journals Online (AJOL)
The analysis reveals spatial clustering in the level of dissatisfaction with the performance of local government. It also reveals percentage of respondents dissatisfied with dwelling, mean sense of safety index, and percentage agree the country is going in the wrong direction, as significant predictors of the level of local ...
The statistical analysis of results of solidification of fly ash
Directory of Open Access Journals (Sweden)
Pliešovská Natália
1996-09-01
Full Text Available The analysis shows, that there is no statical dependence between contents of heavy metals in fly ash on one side, and contents in leaching characteristics of heavy metals from the stabilized waste and from the waste itself on the other side.
Statistical Analysis of Hit/Miss Data (Preprint)
2012-07-01
HDBK-1823A, 2009). Other agencies and industries have also made use of this guidance (Gandossi et al., 2010) and ( Drury et al., 2006). It should...2002. Drury , Ghylin, and Holness, Error Analysis and Threat Magnitude for Carry-on Bag Inspection, Proceedings of the Human Factors and Ergonomic
Statistical Analysis Of Trace Element Concentrations In Shale ...
African Journals Online (AJOL)
Principal component and regression analysis of geochemical data in sampled shale – carbonate sediments in Guyuk, Northeastern Nigeria reveal enrichments of four predictor elements, Ni, Co, Cr and Cu to gypsum mineralisation. Ratios of their enrichments are Cu(10:1), Ni(8:1), Co(58:1) and Cr(30:1) The >70% ...
Open Access Publishing Trend Analysis: Statistics beyond the Perception
Poltronieri, Elisabetta; Bravo, Elena; Curti, Moreno; Maurizio Ferri,; Mancini, Cristina
2016-01-01
Introduction: The purpose of this analysis was twofold: to track the number of open access journals acquiring impact factor, and to investigate the distribution of subject categories pertaining to these journals. As a case study, journals in which the researchers of the National Institute of Health (Istituto Superiore di Sanità) in Italy have…
Multivariate statistical analysis of a multi-step industrial processes
DEFF Research Database (Denmark)
Reinikainen, S.P.; Høskuldsson, Agnar
2007-01-01
multivariate multi-step processes, where results from each step are used to evaluate future results, is presented. The methods presented are based on Priority PLS Regression. The basic idea is to compute the weights in the regression analysis for given steps, but adjust all data by the resulting score vectors...
A statistical inference method for the stochastic reachability analysis
Bujorianu, L.M.
2005-01-01
Many control systems have large, infinite state space that can not be easily abstracted. One method to analyse and verify these systems is reachability analysis. It is frequently used for air traffic control and power plants. Because of lack of complete information about the environment or
Statistical analysis of geodetic networks for detecting regional events
Granat, Robert
2004-01-01
We present an application of hidden Markov models (HMMs) to analysis of geodetic time series in Southern California. Our model fitting method uses a regularized version of the deterministic annealing expectation-maximization algorithm to ensure that model solutions are both robust and of high quality.
Sealed-Bid Auction of Dutch Mussels : Statistical Analysis
Kleijnen, J.P.C.; van Schaik, F.D.J.
2007-01-01
This article presents an econometric analysis of the many data on the sealed-bid auction that sells mussels in Yerseke town, the Netherlands. The goals of this analy- sis are obtaining insight into the important factors that determine the price of these mussels, and quantifying the performance of an
A comparative assessment of statistical methods for extreme weather analysis
Schlögl, Matthias; Laaha, Gregor
2017-04-01
Extreme weather exposure assessment is of major importance for scientists and practitioners alike. We compare different extreme value approaches and fitting methods with respect to their value for assessing extreme precipitation and temperature impacts. Based on an Austrian data set from 25 meteorological stations representing diverse meteorological conditions, we assess the added value of partial duration series over the standardly used annual maxima series in order to give recommendations for performing extreme value statistics of meteorological hazards. Results show the merits of the robust L-moment estimation, which yielded better results than maximum likelihood estimation in 62 % of all cases. At the same time, results question the general assumption of the threshold excess approach (employing partial duration series, PDS) being superior to the block maxima approach (employing annual maxima series, AMS) due to information gain. For low return periods (non-extreme events) the PDS approach tends to overestimate return levels as compared to the AMS approach, whereas an opposite behavior was found for high return levels (extreme events). In extreme cases, an inappropriate threshold was shown to lead to considerable biases that may outperform the possible gain of information from including additional extreme events by far. This effect was neither visible from the square-root criterion, nor from standardly used graphical diagnosis (mean residual life plot), but from a direct comparison of AMS and PDS in synoptic quantile plots. We therefore recommend performing AMS and PDS approaches simultaneously in order to select the best suited approach. This will make the analyses more robust, in cases where threshold selection and dependency introduces biases to the PDS approach, but also in cases where the AMS contains non-extreme events that may introduce similar biases. For assessing the performance of extreme events we recommend conditional performance measures that focus
Detailed statistical analysis plan for the pulmonary protection trial.
Buggeskov, Katrine B; Jakobsen, Janus C; Secher, Niels H; Jonassen, Thomas; Andersen, Lars W; Steinbrüchel, Daniel A; Wetterslev, Jørn
2014-12-23
Pulmonary dysfunction complicates cardiac surgery that includes cardiopulmonary bypass. The pulmonary protection trial evaluates effect of pulmonary perfusion on pulmonary function in patients suffering from chronic obstructive pulmonary disease. This paper presents the statistical plan for the main publication to avoid risk of outcome reporting bias, selective reporting, and data-driven results as an update to the published design and method for the trial. The pulmonary protection trial is a randomized, parallel group clinical trial that assesses the effect of pulmonary perfusion with oxygenated blood or Custodiol™ HTK (histidine-tryptophan-ketoglutarate) solution versus no pulmonary perfusion in 90 chronic obstructive pulmonary disease patients. Patients, the statistician, and the conclusion drawers are blinded to intervention allocation. The primary outcome is the oxygenation index from 10 to 15 minutes after the end of cardiopulmonary bypass until 24 hours thereafter. Secondary outcome measures are oral tracheal intubation time, days alive outside the intensive care unit, days alive outside the hospital, and 30- and 90-day mortality, and one or more of the following selected serious adverse events: pneumothorax or pleural effusion requiring drainage, major bleeding, reoperation, severe infection, cerebral event, hyperkaliemia, acute myocardial infarction, cardiac arrhythmia, renal replacement therapy, and readmission for a respiratory-related problem. The pulmonary protection trial investigates the effect of pulmonary perfusion during cardiopulmonary bypass in chronic obstructive pulmonary disease patients. A preserved oxygenation index following pulmonary perfusion may indicate an effect and inspire to a multicenter confirmatory trial to assess a more clinically relevant outcome. ClinicalTrials.gov identifier: NCT01614951, registered on 6 June 2012.
Statistical Analysis of the Grid Connected Photovoltaic System Performance Ratio
Directory of Open Access Journals (Sweden)
Javier Vilariño-García
2017-05-01
Full Text Available A methodology based on the application of variance analysis and Tukey's method to a data set of solar radiation in the plane of the photovoltaic modules and the corresponding values of power delivered to the grid at intervals of 10 minutes presents from sunrise to sunset during the 52 weeks of the year 2013. These data were obtained through a monitoring system located in a photovoltaic plant of 10 MW of rated power located in Cordoba, consisting of 16 transformers and 98 investors. The application of the comparative method among the middle of the performance index of the processing centers to detect with an analysis of variance if there is significant difference in average at least the rest at a level of significance of 5% and then by testing Tukey which one or more processing centers that are below average due to a fault to be detected and corrected are.
Determinants of ICT Infrastructure: A Cross-Country Statistical Analysis
Jens J. Krüger; Rhiel, Mathias
2016-01-01
We investigate economic and institutional determinants of ICT infrastructure for a broad cross section ofmore than 100 countries. The ICT variable is constructed from a principal components analysis. The explanatory variables are selected by variants of the Lasso estimator from the machine learning literature.In addition to least squares, we also apply robust and semiparametric regression estimators. The results show that the regressions are able to explain ICT infrastructure very well. Maj...
Practical guidance for statistical analysis of operational event data
Energy Technology Data Exchange (ETDEWEB)
Atwood, C.L.
1995-10-01
This report presents ways to avoid mistakes that are sometimes made in analysis of operational event data. It then gives guidance on what to do when a model is rejected, a list of standard types of models to consider, and principles for choosing one model over another. For estimating reliability, it gives advice on which failure modes to model, and moment formulas for combinations of failure modes. The issues are illustrated with many examples and case studies.
Statistical analysis of joint toxicity in biological growth experiments
DEFF Research Database (Denmark)
Spliid, Henrik; Tørslev, J.
1994-01-01
The authors formulate a model for the analysis of designed biological growth experiments where a mixture of toxicants is applied to biological target organisms. The purpose of such experiments is to assess the toxicity of the mixture in comparison with the toxicity observed when the toxicants are...... is applied on data from an experiment where inhibition of the growth of the bacteria Pseudomonas fluorescens caused by different mixtures of pentachlorophenol and aniline was studied....
Analysis of tensile bond strengths using Weibull statistics.
Burrow, Michael F; Thomas, David; Swain, Mike V; Tyas, Martin J
2004-09-01
Tensile strength tests of restorative resins bonded to dentin, and the resultant strengths of interfaces between the two, exhibit wide variability. Many variables can affect test results, including specimen preparation and storage, test rig design and experimental technique. However, the more fundamental source of variability, that associated with the brittle nature of the materials, has received little attention. This paper analyzes results from micro-tensile tests on unfilled resins and adhesive bonds between restorative resin composite and dentin in terms of reliability using the Weibull probability of failure method. Results for the tensile strengths of Scotchbond Multipurpose Adhesive (3M) and Clearfil LB Bond (Kuraray) bonding resins showed Weibull moduli (m) of 6.17 (95% confidence interval, 5.25-7.19) and 5.01 (95% confidence interval, 4.23-5.8). Analysis of results for micro-tensile tests on bond strengths to dentin gave moduli between 1.81 (Clearfil Liner Bond 2V) and 4.99 (Gluma One Bond, Kulzer). Material systems with m in this range do not have a well-defined strength. The Weibull approach also enables the size dependence of the strength to be estimated. An example where the bonding area was changed from 3.1 to 1.1 mm diameter is shown. Weibull analysis provides a method for determining the reliability of strength measurements in the analysis of data from bond strength and tensile tests on dental restorative materials.
The statistical analysis of single-subject data: a comparative examination.
Nourbakhsh, M R; Ottenbacher, K J
1994-08-01
The purposes of this study were to examine whether the use of three different statistical methods for analyzing single-subject data led to similar results and to identify components of graphed data that influence agreement (or disagreement) among the statistical procedures. Forty-two graphs containing single-subject data were examined. Twenty-one were AB charts of hypothetical data. The other 21 graphs appeared in Journal of Applied Behavioral Analysis, Physical Therapy, Journal of the Association for Persons With Severe Handicaps, and Journal of Behavior Therapy and Experimental Psychiatry. Three different statistical tests--the C statistic, the two-standard deviation band method, and the split-middle method of trend estimation--were used to analyze the 42 graphs. A relatively low degree of agreement (38%) was found among the three statistical tests. The highest rate of agreement for any two statistical procedures (71%) was found for the two-standard deviation band method and the C statistic. A logistic regression analysis revealed that overlap in single-subject graphed data was the best predictor of disagreement among the three statistical tests (beta = .49, P < .03). The results indicate that interpretation of data from single-subject research designs is directly influenced by the method of data analysis selected. Variation exists across both visual and statistical methods of data reduction. The advantages and disadvantages of statistical and visual analysis are described.
Meta-analysis and The Cochrane Collaboration: 20 years of the Cochrane Statistical Methods Group.
McKenzie, Joanne E; Salanti, Georgia; Lewis, Steff C; Altman, Douglas G
2013-11-26
The Statistical Methods Group has played a pivotal role in The Cochrane Collaboration over the past 20 years. The Statistical Methods Group has determined the direction of statistical methods used within Cochrane reviews, developed guidance for these methods, provided training, and continued to discuss and consider new and controversial issues in meta-analysis. The contribution of Statistical Methods Group members to the meta-analysis literature has been extensive and has helped to shape the wider meta-analysis landscape.In this paper, marking the 20th anniversary of The Cochrane Collaboration, we reflect on the history of the Statistical Methods Group, beginning in 1993 with the identification of aspects of statistical synthesis for which consensus was lacking about the best approach. We highlight some landmark methodological developments that Statistical Methods Group members have contributed to in the field of meta-analysis. We discuss how the Group implements and disseminates statistical methods within The Cochrane Collaboration. Finally, we consider the importance of robust statistical methodology for Cochrane systematic reviews, note research gaps, and reflect on the challenges that the Statistical Methods Group faces in its future direction.
Vogl, Thomas J; Dommermuth, Alena; Heinle, Britta; Nour-Eldin, Nour-Eldin A; Lehnert, Thomas; Eichler, Katrin; Zangos, Stephan; Bechstein, Wolf O; Naguib, Nagy N N
2014-01-01
The purpose of this study was the evaluation of prognostic factors for long-term survival and progression-free survival (PFS) after treatment of colorectal cancer (CRC) liver metastases with magnetic resonance-guided laser-induced interstital thermotherapy (LITT). We included 594 patients (mean age, 61.2 years) with CRC liver metastases who were treated with LITT. The statistical analysis of the long-term survival and PFS were based on the Kaplan-Meier method. The Cox regression model tested different parameters that could be of prognostic value. The tested prognostic factors were the following: sex, age, the location of primary tumor, the number of metastases, the maximal diameter and total volume of metastases and necroses, the quotient of total volumes of metastases and necroses, the time of appearance of liver metastases and location in the liver, the TNM classification of CRC, extrahepatic metastases, and neoadjuvant treatments. The median survival was 25 months starting from the date of the first LITT. The 1-, 2-, 3-, 4-, and 5-year survival rates were 78%, 50.1%, 28%, 16.4%, and 7.8%, respectively. The median PFS was 13 months. The 1-, 2-, 3-, 4-, and 5-year PFS rates were 51.3%, 35.4%, 30.7%, 25.4%, and 22.3%, respectively. The number of metastases and their maximal diameter were the most important prognostic factors for both long-term survival and PFS. Long-term survival was also highly influenced by the initial involvement of the lymph nodes. For patients treated with LITT for CRC liver metastases, the number and size of metastases, together with the initial lymph node status, are significant prognostic factors for long-term survival.
GDISC: a web portal for integrative analysis of gene-drug interaction for survival in cancer.
Spainhour, John Christian Givhan; Lim, Juho; Qiu, Peng
2017-05-01
Survival analysis has been applied to The Cancer Genome Atlas (TCGA) data. Although drug exposure records are available in TCGA, existing survival analyses typically did not consider drug exposure, partly due to naming inconsistencies in the data. We have spent extensive effort to standardize the drug exposure data, which enabled us to perform survival analysis on drug-stratified subpopulations of cancer patients. Using this strategy, we integrated gene copy number data, drug exposure data and patient survival data to infer gene-drug interactions that impact survival. The collection of all analyzed gene-drug interactions in 32 cancer types are organized and presented in a searchable web-portal called gene-drug Interaction for survival in cancer (GDISC). GDISC allows biologists and clinicians to interactively explore the gene-drug interactions identified in the context of TCGA, and discover interactions associated to their favorite cancer, drug and/or gene of interest. In addition, GDISC provides the standardized drug exposure data, which is a valuable resource for developing new methods for drug-specific analysis. GDISC is available at https://gdisc.bme.gatech.edu/. peng.qiu@bme.gatech.edu.
MethSurv: a web tool to perform multivariable survival analysis using DNA methylation data.
Modhukur, Vijayachitra; Iljasenko, Tatjana; Metsalu, Tauno; Lokk, Kaie; Laisk-Podar, Triin; Vilo, Jaak
2017-12-21
To develop a web tool for survival analysis based on CpG methylation patterns. We utilized methylome data from 'The Cancer Genome Atlas' and used the Cox proportional-hazards model to develop an interactive web interface for survival analysis. MethSurv enables survival analysis for a CpG located in or around the proximity of a query gene. For further mining, cluster analysis for a query gene to associate methylation patterns with clinical characteristics and browsing of top biomarkers for each cancer type are provided. MethSurv includes 7358 methylomes from 25 different human cancers. The MethSurv tool is a valuable platform for the researchers without programming skills to perform the initial assessment of methylation-based cancer biomarkers.
Statistical analysis of the ambiguities in the asteroid period determinations
Butkiewicz, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.
2014-07-01
A synodic period of an asteroid can be derived from its lightcurve by standard methods like Fourier-series fitting. A problem appears when results of observations are based on less than a full coverage of a lightcurve and/or high level of noise. Also, long gaps between individual lightcurves create an ambiguity in the cycle count which leads to aliases. Excluding binary systems and objects with non-principal-axis rotation, the rotation period is usually identical to the period of the second Fourier harmonic of the lightcurve. There are cases, however, where it may be connected with the 1st, 3rd, or 4th harmonic and it is difficult to choose among them when searching for the period. To help remove such uncertainties we analysed asteroid lightcurves for a range of shapes and observing/illuminating geometries. We simulated them using a modified internal code from the ISAM service (Marciniak et al. 2012, A&A 545, A131). In our computations, shapes of asteroids were modeled as Gaussian random spheres (Muinonen 1998, A&A, 332, 1087). A combination of Lommel-Seeliger and Lambert scattering laws was assumed. For each of the 100 shapes, we randomly selected 1000 positions of the spin axis, systematically changing the solar phase angle with a step of 5°. For each lightcurve, we determined its peak-to-peak amplitude, fitted the 6th-order Fourier series and derived the amplitudes of its harmonics. Instead of the number of the lightcurve extrema, which in many cases is subjective, we characterized each lightcurve by the order of the highest-amplitude Fourier harmonic. The goal of our simulations was to derive statistically significant conclusions (based on the underlying assumptions) about the dominance of different harmonics in the lightcurves of the specified amplitude and phase angle. The results, presented in the Figure, can be used in individual cases to estimate the probability that the obtained lightcurve is dominated by a specified Fourier harmonic. Some of the
Statistical Analysis Software for the TRS-80 Microcomputer.
1981-09-01
007011260 11240 LC-LCMC 112500070 11210 11240 F0LsCe*PLC 11270 0X01l-FOX i1280 RETURN 67 11290 14Xa(4CX/DF)C (1/3) - (1-(21(9.OF)DI)/SQR(2/(9*OF)) 11300...Linear Regression"FRIT I007 PRINT#4 Analysis of Variance’ 100m KPB898 : zs-X : oOSUu ISO 1009 IF 10.4 0070 20 101001 10.I~3 THEN ZT=*20 10070 10120
Ball lightning diameter-lifetime statistical analysis of SKB databank
Amirov, Anvar Kh; Bychkov, Vladimir L.
1995-03-01
Revelation of the significance of diameter as a factor for the lifetime as a parameter for different ways of Ball Lightning (BL) disappearance has been made. Methods for non-parametric regression analysis have been applied for pairs diameter - radiation losses in correspondence to BL disappearance. BL diameter as a factor turned out to be significant for BL life-time in the case of explosion and decay and insignificant in the case of extinction. Dependence logarithm of radiation losses - logarithm of BL volume obtained with the help of nonparametric regression treatment turned out to be different according to BL ways of disappearance.
Directory of Open Access Journals (Sweden)
Limor Amit
Full Text Available PURPOSE: To evaluate the effect of Bevacizumab in combination with chemotherapy on overall survival of patients with metastatic solid tumors. DESIGN: A systematic literature search to identify randomized trials comparing chemotherapy with and without Bevacizumab in metastatic cancer. The primary end point was overall survival (OS and the secondary end points were progression free survival (PFS and toxicity. A meta-analysis was performed for each tumor type and for the combination of all tumors. RESULTS: 24 randomized trials with 8 different types of malignancies were included in this meta-analysis. Patients treated with Bevacizumab had an OS benefit, hazard ratio (HR 0.89 (95% CI 0.84-0.93, P<0.00001 I(2-4%. The combined analysis showed a PFS benefit with a HR 0.71 (95% CI 0.68-0.74, P<0.00001, I(2-54%. The toxicity analysis showed a statistically significant increase in fatal adverse events (FAEs in the Bevacizumab treatment arm, risk ratio (RR 1.47 (95% CI 1.1-1.98. A separate analysis of the lung cancer trials showed an increased risk of fatal pulmonary hemorrhage with a RR of 5.65 (95% CI 1.26-25.26. The risk of G3-4 adverse events was increased: RR 1.2 (95% CI 1.15-1.24. CONCLUSION: in this combined analysis Bevacizumab improved OS (with little heterogeneity and PFS. These results should be considered in the light of lack of markers predictive of response and the increased severe and fatal toxicity seen with Bevacizumab treatment.