Application of Multivariate Statistical Analysis to Biomarkers in Se-Turkey Crude Oils
Gürgey, K.; Canbolat, S.
2017-11-01
Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%), Stable Carbon Isotope, Gas Chromatography (GC), and Gas Chromatography-Mass Spectrometry (GC-MS) data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE) Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.
APPLICATION OF MULTIVARIATE STATISTICAL ANALYSIS TO BIOMARKERS IN SE-TURKEY CRUDE OILS
Directory of Open Access Journals (Sweden)
K. Gürgey
2017-11-01
Full Text Available Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%, Stable Carbon Isotope, Gas Chromatography (GC, and Gas Chromatography-Mass Spectrometry (GC-MS data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.
Analysis of biomarker data a practical guide
Looney, Stephen W
2015-01-01
A "how to" guide for applying statistical methods to biomarker data analysis Presenting a solid foundation for the statistical methods that are used to analyze biomarker data, Analysis of Biomarker Data: A Practical Guide features preferred techniques for biomarker validation. The authors provide descriptions of select elementary statistical methods that are traditionally used to analyze biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses
Eppig, Joel S; Edmonds, Emily C; Campbell, Laura; Sanderson-Cimino, Mark; Delano-Wood, Lisa; Bondi, Mark W
2017-08-01
Research demonstrates heterogeneous neuropsychological profiles among individuals with mild cognitive impairment (MCI). However, few studies have included visuoconstructional ability or used latent mixture modeling to statistically identify MCI subtypes. Therefore, we examined whether unique neuropsychological MCI profiles could be ascertained using latent profile analysis (LPA), and subsequently investigated cerebrospinal fluid (CSF) biomarkers, genotype, and longitudinal clinical outcomes between the empirically derived classes. A total of 806 participants diagnosed by means of the Alzheimer's Disease Neuroimaging Initiative (ADNI) MCI criteria received a comprehensive neuropsychological battery assessing visuoconstructional ability, language, attention/executive function, and episodic memory. Test scores were adjusted for demographic characteristics using standardized regression coefficients based on "robust" normal control performance (n=260). Calculated Z-scores were subsequently used in the LPA, and CSF-derived biomarkers, genotype, and longitudinal clinical outcome were evaluated between the LPA-derived MCI classes. Statistical fit indices suggested a 3-class model was the optimal LPA solution. The three-class LPA consisted of a mixed impairment MCI class (n=106), an amnestic MCI class (n=455), and an LPA-derived normal class (n=245). Additionally, the amnestic and mixed classes were more likely to be apolipoprotein e4+ and have worse Alzheimer's disease CSF biomarkers than LPA-derived normal subjects. Our study supports significant heterogeneity in MCI neuropsychological profiles using LPA and extends prior work (Edmonds et al., 2015) by demonstrating a lower rate of progression in the approximately one-third of ADNI MCI individuals who may represent "false-positive" diagnoses. Our results underscore the importance of using sensitive, actuarial methods for diagnosing MCI, as current diagnostic methods may be over-inclusive. (JINS, 2017, 23, 564-576).
Directory of Open Access Journals (Sweden)
Hilko van der Voet
Full Text Available Nutrient recommendations in use today are often derived from relatively old data of few studies with few individuals. However, for many nutrients, including vitamin B-12, extensive data have now become available from both observational studies and randomized controlled trials, addressing the relation between intake and health-related status biomarkers. The purpose of this article is to provide new methodology for dietary planning based on dose-response data and meta-analysis. The methodology builds on existing work, and is consistent with current methodology and measurement error models for dietary assessment. The detailed purposes of this paper are twofold. Firstly, to define a Population Nutrient Level (PNL for dietary planning in groups. Secondly, to show how data from different sources can be combined in an extended meta-analysis of intake-status datasets for estimating PNL as well as other nutrient intake values, such as the Average Nutrient Requirement (ANR and the Individual Nutrient Level (INL. For this, a computational method is presented for comparing a bivariate lognormal distribution to a health criterion value. Procedures to meta-analyse available data in different ways are described. Example calculations on vitamin B-12 requirements were made for four models, assuming different ways of estimating the dose-response relation, and different values of the health criterion. Resulting estimates of ANRs and less so for INLs were found to be sensitive to model assumptions, whereas estimates of PNLs were much less sensitive to these assumptions as they were closer to the average nutrient intake in the available data.
Statistical Analysis and validation
Hoefsloot, H.C.J.; Horvatovich, P.; Bischoff, R.
2013-01-01
In this chapter guidelines are given for the selection of a few biomarker candidates from a large number of compounds with a relative low number of samples. The main concepts concerning the statistical validation of the search for biomarkers are discussed. These complicated methods and concepts are
Statistical data analysis handbook
National Research Council Canada - National Science Library
Wall, Francis J
1986-01-01
It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...
Raunig, David L; McShane, Lisa M; Pennello, Gene; Gatsonis, Constantine; Carson, Paul L; Voyvodic, James T; Wahl, Richard L; Kurland, Brenda F; Schwarz, Adam J; Gönen, Mithat; Zahlmann, Gudrun; Kondratovich, Marina V; O'Donnell, Kevin; Petrick, Nicholas; Cole, Patricia E; Garra, Brian; Sullivan, Daniel C
2015-02-01
Technological developments and greater rigor in the quantitative measurement of biological features in medical images have given rise to an increased interest in using quantitative imaging biomarkers to measure changes in these features. Critical to the performance of a quantitative imaging biomarker in preclinical or clinical settings are three primary metrology areas of interest: measurement linearity and bias, repeatability, and the ability to consistently reproduce equivalent results when conditions change, as would be expected in any clinical trial. Unfortunately, performance studies to date differ greatly in designs, analysis method, and metrics used to assess a quantitative imaging biomarker for clinical use. It is therefore difficult or not possible to integrate results from different studies or to use reported results to design studies. The Radiological Society of North America and the Quantitative Imaging Biomarker Alliance with technical, radiological, and statistical experts developed a set of technical performance analysis methods, metrics, and study designs that provide terminology, metrics, and methods consistent with widely accepted metrological standards. This document provides a consistent framework for the conduct and evaluation of quantitative imaging biomarker performance studies so that results from multiple studies can be compared, contrasted, or combined. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Directory of Open Access Journals (Sweden)
Joanna F Dipnall
Full Text Available Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study.The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009-2010. Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators.After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30, serum glucose (OR 1.01; 95% CI 1.00, 1.01 and total bilirubin (OR 0.12; 95% CI 0.05, 0.28. Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016, and current smokers (p<0.001.The systematic use of a hybrid methodology for variable selection, fusing data mining techniques using a machine learning algorithm with traditional statistical modelling, accounted for missing data and complex survey sampling
Dipnall, Joanna F; Pasco, Julie A; Berk, Michael; Williams, Lana J; Dodd, Seetal; Jacka, Felice N; Meyer, Denny
2016-01-01
Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009-2010). Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators. After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30), serum glucose (OR 1.01; 95% CI 1.00, 1.01) and total bilirubin (OR 0.12; 95% CI 0.05, 0.28). Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016), and current smokers (pmachine learning algorithm with traditional statistical modelling, accounted for missing data and complex survey sampling methodology and was demonstrated to be a useful tool for detecting three biomarkers associated with depression for future
Beginning statistics with data analysis
Mosteller, Frederick; Rourke, Robert EK
2013-01-01
This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.
Imaging mass spectrometry statistical analysis.
Jones, Emrys A; Deininger, Sören-Oliver; Hogendoorn, Pancras C W; Deelder, André M; McDonnell, Liam A
2012-08-30
Imaging mass spectrometry is increasingly used to identify new candidate biomarkers. This clinical application of imaging mass spectrometry is highly multidisciplinary: expertise in mass spectrometry is necessary to acquire high quality data, histology is required to accurately label the origin of each pixel's mass spectrum, disease biology is necessary to understand the potential meaning of the imaging mass spectrometry results, and statistics to assess the confidence of any findings. Imaging mass spectrometry data analysis is further complicated because of the unique nature of the data (within the mass spectrometry field); several of the assumptions implicit in the analysis of LC-MS/profiling datasets are not applicable to imaging. The very large size of imaging datasets and the reporting of many data analysis routines, combined with inadequate training and accessible reviews, have exacerbated this problem. In this paper we provide an accessible review of the nature of imaging data and the different strategies by which the data may be analyzed. Particular attention is paid to the assumptions of the data analysis routines to ensure that the reader is apprised of their correct usage in imaging mass spectrometry research. Copyright © 2012 Elsevier B.V. All rights reserved.
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
Dipnall, Joanna F.
2016-01-01
Background Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. Methods The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009–2010). Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators. Results After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30), serum glucose (OR 1.01; 95% CI 1.00, 1.01) and total bilirubin (OR 0.12; 95% CI 0.05, 0.28). Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016), and current smokers (p<0.001). Conclusion The systematic use of a hybrid methodology for variable selection, fusing data mining techniques using a machine learning algorithm with traditional statistical modelling, accounted for missing data and
Bioinformatics and biomarker discovery "Omic" data analysis for personalized medicine
Azuaje, Francisco
2010-01-01
This book is designed to introduce biologists, clinicians and computational researchers to fundamental data analysis principles, techniques and tools for supporting the discovery of biomarkers and the implementation of diagnostic/prognostic systems. The focus of the book is on how fundamental statistical and data mining approaches can support biomarker discovery and evaluation, emphasising applications based on different types of "omic" data. The book also discusses design factors, requirements and techniques for disease screening, diagnostic and prognostic applications. Readers are provided w
Per Object statistical analysis
DEFF Research Database (Denmark)
2008-01-01
of a specific class in turn, and uses as pair of PPO stages to derive the statistics and then assign them to the objects' Object Variables. It may be that this could all be done in some other, simply way, but several other ways that were tried did not succeed. The procedure ouptut has been tested against...
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
International Nuclear Information System (INIS)
Hahn, A.A.
1994-11-01
The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques
DEFF Research Database (Denmark)
Ris Hansen, Inge; Søgaard, Karen; Gram, Bibi
2015-01-01
This is the analysis plan for the multicentre randomised control study looking at the effect of training and exercises in chronic neck pain patients that is being conducted in Jutland and Funen, Denmark. This plan will be used as a work description for the analyses of the data collected....
Research design and statistical analysis
Myers, Jerome L; Lorch Jr, Robert F
2013-01-01
Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Regularized Statistical Analysis of Anatomy
DEFF Research Database (Denmark)
Sjöstrand, Karl
2007-01-01
This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus...... and mind. Statistics represents a quintessential part of such investigations as they are preluded by a clinical hypothesis that must be verified based on observed data. The massive amounts of image data produced in each examination pose an important and interesting statistical challenge...... efficient algorithms which make the analysis of large data sets feasible, and gives examples of applications....
One of the main uses of biomarker measurements is to compare different populations to each other and to assess risk in comparison to established parameters. This is most often done using summary statistics such as central tendency, variance components, confidence intervals, excee...
Karaismailoğlu, Eda; Dikmen, Zeliha Günnur; Akbıyık, Filiz; Karaağaoğlu, Ahmet Ergun
2018-04-30
Background/aim: Myoglobin, cardiac troponin T, B-type natriuretic peptide (BNP), and creatine kinase isoenzyme MB (CK-MB) are frequently used biomarkers for evaluating risk of patients admitted to an emergency department with chest pain. Recently, time- dependent receiver operating characteristic (ROC) analysis has been used to evaluate the predictive power of biomarkers where disease status can change over time. We aimed to determine the best set of biomarkers that estimate cardiac death during follow-up time. We also obtained optimal cut-off values of these biomarkers, which differentiates between patients with and without risk of death. A web tool was developed to estimate time intervals in risk. Materials and methods: A total of 410 patients admitted to the emergency department with chest pain and shortness of breath were included. Cox regression analysis was used to determine an optimal set of biomarkers that can be used for estimating cardiac death and to combine the significant biomarkers. Time-dependent ROC analysis was performed for evaluating performances of significant biomarkers and a combined biomarker during 240 h. The bootstrap method was used to compare statistical significance and the Youden index was used to determine optimal cut-off values. Results : Myoglobin and BNP were significant by multivariate Cox regression analysis. Areas under the time-dependent ROC curves of myoglobin and BNP were about 0.80 during 240 h, and that of the combined biomarker (myoglobin + BNP) increased to 0.90 during the first 180 h. Conclusion: Although myoglobin is not clinically specific to a cardiac event, in our study both myoglobin and BNP were found to be statistically significant for estimating cardiac death. Using this combined biomarker may increase the power of prediction. Our web tool can be useful for evaluating the risk status of new patients and helping clinicians in making decisions.
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Biomarker identification and effect estimation on schizophrenia –a high dimensional data analysis
Directory of Open Access Journals (Sweden)
Yuanzhang eLi
2015-05-01
Full Text Available Biomarkers have been examined in schizophrenia research for decades. Medical morbidity and mortality rates, as well as personal and societal costs, are associated with schizophrenia patients. The identification of biomarkers and alleles, which often have a small effect individually, may help to develop new diagnostic tests for early identification and treatment. Currently, there is not a commonly accepted statistical approach to identify predictive biomarkers from high dimensional data. We used space Decomposition-Gradient-Regression method (DGR to select biomarkers, which are associated with the risk of schizophrenia. Then, we used the gradient scores, generated from the selected biomarkers, as the prediction factor in regression to estimate their effects. We also used an alternative approach, classification and regression tree (CART, to compare the biomarker selected by DGR and found about 70% of the selected biomarkers were the same. However, the advantage of DGR is that it can evaluate individual effects for each biomarker from their combined effect. In DGR analysis of serum specimens of US military service members with a diagnosis of schizophrenia from 1992 to 2005 and their controls, Alpha-1-Antitrypsin (AAT, Interleukin-6 receptor (IL-6r and Connective Tissue Growth Factor (CTGF were selected to identify schizophrenia for males; and Alpha-1-Antitrypsin (AAT, Apolipoprotein B (Apo B and Sortilin were selected for females. If these findings from military subjects are replicated by other studies, they suggest the possibility of a novel biomarker panel as an adjunct to earlier diagnosis and initiation of treatment.
The statistical analysis of anisotropies
International Nuclear Information System (INIS)
Webster, A.
1977-01-01
One of the many uses to which a radio survey may be put is an analysis of the distribution of the radio sources on the celestial sphere to find out whether they are bunched into clusters or lie in preferred regions of space. There are many methods of testing for clustering in point processes and since they are not all equally good this contribution is presented as a brief guide to what seems to be the best of them. The radio sources certainly do not show very strong clusering and may well be entirely unclustered so if a statistical method is to be useful it must be both powerful and flexible. A statistic is powerful in this context if it can efficiently distinguish a weakly clustered distribution of sources from an unclustered one, and it is flexible if it can be applied in a way which avoids mistaking defects in the survey for true peculiarities in the distribution of sources. The paper divides clustering statistics into two classes: number density statistics and log N/log S statistics. (Auth.)
An integrative multi-platform analysis for discovering biomarkers of osteosarcoma
International Nuclear Information System (INIS)
Li, Guodong; Zhang, Wenjuan; Zeng, Huazong; Chen, Lei; Wang, Wenjing; Liu, Jilong; Zhang, Zhiyu; Cai, Zhengdong
2009-01-01
SELDI-TOF-MS (Surface Enhanced Laser Desorption/Ionization-Time of Flight-Mass Spectrometry) has become an attractive approach for cancer biomarker discovery due to its ability to resolve low mass proteins and high-throughput capability. However, the analytes from mass spectrometry are described only by their mass-to-charge ratio (m/z) values without further identification and annotation. To discover potential biomarkers for early diagnosis of osteosarcoma, we designed an integrative workflow combining data sets from both SELDI-TOF-MS and gene microarray analysis. After extracting the information for potential biomarkers from SELDI data and microarray analysis, their associations were further inferred by link-test to identify biomarkers that could likely be used for diagnosis. Immuno-blot analysis was then performed to examine whether the expression of the putative biomarkers were indeed altered in serum from patients with osteosarcoma. Six differentially expressed protein peaks with strong statistical significances were detected by SELDI-TOF-MS. Four of the proteins were up-regulated and two of them were down-regulated. Microarray analysis showed that, compared with an osteoblastic cell line, the expression of 653 genes was changed more than 2 folds in three osteosarcoma cell lines. While expression of 310 genes was increased, expression of the other 343 genes was decreased. The two sets of biomarkers candidates were combined by the link-test statistics, indicating that 13 genes were potential biomarkers for early diagnosis of osteosarcoma. Among these genes, cytochrome c1 (CYC-1) was selected for further experimental validation. Link-test on datasets from both SELDI-TOF-MS and microarray high-throughput analysis can accelerate the identification of tumor biomarkers. The result confirmed that CYC-1 may be a promising biomarker for early diagnosis of osteosarcoma
Statistical analysis of environmental data
International Nuclear Information System (INIS)
Beauchamp, J.J.; Bowman, K.O.; Miller, F.L. Jr.
1975-10-01
This report summarizes the analyses of data obtained by the Radiological Hygiene Branch of the Tennessee Valley Authority from samples taken around the Browns Ferry Nuclear Plant located in Northern Alabama. The data collection was begun in 1968 and a wide variety of types of samples have been gathered on a regular basis. The statistical analysis of environmental data involving very low-levels of radioactivity is discussed. Applications of computer calculations for data processing are described
Statistical considerations on safety analysis
International Nuclear Information System (INIS)
Pal, L.; Makai, M.
2004-01-01
The authors have investigated the statistical methods applied to safety analysis of nuclear reactors and arrived at alarming conclusions: a series of calculations with the generally appreciated safety code ATHLET were carried out to ascertain the stability of the results against input uncertainties in a simple experimental situation. Scrutinizing those calculations, we came to the conclusion that the ATHLET results may exhibit chaotic behavior. A further conclusion is that the technological limits are incorrectly set when the output variables are correlated. Another formerly unnoticed conclusion of the previous ATHLET calculations that certain innocent looking parameters (like wall roughness factor, the number of bubbles per unit volume, the number of droplets per unit volume) can influence considerably such output parameters as water levels. The authors are concerned with the statistical foundation of present day safety analysis practices and can only hope that their own misjudgment will be dispelled. Until then, the authors suggest applying correct statistical methods in safety analysis even if it makes the analysis more expensive. It would be desirable to continue exploring the role of internal parameters (wall roughness factor, steam-water surface in thermal hydraulics codes, homogenization methods in neutronics codes) in system safety codes and to study their effects on the analysis. In the validation and verification process of a code one carries out a series of computations. The input data are not precisely determined because measured data have an error, calculated data are often obtained from a more or less accurate model. Some users of large codes are content with comparing the nominal output obtained from the nominal input, whereas all the possible inputs should be taken into account when judging safety. At the same time, any statement concerning safety must be aleatory, and its merit can be judged only when the probability is known with which the
Statistical evaluation of diagnostic performance topics in ROC analysis
Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E
2016-01-01
Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
Statistical analysis of JET disruptions
International Nuclear Information System (INIS)
Tanga, A.; Johnson, M.F.
1991-07-01
In the operation of JET and of any tokamak many discharges are terminated by a major disruption. The disruptive termination of a discharge is usually an unwanted event which may cause damage to the structure of the vessel. In a reactor disruptions are potentially a very serious problem, hence the importance of studying them and devising methods to avoid disruptions. Statistical information has been collected about the disruptions which have occurred at JET over a long span of operations. The analysis is focused on the operational aspects of the disruptions rather than on the underlining physics. (Author)
Statistical Analysis of Protein Ensembles
Máté, Gabriell; Heermann, Dieter
2014-04-01
As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Statistical data analysis using SAS intermediate statistical methods
Marasinghe, Mervyn G
2018-01-01
The aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data. The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude. Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitab...
Meta-Statistics for Variable Selection: The R Package BioMark
Directory of Open Access Journals (Sweden)
Ron Wehrens
2012-11-01
Full Text Available Biomarker identification is an ever more important topic in the life sciences. With the advent of measurement methodologies based on microarrays and mass spectrometry, thousands of variables are routinely being measured on complex biological samples. Often, the question is what makes two groups of samples different. Classical hypothesis testing suffers from the multiple testing problem; however, correcting for this often leads to a lack of power. In addition, choosing α cutoff levels remains somewhat arbitrary. Also in a regression context, a model depending on few but relevant variables will be more accurate and precise, and easier to interpret biologically.We propose an R package, BioMark, implementing two meta-statistics for variable selection. The first, higher criticism, presents a data-dependent selection threshold for significance, instead of a cookbook value of α = 0.05. It is applicable in all cases where two groups are compared. The second, stability selection, is more general, and can also be applied in a regression context. This approach uses repeated subsampling of the data in order to assess the variability of the model coefficients and selects those that remain consistently important. It is shown using experimental spike-in data from the field of metabolomics that both approaches work well with real data. BioMark also contains functionality for simulating data with specific characteristics for algorithm development and testing.
Dipnall, Joanna F.; Pasco, Julie A.; Berk, Michael; Williams, Lana J.; Dodd, Seetal; Jacka, Felice N.; Meyer, Denny
2016-01-01
Background Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. Methods The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted reg...
Parametric statistical change point analysis
Chen, Jie
2000-01-01
This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study
Quantitative Imaging Biomarkers: A Review of Statistical Methods for Computer Algorithm Comparisons
2014-01-01
Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. PMID:24919829
Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons.
Obuchowski, Nancy A; Reeves, Anthony P; Huang, Erich P; Wang, Xiao-Feng; Buckler, Andrew J; Kim, Hyun J Grace; Barnhart, Huiman X; Jackson, Edward F; Giger, Maryellen L; Pennello, Gene; Toledano, Alicia Y; Kalpathy-Cramer, Jayashree; Apanasovich, Tatiyana V; Kinahan, Paul E; Myers, Kyle J; Goldgof, Dmitry B; Barboriak, Daniel P; Gillies, Robert J; Schwartz, Lawrence H; Sullivan, Daniel C
2015-02-01
Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Statistical analysis and data management
International Nuclear Information System (INIS)
Anon.
1981-01-01
This report provides an overview of the history of the WIPP Biology Program. The recommendations of the American Institute of Biological Sciences (AIBS) for the WIPP biology program are summarized. The data sets available for statistical analyses and problems associated with these data sets are also summarized. Biological studies base maps are presented. A statistical model is presented to evaluate any correlation between climatological data and small mammal captures. No statistically significant relationship between variance in small mammal captures on Dr. Gennaro's 90m x 90m grid and precipitation records from the Duval Potash Mine were found
Statistical analysis of management data
Gatignon, Hubert
2013-01-01
This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.
A Statistical Analysis of Cryptocurrencies
Directory of Open Access Journals (Sweden)
Stephen Chan
2017-05-01
Full Text Available We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization, of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic distribution gives the best fit, while for the smaller cryptocurrencies the normal inverse Gaussian distribution, generalized t distribution, and Laplace distribution give good fits. The results are important for investment and risk management purposes.
Integrative analysis to select cancer candidate biomarkers to targeted validation
Heberle, Henry; Domingues, Romênia R.; Granato, Daniela C.; Yokoo, Sami; Canevarolo, Rafael R.; Winck, Flavia V.; Ribeiro, Ana Carolina P.; Brandão, Thaís Bianca; Filgueiras, Paulo R.; Cruz, Karen S. P.; Barbuto, José Alexandre; Poppi, Ronei J.; Minghim, Rosane; Telles, Guilherme P.; Fonseca, Felipe Paiva; Fox, Jay W.; Santos-Silva, Alan R.; Coletta, Ricardo D.; Sherman, Nicholas E.; Paes Leme, Adriana F.
2015-01-01
Targeted proteomics has flourished as the method of choice for prospecting for and validating potential candidate biomarkers in many diseases. However, challenges still remain due to the lack of standardized routines that can prioritize a limited number of proteins to be further validated in human samples. To help researchers identify candidate biomarkers that best characterize their samples under study, a well-designed integrative analysis pipeline, comprising MS-based discovery, feature selection methods, clustering techniques, bioinformatic analyses and targeted approaches was performed using discovery-based proteomic data from the secretomes of three classes of human cell lines (carcinoma, melanoma and non-cancerous). Three feature selection algorithms, namely, Beta-binomial, Nearest Shrunken Centroids (NSC), and Support Vector Machine-Recursive Features Elimination (SVM-RFE), indicated a panel of 137 candidate biomarkers for carcinoma and 271 for melanoma, which were differentially abundant between the tumor classes. We further tested the strength of the pipeline in selecting candidate biomarkers by immunoblotting, human tissue microarrays, label-free targeted MS and functional experiments. In conclusion, the proposed integrative analysis was able to pre-qualify and prioritize candidate biomarkers from discovery-based proteomics to targeted MS. PMID:26540631
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical methods for astronomical data analysis
Chattopadhyay, Asis Kumar
2014-01-01
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...
2013-01-01
Background Calcium deficiency is a global public-health problem. Although the initial stage of calcium deficiency can lead to metabolic alterations or potential pathological changes, calcium deficiency is difficult to diagnose accurately. Moreover, the details of the molecular mechanism of calcium deficiency remain somewhat elusive. To accurately assess and provide appropriate nutritional intervention, we carried out a global analysis of metabolic alterations in response to calcium deficiency. Methods The metabolic alterations associated with calcium deficiency were first investigated in a rat model, using urinary metabonomics based on ultra-performance liquid chromatography coupled with quadrupole time-of-flight tandem mass spectrometry and multivariate statistical analysis. Correlations between dietary calcium intake and the biomarkers identified from the rat model were further analyzed to confirm the potential application of these biomarkers in humans. Results Urinary metabolic-profiling analysis could preliminarily distinguish between calcium-deficient and non-deficient rats after a 2-week low-calcium diet. We established an integrated metabonomics strategy for identifying reliable biomarkers of calcium deficiency using a time-course analysis of discriminating metabolites in a low-calcium diet experiment, repeating the low-calcium diet experiment and performing a calcium-supplement experiment. In total, 27 biomarkers were identified, including glycine, oxoglutaric acid, pyrophosphoric acid, sebacic acid, pseudouridine, indoxyl sulfate, taurine, and phenylacetylglycine. The integrated urinary metabonomics analysis, which combined biomarkers with regular trends of change (types A, B, and C), could accurately assess calcium-deficient rats at different stages and clarify the dynamic pathophysiological changes and molecular mechanism of calcium deficiency in detail. Significant correlations between calcium intake and two biomarkers, pseudouridine (Pearson
Statistical analysis with Excel for dummies
Schmuller, Joseph
2013-01-01
Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro
Welzenbach, Julia; Neuhoff, Christiane; Looft, Christian; Schellander, Karl; Tholen, Ernst; Große-Brinkhaus, Christine
2016-01-01
The aim of this study was to elucidate the underlying biochemical processes to identify potential key molecules of meat quality traits drip loss, pH of meat 1 h post-mortem (pH1), pH in meat 24 h post-mortem (pH24) and meat color. An untargeted metabolomics approach detected the profiles of 393 annotated and 1,600 unknown metabolites in 97 Duroc × Pietrain pigs. Despite obvious differences regarding the statistical approaches, the four applied methods, namely correlation analysis, principal component analysis, weighted network analysis (WNA) and random forest regression (RFR), revealed mainly concordant results. Our findings lead to the conclusion that meat quality traits pH1, pH24 and color are strongly influenced by processes of post-mortem energy metabolism like glycolysis and pentose phosphate pathway, whereas drip loss is significantly associated with metabolites of lipid metabolism. In case of drip loss, RFR was the most suitable method to identify reliable biomarkers and to predict the phenotype based on metabolites. On the other hand, WNA provides the best parameters to investigate the metabolite interactions and to clarify the complex molecular background of meat quality traits. In summary, it was possible to attain findings on the interaction of meat quality traits and their underlying biochemical processes. The detected key metabolites might be better indicators of meat quality especially of drip loss than the measured phenotype itself and potentially might be used as bio indicators. PMID:26919205
A Pilot Proteomic Analysis of Salivary Biomarkers in Autism Spectrum Disorder.
Ngounou Wetie, Armand G; Wormwood, Kelly L; Russell, Stefanie; Ryan, Jeanne P; Darie, Costel C; Woods, Alisa G
2015-06-01
Autism spectrum disorder (ASD) prevalence is increasing, with current estimates at 1/68-1/50 individuals diagnosed with an ASD. Diagnosis is based on behavioral assessments. Early diagnosis and intervention is known to greatly improve functional outcomes in people with ASD. Diagnosis, treatment monitoring and prognosis of ASD symptoms could be facilitated with biomarkers to complement behavioral assessments. Mass spectrometry (MS) based proteomics may help reveal biomarkers for ASD. In this pilot study, we have analyzed the salivary proteome in individuals with ASD compared to neurotypical control subjects, using MS-based proteomics. Our goal is to optimize methods for salivary proteomic biomarker discovery and to identify initial putative biomarkers in people with ASDs. The salivary proteome is virtually unstudied in ASD, and saliva could provide an easily accessible biomaterial for analysis. Using nano liquid chromatography-tandem mass spectrometry, we found statistically significant differences in several salivary proteins, including elevated prolactin-inducible protein, lactotransferrin, Ig kappa chain C region, Ig gamma-1 chain C region, Ig lambda-2 chain C regions, neutrophil elastase, polymeric immunoglobulin receptor and deleted in malignant brain tumors 1. Our results indicate that this is an effective method for identification of salivary protein biomarkers, support the concept that immune system and gastrointestinal disturbances may be present in individuals with ASDs and point toward the need for larger studies in behaviorally-characterized individuals. © 2015 International Society for Autism Research, Wiley Periodicals, Inc.
Collecting operational event data for statistical analysis
International Nuclear Information System (INIS)
Atwood, C.L.
1994-09-01
This report gives guidance for collecting operational data to be used for statistical analysis, especially analysis of event counts. It discusses how to define the purpose of the study, the unit (system, component, etc.) to be studied, events to be counted, and demand or exposure time. Examples are given of classification systems for events in the data sources. A checklist summarizes the essential steps in data collection for statistical analysis
Explorations in Statistics: The Analysis of Change
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Classification, (big) data analysis and statistical learning
Conversano, Claudio; Vichi, Maurizio
2018-01-01
This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...
Statistical hot spot analysis of reactor cores
International Nuclear Information System (INIS)
Schaefer, H.
1974-05-01
This report is an introduction into statistical hot spot analysis. After the definition of the term 'hot spot' a statistical analysis is outlined. The mathematical method is presented, especially the formula concerning the probability of no hot spots in a reactor core is evaluated. A discussion with the boundary conditions of a statistical hot spot analysis is given (technological limits, nominal situation, uncertainties). The application of the hot spot analysis to the linear power of pellets and the temperature rise in cooling channels is demonstrated with respect to the test zone of KNK II. Basic values, such as probability of no hot spots, hot spot potential, expected hot spot diagram and cumulative distribution function of hot spots, are discussed. It is shown, that the risk of hot channels can be dispersed equally over all subassemblies by an adequate choice of the nominal temperature distribution in the core
Statistics and analysis of scientific data
Bonamente, Massimiliano
2013-01-01
Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...
Rweb:Web-based Statistical Analysis
Directory of Open Access Journals (Sweden)
Jeff Banfield
1999-03-01
Full Text Available Rweb is a freely accessible statistical analysis environment that is delivered through the World Wide Web (WWW. It is based on R, a well known statistical analysis package. The only requirement to run the basic Rweb interface is a WWW browser that supports forms. If you want graphical output you must, of course, have a browser that supports graphics. The interface provides access to WWW accessible data sets, so you may run Rweb on your own data. Rweb can provide a four window statistical computing environment (code input, text output, graphical output, and error information through browsers that support Javascript. There is also a set of point and click modules under development for use in introductory statistics courses.
Semiclassical analysis, Witten Laplacians, and statistical mechanis
Helffer, Bernard
2002-01-01
This important book explains how the technique of Witten Laplacians may be useful in statistical mechanics. It considers the problem of analyzing the decay of correlations, after presenting its origin in statistical mechanics. In addition, it compares the Witten Laplacian approach with other techniques, such as the transfer matrix approach and its semiclassical analysis. The author concludes by providing a complete proof of the uniform Log-Sobolev inequality. Contents: Witten Laplacians Approach; Problems in Statistical Mechanics with Discrete Spins; Laplace Integrals and Transfer Operators; S
A statistical approach to plasma profile analysis
International Nuclear Information System (INIS)
Kardaun, O.J.W.F.; McCarthy, P.J.; Lackner, K.; Riedel, K.S.
1990-05-01
A general statistical approach to the parameterisation and analysis of tokamak profiles is presented. The modelling of the profile dependence on both the radius and the plasma parameters is discussed, and pertinent, classical as well as robust, methods of estimation are reviewed. Special attention is given to statistical tests for discriminating between the various models, and to the construction of confidence intervals for the parameterised profiles and the associated global quantities. The statistical approach is shown to provide a rigorous approach to the empirical testing of plasma profile invariance. (orig.)
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; Højsgaard, Søren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....
Foundation of statistical energy analysis in vibroacoustics
Le Bot, A
2015-01-01
This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.
A Statistical Toolkit for Data Analysis
International Nuclear Information System (INIS)
Donadio, S.; Guatelli, S.; Mascialino, B.; Pfeiffer, A.; Pia, M.G.; Ribon, A.; Viarengo, P.
2006-01-01
The present project aims to develop an open-source and object-oriented software Toolkit for statistical data analysis. Its statistical testing component contains a variety of Goodness-of-Fit tests, from Chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Goodman, Fisz-Cramer-von Mises, Kuiper, Tiku. Thanks to the component-based design and the usage of the standard abstract interfaces for data analysis, this tool can be used by other data analysis systems or integrated in experimental software frameworks. This Toolkit has been released and is downloadable from the web. In this paper we describe the statistical details of the algorithms, the computational features of the Toolkit and describe the code validation
Analysis of statistical misconception in terms of statistical reasoning
Maryati, I.; Priatna, N.
2018-05-01
Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
Statistical analysis of network data with R
Kolaczyk, Eric D
2014-01-01
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Statistics and analysis of scientific data
Bonamente, Massimiliano
2017-01-01
The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...
Metabolome analysis for discovering biomarkers of gastroenterological cancer.
Suzuki, Makoto; Nishiumi, Shin; Matsubara, Atsuki; Azuma, Takeshi; Yoshida, Masaru
2014-09-01
Improvements in analytical technologies have made it possible to rapidly determine the concentrations of thousands of metabolites in any biological sample, which has resulted in metabolome analysis being applied to various types of research, such as clinical, cell biology, and plant/food science studies. The metabolome represents all of the end products and by-products of the numerous complex metabolic pathways operating in a biological system. Thus, metabolome analysis allows one to survey the global changes in an organism's metabolic profile and gain a holistic understanding of the changes that occur in organisms during various biological processes, e.g., during disease development. In clinical metabolomic studies, there is a strong possibility that differences in the metabolic profiles of human specimens reflect disease-specific states. Recently, metabolome analysis of biofluids, e.g., blood, urine, or saliva, has been increasingly used for biomarker discovery and disease diagnosis. Mass spectrometry-based techniques have been extensively used for metabolome analysis because they exhibit high selectivity and sensitivity during the identification and quantification of metabolites. Here, we describe metabolome analysis using liquid chromatography-mass spectrometry, gas chromatography-mass spectrometry, and capillary electrophoresis-mass spectrometry. Furthermore, the findings of studies that attempted to discover biomarkers of gastroenterological cancer are also outlined. Finally, we discuss metabolome analysis-based disease diagnosis. Copyright © 2014 Elsevier B.V. All rights reserved.
Statistical analysis on extreme wave height
Digital Repository Service at National Institute of Oceanography (India)
Teena, N.V.; SanilKumar, V.; Sudheesh, K.; Sajeev, R.
-294. • WAFO (2000) – A MATLAB toolbox for analysis of random waves and loads, Lund University, Sweden, homepage http://www.maths.lth.se/matstat/wafo/,2000. 15 Table 1: Statistical results of data and fitted distribution for cumulative distribution...
Applied Behavior Analysis and Statistical Process Control?
Hopkins, B. L.
1995-01-01
Incorporating statistical process control (SPC) methods into applied behavior analysis is discussed. It is claimed that SPC methods would likely reduce applied behavior analysts' intimate contacts with problems and would likely yield poor treatment and research decisions. Cases and data presented by Pfadt and Wheeler (1995) are cited as examples.…
The fuzzy approach to statistical analysis
Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.
2006-01-01
For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;
Plasma data analysis using statistical analysis system
International Nuclear Information System (INIS)
Yoshida, Z.; Iwata, Y.; Fukuda, Y.; Inoue, N.
1987-01-01
Multivariate factor analysis has been applied to a plasma data base of REPUTE-1. The characteristics of the reverse field pinch plasma in REPUTE-1 are shown to be explained by four independent parameters which are described in the report. The well known scaling laws F/sub chi/ proportional to I/sub p/, T/sub e/ proportional to I/sub p/, and tau/sub E/ proportional to N/sub e/ are also confirmed. 4 refs., 8 figs., 1 tab
Statistical analysis of metallicity in spiral galaxies
Energy Technology Data Exchange (ETDEWEB)
Galeotti, P [Consiglio Nazionale delle Ricerche, Turin (Italy). Lab. di Cosmo-Geofisica; Turin Univ. (Italy). Ist. di Fisica Generale)
1981-04-01
A principal component analysis of metallicity and other integral properties of 33 spiral galaxies is presented; the involved parameters are: morphological type, diameter, luminosity and metallicity. From the statistical analysis it is concluded that the sample has only two significant dimensions and additonal tests, involving different parameters, show similar results. Thus it seems that only type and luminosity are independent variables, being the other integral properties of spiral galaxies correlated with them.
Selected papers on analysis, probability, and statistics
Nomizu, Katsumi
1994-01-01
This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.
Statistical evaluation of vibration analysis techniques
Milner, G. Martin; Miller, Patrice S.
1987-01-01
An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...... for timber are investigated....
The Statistical Analysis of Time Series
Anderson, T W
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Developments in statistical analysis in quantitative genetics
DEFF Research Database (Denmark)
Sorensen, Daniel
2009-01-01
of genetic means and variances, models for the analysis of categorical and count data, the statistical genetics of a model postulating that environmental variance is partly under genetic control, and a short discussion of models that incorporate massive genetic marker information. We provide an overview......A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap...... and by Markov chain Monte Carlo (McMC). In this overview a number of specific areas are chosen to illustrate the enormous flexibility that McMC has provided for fitting models and exploring features of data that were previously inaccessible. The selected areas are inferences of the trajectories over time...
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
Robust statistics and geochemical data analysis
International Nuclear Information System (INIS)
Di, Z.
1987-01-01
Advantages of robust procedures over ordinary least-squares procedures in geochemical data analysis is demonstrated using NURE data from the Hot Springs Quadrangle, South Dakota, USA. Robust principal components analysis with 5% multivariate trimming successfully guarded the analysis against perturbations by outliers and increased the number of interpretable factors. Regression with SINE estimates significantly increased the goodness-of-fit of the regression and improved the correspondence of delineated anomalies with known uranium prospects. Because of the ubiquitous existence of outliers in geochemical data, robust statistical procedures are suggested as routine procedures to replace ordinary least-squares procedures
Labaj, Wojciech; Papiez, Anna; Polanski, Andrzej; Polanska, Joanna
2017-03-01
Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.
Analysis of photon statistics with Silicon Photomultiplier
International Nuclear Information System (INIS)
D'Ascenzo, N.; Saveliev, V.; Wang, L.; Xie, Q.
2015-01-01
The Silicon Photomultiplier (SiPM) is a novel silicon-based photodetector, which represents the modern perspective of low photon flux detection. The aim of this paper is to provide an introduction on the statistical analysis methods needed to understand and estimate in quantitative way the correct features and description of the response of the SiPM to a coherent source of light
Vapor Pressure Data Analysis and Statistics
2016-12-01
near 8, 2000, and 200, respectively. The A (or a) value is directly related to vapor pressure and will be greater for high vapor pressure materials...1, (10) where n is the number of data points, Yi is the natural logarithm of the i th experimental vapor pressure value, and Xi is the...VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE
Statistical analysis of brake squeal noise
Oberst, S.; Lai, J. C. S.
2011-06-01
Despite substantial research efforts applied to the prediction of brake squeal noise since the early 20th century, the mechanisms behind its generation are still not fully understood. Squealing brakes are of significant concern to the automobile industry, mainly because of the costs associated with warranty claims. In order to remedy the problems inherent in designing quieter brakes and, therefore, to understand the mechanisms, a design of experiments study, using a noise dynamometer, was performed by a brake system manufacturer to determine the influence of geometrical parameters (namely, the number and location of slots) of brake pads on brake squeal noise. The experimental results were evaluated with a noise index and ranked for warm and cold brake stops. These data are analysed here using statistical descriptors based on population distributions, and a correlation analysis, to gain greater insight into the functional dependency between the time-averaged friction coefficient as the input and the peak sound pressure level data as the output quantity. The correlation analysis between the time-averaged friction coefficient and peak sound pressure data is performed by applying a semblance analysis and a joint recurrence quantification analysis. Linear measures are compared with complexity measures (nonlinear) based on statistics from the underlying joint recurrence plots. Results show that linear measures cannot be used to rank the noise performance of the four test pad configurations. On the other hand, the ranking of the noise performance of the test pad configurations based on the noise index agrees with that based on nonlinear measures: the higher the nonlinearity between the time-averaged friction coefficient and peak sound pressure, the worse the squeal. These results highlight the nonlinear character of brake squeal and indicate the potential of using nonlinear statistical analysis tools to analyse disc brake squeal.
Sensitivity analysis and related analysis : A survey of statistical techniques
Kleijnen, J.P.C.
1995-01-01
This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical
Analysis of Variance in Statistical Image Processing
Kurz, Ludwik; Hafed Benteftifa, M.
1997-04-01
A key problem in practical image processing is the detection of specific features in a noisy image. Analysis of variance (ANOVA) techniques can be very effective in such situations, and this book gives a detailed account of the use of ANOVA in statistical image processing. The book begins by describing the statistical representation of images in the various ANOVA models. The authors present a number of computationally efficient algorithms and techniques to deal with such problems as line, edge, and object detection, as well as image restoration and enhancement. By describing the basic principles of these techniques, and showing their use in specific situations, the book will facilitate the design of new algorithms for particular applications. It will be of great interest to graduate students and engineers in the field of image processing and pattern recognition.
Statistical Analysis of Zebrafish Locomotor Response.
Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure.
Directory of Open Access Journals (Sweden)
Sudeepa Bhattacharyya
2006-01-01
Full Text Available Multiple Myeloma (MM is a severely debilitating neoplastic disease of B cell origin, with the primary source of morbidity and mortality associated with unrestrained bone destruction. Surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS was used to screen for potential biomarkers indicative of skeletal involvement in patients with MM. Serum samples from 48 MM patients, 24 with more than three bone lesions and 24 with no evidence of bone lesions were fractionated and analyzed in duplicate using copper ion loaded immobilized metal affinity SELDI chip arrays. The spectra obtained were compiled, normalized, and mass peaks with mass-to-charge ratios (m/z between 2000 and 20,000 Da identified. Peak information from all fractions was combined together and analyzed using univariate statistics, as well as a linear, partial least squares discriminant analysis (PLS-DA, and a non-linear, random forest (RF, classification algorithm. The PLS-DA model resulted in prediction accuracy between 96–100%, while the RF model was able to achieve a specificity and sensitivity of 87.5% each. Both models as well as multiple comparison adjusted univariate analysis identified a set of four peaks that were the most discriminating between the two groups of patients and hold promise as potential biomarkers for future diagnostic and/or therapeutic purposes.
On the Statistical Validation of Technical Analysis
Directory of Open Access Journals (Sweden)
Rosane Riera Freire
2007-06-01
Full Text Available Technical analysis, or charting, aims on visually identifying geometrical patterns in price charts in order to antecipate price "trends". In this paper we revisit the issue of thecnical analysis validation which has been tackled in the literature without taking care for (i the presence of heterogeneity and (ii statistical dependence in the analyzed data - various agglutinated return time series from distinct financial securities. The main purpose here is to address the first cited problem by suggesting a validation methodology that also "homogenizes" the securities according to the finite dimensional probability distribution of their return series. The general steps go through the identification of the stochastic processes for the securities returns, the clustering of similar securities and, finally, the identification of presence, or absence, of informatinal content obtained from those price patterns. We illustrate the proposed methodology with a real data exercise including several securities of the global market. Our investigation shows that there is a statistically significant informational content in two out of three common patterns usually found through technical analysis, namely: triangle, rectangle and head and shoulders.
Statistical trend analysis methods for temporal phenomena
International Nuclear Information System (INIS)
Lehtinen, E.; Pulkkinen, U.; Poern, K.
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods
Statistical trend analysis methods for temporal phenomena
Energy Technology Data Exchange (ETDEWEB)
Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)
1997-04-01
We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.
atBioNet– an integrated network analysis tool for genomics and biomarker discovery
Directory of Open Access Journals (Sweden)
Ding Yijun
2012-07-01
Full Text Available Abstract Background Large amounts of mammalian protein-protein interaction (PPI data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. Results atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks. The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. Conclusion atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.
Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida
2012-07-20
Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
Statistical analysis of solar proton events
Directory of Open Access Journals (Sweden)
V. Kurt
2004-06-01
Full Text Available A new catalogue of 253 solar proton events (SPEs with energy >10MeV and peak intensity >10 protons/cm2.s.sr (pfu at the Earth's orbit for three complete 11-year solar cycles (1970-2002 is given. A statistical analysis of this data set of SPEs and their associated flares that occurred during this time period is presented. It is outlined that 231 of these proton events are flare related and only 22 of them are not associated with Ha flares. It is also noteworthy that 42 of these events are registered as Ground Level Enhancements (GLEs in neutron monitors. The longitudinal distribution of the associated flares shows that a great number of these events are connected with west flares. This analysis enables one to understand the long-term dependence of the SPEs and the related flare characteristics on the solar cycle which are useful for space weather prediction.
Recent advances in statistical energy analysis
Heron, K. H.
1992-01-01
Statistical Energy Analysis (SEA) has traditionally been developed using modal summation and averaging approach, and has led to the need for many restrictive SEA assumptions. The assumption of 'weak coupling' is particularly unacceptable when attempts are made to apply SEA to structural coupling. It is now believed that this assumption is more a function of the modal formulation rather than a necessary formulation of SEA. The present analysis ignores this restriction and describes a wave approach to the calculation of plate-plate coupling loss factors. Predictions based on this method are compared with results obtained from experiments using point excitation on one side of an irregular six-sided box structure. Conclusions show that the use and calculation of infinite transmission coefficients is the way forward for the development of a purely predictive SEA code.
STATISTICS, Program System for Statistical Analysis of Experimental Data
International Nuclear Information System (INIS)
Helmreich, F.
1991-01-01
1 - Description of problem or function: The package is composed of 83 routines, the most important of which are the following: BINDTR: Binomial distribution; HYPDTR: Hypergeometric distribution; POIDTR: Poisson distribution; GAMDTR: Gamma distribution; BETADTR: Beta-1 and Beta-2 distributions; NORDTR: Normal distribution; CHIDTR: Chi-square distribution; STUDTR : Distribution of 'Student's T'; FISDTR: Distribution of F; EXPDTR: Exponential distribution; WEIDTR: Weibull distribution; FRAKTIL: Calculation of the fractiles of the normal, chi-square, Student's, and F distributions; VARVGL: Test for equality of variance for several sample observations; ANPAST: Kolmogorov-Smirnov test and chi-square test of goodness of fit; MULIRE: Multiple linear regression analysis for a dependent variable and a set of independent variables; STPRG: Performs a stepwise multiple linear regression analysis for a dependent variable and a set of independent variables. At each step, the variable entered into the regression equation is the one which has the greatest amount of variance between it and the dependent variable. Any independent variable can be forced into or deleted from the regression equation, irrespective of its contribution to the equation. LTEST: Tests the hypotheses of linearity of the data. SPRANK: Calculates the Spearman rank correlation coefficient. 2 - Method of solution: VARVGL: The Bartlett's Test, the Cochran's Test and the Hartley's Test are performed in the program. MULIRE: The Gauss-Jordan method is used in the solution of the normal equations. STPRG: The abbreviated Doolittle method is used to (1) determine variables to enter into the regression, and (2) complete regression coefficient calculation. 3 - Restrictions on the complexity of the problem: VARVGL: The Hartley's Test is only performed if the sample observations are all of the same size
Sensitive Spectroscopic Analysis of Biomarkers in Exhaled Breath
Bicer, A.; Bounds, J.; Zhu, F.; Kolomenskii, A. A.; Kaya, N.; Aluauee, E.; Amani, M.; Schuessler, H. A.
2018-06-01
We have developed a novel optical setup which is based on a high finesse cavity and absorption laser spectroscopy in the near-IR spectral region. In pilot experiments, spectrally resolved absorption measurements of biomarkers in exhaled breath, such as methane and acetone, were carried out using cavity ring-down spectroscopy (CRDS). With a 172-cm-long cavity, an efficient optical path of 132 km was achieved. The CRDS technique is well suited for such measurements due to its high sensitivity and good spectral resolution. The detection limits for methane of 8 ppbv and acetone of 2.1 ppbv with spectral sampling of 0.005 cm-1 were achieved, which allowed to analyze multicomponent gas mixtures and to observe absorption peaks of 12CH4 and 13CH4. Further improvements of the technique have the potential to realize diagnostics of health conditions based on a multicomponent analysis of breath samples.
Occupational exposure to HDI: progress and challenges in biomarker analysis.
Flack, Sheila L; Ball, Louise M; Nylander-French, Leena A
2010-10-01
1,6-Hexamethylene diisocyanate (HDI) is extensively used in the automotive repair industry and is a commonly reported cause of occupational asthma in industrialized populations. However, the exact pathological mechanism remains uncertain. Characterization and quantification of biomarkers resulting from HDI exposure can fill important knowledge gaps between exposure, susceptibility, and the rise of immunological reactions and sensitization leading to asthma. Here, we discuss existing challenges in HDI biomarker analysis including the quantification of N-acetyl-1,6-hexamethylene diamine (monoacetyl-HDA) and N,N'-diacetyl-1,6-hexamethylene diamine (diacetyl-HDA) in urine samples based on previously established methods for HDA analysis. In addition, we describe the optimization of reaction conditions for the synthesis of monoacetyl-HDA and diacetyl-HDA, and utilize these standards for the quantification of these metabolites in the urine of three occupationally exposed workers. Diacetyl-HDA was present in untreated urine at 0.015-0.060 μg/l. Using base hydrolysis, the concentration range of monoacetyl-HDA in urine was 0.19-2.2 μg/l, 60-fold higher than in the untreated samples on average. HDA was detected only in one sample after base hydrolysis (0.026 μg/l). In contrast, acid hydrolysis yielded HDA concentrations ranging from 0.36 to 10.1 μg/l in these three samples. These findings demonstrate HDI metabolism via N-acetylation metabolic pathway and protein adduct formation resulting from occupational exposure to HDI. Copyright © 2010 Elsevier B.V. All rights reserved.
DEFF Research Database (Denmark)
Madsen, Tobias
2017-01-01
In the present thesis I develop, implement and apply statistical methods for detecting genomic elements implicated in cancer development and progression. This is done in two separate bodies of work. The first uses the somatic mutation burden to distinguish cancer driver mutations from passenger m...
Towards the disease biomarker in an individual patient using statistical health monitoring
Engel, J.; Blanchet, L.M.; Engelke, U.F.; Wevers, R.A.; Buydens, L.M.
2014-01-01
In metabolomics, identification of complex diseases is often based on application of (multivariate) statistical techniques to the data. Commonly, each disease requires its own specific diagnostic model, separating healthy and diseased individuals, which is not very practical in a diagnostic setting.
Statistical analysis of tourism destination competitiveness
Directory of Open Access Journals (Sweden)
Attilio Gardini
2013-05-01
Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.
Multivariate statistical analysis of wildfires in Portugal
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
Lind, Mads V; Savolainen, Otto I; Ross, Alastair B
2016-08-01
Data quality is critical for epidemiology, and as scientific understanding expands, the range of data available for epidemiological studies and the types of tools used for measurement have also expanded. It is essential for the epidemiologist to have a grasp of the issues involved with different measurement tools. One tool that is increasingly being used for measuring biomarkers in epidemiological cohorts is mass spectrometry (MS), because of the high specificity and sensitivity of MS-based methods and the expanding range of biomarkers that can be measured. Further, the ability of MS to quantify many biomarkers simultaneously is advantageously compared to single biomarker methods. However, as with all methods used to measure biomarkers, there are a number of pitfalls to consider which may have an impact on results when used in epidemiology. In this review we discuss the use of MS for biomarker analyses, focusing on metabolites and their application and potential issues related to large-scale epidemiology studies, the use of MS "omics" approaches for biomarker discovery and how MS-based results can be used for increasing biological knowledge gained from epidemiological studies. Better understanding of the possibilities and possible problems related to MS-based measurements will help the epidemiologist in their discussions with analytical chemists and lead to the use of the most appropriate statistical tools for these data.
A statistical analysis of electrical cerebral activity
International Nuclear Information System (INIS)
Bassant, Marie-Helene
1971-01-01
The aim of this work was to study the statistical properties of the amplitude of the electroencephalographic signal. The experimental method is described (implantation of electrodes, acquisition and treatment of data). The program of the mathematical analysis is given (calculation of probability density functions, study of stationarity) and the validity of the tests discussed. The results concerned ten rabbits. Trips of EEG were sampled during 40 s. with very short intervals (500 μs). The probability density functions established for different brain structures (especially the dorsal hippocampus) and areas, were compared during sleep, arousal and visual stimulus. Using a Χ 2 test, it was found that the Gaussian distribution assumption was rejected in 96.7 per cent of the cases. For a given physiological state, there was no mathematical reason to reject the assumption of stationarity (in 96 per cent of the cases). (author) [fr
Statistical analysis of ultrasonic measurements in concrete
Chiang, Chih-Hung; Chen, Po-Chih
2002-05-01
Stress wave techniques such as measurements of ultrasonic pulse velocity are often used to evaluate concrete quality in structures. For proper interpretation of measurement results, the dependence of pulse transit time on the average acoustic impedance and the material homogeneity along the sound path need to be examined. Semi-direct measurement of pulse velocity could be more convenient than through transmission measurement. It is not necessary to assess both sides of concrete floors or walls. A novel measurement scheme is proposed and verified based on statistical analysis. It is shown that Semi-direct measurements are very effective for gathering large amount of pulse velocity data from concrete reference specimens. The variability of measurements is comparable with that reported by American Concrete Institute using either break-off or pullout tests.
Zhang, Bo; Chen, Zhen; Albert, Paul S
2012-01-01
High-dimensional biomarker data are often collected in epidemiological studies when assessing the association between biomarkers and human disease is of interest. We develop a latent class modeling approach for joint analysis of high-dimensional semicontinuous biomarker data and a binary disease outcome. To model the relationship between complex biomarker expression patterns and disease risk, we use latent risk classes to link the 2 modeling components. We characterize complex biomarker-specific differences through biomarker-specific random effects, so that different biomarkers can have different baseline (low-risk) values as well as different between-class differences. The proposed approach also accommodates data features that are common in environmental toxicology and other biomarker exposure data, including a large number of biomarkers, numerous zero values, and complex mean-variance relationship in the biomarkers levels. A Monte Carlo EM (MCEM) algorithm is proposed for parameter estimation. Both the MCEM algorithm and model selection procedures are shown to work well in simulations and applications. In applying the proposed approach to an epidemiological study that examined the relationship between environmental polychlorinated biphenyl (PCB) exposure and the risk of endometriosis, we identified a highly significant overall effect of PCB concentrations on the risk of endometriosis.
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard; Hoffmeyer, P.
Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull...
On two methods of statistical image analysis
Missimer, J; Knorr, U; Maguire, RP; Herzog, H; Seitz, RJ; Tellman, L; Leenders, K.L.
1999-01-01
The computerized brain atlas (CBA) and statistical parametric mapping (SPM) are two procedures for voxel-based statistical evaluation of PET activation studies. Each includes spatial standardization of image volumes, computation of a statistic, and evaluation of its significance. In addition,
Application of descriptive statistics in analysis of experimental data
Mirilović Milorad; Pejin Ivana
2008-01-01
Statistics today represent a group of scientific methods for the quantitative and qualitative investigation of variations in mass appearances. In fact, statistics present a group of methods that are used for the accumulation, analysis, presentation and interpretation of data necessary for reaching certain conclusions. Statistical analysis is divided into descriptive statistical analysis and inferential statistics. The values which represent the results of an experiment, and which are the subj...
Directory of Open Access Journals (Sweden)
Huarong Xu
2016-08-01
Full Text Available Polyamines, one of the most important kind of biomarkers in cancer research, were investigated in order to characterize different cancer types. An integrative approach which combined ultra-high performance liquid chromatography—tandem mass spectrometry detection and multiple statistical data processing strategies including outlier elimination, binary logistic regression analysis and cluster analysis had been developed to discover the characteristic biomarkers of lung and liver cancer. The concentrations of 14 polyamine metabolites in biosamples from lung (n = 50 and liver cancer patients (n = 50 were detected by a validated UHPLC-MS/MS method. Then the concentrations were converted into independent variables to characterize patients of lung and liver cancer by binary logic regression analysis. Significant independent variables were regarded as the potential biomarkers. Cluster analysis was engaged for further verifying. As a result, two values was discovered to identify lung and liver cancer, which were the product of the plasma concentration of putrescine and spermidine; and the ratio of the urine concentration of S-adenosyl-l-methionine and N-acetylspermidine. Results indicated that the established advanced method could be successfully applied to characterize lung and liver cancer, and may also enable a new way of discovering cancer biomarkers and characterizing other types of cancer.
Statistical analysis in MSW collection performance assessment.
Teixeira, Carlos Afonso; Avelino, Catarina; Ferreira, Fátima; Bentes, Isabel
2014-09-01
The increase of Municipal Solid Waste (MSW) generated over the last years forces waste managers pursuing more effective collection schemes, technically viable, environmentally effective and economically sustainable. The assessment of MSW services using performance indicators plays a crucial role for improving service quality. In this work, we focus on the relevance of regular system monitoring as a service assessment tool. In particular, we select and test a core-set of MSW collection performance indicators (effective collection distance, effective collection time and effective fuel consumption) that highlights collection system strengths and weaknesses and supports pro-active management decision-making and strategic planning. A statistical analysis was conducted with data collected in mixed collection system of Oporto Municipality, Portugal, during one year, a week per month. This analysis provides collection circuits' operational assessment and supports effective short-term municipality collection strategies at the level of, e.g., collection frequency and timetables, and type of containers. Copyright © 2014 Elsevier Ltd. All rights reserved.
Statistics Analysis Measures Painting of Cooling Tower
Directory of Open Access Journals (Sweden)
A. Zacharopoulou
2013-01-01
Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.
Transit safety & security statistics & analysis 2002 annual report (formerly SAMIS)
2004-12-01
The Transit Safety & Security Statistics & Analysis 2002 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
Transit safety & security statistics & analysis 2003 annual report (formerly SAMIS)
2005-12-01
The Transit Safety & Security Statistics & Analysis 2003 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
Statistical analysis of long term spatial and temporal trends of ...
Indian Academy of Sciences (India)
Statistical analysis of long term spatial and temporal trends of temperature ... CGCM3; HadCM3; modified Mann–Kendall test; statistical analysis; Sutlej basin. ... Water Resources Systems Division, National Institute of Hydrology, Roorkee 247 ...
Quantitative plasma biomarker analysis in HDI exposure assessment.
Flack, Sheila L; Fent, Kenneth W; Trelles Gaines, Linda G; Thomasen, Jennifer M; Whittaker, Steve; Ball, Louise M; Nylander-French, Leena A
2010-01-01
Quantification of amines in biological samples is important for evaluating occupational exposure to diisocyanates. In this study, we describe the quantification of 1,6-hexamethylene diamine (HDA) levels in hydrolyzed plasma of 46 spray painters applying 1,6-hexamethylene diisocyanate (HDI)-containing paint in vehicle repair shops collected during repeated visits to their workplace and their relationship with dermal and inhalation exposure to HDI monomer. HDA was detected in 76% of plasma samples, as heptafluorobutyryl derivatives, and the range of HDA concentrations was HDA levels and HDI inhalation exposure measured on the same workday was low (N = 108, r = 0.22, P = 0.026) compared with the correlation between plasma HDA levels and inhalation exposure occurring approximately 20 to 60 days before blood collection (N = 29, r = 0.57, P = 0.0014). The correlation between plasma HDA levels and HDI dermal exposure measured on the same workday, although statistically significant, was low (N = 108, r = 0.22, P = 0.040) while the correlation between HDA and dermal exposure occurring approximately 20 to 60 days before blood collection was slightly improved (N = 29, r = 0.36, P = 0.053). We evaluated various workplace factors and controls (i.e. location, personal protective equipment use and paint booth type) as modifiers of plasma HDA levels. Workers using a downdraft-ventilated booth had significantly lower plasma HDA levels relative to semi-downdraft and crossdraft booth types (P = 0.0108); this trend was comparable to HDI inhalation and dermal exposure levels stratified by booth type. These findings indicate that HDA concentration in hydrolyzed plasma may be used as a biomarker of cumulative inhalation and dermal exposure to HDI and for investigating the effectiveness of exposure controls in the workplace.
Statistical approach to partial equilibrium analysis
Wang, Yougui; Stanley, H. E.
2009-04-01
A statistical approach to market equilibrium and efficiency analysis is proposed in this paper. One factor that governs the exchange decisions of traders in a market, named willingness price, is highlighted and constitutes the whole theory. The supply and demand functions are formulated as the distributions of corresponding willing exchange over the willingness price. The laws of supply and demand can be derived directly from these distributions. The characteristics of excess demand function are analyzed and the necessary conditions for the existence and uniqueness of equilibrium point of the market are specified. The rationing rates of buyers and sellers are introduced to describe the ratio of realized exchange to willing exchange, and their dependence on the market price is studied in the cases of shortage and surplus. The realized market surplus, which is the criterion of market efficiency, can be written as a function of the distributions of willing exchange and the rationing rates. With this approach we can strictly prove that a market is efficient in the state of equilibrium.
Analysis of Variance: What Is Your Statistical Software Actually Doing?
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Biomarkers and asthma management: analysis and potential applications
Richards, Levi B.; Neerincx, Anne H.; van Bragt, Job J. M. H.; Sterk, Peter J.; Bel, Elisabeth H. D.; Maitland-van der Zee, Anke H.
2018-01-01
Asthma features a high degree of heterogeneity in both pathophysiology and therapeutic response, resulting in many asthma patients being treated inadequately. Biomarkers indicative of underlying pathological processes could be used to identify disease subtypes, determine prognosis and to predict or
Mesothelin as a biomarker for ovarian carcinoma: a meta-analysis
Directory of Open Access Journals (Sweden)
KRISTIAN MADEIRA
2016-06-01
Full Text Available The objective of this work was to estimate the accuracy of mesothelin as a biomarker for ovarian cancer. A quantitative systematic review was performed. A comprehensive search of the Medline, LILACS, SCOPUS, Embase, Cochrane Central Register of Controlled Trials, Biomed Central, and ISI Web of Science databases was conducted from January 1990 to June 2015. For inclusion in this systematic review, the papers must have measured mesothelin levels in at least two histological diagnoses; ovarian cancer (borderline or ovarian tumor vs. benign or normal ovarian tissue. For each study, 2 x 2 contingency tables were constructed. We calculated the sensitivity, specificity and diagnostic odds ratio. The verification bias was performed according to QUADAS-2. Statistical analysis was performed with the software Stata 11, Meta-DiSc(r and RevMan 5.2. Twelve studies were analyzed, which included 1,561 women. The pooled sensitivity was 0.62 (CI 95% 0.58 - 0.66 and specificity was 0.94 (CI 95% 0.92 - 0.95. The DOR was 38.92 (CI 95% 17.82 - 84.99. Our systematic review shows that mesothelin cannot serve alone as a biomarker for the detection of ovarian cancer.
Searching for New Biomarkers and the Use of Multivariate Analysis in Gastric Cancer Diagnostics.
Kucera, Radek; Smid, David; Topolcan, Ondrej; Karlikova, Marie; Fiala, Ondrej; Slouka, David; Skalicky, Tomas; Treska, Vladislav; Kulda, Vlastimil; Simanek, Vaclav; Safanda, Martin; Pesta, Martin
2016-04-01
The first aim of this study was to search for new biomarkers to be used in gastric cancer diagnostics. The second aim was to verify the findings presented in literature on a sample of the local population and investigate the risk of gastric cancer in that population using a multivariant statistical analysis. We assessed a group of 36 patients with gastric cancer and 69 healthy individuals. We determined carcinoembryonic antigen, cancer antigen 19-9, cancer antigen 72-4, matrix metalloproteinases (-1, -2, -7, -8 and -9), osteoprotegerin, osteopontin, prothrombin induced by vitamin K absence-II, pepsinogen I, pepsinogen II, gastrin and Helicobacter pylori for each sample. The multivariate stepwise logistic regression identified the following biomarkers as the best gastric cancer predictors: CEA, CA72-4, pepsinogen I, Helicobacter pylori presence and MMP7. CEA and CA72-4 remain the best markers for gastric cancer diagnostics. We suggest a mathematical model for the assessment of risk of gastric cancer. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Statistical analysis of angular correlation measurements
International Nuclear Information System (INIS)
Oliveira, R.A.A.M. de.
1986-01-01
Obtaining the multipole mixing ratio, δ, of γ transitions in angular correlation measurements is a statistical problem characterized by the small number of angles in which the observation is made and by the limited statistic of counting, α. The inexistence of a sufficient statistics for the estimator of δ, is shown. Three different estimators for δ were constructed and their properties of consistency, bias and efficiency were tested. Tests were also performed in experimental results obtained in γ-γ directional correlation measurements. (Author) [pt
Surface Properties of TNOs: Preliminary Statistical Analysis
Antonieta Barucci, Maria; Fornasier, S.; Alvarez-Cantal, A.; de Bergh, C.; Merlin, F.; DeMeo, F.; Dumas, C.
2009-09-01
An overview of the surface properties based on the last results obtained during the Large Program performed at ESO-VLT (2007-2008) will be presented. Simultaneous high quality visible and near-infrared spectroscopy and photometry have been carried out on 40 objects with various dynamical properties, using FORS1 (V), ISAAC (J) and SINFONI (H+K bands) mounted respectively at UT2, UT1 and UT4 VLT-ESO telescopes (Cerro Paranal, Chile). For spectroscopy we computed the spectral slope for each object and searched for possible rotational inhomogeneities. A few objects show features in their visible spectra such as Eris, whose spectral bands are displaced with respect to pure methane-ice. We identify new faint absorption features on 10199 Chariklo and 42355 Typhon, possibly due to the presence of aqueous altered materials. The H+K band spectroscopy was performed with the new instrument SINFONI which is a 3D integral field spectrometer. While some objects show no diagnostic spectral bands, others reveal surface deposits of ices of H2O, CH3OH, CH4, and N2. To investigate the surface properties of these bodies, a radiative transfer model has been applied to interpret the entire 0.4-2.4 micron spectral region. The diversity of the spectra suggests that these objects represent a substantial range of bulk compositions. These different surface compositions can be diagnostic of original compositional diversity, interior source and/or different evolution with different physical processes affecting the surfaces. A statistical analysis is in progress to investigate the correlation of the TNOs’ surface properties with size and dynamical properties.
Time Series Analysis Based on Running Mann Whitney Z Statistics
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
COMPARATIVE STATISTICAL ANALYSIS OF GENOTYPES’ COMBINING
Directory of Open Access Journals (Sweden)
V. Z. Stetsyuk
2015-05-01
The program provides the creation of desktop program complex for statistics calculations on a personal computer of doctor. Modern methods and tools for development of information systems were described to create program.
Statistical Analysis of Research Data | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data. The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.
Kahle, E.; Uno, K. T.; Polissar, P. J.; Lepre, C. J.; deMenocal, P. B.
2013-12-01
Plio-Pleistocene sedimentary records from the Turkana Basin in eastern Africa provide a unique opportunity to compare a high-resolution record of climate and terrestrial vegetation with important changes in the record of human evolution. Molecular biomarkers from terrestrial vegetation can yield stable isotope ratios of hydrogen and carbon that reflect ancient climate and vegetation. However, the preservation of long-chain plant wax biomarkers in these paleosol, fluvial, and lacustrine sediments is not known, and this preservation must be studied to establish their utility for molecular stable isotope studies. We investigated leaf wax biomarkers in Nachukui Formation sediments deposited between 2.3 and 1.7 Ma to assess biomarker preservation. We analyzed n alkane and n alkanoic acid concentrations and, where suitable, molecular carbon and hydrogen isotope ratios. Molecular abundance distributions show a great deal of variance in biomarker preservation and plant-type source as indicated by the carbon preference index and average chain length. This variation suggests that some samples are suitable for isotopic analysis, while other samples lack primary terrestrial plant biomarker signatures. The biomarker signal in many samples contains significant additional material from unidentified sources. For example, the n-alkane distributions contain an unresolved complex mixture underlying the short and mid-chain n-alkanes. Samples from lacustrine intervals include long-chain diacids, hydroxy acids and (ω-1) ketoacids that suggest degradation of the original acids. Degradation of poorly preserved samples and the addition of non-terrestrial plant biomarkers may originate from a number of processes including forest fire or microbial alteration. Isotopic analysis of well-preserved terrestrial plant biomarkers will be presented along with examples where the original biomarker distribution has been altered.
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs......To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social......), and stochastic actor-oriented models. We focus most attention on ERGMs by providing an illustrative example of a model for a strategic information network within a local government. We draw inferences about the structural role played by individuals recognized as key innovators and conclude that such an approach...
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Statistical analysis of hydrodynamic cavitation events
Gimenez, G.; Sommer, R.
1980-10-01
The frequency (number of events per unit time) of pressure pulses produced by hydrodynamic cavitation bubble collapses is investigated using statistical methods. The results indicate that this frequency is distributed according to a normal law, its parameters not being time-evolving.
Statistical analysis of lineaments of Goa, India
Digital Repository Service at National Institute of Oceanography (India)
Iyer, S.D.; Banerjee, G.; Wagle, B.G.
statistically to obtain the nonlinear pattern in the form of a cosine wave. Three distinct peaks were found at azimuths of 40-45 degrees, 90-95 degrees and 140-145 degrees, which have peak values of 5.85, 6.80 respectively. These three peaks are correlated...
On statistical analysis of compound point process
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2006-01-01
Roč. 35, 2-3 (2006), s. 389-396 ISSN 1026-597X R&D Projects: GA ČR(CZ) GA402/04/1294 Institutional research plan: CEZ:AV0Z10750506 Keywords : counting process * compound process * hazard function * Cox -model Subject RIV: BB - Applied Statistics, Operational Research
Statistical Analysis Of Reconnaissance Geochemical Data From ...
African Journals Online (AJOL)
, Co, Mo, Hg, Sb, Tl, Sc, Cr, Ni, La, W, V, U, Th, Bi, Sr and Ga in 56 stream sediment samples collected from Orle drainage system were subjected to univariate and multivariate statistical analyses. The univariate methods used include ...
Uncertainty analysis with statistically correlated failure data
International Nuclear Information System (INIS)
Modarres, M.; Dezfuli, H.; Roush, M.L.
1987-01-01
Likelihood of occurrence of the top event of a fault tree or sequences of an event tree is estimated from the failure probability of components that constitute the events of the fault/event tree. Component failure probabilities are subject to statistical uncertainties. In addition, there are cases where the failure data are statistically correlated. At present most fault tree calculations are based on uncorrelated component failure data. This chapter describes a methodology for assessing the probability intervals for the top event failure probability of fault trees or frequency of occurrence of event tree sequences when event failure data are statistically correlated. To estimate mean and variance of the top event, a second-order system moment method is presented through Taylor series expansion, which provides an alternative to the normally used Monte Carlo method. For cases where component failure probabilities are statistically correlated, the Taylor expansion terms are treated properly. Moment matching technique is used to obtain the probability distribution function of the top event through fitting the Johnson Ssub(B) distribution. The computer program, CORRELATE, was developed to perform the calculations necessary for the implementation of the method developed. (author)
Topic model-based mass spectrometric data analysis in cancer biomarker discovery studies.
Wang, Minkun; Tsai, Tsung-Heng; Di Poto, Cristina; Ferrarini, Alessia; Yu, Guoqiang; Ressom, Habtom W
2016-08-18
that incorporation of scan-level features have the potential to lead to more accurate purification results by alleviating the loss in information as a result of integrating peaks. We believe cancer biomarker discovery studies that use mass spectrometric analysis of human biospecimens can greatly benefit from topic model-based purification of the data prior to statistical and pathway analyses.
International Nuclear Information System (INIS)
Bruse, Jan L.; McLeod, Kristin; Biglino, Giovanni; Ntsinjana, Hopewell N.; Capelli, Claudio
2016-01-01
Medical image analysis in clinical practice is commonly carried out on 2D image data, without fully exploiting the detailed 3D anatomical information that is provided by modern non-invasive medical imaging techniques. In this paper, a statistical shape analysis method is presented, which enables the extraction of 3D anatomical shape features from cardiovascular magnetic resonance (CMR) image data, with no need for manual landmarking. The method was applied to repaired aortic coarctation arches that present complex shapes, with the aim of capturing shape features as biomarkers of potential functional relevance. The method is presented from the user-perspective and is evaluated by comparing results with traditional morphometric measurements. Steps required to set up the statistical shape modelling analyses, from pre-processing of the CMR images to parameter setting and strategies to account for size differences and outliers, are described in detail. The anatomical mean shape of 20 aortic arches post-aortic coarctation repair (CoA) was computed based on surface models reconstructed from CMR data. By analysing transformations that deform the mean shape towards each of the individual patient’s anatomy, shape patterns related to differences in body surface area (BSA) and ejection fraction (EF) were extracted. The resulting shape vectors, describing shape features in 3D, were compared with traditionally measured 2D and 3D morphometric parameters. The computed 3D mean shape was close to population mean values of geometric shape descriptors and visually integrated characteristic shape features associated with our population of CoA shapes. After removing size effects due to differences in body surface area (BSA) between patients, distinct 3D shape features of the aortic arch correlated significantly with EF (r = 0.521, p = .022) and were well in agreement with trends as shown by traditional shape descriptors. The suggested method has the potential to discover previously
Bruse, Jan L; McLeod, Kristin; Biglino, Giovanni; Ntsinjana, Hopewell N; Capelli, Claudio; Hsia, Tain-Yen; Sermesant, Maxime; Pennec, Xavier; Taylor, Andrew M; Schievano, Silvia
2016-05-31
Medical image analysis in clinical practice is commonly carried out on 2D image data, without fully exploiting the detailed 3D anatomical information that is provided by modern non-invasive medical imaging techniques. In this paper, a statistical shape analysis method is presented, which enables the extraction of 3D anatomical shape features from cardiovascular magnetic resonance (CMR) image data, with no need for manual landmarking. The method was applied to repaired aortic coarctation arches that present complex shapes, with the aim of capturing shape features as biomarkers of potential functional relevance. The method is presented from the user-perspective and is evaluated by comparing results with traditional morphometric measurements. Steps required to set up the statistical shape modelling analyses, from pre-processing of the CMR images to parameter setting and strategies to account for size differences and outliers, are described in detail. The anatomical mean shape of 20 aortic arches post-aortic coarctation repair (CoA) was computed based on surface models reconstructed from CMR data. By analysing transformations that deform the mean shape towards each of the individual patient's anatomy, shape patterns related to differences in body surface area (BSA) and ejection fraction (EF) were extracted. The resulting shape vectors, describing shape features in 3D, were compared with traditionally measured 2D and 3D morphometric parameters. The computed 3D mean shape was close to population mean values of geometric shape descriptors and visually integrated characteristic shape features associated with our population of CoA shapes. After removing size effects due to differences in body surface area (BSA) between patients, distinct 3D shape features of the aortic arch correlated significantly with EF (r = 0.521, p = .022) and were well in agreement with trends as shown by traditional shape descriptors. The suggested method has the potential to discover
Statistical analysis of medical data using SAS
Der, Geoff
2005-01-01
An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences
Fundamentals of statistical experimental design and analysis
Easterling, Robert G
2015-01-01
Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and t...
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
Statistical analysis of radioactivity in the environment
International Nuclear Information System (INIS)
Barnes, M.G.; Giacomini, J.J.
1980-05-01
The pattern of radioactivity in surface soils of Area 5 of the Nevada Test Site is analyzed statistically by means of kriging. The 1962 event code-named Smallboy effected the greatest proportion of the area sampled, but some of the area was also affected by a number of other events. The data for this study were collected on a regular grid to take advantage of the efficiency of grid sampling
Critical analysis of adsorption data statistically
Kaushal, Achla; Singh, S. K.
2017-10-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are mango leaf powder.
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958
Tuuli, Methodius G; Odibo, Anthony O
2011-08-01
The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.
Biomarker Identification and Pathway Analysis by Serum Metabolomics of Lung Cancer
Directory of Open Access Journals (Sweden)
Yingrong Chen
2015-01-01
Full Text Available Lung cancer is one of the most common causes of cancer death, for which no validated tumor biomarker is sufficiently accurate to be useful for diagnosis. Additionally, the metabolic alterations associated with the disease are unclear. In this study, we investigated the construction, interaction, and pathways of potential lung cancer biomarkers using metabolomics pathway analysis based on the Kyoto Encyclopedia of Genes and Genomes database and the Human Metabolome Database to identify the top altered pathways for analysis and visualization. We constructed a diagnostic model using potential serum biomarkers from patients with lung cancer. We assessed their specificity and sensitivity according to the area under the curve of the receiver operator characteristic (ROC curves, which could be used to distinguish patients with lung cancer from normal subjects. The pathway analysis indicated that sphingolipid metabolism was the top altered pathway in lung cancer. ROC curve analysis indicated that glycerophospho-N-arachidonoyl ethanolamine (GpAEA and sphingosine were potential sensitive and specific biomarkers for lung cancer diagnosis and prognosis. Compared with the traditional lung cancer diagnostic biomarkers carcinoembryonic antigen and cytokeratin 19 fragment, GpAEA and sphingosine were as good or more appropriate for detecting lung cancer. We report our identification of potential metabolic diagnostic and prognostic biomarkers of lung cancer and clarify the metabolic alterations in lung cancer.
Statistical learning methods in high-energy and astrophysics analysis
Energy Technology Data Exchange (ETDEWEB)
Zimmermann, J. [Forschungszentrum Juelich GmbH, Zentrallabor fuer Elektronik, 52425 Juelich (Germany) and Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de; Kiesling, C. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)
2004-11-21
We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application.
Statistical learning methods in high-energy and astrophysics analysis
International Nuclear Information System (INIS)
Zimmermann, J.; Kiesling, C.
2004-01-01
We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application
Statistical analysis of partial reduced width distributions
International Nuclear Information System (INIS)
Tran Quoc Thuong.
1973-01-01
The aim of this study was to develop rigorous methods for analysing experimental event distributions according to a law in chi 2 and to check if the number of degrees of freedom ν is compatible with the value 1 for the reduced neutron width distribution. Two statistical methods were used (the maximum-likelihood method and the method of moments); it was shown, in a few particular cases, that ν is compatible with 1. The difference between ν and 1, if it exists, should not exceed 3%. These results confirm the validity of the compound nucleus model [fr
Statistical analysis of random duration times
International Nuclear Information System (INIS)
Engelhardt, M.E.
1996-04-01
This report presents basic statistical methods for analyzing data obtained by observing random time durations. It gives nonparametric estimates of the cumulative distribution function, reliability function and cumulative hazard function. These results can be applied with either complete or censored data. Several models which are commonly used with time data are discussed, and methods for model checking and goodness-of-fit tests are discussed. Maximum likelihood estimates and confidence limits are given for the various models considered. Some results for situations where repeated durations such as repairable systems are also discussed
Statistical analysis of random pulse trains
International Nuclear Information System (INIS)
Da Costa, G.
1977-02-01
Some experimental and theoretical results concerning the statistical properties of optical beams formed by a finite number of independent pulses are presented. The considered waves (corresponding to each pulse) present important spatial variations of the illumination distribution in a cross-section of the beam, due to the time-varying random refractive index distribution in the active medium. Some examples of this kind of emission are: (a) Free-running ruby laser emission; (b) Mode-locked pulse trains; (c) Randomly excited nonlinear media
Statistical analysis of dragline monitoring data
Energy Technology Data Exchange (ETDEWEB)
Mirabediny, H.; Baafi, E.Y. [University of Tehran, Tehran (Iran)
1998-07-01
Dragline monitoring systems are normally the best tool used to collect data on the machine performance and operational parameters of a dragline operation. This paper discusses results of a time study using data from a dragline monitoring system captured over a four month period. Statistical summaries of the time study in terms of average values, standard deviation and frequency distributions showed that the mode of operation and the geological conditions have a significant influence on the dragline performance parameters. 6 refs., 14 figs., 3 tabs.
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.
Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V
2018-04-01
A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Grigsby, Claude C.; Zmuda, Michael A.; Boone, Derek W.; Highlander, Tyler C.; Kramer, Ryan M.; Rizki, Mateen M.
2012-06-01
A growing body of discoveries in molecular signatures has revealed that volatile organic compounds (VOCs), the small molecules associated with an individual's odor and breath, can be monitored to reveal the identity and presence of a unique individual, as well their overall physiological status. Given the analysis requirements for differential VOC profiling via gas chromatography/mass spectrometry, our group has developed a novel informatics platform, Metabolite Differentiation and Discovery Lab (MeDDL). In its current version, MeDDL is a comprehensive tool for time-series spectral registration and alignment, visualization, comparative analysis, and machine learning to facilitate the efficient analysis of multiple, large-scale biomarker discovery studies. The MeDDL toolset can therefore identify a large differential subset of registered peaks, where their corresponding intensities can be used as features for classification. This initial screening of peaks yields results sets that are typically too large for incorporation into a portable, electronic nose based system in addition to including VOCs that are not amenable to classification; consequently, it is also important to identify an optimal subset of these peaks to increase classification accuracy and to decrease the cost of the final system. MeDDL's learning tools include a classifier similar to a K-nearest neighbor classifier used in conjunction with a genetic algorithm (GA) that simultaneously optimizes the classifier and subset of features. The GA uses ROC curves to produce classifiers having maximal area under their ROC curve. Experimental results on over a dozen recognition problems show many examples of classifiers and feature sets that produce perfect ROC curves.
CONFIDENCE LEVELS AND/VS. STATISTICAL HYPOTHESIS TESTING IN STATISTICAL ANALYSIS. CASE STUDY
Directory of Open Access Journals (Sweden)
ILEANA BRUDIU
2009-05-01
Full Text Available Estimated parameters with confidence intervals and testing statistical assumptions used in statistical analysis to obtain conclusions on research from a sample extracted from the population. Paper to the case study presented aims to highlight the importance of volume of sample taken in the study and how this reflects on the results obtained when using confidence intervals and testing for pregnant. If statistical testing hypotheses not only give an answer "yes" or "no" to some questions of statistical estimation using statistical confidence intervals provides more information than a test statistic, show high degree of uncertainty arising from small samples and findings build in the "marginally significant" or "almost significant (p very close to 0.05.
NanoSIMS analysis of Archean fossils and biomarkers
International Nuclear Information System (INIS)
Kilburn, M.R.; Wacey, D.
2008-01-01
The study of fossils and biomarkers from Archean rocks is of vital importance to reveal how life arose on Earth and what we might expect to find on other planets such as Mars. The Cameca NanoSIMS 50 has the unique ability to measure stable isotopes and map biologically relevant elements at the micron-scale, in situ. This makes it the perfect tool for testing the biogenicity of a range of putative biomarkers from early Archean rocks (∼3.50 billion-year-old). NanoSIMS has been used to investigate ambient inclusion trails (AITs) in a 3.43 Ga beach sand deposit from the Pilbara craton, Western Australia. Chemical maps of the light elements necessary for life (C, N and O) and several transition metals commonly associated with biological processing (Ni, Zn and Fe), coupled with 13 C/ 12 C isotope ratios from carbonaceous linings, strongly suggest a biological component in the formation of AITs
Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano
2011-01-01
The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.
Occupational Exposure to HDI: Progress and Challenges in Biomarker Analysis
Flack, Sheila L.; Ball, Louise M.; Nylander-French, Leena A.
2010-01-01
1,6-hexamethylene diisocyanate (HDI) is extensively used in the automotive repair industry and is a commonly reported cause of occupational asthma in industrialized populations. However, the exact pathological mechanism remains uncertain. Characterization and quantification of biomarkers resulting from HDI exposure can fill important knowledge gaps between exposure, susceptibility, and the rise of immunological reactions and sensitization leading to asthma. Here, we discuss existing challenge...
Statistical models for competing risk analysis
International Nuclear Information System (INIS)
Sather, H.N.
1976-08-01
Research results on three new models for potential applications in competing risks problems. One section covers the basic statistical relationships underlying the subsequent competing risks model development. Another discusses the problem of comparing cause-specific risk structure by competing risks theory in two homogeneous populations, P1 and P2. Weibull models which allow more generality than the Berkson and Elveback models are studied for the effect of time on the hazard function. The use of concomitant information for modeling single-risk survival is extended to the multiple failure mode domain of competing risks. The model used to illustrate the use of this methodology is a life table model which has constant hazards within pre-designated intervals of the time scale. Two parametric models for bivariate dependent competing risks, which provide interesting alternatives, are proposed and examined
Statistical analysis of the ASME KIc database
International Nuclear Information System (INIS)
Sokolov, M.A.
1998-01-01
The American Society of Mechanical Engineers (ASME) K Ic curve is a function of test temperature (T) normalized to a reference nil-ductility temperature, RT NDT , namely, T-RT NDT . It was constructed as the lower boundary to the available K Ic database. Being a lower bound to the unique but limited database, the ASME K Ic curve concept does not discuss probability matters. However, a continuing evolution of fracture mechanics advances has led to employment of the Weibull distribution function to model the scatter of fracture toughness values in the transition range. The Weibull statistic/master curve approach was applied to analyze the current ASME K Ic database. It is shown that the Weibull distribution function models the scatter in K Ic data from different materials very well, while the temperature dependence is described by the master curve. Probabilistic-based tolerance-bound curves are suggested to describe lower-bound K Ic values
Statistical analysis of earthquake ground motion parameters
International Nuclear Information System (INIS)
1979-12-01
Several earthquake ground response parameters that define the strength, duration, and frequency content of the motions are investigated using regression analyses techniques; these techniques incorporate statistical significance testing to establish the terms in the regression equations. The parameters investigated are the peak acceleration, velocity, and displacement; Arias intensity; spectrum intensity; bracketed duration; Trifunac-Brady duration; and response spectral amplitudes. The study provides insight into how these parameters are affected by magnitude, epicentral distance, local site conditions, direction of motion (i.e., whether horizontal or vertical), and earthquake event type. The results are presented in a form so as to facilitate their use in the development of seismic input criteria for nuclear plants and other major structures. They are also compared with results from prior investigations that have been used in the past in the criteria development for such facilities
Statistical power analysis for the behavioral sciences
National Research Council Canada - National Science Library
Cohen, Jacob
1988-01-01
.... A chapter has been added for power analysis in set correlation and multivariate methods (Chapter 10). Set correlation is a realization of the multivariate general linear model, and incorporates the standard multivariate methods...
Statistical power analysis for the behavioral sciences
National Research Council Canada - National Science Library
Cohen, Jacob
1988-01-01
... offers a unifying framework and some new data-analytic possibilities. 2. A new chapter (Chapter 11) considers some general topics in power analysis in more integrted form than is possible in the earlier...
Statistical methods for categorical data analysis
Powers, Daniel
2008-01-01
This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Sensitivity analysis of ranked data: from order statistics to quantiles
Heidergott, B.F.; Volk-Makarewicz, W.
2015-01-01
In this paper we provide the mathematical theory for sensitivity analysis of order statistics of continuous random variables, where the sensitivity is with respect to a distributional parameter. Sensitivity analysis of order statistics over a finite number of observations is discussed before
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Statistical analysis of disruptions in JET
International Nuclear Information System (INIS)
De Vries, P.C.; Johnson, M.F.; Segui, I.
2009-01-01
The disruption rate (the percentage of discharges that disrupt) in JET was found to drop steadily over the years. Recent campaigns (2005-2007) show a yearly averaged disruption rate of only 6% while from 1991 to 1995 this was often higher than 20%. Besides the disruption rate, the so-called disruptivity, or the likelihood of a disruption depending on the plasma parameters, has been determined. The disruptivity of plasmas was found to be significantly higher close to the three main operational boundaries for tokamaks; the low-q, high density and β-limit. The frequency at which JET operated close to the density-limit increased six fold over the last decade; however, only a small reduction in disruptivity was found. Similarly the disruptivity close to the low-q and β-limit was found to be unchanged. The most significant reduction in disruptivity was found far from the operational boundaries, leading to the conclusion that the improved disruption rate is due to a better technical capability of operating JET, instead of safer operations close to the physics limits. The statistics showed that a simple protection system was able to mitigate the forces of a large fraction of disruptions, although it has proved to be at present more difficult to ameliorate the heat flux.
The Statistical Analysis of Failure Time Data
Kalbfleisch, John D
2011-01-01
Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.
An analysis of UK wind farm statistics
International Nuclear Information System (INIS)
Milborrow, D.J.
1995-01-01
An analysis of key data for 22 completed wind projects shows 134 MW of plant cost Pound 152 million, giving an average cost of Pound 1136/kW. The energy generation potential of these windfarms is around 360 GWh, derived from sites with windspeeds between 6.2 and 8.8 m/s. Relationships between wind speed, energy production and cost were examined and it was found that costs increased with wind speed, due to the difficulties of access in hilly regions. It also appears that project costs fell with time and wind energy prices have fallen much faster than electricity prices. (Author)
A statistical analysis of UK financial networks
Chu, J.; Nadarajah, S.
2017-04-01
In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.
Phase II cancer clinical trials for biomarker-guided treatments.
Jung, Sin-Ho
2018-01-01
The design and analysis of cancer clinical trials with biomarker depend on various factors, such as the phase of trials, the type of biomarker, whether the used biomarker is validated or not, and the study objectives. In this article, we demonstrate the design and analysis of two Phase II cancer clinical trials, one with a predictive biomarker and the other with an imaging prognostic biomarker. Statistical testing methods and their sample size calculation methods are presented for each trial. We assume that the primary endpoint of these trials is a time to event variable, but this concept can be used for any type of endpoint.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Comparative analysis of positive and negative attitudes toward statistics
Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah
2015-02-01
Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Cui, Jiajia; Liu, Yuetao; Hu, Yinghuan; Tong, Jiayu; Li, Aiping; Qu, Tingli; Qin, Xuemei; Du, Guanhua
2017-01-05
Chronic atrophic gastritis (CAG) is one of the most important pre-cancerous states with a high prevalence. Exploring of the underlying mechanism and potential biomarkers is of significant importance for CAG. In the present work, 1 H NMR-based metabonomics with correlative analysis was performed to analyze the metabolic features of CAG. 19 plasma metabolites and 18 urine metabolites were enrolled to construct the circulatory and excretory metabolome of CAG, which was in response to alterations of energy metabolism, inflammation, immune dysfunction, as well as oxidative stress. 7 plasma biomarkers and 7 urine biomarkers were screened to elucidate the pathogenesis of CAG based on the further correlation analysis with biochemical indexes. Finally, 3 plasma biomarkers (arginine, succinate and 3-hydroxybutyrate) and 2 urine biomarkers (α-ketoglutarate and valine) highlighted the potential to indicate risks of CAG in virtue of correlation with pepsin activity and ROC analysis. Here, our results paved a way for elucidating the underlying mechanisms in the development of CAG, and provided new avenues for the diagnosis of CAG and presented potential drug targets for treatment of CAG. Copyright © 2016 Elsevier B.V. All rights reserved.
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Statistical Analysis and Modelling of Olkiluoto Structures
International Nuclear Information System (INIS)
Hellae, P.; Vaittinen, T.; Saksa, P.; Nummela, J.
2004-11-01
Posiva Oy is carrying out investigations for the disposal of the spent nuclear fuel at the Olkiluoto site in SW Finland. The investigations have focused on the central part of the island. The layout design of the entire repository requires characterization of notably larger areas and must rely at least at the current stage on borehole information from a rather sparse network and on the geophysical soundings providing information outside and between the holes. In this work, the structural data according to the current version of the Olkiluoto bedrock model is analyzed. The bedrock model relies much on the borehole data although results of the seismic surveys and, for example, pumping tests are used in determining the orientation and continuation of the structures. Especially in the analysis, questions related to the frequency of structures and size of the structures are discussed. The structures observed in the boreholes are mainly dipping gently to the southeast. About 9 % of the sample length belongs to structures. The proportion is higher in the upper parts of the rock. The number of fracture and crushed zones seems not to depend greatly on the depth, whereas the hydraulic features concentrate on the depth range above -100 m. Below level -300 m, the hydraulic conductivity occurs in connection of fractured zones. Especially the hydraulic features, but also fracture and crushed zones often occur in groups. The frequency of the structure (area of structures per total volume) is estimated to be of the order of 1/100m. The size of the local structures was estimated by calculating the intersection of the zone to the nearest borehole where the zone has not been detected. Stochastic models using the Fracman software by Golder Associates were generated based on the bedrock model data complemented with the magnetic ground survey data. The seismic surveys (from boreholes KR5, KR13, KR14, and KR19) were used as alternative input data. The generated models were tested by
Explorations in Statistics: The Analysis of Ratios and Normalized Data
Curran-Everett, Douglas
2013-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…
Analysis of thrips distribution: application of spatial statistics and Kriging
John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard
1991-01-01
Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...
Statistical methods for the analysis of high-throughput metabolomics data
Directory of Open Access Journals (Sweden)
Fabian J. Theis
2013-01-01
Full Text Available Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regulation as well as environmental factors. This potential to connect genotypic to phenotypic information promises new insights and biomarkers for different research fields, including biomedical and pharmaceutical research. In the statistical analysis of metabolomics data, many techniques from other omics fields can be reused. However recently, a number of tools specific for metabolomics data have been developed as well. The focus of this mini review will be on recent advancements in the analysis of metabolomics data especially by utilizing Gaussian graphical models and independent component analysis.
International Nuclear Information System (INIS)
2005-01-01
For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees
Statistical analysis of the count and profitability of air conditioners.
Rady, El Houssainy A; Mohamed, Salah M; Abd Elmegaly, Alaa A
2018-08-01
This article presents the statistical analysis of the number and profitability of air conditioners in an Egyptian company. Checking the same distribution for each categorical variable has been made using Kruskal-Wallis test.
Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance
Glascock, M. D.; Neff, H.; Vaughn, K. J.
2004-06-01
The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.
Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance
International Nuclear Information System (INIS)
Glascock, M. D.; Neff, H.; Vaughn, K. J.
2004-01-01
The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.
Application of Ontology Technology in Health Statistic Data Analysis.
Guo, Minjiang; Hu, Hongpu; Lei, Xingyun
2017-01-01
Research Purpose: establish health management ontology for analysis of health statistic data. Proposed Methods: this paper established health management ontology based on the analysis of the concepts in China Health Statistics Yearbook, and used protégé to define the syntactic and semantic structure of health statistical data. six classes of top-level ontology concepts and their subclasses had been extracted and the object properties and data properties were defined to establish the construction of these classes. By ontology instantiation, we can integrate multi-source heterogeneous data and enable administrators to have an overall understanding and analysis of the health statistic data. ontology technology provides a comprehensive and unified information integration structure of the health management domain and lays a foundation for the efficient analysis of multi-source and heterogeneous health system management data and enhancement of the management efficiency.
Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance
Energy Technology Data Exchange (ETDEWEB)
Glascock, M. D.; Neff, H. [University of Missouri, Research Reactor Center (United States); Vaughn, K. J. [Pacific Lutheran University, Department of Anthropology (United States)
2004-06-15
The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.
International Nuclear Information System (INIS)
2001-01-01
For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
1999-01-01
For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Agah, Elmira; Zardoui, Arshia; Saghazadeh, Amene; Ahmadi, Mona; Tafakhori, Abbas; Rezaei, Nima
2018-01-01
Identifying a reliable biomarker may accelerate diagnosis of multiple sclerosis (MS) and lead to early management of the disease. Accumulating evidence suggest that cerebrospinal fluid (CSF) and peripheral blood concentration of osteopontin (OPN) may have diagnostic and prognostic value in MS. We conducted a systematic review and meta-analysis of studies that measured peripheral blood and CSF levels of OPN in MS patients and controls to evaluate the diagnostic potential of this biomarker better. We searched PubMed, Web of Science and Scopus databases to find articles that measured OPN concentration in peripheral blood and CSF samples from MS patients up to October 19, 2016. Q statistic tests and the I2 index were applied for heterogeneity assessment. If the I2 index was less than 40%, the fixed-effects model was used for meta-analysis. Random-effects meta-analysis was chosen if the I2 value was greater than 40%. After removal of duplicates, 918 articles were identified, and 27 of them fulfilled the inclusion criteria. We included 22 eligible studies in the final meta-analysis. MS patients, in general, had considerably higher levels of OPN in their CSF and blood when compared to all types of controls (pCSF of MS subgroups (pCSF concentrations of OPN between MS patient subtypes. CIS patients had significantly lower levels of OPN both in their peripheral blood and CSF compared to patients with progressive subtypes of MS (pCSF concentration of OPN was significantly higher among RRMS patients compared to the CIS patients and SPMS patients (PCSF compared to patients with stable disease (P = 0.007). The result of this study confirms that increased levels of OPN exist in CSF and peripheral blood of MS patients and strengthens the evidence regarding the clinical utility of OPN as a promising and validated biomarker for MS.
MULTIPLEX ANALYSIS OF BIOMARKERS OF NEOANGIOGENESIS AND INFLAMMATION IN HEART TRANSPLANT RECIPIENTS
Directory of Open Access Journals (Sweden)
O. P. Shevchenko
2015-01-01
Full Text Available Aim of study: multiplex analysis of the levels of biomarkers of neoangiogenesis and inflammation in cardiac transplant recipients. Materials and methods. 59 pts. with heart failure III–IV according to NYHA FC, waiting for a heart transplant, aged 22 to 73 years, 48 males and 11 females. 41 recipient (30 men and 11 women had dilated cardiomyopathy, 18 – coronary heart disease (CHD. The concentration of VEGF-A, VEGF-D, PlGF, PDGF-BB, FGF, sCD40L, MCP-1 was measured using xMAP technology, the sets of reagents Simplex ProcartaPlexTM (Affymetrix, USA. Results. There are four levels of seven biomarkers of neoangiogenesis and inflammation method for multiplex analysis in patients with heart failure. A year after transplantation, the mean levels of biomarkers VEGF-A (p = 0.001, PDGF-BB (p = 0.018, MCP-1 (p = 0.003 was significantly decreased, and the others had a tendency to decrease relative to the level before transplantation. It was shown individual differences of levels of VEGF-A, VEGF-D and PlGF before and after transplantation. There were found different dynamics of the concentrations of biomarkers and growth factors before and after heart transplantation in patients with cardiovascular complications and without them. Conclusion. Multiplex analysis allows to measure the concentration range of analyte biomarkers of neoangiogenesis, inflammation in one sample of blood serum of patients with severe heart failure and after transplantation. There are marked individual differences in the concentration of biomarkers in different clinical situations that may have clinical significance in the conduct and supervision of recipients after transplantation.
Huang, Erich P; Wang, Xiao-Feng; Choudhury, Kingshuk Roy; McShane, Lisa M; Gönen, Mithat; Ye, Jingjing; Buckler, Andrew J; Kinahan, Paul E; Reeves, Anthony P; Jackson, Edward F; Guimaraes, Alexander R; Zahlmann, Gudrun
2015-02-01
Medical imaging serves many roles in patient care and the drug approval process, including assessing treatment response and guiding treatment decisions. These roles often involve a quantitative imaging biomarker, an objectively measured characteristic of the underlying anatomic structure or biochemical process derived from medical images. Before a quantitative imaging biomarker is accepted for use in such roles, the imaging procedure to acquire it must undergo evaluation of its technical performance, which entails assessment of performance metrics such as repeatability and reproducibility of the quantitative imaging biomarker. Ideally, this evaluation will involve quantitative summaries of results from multiple studies to overcome limitations due to the typically small sample sizes of technical performance studies and/or to include a broader range of clinical settings and patient populations. This paper is a review of meta-analysis procedures for such an evaluation, including identification of suitable studies, statistical methodology to evaluate and summarize the performance metrics, and complete and transparent reporting of the results. This review addresses challenges typical of meta-analyses of technical performance, particularly small study sizes, which often causes violations of assumptions underlying standard meta-analysis techniques. Alternative approaches to address these difficulties are also presented; simulation studies indicate that they outperform standard techniques when some studies are small. The meta-analysis procedures presented are also applied to actual [18F]-fluorodeoxyglucose positron emission tomography (FDG-PET) test-retest repeatability data for illustrative purposes. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
International Nuclear Information System (INIS)
2003-01-01
For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products
International Nuclear Information System (INIS)
2004-01-01
For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Basic statistical tools in research and data analysis
Directory of Open Access Journals (Sweden)
Zulfiqar Ali
2016-01-01
Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Numeric computation and statistical data analysis on the Java platform
Chekanov, Sergei V
2016-01-01
Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...
Analysis of room transfer function and reverberant signal statistics
DEFF Research Database (Denmark)
Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn
2008-01-01
For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the RTF...... magnitude and phase). RTF fractional octave smoothing, as with 1-slash 3 octave analysis, may lead to RTF simplifications that can be useful for several audio applications, like room compensation, room modeling, auralisation purposes. The aim of this work is to identify the relationship of optimal response...... and the corresponding ratio of the direct and reverberant signal. In addition, this work examines the statistical quantities for speech and audio signals prior to their reproduction within rooms and when recorded in rooms. Histograms and other statistical distributions are used to compare RTF minima of typical...
Statistical analysis of dynamic parameters of the core
International Nuclear Information System (INIS)
Ionov, V.S.
2007-01-01
The transients of various types were investigated for the cores of zero power critical facilities in RRC KI and NPP. Dynamic parameters of neutron transients were explored by tool statistical analysis. Its have sufficient duration, few channels for currents of chambers and reactivity and also some channels for technological parameters. On these values the inverse period. reactivity, lifetime of neutrons, reactivity coefficients and some effects of a reactivity are determinate, and on the values were restored values of measured dynamic parameters as result of the analysis. The mathematical means of statistical analysis were used: approximation(A), filtration (F), rejection (R), estimation of parameters of descriptive statistic (DSP), correlation performances (kk), regression analysis(KP), the prognosis (P), statistician criteria (SC). The calculation procedures were realized by computer language MATLAB. The reasons of methodical and statistical errors are submitted: inadequacy of model operation, precision neutron-physical parameters, features of registered processes, used mathematical model in reactivity meters, technique of processing for registered data etc. Examples of results of statistical analysis. Problems of validity of the methods used for definition and certification of values of statistical parameters and dynamic characteristics are considered (Authors)
ANALYSIS OR THE POTENTIAL SPERM BIOMARKER, SP22, IN HUMAN SEMEN
ANALYSIS OF THE POTENTIAL SPERM BIOMARKER SP22 IN HUMAN SEMEN Rebecca A. Morris Ph.D.1, Gary R. Klinefelter Ph.D.1, Naomi L. Roberts 1, Juan D. Suarez 1, Lillian F. Strader 1, Susan C. Jeffay 1 and Sally D. Perreault Ph.D.1 1 U.S. EPA / ORD / National Health a...
Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.
Conley, Samantha
2017-12-01
The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.
Proteomics analysis after traumatic brain injury in rats: the search for potential biomarkers
Directory of Open Access Journals (Sweden)
Jun Ding
2015-04-01
Full Text Available Many studies of protein expression after traumatic brain injury (TBI have identified biomarkers for diagnosing or determining the prognosis of TBI. In this study, we searched for additional protein markers of TBI using a fluid perfusion impact device to model TBI in S-D rats. Two-dimensional gel electrophoresis and mass spectrometry were used to identify differentially expressed proteins. After proteomic analysis, we detected 405 and 371 protein spots within a pH range of 3-10 from sham-treated and contused brain cortex, respectively. Eighty protein spots were differentially expressed in the two groups and 20 of these proteins were identified. This study validated the established biomarkers of TBI and identified potential biomarkers that could be examined in future work.
Directory of Open Access Journals (Sweden)
Regier N.
2013-04-01
Full Text Available Recently developed genomics tools have a promising potential to identify early biomarkers of exposure to toxicants. In the present work we used transcriptome analysis (RNA-seq of Elodea nuttallii –an invasive rooted macrophyte that is able to accumulate large amounts of metals- to identify biomarkers of Hg exposure. RNA-seq allowed identification of genes affected by Hg exposure and also unraveled plant response to the toxic metal: a change in energy/reserve metabolism caused by the inhibition of photosynthesis, and an adaptation of homeostasis networks to control accumulation of Hg. Data were validated by RT-qPCR and selected genes were further tested as biomarkers. Samples exposed in the field and to natural contaminated sediments clustered well with samples exposed to low metal concentrations under laboratory conditions. Our data suggest that this plant and/or this approach could be useful to develop new tests for water and sediment quality assessment.
Chen, Tung-Sheng; Chang, Mu-Hsin; Kuo, Wei-Wen; Lin, Yueh-Min; Yeh, Yu-Lan; Day, Cecilia Hsuan; Lin, Chien-Chung; Tsai, Fuu-Jen; Tsai, Chang-Hai; Huang, Chih-Yang
2013-04-01
Statistical and clinical reports indicate that betel nut chewing is strongly associated with progression of oral cancer because some ingredients in betel nuts are potential cancer promoters, especially arecoline. Early diagnosis for cancer biomarkers is the best strategy for prevention of cancer progression. Several methods are suggested for investigating cancer biomarkers. Among these methods, gel-based proteomics approach is the most powerful and recommended tool for investigating biomarkers due to its high-throughput. However, this proteomics approach is not suitable for screening biomarkers with molecular weight under 10 KDa because of the characteristics of gel electrophoresis. This study investigated biomarkers with molecular weight under 10 KDa in rats with arecoline challenge. The centrifuging vials with membrane (10 KDa molecular weight cut-off) played a crucial role in this study. After centrifuging, the filtrate (containing compounds with molecular weight under 10 KDa) was collected and spotted on a sample plate for MALDI-TOF mass spectrometry analysis. Compared to control, three extra peaks (m/z values were 1553.1611, 1668.2097 and 1740.1832, respectively) were found in sera and two extra peaks were found in heart tissue samples (408.9719 and 524.9961, respectively). These small compounds should play important roles and may be potential biomarker candidates in rats with arecoline. This study successfully reports a mass-based method for investigating biomarker candidates with small molecular weight in different types of sample (including serum and tissue). In addition, this reported method is more time-efficient (1 working day) than gel-based proteomics approach (5~7 working days).
Simulation Experiments in Practice: Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...
Statistical analysis of planktic foraminifera of the surface Continental ...
African Journals Online (AJOL)
Planktic foraminiferal assemblage recorded from selected samples obtained from shallow continental shelf sediments off southwestern Nigeria were subjected to statistical analysis. The Principal Component Analysis (PCA) was used to determine variants of planktic parameters. Values obtained for these parameters were ...
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is
PRECISE - pregabalin in addition to usual care: Statistical analysis plan
S. Mathieson (Stephanie); L. Billot (Laurent); C. Maher (Chris); A.J. McLachlan (Andrew J.); J. Latimer (Jane); B.W. Koes (Bart); M.J. Hancock (Mark J.); I. Harris (Ian); R.O. Day (Richard O.); J. Pik (Justin); S. Jan (Stephen); C.-W.C. Lin (Chung-Wei Christine)
2016-01-01
textabstractBackground: Sciatica is a severe, disabling condition that lacks high quality evidence for effective treatment strategies. This a priori statistical analysis plan describes the methodology of analysis for the PRECISE study. Methods/design: PRECISE is a prospectively registered, double
A Divergence Statistics Extension to VTK for Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
HistFitter software framework for statistical data analysis
Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...
Hattori, Satoshi; Zhou, Xiao-Hua
2018-02-10
Publication bias is one of the most important issues in meta-analysis. For standard meta-analyses to examine intervention effects, the funnel plot and the trim-and-fill method are simple and widely used techniques for assessing and adjusting for the influence of publication bias, respectively. However, their use may be subjective and can then produce misleading insights. To make a more objective inference for publication bias, various sensitivity analysis methods have been proposed, including the Copas selection model. For meta-analysis of diagnostic studies evaluating a continuous biomarker, the summary receiver operating characteristic (sROC) curve is a very useful method in the presence of heterogeneous cutoff values. To our best knowledge, no methods are available for evaluation of influence of publication bias on estimation of the sROC curve. In this paper, we introduce a Copas-type selection model for meta-analysis of diagnostic studies and propose a sensitivity analysis method for publication bias. Our method enables us to assess the influence of publication bias on the estimation of the sROC curve and then judge whether the result of the meta-analysis is sufficiently confident or should be interpreted with much caution. We illustrate our proposed method with real data. Copyright © 2017 John Wiley & Sons, Ltd.
Statistical analysis applied to safety culture self-assessment
International Nuclear Information System (INIS)
Macedo Soares, P.P.
2002-01-01
Interviews and opinion surveys are instruments used to assess the safety culture in an organization as part of the Safety Culture Enhancement Programme. Specific statistical tools are used to analyse the survey results. This paper presents an example of an opinion survey with the corresponding application of the statistical analysis and the conclusions obtained. Survey validation, Frequency statistics, Kolmogorov-Smirnov non-parametric test, Student (T-test) and ANOVA means comparison tests and LSD post-hoc multiple comparison test, are discussed. (author)
Longitudinal data analysis a handbook of modern statistical methods
Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert
2008-01-01
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint
Highly Robust Statistical Methods in Medical Image Analysis
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2012-01-01
Roč. 32, č. 2 (2012), s. 3-16 ISSN 0208-5216 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust statistics * classification * faces * robust image analysis * forensic science Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.208, year: 2012 http://www.ibib.waw.pl/bbe/bbefulltext/BBE_32_2_003_FT.pdf
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban
2016-01-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We cal...
Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia
Directory of Open Access Journals (Sweden)
Borlawsky Tara B
2010-10-01
Full Text Available Abstract Background Chronic lymphocytic leukemia (CLL is the most common adult leukemia. It is a highly heterogeneous disease, and can be divided roughly into indolent and progressive stages based on classic clinical markers. Immunoglobin heavy chain variable region (IgVH mutational status was found to be associated with patient survival outcome, and biomarkers linked to the IgVH status has been a focus in the CLL prognosis research field. However, biomarkers highly correlated with IgVH mutational status which can accurately predict the survival outcome are yet to be discovered. Results In this paper, we investigate the use of gene co-expression network analysis to identify potential biomarkers for CLL. Specifically we focused on the co-expression network involving ZAP70, a well characterized biomarker for CLL. We selected 23 microarray datasets corresponding to multiple types of cancer from the Gene Expression Omnibus (GEO and used the frequent network mining algorithm CODENSE to identify highly connected gene co-expression networks spanning the entire genome, then evaluated the genes in the co-expression network in which ZAP70 is involved. We then applied a set of feature selection methods to further select genes which are capable of predicting IgVH mutation status from the ZAP70 co-expression network. Conclusions We have identified a set of genes that are potential CLL prognostic biomarkers IL2RB, CD8A, CD247, LAG3 and KLRK1, which can predict CLL patient IgVH mutational status with high accuracies. Their prognostic capabilities were cross-validated by applying these biomarker candidates to classify patients into different outcome groups using a CLL microarray datasets with clinical information.
Kataoka, Hiroyuki; Saito, Keita; Kato, Hisato; Masuda, Kazufumi
2013-06-01
Early disease diagnosis is crucial for human healthcare and successful therapy. Since any changes in homeostatic balance can alter human emanations, the components of breath exhalations and skin emissions may be diagnostic biomarkers for various diseases and metabolic disorders. Since hundreds of endogenous and exogenous volatile organic compounds (VOCs) are released from the human body, analysis of these VOCs may be a noninvasive, painless, and easy diagnostic tool. Sampling and preconcentration by sorbent tubes/traps and solid-phase microextraction, in combination with GC or GC-MS, are usually used to analyze VOCs. In addition, GC-MS-olfactometry is useful for simultaneous analysis of odorants and odor quality. Direct MS techniques are also useful for the online real-time detection of VOCs. This review focuses on recent developments in sampling and analysis of volatile biomarkers in human odors and/or emanations, and discusses future use of VOC analysis.
Puchades-Carrasco, Leonor; Palomino-Schätzlein, Martina; Pérez-Rambla, Clara; Pineda-Lucena, Antonio
2016-05-01
Metabolomics, a systems biology approach focused on the global study of the metabolome, offers a tremendous potential in the analysis of clinical samples. Among other applications, metabolomics enables mapping of biochemical alterations involved in the pathogenesis of diseases, and offers the opportunity to noninvasively identify diagnostic, prognostic and predictive biomarkers that could translate into early therapeutic interventions. Particularly, metabolomics by Nuclear Magnetic Resonance (NMR) has the ability to simultaneously detect and structurally characterize an abundance of metabolic components, even when their identities are unknown. Analysis of the data generated using this experimental approach requires the application of statistical and bioinformatics tools for the correct interpretation of the results. This review focuses on the different steps involved in the metabolomics characterization of biofluids for clinical applications, ranging from the design of the study to the biological interpretation of the results. Particular emphasis is devoted to the specific procedures required for the processing and interpretation of NMR data with a focus on the identification of clinically relevant biomarkers. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Statistical margin to DNB safety analysis approach for LOFT
International Nuclear Information System (INIS)
Atkinson, S.A.
1982-01-01
A method was developed and used for LOFT thermal safety analysis to estimate the statistical margin to DNB for the hot rod, and to base safety analysis on desired DNB probability limits. This method is an advanced approach using response surface analysis methods, a very efficient experimental design, and a 2nd-order response surface equation with a 2nd-order error propagation analysis to define the MDNBR probability density function. Calculations for limiting transients were used in the response surface analysis thereby including transient interactions and trip uncertainties in the MDNBR probability density
Data analysis using the Gnu R system for statistical computation
Energy Technology Data Exchange (ETDEWEB)
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
Directory of Open Access Journals (Sweden)
Susan K Morton
Full Text Available Arteriovenous fistula (AVF failure is a significant cause of morbidity and expense in patients on maintenance haemodialysis (HD. Circulating biomarkers could be valuable in detecting patients at risk of AVF failure and may identify targets to improve AVF outcome. Currently there is little consensus on the relationship between circulating biomarkers and AVF failure. The aim of this systematic review was to identify circulating biomarkers associated with AVF failure.Studies evaluating the association between circulating biomarkers and the presence or risk of AVF failure were systematically identified from the MEDLINE, EMBASE and Cochrane Library databases. No restrictions on the type of study were imposed. Concentrations of circulating biomarkers of routine HD patients with and without AVF failure were recorded and meta-analyses were performed on biomarkers that were assessed in three or more studies with a composite population of at least 100 participants. Biomarker concentrations were synthesized into inverse-variance random-effects models to calculate standardized mean differences (SMD and 95% confidence intervals (CI.Thirteen studies comprising a combined population of 1512 participants were included after screening 2835 unique abstracts. These studies collectively investigated 48 biomarkers, predominantly circulating molecules which were assessed as part of routine clinical care. Meta-analysis was performed on twelve eligible biomarkers. No significant association between any of the assessed biomarkers and AVF failure was observed.This paper is the first systematic review of biomarkers associated with AVF failure. Our results suggest that blood markers currently assessed do not identify an at-risk AVF. Further, rigorously designed studies assessing biological plausible biomarkers are needed to clarify whether assessment of circulating markers can be of any clinical value. PROSPERO registration number CRD42016033845.
Gao, Xueqin; Ke, Chaofu; Liu, Haixia; Liu, Wei; Li, Kang; Yu, Bo; Sun, Meng
2017-09-18
Coronary atherosclerosis (CAS) is the pathogenesis of coronary heart disease, which is a prevalent and chronic life-threatening disease. Initially, this disease is not always detected until a patient presents with seriously vascular occlusion. Therefore, new biomarkers for appropriate and timely diagnosis of early CAS is needed for screening to initiate therapy on time. In this study, we used an untargeted metabolomics approach to identify potential biomarkers that could enable highly sensitive and specific CAS detection. Score plots from partial least-squares discriminant analysis clearly separated early-stage CAS patients from controls. Meanwhile, the levels of 24 metabolites increased greatly and those of 18 metabolites decreased markedly in early CAS patients compared with the controls, which suggested significant metabolic dysfunction in phospholipid, sphingolipid, and fatty acid metabolism in the patients. Furthermore, binary logistic regression showed that nine metabolites could be used as a combinatorial biomarker to distinguish early-stage CAS patients from controls. The panel of nine metabolites was then tested with an independent cohort of samples, which also yielded satisfactory diagnostic accuracy (AUC = 0.890). In conclusion, our findings provide insight into the pathological mechanism of early-stage CAS and also supply a combinatorial biomarker to aid clinical diagnosis of early-stage CAS.
Quantitative proteomic analysis of microdissected oral epithelium for cancer biomarker discovery.
Xiao, Hua; Langerman, Alexander; Zhang, Yan; Khalid, Omar; Hu, Shen; Cao, Cheng-Xi; Lingen, Mark W; Wong, David T W
2015-11-01
Specific biomarkers are urgently needed for the detection and progression of oral cancer. The objective of this study was to discover cancer biomarkers from oral epithelium through utilizing high throughput quantitative proteomics approaches. Morphologically malignant, epithelial dysplasia, and adjacent normal epithelial tissues were laser capture microdissected (LCM) from 19 patients and used for proteomics analysis. Total proteins from each group were extracted, digested and then labelled with corresponding isobaric tags for relative and absolute quantitation (iTRAQ). Labelled peptides from each sample were combined and analyzed by liquid chromatography-mass spectrometry (LC-MS/MS) for protein identification and quantification. In total, 500 proteins were identified and 425 of them were quantified. When compared with adjacent normal oral epithelium, 17 and 15 proteins were consistently up-regulated or down-regulated in malignant and epithelial dysplasia, respectively. Half of these candidate biomarkers were discovered for oral cancer for the first time. Cornulin was initially confirmed in tissue protein extracts and was further validated in tissue microarray. Its presence in the saliva of oral cancer patients was also explored. Myoglobin and S100A8 were pre-validated by tissue microarray. These data demonstrated that the proteomic biomarkers discovered through this strategy are potential targets for oral cancer detection and salivary diagnostics. Copyright © 2015 Elsevier Ltd. All rights reserved.
Core cerebrospinal fluid biomarker profile in cerebral amyloid angiopathy: A meta-analysis.
Charidimou, Andreas; Friedrich, Jan O; Greenberg, Steven M; Viswanathan, Anand
2018-02-27
To perform a meta-analysis of 4 core CSF biomarkers (β-amyloid [Aβ]42, Aβ40, total tau [t-tau], and phosphorylated tau [p-tau]) to assess which of these are most altered in sporadic cerebral amyloid angiopathy (CAA). We systematically searched PubMed for eligible studies reporting data on CSF biomarkers reflecting amyloid precursor protein metabolism (Aβ42, Aβ40), neurodegeneration (t-tau), and tangle pathology (p-tau) in symptomatic sporadic CAA cohorts vs controls and patients with Alzheimer disease (AD). Biomarker performance was assessed in random-effects meta-analysis based on ratio of mean (RoM) biomarker concentrations: (1) in patients with CAA vs healthy controls and (2) in patients with CAA vs patients with AD. RoM >1 indicates higher biomarker concentration in patients with CAA vs comparison population and RoM <1 indicates higher concentration in comparison groups. Three studies met inclusion criteria. These comprised 5 CAA patient cohorts (n = 59 patients) vs healthy controls (n = 94 cases) and AD cohorts (n = 158). Three core biomarkers differentiated CAA from controls: CSF Aβ42 (RoM 0.49, 95% confidence interval [CI] 0.38-0.64, p < 0.003), Aβ40 (RoM 0.70, 95% CI 0.63-0.78, p < 0.0001), and t-tau (RoM 1.54, 95% CI 1.15-2.07, p = 0.004); p-tau was marginal (RoM 1.24, 95% CI 0.99-1.54, p = 0.062). Differentiation between CAA and AD was strong for CSF Aβ40 (RoM 0.76, 95% CI 0.69-0.83, p < 0.0001), but not Aβ42 (RoM 1.00; 95% CI 0.81-1.23, p = 0.970). For t-tau and p-tau, average CSF ratios in patients with CAA vs patients with AD were 0.63 (95% CI 0.54-0.74, p < 0.0001) and 0.60 (95% CI 0.50-0.71, p < 0.0001), respectively. Specific CSF patterns of Aβ42, Aβ40, t-tau, and p-tau might serve as molecular biomarkers of CAA, but analyses in larger CAA cohorts are needed. © 2018 American Academy of Neurology.
A κ-generalized statistical mechanics approach to income analysis
International Nuclear Information System (INIS)
Clementi, F; Gallegati, M; Kaniadakis, G
2009-01-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low–middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful
A κ-generalized statistical mechanics approach to income analysis
Clementi, F.; Gallegati, M.; Kaniadakis, G.
2009-02-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.
Common pitfalls in statistical analysis: Linear regression analysis
Directory of Open Access Journals (Sweden)
Rakesh Aggarwal
2017-01-01
Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Statistical Compilation of the ICT Sector and Policy Analysis | CRDI ...
International Development Research Centre (IDRC) Digital Library (Canada)
Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...
Multivariate statistical analysis of major and trace element data for ...
African Journals Online (AJOL)
Multivariate statistical analysis of major and trace element data for niobium exploration in the peralkaline granites of the anorogenic ring-complex province of Nigeria. PO Ogunleye, EC Ike, I Garba. Abstract. No Abstract Available Journal of Mining and Geology Vol.40(2) 2004: 107-117. Full Text: EMAIL FULL TEXT EMAIL ...
Statistical Compilation of the ICT Sector and Policy Analysis | IDRC ...
International Development Research Centre (IDRC) Digital Library (Canada)
Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...
Statistical analysis of the BOIL program in RSYST-III
International Nuclear Information System (INIS)
Beck, W.; Hausch, H.J.
1978-11-01
The paper describes a statistical analysis in the RSYST-III program system. Using the example of the BOIL program, it is shown how the effects of inaccurate input data on the output data can be discovered. The existing possibilities of data generation, data handling, and data evaluation are outlined. (orig.) [de
Statistical analysis of thermal conductivity of nanofluid containing ...
Indian Academy of Sciences (India)
Thermal conductivity measurements of nanofluids were analysed via two-factor completely randomized design and comparison of data means is carried out with Duncan's multiple-range test. Statistical analysis of experimental data show that temperature and weight fraction have a reasonable impact on the thermal ...
Multivariate statistical analysis of precipitation chemistry in Northwestern Spain
International Nuclear Information System (INIS)
Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T.
1993-01-01
149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs
Implementation and statistical analysis of Metropolis algorithm for SU(3)
International Nuclear Information System (INIS)
Katznelson, E.; Nobile, A.
1984-12-01
In this paper we study the statistical properties of an implementation of the Metropolis algorithm for SU(3) gauge theory. It is shown that the results have normal distribution. We demonstrate that in this case error analysis can be carried on in a simple way and we show that applying it to both the measurement strategy and the output data analysis has an important influence on the performance and reliability of the simulation. (author)
Multivariate statistical analysis of precipitation chemistry in Northwestern Spain
Energy Technology Data Exchange (ETDEWEB)
Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T. (University of Santiago, Santiago (Spain). Faculty of Mathematics, Dept. of Statistics and Operations Research)
1993-07-01
149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs.
Reducing bias in the analysis of counting statistics data
International Nuclear Information System (INIS)
Hammersley, A.P.; Antoniadis, A.
1997-01-01
In the analysis of counting statistics data it is common practice to estimate the variance of the measured data points as the data points themselves. This practice introduces a bias into the results of further analysis which may be significant, and under certain circumstances lead to false conclusions. In the case of normal weighted least squares fitting this bias is quantified and methods to avoid it are proposed. (orig.)
Fisher statistics for analysis of diffusion tensor directional information.
Hutchinson, Elizabeth B; Rutecki, Paul A; Alexander, Andrew L; Sutula, Thomas P
2012-04-30
A statistical approach is presented for the quantitative analysis of diffusion tensor imaging (DTI) directional information using Fisher statistics, which were originally developed for the analysis of vectors in the field of paleomagnetism. In this framework, descriptive and inferential statistics have been formulated based on the Fisher probability density function, a spherical analogue of the normal distribution. The Fisher approach was evaluated for investigation of rat brain DTI maps to characterize tissue orientation in the corpus callosum, fornix, and hilus of the dorsal hippocampal dentate gyrus, and to compare directional properties in these regions following status epilepticus (SE) or traumatic brain injury (TBI) with values in healthy brains. Direction vectors were determined for each region of interest (ROI) for each brain sample and Fisher statistics were applied to calculate the mean direction vector and variance parameters in the corpus callosum, fornix, and dentate gyrus of normal rats and rats that experienced TBI or SE. Hypothesis testing was performed by calculation of Watson's F-statistic and associated p-value giving the likelihood that grouped observations were from the same directional distribution. In the fornix and midline corpus callosum, no directional differences were detected between groups, however in the hilus, significant (pstatistical comparison of tissue structural orientation. Copyright © 2012 Elsevier B.V. All rights reserved.
HistFitter software framework for statistical data analysis
Energy Technology Data Exchange (ETDEWEB)
Baak, M. [CERN, Geneva (Switzerland); Besjes, G.J. [Radboud University Nijmegen, Nijmegen (Netherlands); Nikhef, Amsterdam (Netherlands); Cote, D. [University of Texas, Arlington (United States); Koutsman, A. [TRIUMF, Vancouver (Canada); Lorenz, J. [Ludwig-Maximilians-Universitaet Muenchen, Munich (Germany); Excellence Cluster Universe, Garching (Germany); Short, D. [University of Oxford, Oxford (United Kingdom)
2015-04-15
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface. (orig.)
HistFitter software framework for statistical data analysis
International Nuclear Information System (INIS)
Baak, M.; Besjes, G.J.; Cote, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface. (orig.)
The added value of biomarker analysis to the genesis of Plaggic Anthrosols.
van Mourik, Jan; Jansen, Boris
2015-04-01
Coversands (chemical poor Late-glacial aeolian sand deposits) dominate the surface geology of an extensive area in northwestern Europe. Plaggic Anthrosols occur in cultural landscapes, developed on coversands. They are the characteristic soils that developed on ancient fertilized arable fields. Plaggic Anthrosols have a complex genesis. They are records of aspects environmental and agricultural history. In previous studies information of the soil records was unlocked by application of pollen analysis, 14C and OSL dating. In this study we applied biomarker analysis to unlock additional information about the applied organic sources in the production of plaggic manure. Radiocarbon dating suggested the start of sedentary agriculture (after a period, characterized by shifting cultivation and Celtic fields) between 3000 and 2000 BP. In previous studies is assumed that farmers applied organic sods, dug on forest soils and heath to produce organic stable manure to fertilize the fields. The mineral fraction of the sods was supposed to be responsible for the development of the plaggic horizon and the raise of the land surface. Optically stimulated Luminescence dating however suggested that plaggic deposition on the fields started relatively late, in the 18th century. The use of ectorganic matter from the forest soils must have been ended in the 10th-12th century, due to commercial forest clear cuttings as recorded in archived documents. These deforestations resulted in the first extension of sand drifting and famers had to protect the valuable heath against this ' environmental catastrophe' . The use of heath for sheep grazing and other purposes as honey production could continue till the 18th century, as recorded in archived documents. In the course of the 18th century, the population growth resulted in increasing demand for food. The deep stable economy was introduced and the booming demand for manure resulted in intensive sod digging on the heath. This caused heath
Potential biomarkers of tardive dyskinesia: A multiplex analysis of blood serum
Boiko, Anastasia S; Kornetova, Elena G; Ivanova, Svetlana A.; Loonen, Antonius
2017-01-01
Potential biomarkers of tardive dyskinesia: a multiplex analysis of blood serum A.S. Boiko(1), E.G. Kornetova(2), S.A. Ivanova(1), A.J.M. Loonen(3) (1)Mental Health Research Institute, Tomsk National Research Medical Center of the Russian Academy of Sciences, Laboratory of Molecular Genetics and Biochemistry, Tomsk, Russia (2)Mental Health Research Institute, Tomsk National Research Medical Center of the Russian Academy of Sciences, Department of Endogenous Disorders, Tomsk, Russia (3)Univers...
STATCAT, Statistical Analysis of Parametric and Non-Parametric Data
International Nuclear Information System (INIS)
David, Hugh
1990-01-01
1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required
Statistical analysis of absorptive laser damage in dielectric thin films
International Nuclear Information System (INIS)
Budgor, A.B.; Luria-Budgor, K.F.
1978-01-01
The Weibull distribution arises as an example of the theory of extreme events. It is commonly used to fit statistical data arising in the failure analysis of electrical components and in DC breakdown of materials. This distribution is employed to analyze time-to-damage and intensity-to-damage statistics obtained when irradiating thin film coated samples of SiO 2 , ZrO 2 , and Al 2 O 3 with tightly focused laser beams. The data used is furnished by Milam. The fit to the data is excellent; and least squared correlation coefficients greater than 0.9 are often obtained
Analytical and statistical analysis of elemental composition of lichens
International Nuclear Information System (INIS)
Calvelo, S.; Baccala, N.; Bubach, D.; Arribere, M.A.; Riberio Guevara, S.
1997-01-01
The elemental composition of lichens from remote southern South America regions has been studied with analytical and statistical techniques to determine if the values obtained reflect species, growth forms or habitat characteristics. The enrichment factors are calculated discriminated by species and collection site and compared with data available in the literature. The elemental concentrations are standardized and compared for different species. The information was statistically processed, a cluster analysis was performed using the three first principal axes of the PCA; the three groups formed are presented. Their relationship with the species, collection sites and the lichen growth forms are interpreted. (author)
Detecting errors in micro and trace analysis by using statistics
DEFF Research Database (Denmark)
Heydorn, K.
1993-01-01
By assigning a standard deviation to each step in an analytical method it is possible to predict the standard deviation of each analytical result obtained by this method. If the actual variability of replicate analytical results agrees with the expected, the analytical method is said...... to be in statistical control. Significant deviations between analytical results from different laboratories reveal the presence of systematic errors, and agreement between different laboratories indicate the absence of systematic errors. This statistical approach, referred to as the analysis of precision, was applied...
Multivariate statistical analysis of atom probe tomography data
International Nuclear Information System (INIS)
Parish, Chad M.; Miller, Michael K.
2010-01-01
The application of spectrum imaging multivariate statistical analysis methods, specifically principal component analysis (PCA), to atom probe tomography (APT) data has been investigated. The mathematical method of analysis is described and the results for two example datasets are analyzed and presented. The first dataset is from the analysis of a PM 2000 Fe-Cr-Al-Ti steel containing two different ultrafine precipitate populations. PCA properly describes the matrix and precipitate phases in a simple and intuitive manner. A second APT example is from the analysis of an irradiated reactor pressure vessel steel. Fine, nm-scale Cu-enriched precipitates having a core-shell structure were identified and qualitatively described by PCA. Advantages, disadvantages, and future prospects for implementing these data analysis methodologies for APT datasets, particularly with regard to quantitative analysis, are also discussed.
Using Pre-Statistical Analysis to Streamline Monitoring Assessments
International Nuclear Information System (INIS)
Reed, J.K.
1999-01-01
A variety of statistical methods exist to aid evaluation of groundwater quality and subsequent decision making in regulatory programs. These methods are applied because of large temporal and spatial extrapolations commonly applied to these data. In short, statistical conclusions often serve as a surrogate for knowledge. However, facilities with mature monitoring programs that have generated abundant data have inherently less uncertainty because of the sheer quantity of analytical results. In these cases, statistical tests can be less important, and ''expert'' data analysis should assume an important screening role.The WSRC Environmental Protection Department, working with the General Separations Area BSRI Environmental Restoration project team has developed a method for an Integrated Hydrogeological Analysis (IHA) of historical water quality data from the F and H Seepage Basins groundwater remediation project. The IHA combines common sense analytical techniques and a GIS presentation that force direct interactive evaluation of the data. The IHA can perform multiple data analysis tasks required by the RCRA permit. These include: (1) Development of a groundwater quality baseline prior to remediation startup, (2) Targeting of constituents for removal from RCRA GWPS, (3) Targeting of constituents for removal from UIC, permit, (4) Targeting of constituents for reduced, (5)Targeting of monitoring wells not producing representative samples, (6) Reduction in statistical evaluation, and (7) Identification of contamination from other facilities
Directory of Open Access Journals (Sweden)
Federica Villanova
Full Text Available Discovery of novel immune biomarkers for monitoring of disease prognosis and response to therapy in immune-mediated inflammatory diseases is an important unmet clinical need. Here, we establish a novel framework for immunological biomarker discovery, comparing a conventional (liquid flow cytometry platform (CFP and a unique lyoplate-based flow cytometry platform (LFP in combination with advanced computational data analysis. We demonstrate that LFP had higher sensitivity compared to CFP, with increased detection of cytokines (IFN-γ and IL-10 and activation markers (Foxp3 and CD25. Fluorescent intensity of cells stained with lyophilized antibodies was increased compared to cells stained with liquid antibodies. LFP, using a plate loader, allowed medium-throughput processing of samples with comparable intra- and inter-assay variability between platforms. Automated computational analysis identified novel immunophenotypes that were not detected with manual analysis. Our results establish a new flow cytometry platform for standardized and rapid immunological biomarker discovery with wide application to immune-mediated diseases.
Villanova, Federica; Di Meglio, Paola; Inokuma, Margaret; Aghaeepour, Nima; Perucha, Esperanza; Mollon, Jennifer; Nomura, Laurel; Hernandez-Fuentes, Maria; Cope, Andrew; Prevost, A Toby; Heck, Susanne; Maino, Vernon; Lord, Graham; Brinkman, Ryan R; Nestle, Frank O
2013-01-01
Discovery of novel immune biomarkers for monitoring of disease prognosis and response to therapy in immune-mediated inflammatory diseases is an important unmet clinical need. Here, we establish a novel framework for immunological biomarker discovery, comparing a conventional (liquid) flow cytometry platform (CFP) and a unique lyoplate-based flow cytometry platform (LFP) in combination with advanced computational data analysis. We demonstrate that LFP had higher sensitivity compared to CFP, with increased detection of cytokines (IFN-γ and IL-10) and activation markers (Foxp3 and CD25). Fluorescent intensity of cells stained with lyophilized antibodies was increased compared to cells stained with liquid antibodies. LFP, using a plate loader, allowed medium-throughput processing of samples with comparable intra- and inter-assay variability between platforms. Automated computational analysis identified novel immunophenotypes that were not detected with manual analysis. Our results establish a new flow cytometry platform for standardized and rapid immunological biomarker discovery with wide application to immune-mediated diseases.
Sahore, Vishal; Sonker, Mukul; Nielsen, Anna V; Knob, Radim; Kumar, Suresh; Woolley, Adam T
2018-01-01
We have developed multichannel integrated microfluidic devices for automated preconcentration, labeling, purification, and separation of preterm birth (PTB) biomarkers. We fabricated multilayer poly(dimethylsiloxane)-cyclic olefin copolymer (PDMS-COC) devices that perform solid-phase extraction (SPE) and microchip electrophoresis (μCE) for automated PTB biomarker analysis. The PDMS control layer had a peristaltic pump and pneumatic valves for flow control, while the PDMS fluidic layer had five input reservoirs connected to microchannels and a μCE system. The COC layers had a reversed-phase octyl methacrylate porous polymer monolith for SPE and fluorescent labeling of PTB biomarkers. We determined μCE conditions for two PTB biomarkers, ferritin (Fer) and corticotropin-releasing factor (CRF). We used these integrated microfluidic devices to preconcentrate and purify off-chip-labeled Fer and CRF in an automated fashion. Finally, we performed a fully automated on-chip analysis of unlabeled PTB biomarkers, involving SPE, labeling, and μCE separation with 1 h total analysis time. These integrated systems have strong potential to be combined with upstream immunoaffinity extraction, offering a compact sample-to-answer biomarker analysis platform. Graphical abstract Pressure-actuated integrated microfluidic devices have been developed for automated solid-phase extraction, fluorescent labeling, and microchip electrophoresis of preterm birth biomarkers.
Data management and statistical analysis for environmental assessment
International Nuclear Information System (INIS)
Wendelberger, J.R.; McVittie, T.I.
1995-01-01
Data management and statistical analysis for environmental assessment are important issues on the interface of computer science and statistics. Data collection for environmental decision making can generate large quantities of various types of data. A database/GIS system developed is described which provides efficient data storage as well as visualization tools which may be integrated into the data analysis process. FIMAD is a living database and GIS system. The system has changed and developed over time to meet the needs of the Los Alamos National Laboratory Restoration Program. The system provides a repository for data which may be accessed by different individuals for different purposes. The database structure is driven by the large amount and varied types of data required for environmental assessment. The integration of the database with the GIS system provides the foundation for powerful visualization and analysis capabilities
Explorations in statistics: the analysis of ratios and normalized data.
Curran-Everett, Douglas
2013-09-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of Explorations in Statistics explores the analysis of ratios and normalized-or standardized-data. As researchers, we compute a ratio-a numerator divided by a denominator-to compute a proportion for some biological response or to derive some standardized variable. In each situation, we want to control for differences in the denominator when the thing we really care about is the numerator. But there is peril lurking in a ratio: only if the relationship between numerator and denominator is a straight line through the origin will the ratio be meaningful. If not, the ratio will misrepresent the true relationship between numerator and denominator. In contrast, regression techniques-these include analysis of covariance-are versatile: they can accommodate an analysis of the relationship between numerator and denominator when a ratio is useless.
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-09
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Feature-Based Statistical Analysis of Combustion Simulation Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
Directory of Open Access Journals (Sweden)
Peeyush Sahay
2009-10-01
Full Text Available Breath analysis, a promising new field of medicine and medical instrumentation, potentially offers noninvasive, real-time, and point-of-care (POC disease diagnostics and metabolic status monitoring. Numerous breath biomarkers have been detected and quantified so far by using the GC-MS technique. Recent advances in laser spectroscopic techniques and laser sources have driven breath analysis to new heights, moving from laboratory research to commercial reality. Laser spectroscopic detection techniques not only have high-sensitivity and high-selectivity, as equivalently offered by the MS-based techniques, but also have the advantageous features of near real-time response, low instrument costs, and POC function. Of the approximately 35 established breath biomarkers, such as acetone, ammonia, carbon dioxide, ethane, methane, and nitric oxide, 14 species in exhaled human breath have been analyzed by high-sensitivity laser spectroscopic techniques, namely, tunable diode laser absorption spectroscopy (TDLAS, cavity ringdown spectroscopy (CRDS, integrated cavity output spectroscopy (ICOS, cavity enhanced absorption spectroscopy (CEAS, cavity leak-out spectroscopy (CALOS, photoacoustic spectroscopy (PAS, quartz-enhanced photoacoustic spectroscopy (QEPAS, and optical frequency comb cavity-enhanced absorption spectroscopy (OFC-CEAS. Spectral fingerprints of the measured biomarkers span from the UV to the mid-IR spectral regions and the detection limits achieved by the laser techniques range from parts per million to parts per billion levels. Sensors using the laser spectroscopic techniques for a few breath biomarkers, e.g., carbon dioxide, nitric oxide, etc. are commercially available. This review presents an update on the latest developments in laser-based breath analysis.
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)
Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee
2010-12-01
Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review
CORSSA: Community Online Resource for Statistical Seismicity Analysis
Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.
2011-12-01
Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
Emwas, Abdul-Hamid M.
2016-01-08
NMR-based metabolomics has shown considerable promise in disease diagnosis and biomarker discovery because it allows one to non-destructively identify and quantify large numbers of novel metabolite biomarkers in both biofluids and tissues. Indeed, precise metabolite quantification is a necessary prerequisite to move any chemical biomarker or biomarker panel from the lab into the clinic. Among the many biofluids (urine, serum, plasma, cerebrospinal fluid and saliva) commonly used for disease diagnosis and prognosis, urine has several advantages. It is abundant, sterile, easily obtained, needs little sample preparation and does not require any invasive medical procedures for collection. Furthermore, urine captures and concentrates many “unwanted” or “undesirable” compounds throughout the body, thereby providing a rich source of potentially useful disease biomarkers. However, the incredible variation in urine chemical concentrations due to effects such as gender, age, diet, life style, health conditions, and physical activity make the analysis of urine and the identification of useful urinary biomarkers by NMR quite challenging. In this review, we discuss a number of the most significant issues regarding NMR-based urinary metabolomics with a specific emphasis on metabolite quantification for disease biomarker applications. We also propose a number of data collection and instrumental recommendations regarding NMR pulse sequences, acceptable acquisition parameter ranges, relaxation effects on quantitation, proper handling of instrumental differences, as well as recommendations regarding sample preparation and biomarker assessment.
Emwas, Abdul-Hamid M.; Roy, Raja; McKay, Ryan T.; Ryan, Danielle; Brennan, Lorraine; Tenori, Leonardo; Luchinat, Claudio; Gao, Xin; Zeri, Ana Carolina; Gowda, G. A. Nagana; Raftery, Daniel; Steinbeck, Christoph; Salek, Reza M; Wishart, David S.
2016-01-01
NMR-based metabolomics has shown considerable promise in disease diagnosis and biomarker discovery because it allows one to non-destructively identify and quantify large numbers of novel metabolite biomarkers in both biofluids and tissues. Indeed, precise metabolite quantification is a necessary prerequisite to move any chemical biomarker or biomarker panel from the lab into the clinic. Among the many biofluids (urine, serum, plasma, cerebrospinal fluid and saliva) commonly used for disease diagnosis and prognosis, urine has several advantages. It is abundant, sterile, easily obtained, needs little sample preparation and does not require any invasive medical procedures for collection. Furthermore, urine captures and concentrates many “unwanted” or “undesirable” compounds throughout the body, thereby providing a rich source of potentially useful disease biomarkers. However, the incredible variation in urine chemical concentrations due to effects such as gender, age, diet, life style, health conditions, and physical activity make the analysis of urine and the identification of useful urinary biomarkers by NMR quite challenging. In this review, we discuss a number of the most significant issues regarding NMR-based urinary metabolomics with a specific emphasis on metabolite quantification for disease biomarker applications. We also propose a number of data collection and instrumental recommendations regarding NMR pulse sequences, acceptable acquisition parameter ranges, relaxation effects on quantitation, proper handling of instrumental differences, as well as recommendations regarding sample preparation and biomarker assessment.
Perceptual and statistical analysis of cardiac phase and amplitude images
International Nuclear Information System (INIS)
Houston, A.; Craig, A.
1991-01-01
A perceptual experiment was conducted using cardiac phase and amplitude images. Estimates of statistical parameters were derived from the images and the diagnostic potential of human and statistical decisions compared. Five methods were used to generate the images from 75 gated cardiac studies, 39 of which were classified as pathological. The images were presented to 12 observers experienced in nuclear medicine. The observers rated the images using a five-category scale based on their confidence of an abnormality presenting. Circular and linear statistics were used to analyse phase and amplitude image data, respectively. Estimates of mean, standard deviation (SD), skewness, kurtosis and the first term of the spatial correlation function were evaluated in the region of the left ventricle. A receiver operating characteristic analysis was performed on both sets of data and the human and statistical decisions compared. For phase images, circular SD was shown to discriminate better between normal and abnormal than experienced observers, but no single statistic discriminated as well as the human observer for amplitude images. (orig.)
Metallothionein - immunohistochemical cancer biomarker: a meta-analysis.
Directory of Open Access Journals (Sweden)
Jaromir Gumulec
Full Text Available Metallothionein (MT has been extensively investigated as a molecular marker of various types of cancer. In spite of the fact that numerous reviews have been published in this field, no meta-analytical approach has been performed. Therefore, results of to-date immunohistochemistry-based studies were summarized using meta-analysis in this review. Web of science, PubMed, Embase and CENTRAL databases were searched (up to April 30, 2013 and the eligibility of individual studies and heterogeneity among the studies was assessed. Random and fixed effects model meta-analysis was employed depending on the heterogeneity, and publication bias was evaluated using funnel plots and Egger's tests. A total of 77 studies were included with 8,015 tissue samples (4,631 cases and 3,384 controls. A significantly positive association between MT staining and tumors (vs. healthy tissues was observed in head and neck (odds ratio, OR 9.95; 95% CI 5.82-17.03 and ovarian tumors (OR 7.83; 1.09-56.29, and a negative association was ascertained in liver tumors (OR 0.10; 0.03-0.30. No significant associations were identified in breast, colorectal, prostate, thyroid, stomach, bladder, kidney, gallbladder, and uterine cancers and in melanoma. While no associations were identified between MT and tumor staging, a positive association was identified with the tumor grade (OR 1.58; 1.08-2.30. In particular, strong associations were observed in breast, ovarian, uterine and prostate cancers. Borderline significant association of metastatic status and MT staining were determined (OR 1.59; 1.03-2.46, particularly in esophageal cancer. Additionally, a significant association between the patient prognosis and MT staining was also demonstrated (hazard ratio 2.04; 1.47-2.81. However, a high degree of inconsistence was observed in several tumor types, including colorectal, kidney and prostate cancer. Despite the ambiguity in some tumor types, conclusive results are provided in the tumors of
State analysis of BOP using statistical and heuristic methods
International Nuclear Information System (INIS)
Heo, Gyun Young; Chang, Soon Heung
2003-01-01
Under the deregulation environment, the performance enhancement of BOP in nuclear power plants is being highlighted. To analyze performance level of BOP, we use the performance test procedures provided from an authorized institution such as ASME. However, through plant investigation, it was proved that the requirements of the performance test procedures about the reliability and quantity of sensors was difficult to be satisfied. As a solution of this, state analysis method that are the expanded concept of signal validation, was proposed on the basis of the statistical and heuristic approaches. Authors recommended the statistical linear regression model by analyzing correlation among BOP parameters as a reference state analysis method. Its advantage is that its derivation is not heuristic, it is possible to calculate model uncertainty, and it is easy to apply to an actual plant. The error of the statistical linear regression model is below 3% under normal as well as abnormal system states. Additionally a neural network model was recommended since the statistical model is impossible to apply to the validation of all of the sensors and is sensitive to the outlier that is the signal located out of a statistical distribution. Because there are a lot of sensors need to be validated in BOP, wavelet analysis (WA) were applied as a pre-processor for the reduction of input dimension and for the enhancement of training accuracy. The outlier localization capability of WA enhanced the robustness of the neural network. The trained neural network restored the degraded signals to the values within ±3% of the true signals
Computerized statistical analysis with bootstrap method in nuclear medicine
International Nuclear Information System (INIS)
Zoccarato, O.; Sardina, M.; Zatta, G.; De Agostini, A.; Barbesti, S.; Mana, O.; Tarolo, G.L.
1988-01-01
Statistical analysis of data samples involves some hypothesis about the features of data themselves. The accuracy of these hypotheses can influence the results of statistical inference. Among the new methods of computer-aided statistical analysis, the bootstrap method appears to be one of the most powerful, thanks to its ability to reproduce many artificial samples starting from a single original sample and because it works without hypothesis about data distribution. The authors applied the bootstrap method to two typical situation of Nuclear Medicine Department. The determination of the normal range of serum ferritin, as assessed by radioimmunoassay and defined by the mean value ±2 standard deviations, starting from an experimental sample of small dimension, shows an unacceptable lower limit (ferritin plasmatic levels below zero). On the contrary, the results obtained by elaborating 5000 bootstrap samples gives ans interval of values (10.95 ng/ml - 72.87 ng/ml) corresponding to the normal ranges commonly reported. Moreover the authors applied the bootstrap method in evaluating the possible error associated with the correlation coefficient determined between left ventricular ejection fraction (LVEF) values obtained by first pass radionuclide angiocardiography with 99m Tc and 195m Au. The results obtained indicate a high degree of statistical correlation and give the range of r 2 values to be considered acceptable for this type of studies
Statistical analysis of the Ft. Calhoun reactor coolant pump system
International Nuclear Information System (INIS)
Heising, Carolyn D.
1998-01-01
In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety. As a demonstration of such an approach to plant maintenance and control, a specific system is analyzed: the reactor coolant pumps (RCPs) of the Ft. Calhoun nuclear power plant. This research uses capability analysis, Shewhart X-bar, R-charts, canonical correlation methods, and design of experiments to analyze the process for the state of statistical control. The results obtained show that six out of ten parameters are under control specifications limits and four parameters are not in the state of statistical control. The analysis shows that statistical process control methods can be applied as an early warning system capable of identifying significant equipment problems well in advance of traditional control room alarm indicators Such a system would provide operators with ample time to respond to possible emergency situations and thus improve plant safety and reliability. (author)
Statistical analysis of the Ft. Calhoun reactor coolant pump system
International Nuclear Information System (INIS)
Patel, Bimal; Heising, C.D.
1997-01-01
In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety. As a demonstration of such an approach, a specific system is analyzed: the reactor coolant pumps (RCPs) of the Ft. Calhoun nuclear power plant. This research uses capability analysis, Shewhart X-bar, R charts, canonical correlation methods, and design of experiments to analyze the process for the state of statistical control. The results obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The analysis shows that statistical process control methods can be applied as an early warning system capable of identifying significant equipment problems well in advance of traditional control room alarm indicators. Such a system would provide operators with ample time to respond to possible emergency situations and thus improve plant safety and reliability. (Author)
STATISTICAL ANALYSIS OF THE HEAVY NEUTRAL ATOMS MEASURED BY IBEX
International Nuclear Information System (INIS)
Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard; Galli, André; Livadiotis, George; Fuselier, Steve A.; McComas, David J.
2015-01-01
We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O and Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O and Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath
Software for statistical data analysis used in Higgs searches
International Nuclear Information System (INIS)
Gumpert, Christian; Moneta, Lorenzo; Cranmer, Kyle; Kreiss, Sven; Verkerke, Wouter
2014-01-01
The analysis and interpretation of data collected by the Large Hadron Collider (LHC) requires advanced statistical tools in order to quantify the agreement between observation and theoretical models. RooStats is a project providing a statistical framework for data analysis with the focus on discoveries, confidence intervals and combination of different measurements in both Bayesian and frequentist approaches. It employs the RooFit data modelling language where mathematical concepts such as variables, (probability density) functions and integrals are represented as C++ objects. RooStats and RooFit rely on the persistency technology of the ROOT framework. The usage of a common data format enables the concept of digital publishing of complicated likelihood functions. The statistical tools have been developed in close collaboration with the LHC experiments to ensure their applicability to real-life use cases. Numerous physics results have been produced using the RooStats tools, with the discovery of the Higgs boson by the ATLAS and CMS experiments being certainly the most popular among them. We will discuss tools currently used by LHC experiments to set exclusion limits, to derive confidence intervals and to estimate discovery significances based on frequentist statistics and the asymptotic behaviour of likelihood functions. Furthermore, new developments in RooStats and performance optimisation necessary to cope with complex models depending on more than 1000 variables will be reviewed
A general framework for the regression analysis of pooled biomarker assessments.
Liu, Yan; McMahan, Christopher; Gallagher, Colin
2017-07-10
As a cost-efficient data collection mechanism, the process of assaying pooled biospecimens is becoming increasingly common in epidemiological research; for example, pooling has been proposed for the purpose of evaluating the diagnostic efficacy of biological markers (biomarkers). To this end, several authors have proposed techniques that allow for the analysis of continuous pooled biomarker assessments. Regretfully, most of these techniques proceed under restrictive assumptions, are unable to account for the effects of measurement error, and fail to control for confounding variables. These limitations are understandably attributable to the complex structure that is inherent to measurements taken on pooled specimens. Consequently, in order to provide practitioners with the tools necessary to accurately and efficiently analyze pooled biomarker assessments, herein, a general Monte Carlo maximum likelihood-based procedure is presented. The proposed approach allows for the regression analysis of pooled data under practically all parametric models and can be used to directly account for the effects of measurement error. Through simulation, it is shown that the proposed approach can accurately and efficiently estimate all unknown parameters and is more computational efficient than existing techniques. This new methodology is further illustrated using monocyte chemotactic protein-1 data collected by the Collaborative Perinatal Project in an effort to assess the relationship between this chemokine and the risk of miscarriage. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Kimura, Yayoi; Yanagimachi, Masakatsu; Ino, Yoko; Aketagawa, Mao; Matsuo, Michie; Okayama, Akiko; Shimizu, Hiroyuki; Oba, Kunihiro; Morioka, Ichiro; Imagawa, Tomoyuki; Kaneko, Tetsuji; Yokota, Shumpei; Hirano, Hisashi; Mori, Masaaki
2017-01-01
Kawasaki disease (KD) is a systemic vasculitis and childhood febrile disease that can lead to cardiovascular complications. The diagnosis of KD depends on its clinical features, and thus it is sometimes difficult to make a definitive diagnosis. In order to identify diagnostic serum biomarkers for KD, we explored serum KD-related proteins, which differentially expressed during the acute and recovery phases of two patients by mass spectrometry (MS). We identified a total of 1,879 proteins by MS-based proteomic analysis. The levels of three of these proteins, namely lipopolysaccharide-binding protein (LBP), leucine-rich alpha-2-glycoprotein (LRG1), and angiotensinogen (AGT), were higher in acute phase patients. In contrast, the level of retinol-binding protein 4 (RBP4) was decreased. To confirm the usefulness of these proteins as biomarkers, we analyzed a total of 270 samples, including those collected from 55 patients with acute phase KD, by using western blot analysis and microarray enzyme-linked immunosorbent assays (ELISAs). Over the course of this experiment, we determined that the expression level of these proteins changes specifically in the acute phase of KD, rather than the recovery phase of KD or other febrile illness. Thus, LRG1 could be used as biomarkers to facilitate KD diagnosis based on clinical features. PMID:28262744
Conditional Probability Analysis: A Statistical Tool for Environmental Analysis.
The use and application of environmental conditional probability analysis (CPA) is relatively recent. The first presentation using CPA was made in 2002 at the New England Association of Environmental Biologists Annual Meeting in Newport. Rhode Island. CPA has been used since the...
A statistical test for outlier identification in data envelopment analysis
Directory of Open Access Journals (Sweden)
Morteza Khodabin
2010-09-01
Full Text Available In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the presented method, each observation is deleted from the sample once and the resulting linear program is solved, leading to a distribution of efficiency estimates. Based on the achieved distribution, a pared test is designed to identify the potential outlier(s. We illustrate the method through a real data set. The method could be used in a first step, as an exploratory data analysis, before using any frontier estimation.
Statistical analysis of the determinations of the Sun's Galactocentric distance
Malkin, Zinovy
2013-02-01
Based on several tens of R0 measurements made during the past two decades, several studies have been performed to derive the best estimate of R0. Some used just simple averaging to derive a result, whereas others provided comprehensive analyses of possible errors in published results. In either case, detailed statistical analyses of data used were not performed. However, a computation of the best estimates of the Galactic rotation constants is not only an astronomical but also a metrological task. Here we perform an analysis of 53 R0 measurements (published in the past 20 years) to assess the consistency of the data. Our analysis shows that they are internally consistent. It is also shown that any trend in the R0 estimates from the last 20 years is statistically negligible, which renders the presence of a bandwagon effect doubtful. On the other hand, the formal errors in the published R0 estimates improve significantly with time.
Statistical analysis of first period of operation of FTU Tokamak
International Nuclear Information System (INIS)
Crisanti, F.; Apruzzese, G.; Frigione, D.; Kroegler, H.; Lovisetto, L.; Mazzitelli, G.; Podda, S.
1996-09-01
On the FTU Tokamak the plasma physics operations started on the 20/4/90. The first plasma had a plasma current Ip=0.75 MA for about a second. The experimental phase lasted until 7/7/94, when a long shut-down begun for installing the toroidal limiter in the inner side of the vacuum vessel. In these four years of operations plasma experiments have been successfully exploited, e.g. experiments of single and multiple pellet injections; full current drive up to Ip=300 KA was obtained by using waves at the frequency of the Lower Hybrid; analysis of ohmic plasma parameters with different materials (from the low Z silicon to high Z tungsten) as plasma facing element was performed. In this work a statistical analysis of the full period of operation is presented. Moreover, a comparison with the statistical data from other Tokamaks is attempted
Statistics in experimental design, preprocessing, and analysis of proteomics data.
Jung, Klaus
2011-01-01
High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Statistical analysis of RHIC beam position monitors performance
Calaga, R.; Tomás, R.
2004-04-01
A detailed statistical analysis of beam position monitors (BPM) performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.
Statistical analysis of RHIC beam position monitors performance
Directory of Open Access Journals (Sweden)
R. Calaga
2004-04-01
Full Text Available A detailed statistical analysis of beam position monitors (BPM performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.
Yun, Yong-Huan; Deng, Bai-Chuan; Cao, Dong-Sheng; Wang, Wei-Ting; Liang, Yi-Zeng
2016-03-10
Biomarker discovery is one important goal in metabolomics, which is typically modeled as selecting the most discriminating metabolites for classification and often referred to as variable importance analysis or variable selection. Until now, a number of variable importance analysis methods to discover biomarkers in the metabolomics studies have been proposed. However, different methods are mostly likely to generate different variable ranking results due to their different principles. Each method generates a variable ranking list just as an expert presents an opinion. The problem of inconsistency between different variable ranking methods is often ignored. To address this problem, a simple and ideal solution is that every ranking should be taken into account. In this study, a strategy, called rank aggregation, was employed. It is an indispensable tool for merging individual ranking lists into a single "super"-list reflective of the overall preference or importance within the population. This "super"-list is regarded as the final ranking for biomarker discovery. Finally, it was used for biomarkers discovery and selecting the best variable subset with the highest predictive classification accuracy. Nine methods were used, including three univariate filtering and six multivariate methods. When applied to two metabolic datasets (Childhood overweight dataset and Tubulointerstitial lesions dataset), the results show that the performance of rank aggregation has improved greatly with higher prediction accuracy compared with using all variables. Moreover, it is also better than penalized method, least absolute shrinkage and selectionator operator (LASSO), with higher prediction accuracy or less number of selected variables which are more interpretable. Copyright © 2016 Elsevier B.V. All rights reserved.
Common pitfalls in statistical analysis: Odds versus risk
Ranganathan, Priya; Aggarwal, Rakesh; Pramesh, C. S.
2015-01-01
In biomedical research, we are often interested in quantifying the relationship between an exposure and an outcome. “Odds” and “Risk” are the most common terms which are used as measures of association between variables. In this article, which is the fourth in the series of common pitfalls in statistical analysis, we explain the meaning of risk and odds and the difference between the two. PMID:26623395
Statistical and machine learning approaches for network analysis
Dehmer, Matthias
2012-01-01
Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation
Statistical Challenges of Big Data Analysis in Medicine
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2015-01-01
Roč. 3, č. 1 (2015), s. 24-27 ISSN 1805-8698 R&D Projects: GA ČR GA13-23940S Grant - others:CESNET Development Fund(CZ) 494/2013 Institutional support: RVO:67985807 Keywords : big data * variable selection * classification * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research http://www.ijbh.org/ijbh2015-1.pdf
Research and Development on Food Nutrition Statistical Analysis Software System
Du Li; Ke Yun
2013-01-01
Designing and developing a set of food nutrition component statistical analysis software can realize the automation of nutrition calculation, improve the nutrition processional professional’s working efficiency and achieve the informatization of the nutrition propaganda and education. In the software development process, the software engineering method and database technology are used to calculate the human daily nutritional intake and the intelligent system is used to evaluate the user’s hea...
Maximum Likelihood, Consistency and Data Envelopment Analysis: A Statistical Foundation
Rajiv D. Banker
1993-01-01
This paper provides a formal statistical basis for the efficiency evaluation techniques of data envelopment analysis (DEA). DEA estimators of the best practice monotone increasing and concave production function are shown to be also maximum likelihood estimators if the deviation of actual output from the efficient output is regarded as a stochastic variable with a monotone decreasing probability density function. While the best practice frontier estimator is biased below the theoretical front...
Lifetime statistics of quantum chaos studied by a multiscale analysis
Di Falco, A.
2012-04-30
In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin.
Directory of Open Access Journals (Sweden)
Jeffrey Chu
Full Text Available Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Analysis of spectral data with rare events statistics
International Nuclear Information System (INIS)
Ilyushchenko, V.I.; Chernov, N.I.
1990-01-01
The case is considered of analyzing experimental data, when the results of individual experimental runs cannot be summed due to large systematic errors. A statistical analysis of the hypothesis about the persistent peaks in the spectra has been performed by means of the Neyman-Pearson test. The computations demonstrate the confidence level for the hypothesis about the presence of a persistent peak in the spectrum is proportional to the square root of the number of independent experimental runs, K. 5 refs
Australasian Resuscitation In Sepsis Evaluation trial statistical analysis plan.
Delaney, Anthony; Peake, Sandra L; Bellomo, Rinaldo; Cameron, Peter; Holdgate, Anna; Howe, Belinda; Higgins, Alisa; Presneill, Jeffrey; Webb, Steve
2013-10-01
The Australasian Resuscitation In Sepsis Evaluation (ARISE) study is an international, multicentre, randomised, controlled trial designed to evaluate the effectiveness of early goal-directed therapy compared with standard care for patients presenting to the ED with severe sepsis. In keeping with current practice, and taking into considerations aspects of trial design and reporting specific to non-pharmacologic interventions, this document outlines the principles and methods for analysing and reporting the trial results. The document is prepared prior to completion of recruitment into the ARISE study, without knowledge of the results of the interim analysis conducted by the data safety and monitoring committee and prior to completion of the two related international studies. The statistical analysis plan was designed by the ARISE chief investigators, and reviewed and approved by the ARISE steering committee. The data collected by the research team as specified in the study protocol, and detailed in the study case report form were reviewed. Information related to baseline characteristics, characteristics of delivery of the trial interventions, details of resuscitation and other related therapies, and other relevant data are described with appropriate comparisons between groups. The primary, secondary and tertiary outcomes for the study are defined, with description of the planned statistical analyses. A statistical analysis plan was developed, along with a trial profile, mock-up tables and figures. A plan for presenting baseline characteristics, microbiological and antibiotic therapy, details of the interventions, processes of care and concomitant therapies, along with adverse events are described. The primary, secondary and tertiary outcomes are described along with identification of subgroups to be analysed. A statistical analysis plan for the ARISE study has been developed, and is available in the public domain, prior to the completion of recruitment into the
Precision Statistical Analysis of Images Based on Brightness Distribution
Directory of Open Access Journals (Sweden)
Muzhir Shaban Al-Ani
2017-07-01
Full Text Available Study the content of images is considered an important topic in which reasonable and accurate analysis of images are generated. Recently image analysis becomes a vital field because of huge number of images transferred via transmission media in our daily life. These crowded media with images lead to highlight in research area of image analysis. In this paper, the implemented system is passed into many steps to perform the statistical measures of standard deviation and mean values of both color and grey images. Whereas the last step of the proposed method concerns to compare the obtained results in different cases of the test phase. In this paper, the statistical parameters are implemented to characterize the content of an image and its texture. Standard deviation, mean and correlation values are used to study the intensity distribution of the tested images. Reasonable results are obtained for both standard deviation and mean value via the implementation of the system. The major issue addressed in the work is concentrated on brightness distribution via statistical measures applying different types of lighting.
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2009-01-01
An All-in-One Resource for Using SAS and R to Carry out Common TasksProvides a path between languages that is easier than reading complete documentationSAS and R: Data Management, Statistical Analysis, and Graphics presents an easy way to learn how to perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and the creation of graphics, along with more complex applicat
Neutron activation and statistical analysis of pottery from Thera, Greece
International Nuclear Information System (INIS)
Kilikoglou, V.; Grimanis, A.P.; Karayannis, M.I.
1990-01-01
Neutron activation analysis, in combination with multivariate analysis of the generated data, was used for the chemical characterization of prehistoric pottery from the Greek islands of Thera, Melos (islands with similar geology) and Crete. The statistical procedure which proved that Theran pottery could be distinguished from Melian is described. This discrimination, attained for the first time, was mainly based on the concentrations of the trace elements Sm, Yb, Lu and Cr. Also, Cretan imports to both Thera and Melos were clearly separable from local products. (author) 22 refs.; 1 fig.; 4 tabs
Statistical Analysis of Hypercalcaemia Data related to Transferability
DEFF Research Database (Denmark)
Frølich, Anne; Nielsen, Bo Friis
2005-01-01
In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...... at conferences during the late eighties and early nineties. In this report we give a more detailed description of many of the analysis and provide some new results primarily by simultaneous studies of several databases....
Analysis of Preference Data Using Intermediate Test Statistic Abstract
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... West African Journal of Industrial and Academic Research Vol.7 No. 1 June ... Keywords:-Preference data, Friedman statistic, multinomial test statistic, intermediate test statistic. ... new method and consequently a new statistic ...
Statistical Analysis of 30 Years Rainfall Data: A Case Study
Arvind, G.; Ashok Kumar, P.; Girish Karthi, S.; Suribabu, C. R.
2017-07-01
Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
Kukla, Piotr; Kosior, Dariusz A; Tomaszewski, Andrzej; Ptaszyńska-Kopczyńska, Katarzyna; Widejko, Katarzyna; Długopolski, Robert; Skrzyński, Andrzej; Błaszczak, Piotr; Fijorek, Kamil; Kurzyna, Marcin
2017-07-01
Electrocardiography (ECG) is still one of the first tests performed at admission, mostly in patients (pts) with chest pain or dyspnea. The aim of this study was to assess the correlation between electrocardiographic abnormalities and cardiac biomarkers as well as echocardiographic parameter in patients with acute pulmonary embolism. We performed a retrospective analysis of 614 pts. (F/M 334/280; mean age of 67.9 ± 16.6 years) with confirmed acute pulmonary embolism (APE) who were enrolled to the ZATPOL-2 Registry between 2012 and 2014. Elevated cardiac biomarkers were observed in 358 pts (74.4%). In this group the presence of atrial fibrillation (p = .008), right axis deviation (p = .004), S 1 Q 3 T 3 sign (p electrocardiogram" were as follows: increased heart rate (OR 1.09, 95% CI 1.02-1.17, p = .012), elevated troponin concentration (OR 3.33, 95% CI 1.94-5.72, p = .000), and right ventricular overload (OR 2.30, 95% CI 1.17-4.53, p = .016). Electrocardiographic signs of right ventricular strain are strongly related to elevated cardiac biomarkers and echocardiographic signs of right ventricular overload. ECG may be used in preliminary risk stratification of patient with intermediate- or high-risk forms of APE. © 2017 Wiley Periodicals, Inc.
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Statistical wind analysis for near-space applications
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
STATISTICAL ANALYSIS OF SPORT MOVEMENT OBSERVATIONS: THE CASE OF ORIENTEERING
Directory of Open Access Journals (Sweden)
K. Amouzandeh
2017-09-01
Full Text Available Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope and non-spatial movement attributes (e.g. speed and heart rate of athletes. As the case study, an example dataset of movement observations acquired during the “orienteering” sport is presented and statistically analyzed.
Noise removing in encrypted color images by statistical analysis
Islam, N.; Puech, W.
2012-03-01
Cryptographic techniques are used to secure confidential data from unauthorized access but these techniques are very sensitive to noise. A single bit change in encrypted data can have catastrophic impact over the decrypted data. This paper addresses the problem of removing bit error in visual data which are encrypted using AES algorithm in the CBC mode. In order to remove the noise, a method is proposed which is based on the statistical analysis of each block during the decryption. The proposed method exploits local statistics of the visual data and confusion/diffusion properties of the encryption algorithm to remove the errors. Experimental results show that the proposed method can be used at the receiving end for the possible solution for noise removing in visual data in encrypted domain.
Statistical Analysis Of Failure Strength Of Material Using Weibull Distribution
International Nuclear Information System (INIS)
Entin Hartini; Mike Susmikanti; Antonius Sitompul
2008-01-01
In evaluation of ceramic and glass materials strength a statistical approach is necessary Strength of ceramic and glass depend on its measure and size distribution of flaws in these material. The distribution of strength for ductile material is narrow and close to a Gaussian distribution while strength of brittle materials as ceramic and glass following Weibull distribution. The Weibull distribution is an indicator of the failure of material strength resulting from a distribution of flaw size. In this paper, cumulative probability of material strength to failure probability, cumulative probability of failure versus fracture stress and cumulative probability of reliability of material were calculated. Statistical criteria calculation supporting strength analysis of Silicon Nitride material were done utilizing MATLAB. (author)
Multivariate statistical pattern recognition system for reactor noise analysis
International Nuclear Information System (INIS)
Gonzalez, R.C.; Howington, L.C.; Sides, W.H. Jr.; Kryter, R.C.
1976-01-01
A multivariate statistical pattern recognition system for reactor noise analysis was developed. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, and updating capabilities. System design emphasizes control of the false-alarm rate. The ability of the system to learn normal patterns of reactor behavior and to recognize deviations from these patterns was evaluated by experiments at the ORNL High-Flux Isotope Reactor (HFIR). Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the system
Reactor noise analysis by statistical pattern recognition methods
International Nuclear Information System (INIS)
Howington, L.C.; Gonzalez, R.C.
1976-01-01
A multivariate statistical pattern recognition system for reactor noise analysis is presented. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, updating, and data compacting capabilities. System design emphasizes control of the false-alarm rate. Its abilities to learn normal patterns, to recognize deviations from these patterns, and to reduce the dimensionality of data with minimum error were evaluated by experiments at the Oak Ridge National Laboratory (ORNL) High-Flux Isotope Reactor. Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the pattern recognition system
Multivariate statistical pattern recognition system for reactor noise analysis
International Nuclear Information System (INIS)
Gonzalez, R.C.; Howington, L.C.; Sides, W.H. Jr.; Kryter, R.C.
1975-01-01
A multivariate statistical pattern recognition system for reactor noise analysis was developed. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, and updating capabilities. System design emphasizes control of the false-alarm rate. The ability of the system to learn normal patterns of reactor behavior and to recognize deviations from these patterns was evaluated by experiments at the ORNL High-Flux Isotope Reactor (HFIR). Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the system. 19 references
Statistical Analysis of Sport Movement Observations: the Case of Orienteering
Amouzandeh, K.; Karimipour, F.
2017-09-01
Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope) and non-spatial movement attributes (e.g. speed and heart rate) of athletes. As the case study, an example dataset of movement observations acquired during the "orienteering" sport is presented and statistically analyzed.
Statistical Mechanics Analysis of ATP Binding to a Multisubunit Enzyme
International Nuclear Information System (INIS)
Zhang Yun-Xin
2014-01-01
Due to inter-subunit communication, multisubunit enzymes usually hydrolyze ATP in a concerted fashion. However, so far the principle of this process remains poorly understood. In this study, from the viewpoint of statistical mechanics, a simple model is presented. In this model, we assume that the binding of ATP will change the potential of the corresponding enzyme subunit, and the degree of this change depends on the state of its adjacent subunits. The probability of enzyme in a given state satisfies the Boltzmann's distribution. Although it looks much simple, this model can fit the recent experimental data of chaperonin TRiC/CCT well. From this model, the dominant state of TRiC/CCT can be obtained. This study provide a new way to understand biophysical processe by statistical mechanics analysis. (interdisciplinary physics and related areas of science and technology)
Statistical methods for data analysis in particle physics
AUTHOR|(CDS)2070643
2015-01-01
This concise set of course-based notes provides the reader with the main concepts and tools to perform statistical analysis of experimental data, in particular in the field of high-energy physics (HEP). First, an introduction to probability theory and basic statistics is given, mainly as reminder from advanced undergraduate studies, yet also in view to clearly distinguish the Frequentist versus Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on upper limits as many applications in HEP concern hypothesis testing, where often the main goal is to provide better and better limits so as to be able to distinguish eventually between competing hypotheses or to rule out some of them altogether. Many worked examples will help newcomers to the field and graduate students to understand the pitfalls in applying theoretical concepts to actual data
Statistical analysis of subjective preferences for video enhancement
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
Statistical Analysis of Radio Propagation Channel in Ruins Environment
Directory of Open Access Journals (Sweden)
Jiao He
2015-01-01
Full Text Available The cellphone based localization system for search and rescue in complex high density ruins has attracted a great interest in recent years, where the radio channel characteristics are critical for design and development of such a system. This paper presents a spatial smoothing estimation via rotational invariance technique (SS-ESPRIT for radio channel characterization of high density ruins. The radio propagations at three typical mobile communication bands (0.9, 1.8, and 2 GHz are investigated in two different scenarios. Channel parameters, such as arrival time, delays, and complex amplitudes, are statistically analyzed. Furthermore, a channel simulator is built based on these statistics. By comparison analysis of average excess delay and delay spread, the validation results show a good agreement between the measurements and channel modeling results.
Statistical analysis of the spatial distribution of galaxies and clusters
International Nuclear Information System (INIS)
Cappi, Alberto
1993-01-01
This thesis deals with the analysis of the distribution of galaxies and clusters, describing some observational problems and statistical results. First chapter gives a theoretical introduction, aiming to describe the framework of the formation of structures, tracing the history of the Universe from the Planck time, t_p = 10"-"4"3 sec and temperature corresponding to 10"1"9 GeV, to the present epoch. The most usual statistical tools and models of the galaxy distribution, with their advantages and limitations, are described in chapter two. A study of the main observed properties of galaxy clustering, together with a detailed statistical analysis of the effects of selecting galaxies according to apparent magnitude or diameter, is reported in chapter three. Chapter four delineates some properties of groups of galaxies, explaining the reasons of discrepant results on group distributions. Chapter five is a study of the distribution of galaxy clusters, with different statistical tools, like correlations, percolation, void probability function and counts in cells; it is found the same scaling-invariant behaviour of galaxies. Chapter six describes our finding that rich galaxy clusters too belong to the fundamental plane of elliptical galaxies, and gives a discussion of its possible implications. Finally chapter seven reviews the possibilities offered by multi-slit and multi-fibre spectrographs, and I present some observational work on nearby and distant galaxy clusters. In particular, I show the opportunities offered by ongoing surveys of galaxies coupled with multi-object fibre spectrographs, focusing on the ESO Key Programme A galaxy redshift survey in the south galactic pole region to which I collaborate and on MEFOS, a multi-fibre instrument with automatic positioning. Published papers related to the work described in this thesis are reported in the last appendix. (author) [fr
STATISTICAL ANALYSIS OF TANK 18F FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 18F as per the statistical sampling plan developed by Shine [1]. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL [2]. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results [3] to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL{sub 95%}) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 18F. The uncertainty is quantified in this report by an upper 95% confidence limit (UCL{sub 95%}) on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL{sub 95%} was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Statistical Analysis Of Tank 19F Floor Sample Results
International Nuclear Information System (INIS)
Harris, S.
2010-01-01
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Energy Technology Data Exchange (ETDEWEB)
Fossi, M. C.; Mori, G.; Baroni, D.; Bianchi, N. [Siena Univ., Siena (Italy). Dipt. di Scienze Ambientali
2001-08-01
Toxicological risk assessment in the Orbetello lagoon (Grosseto, Italy) was carried by two approaches: biomonitoring based on estimates of residue levels in indicator species and biomarkers studies by which their responses to chemical and environmental stress were evaluated. In specimens of Carcinus aestuarii sampled in three differently impacted areas of the lagoon, levels of chlorinated hydrocarbons (DDTs, PCBs and HCBs), heavy metals (Pb, Cd and Hg) and 3 specific biomarkers (mixed function oxidase (MFO) induction, butyrylcholinesterase (BChE) inhibition and porphyrin accumulation) were measured. Overall results indicate that the lagoon is highly polluted. Of the three study sites, the highest concentrations of HCBs, DDTs and PCBs were observed in specimens from the mouth of the river Albegna, in which butyrylcholinesterase induction usually attributed to organophosphates (OPs) and carbamates (CBs), was considerable, as well. Specimens from S. Liberata, once known to be the most pristine site, showed clear signs of environmental degradation with high levels of Pb, Cd and organochlorine compounds, including PCBs. Benzopyrene monooxygenase (BPMO) values also seem to confirm such chemical stress. High levels of Hg and largely accumulated protoporphyrins and total porphyrins in C. aestuarii of the Sitoco site are only partially ascribed to the occurrence of Hg, as the presence of some unknown xenobiotics is likely. [Italian] In questo studio e' stato valutato il potenziale pericolo di composti inquinanti su una comunita' naturale della Laguna di Orbetello (Grosseto) utilizzando sia indagini di biomonitoraggio basate sulla stima dei livelli di residui in organismi bioindicatori, si una metodologia innovativa come lo studio di biomarkers (intendendo con cio' la valutazione delle risposte che un organismo genera nei confronti di uno strss chimico-ambientale). Su esemplari di Carcinus aestuarii, scelti come organismi bioindicatori e campionati in tre aree
Irigoyen, Antonio; Jimenez-Luna, Cristina; Benavides, Manuel; Caba, Octavio; Gallego, Javier; Ortuño, Francisco Manuel; Guillen-Ponce, Carmen; Rojas, Ignacio; Aranda, Enrique; Torres, Carolina; Prados, Jose
2018-01-01
Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses ('gained' genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.
Statistical analysis plan for the EuroHYP-1 trial
DEFF Research Database (Denmark)
Winkel, Per; Bath, Philip M; Gluud, Christian
2017-01-01
Score; (4) brain infarct size at 48 +/-24 hours; (5) EQ-5D-5 L score, and (6) WHODAS 2.0 score. Other outcomes are: the primary safety outcome serious adverse events; and the incremental cost-effectiveness, and cost utility ratios. The analysis sets include (1) the intention-to-treat population, and (2...... outcome), logistic regression (binary outcomes), general linear model (continuous outcomes), and the Poisson or negative binomial model (rate outcomes). DISCUSSION: Major adjustments compared with the original statistical analysis plan encompass: (1) adjustment of analyses by nationality; (2) power......) the per protocol population. The sample size is estimated to 800 patients (5% type 1 and 20% type 2 errors). All analyses are adjusted for the protocol-specified stratification variables (nationality of centre), and the minimisation variables. In the analysis, we use ordinal regression (the primary...
Data and statistical methods for analysis of trends and patterns
International Nuclear Information System (INIS)
Atwood, C.L.; Gentillon, C.D.; Wilson, G.E.
1992-11-01
This report summarizes topics considered at a working meeting on data and statistical methods for analysis of trends and patterns in US commercial nuclear power plants. This meeting was sponsored by the Office of Analysis and Evaluation of Operational Data (AEOD) of the Nuclear Regulatory Commission (NRC). Three data sets are briefly described: Nuclear Plant Reliability Data System (NPRDS), Licensee Event Report (LER) data, and Performance Indicator data. Two types of study are emphasized: screening studies, to see if any trends or patterns appear to be present; and detailed studies, which are more concerned with checking the analysis assumptions, modeling any patterns that are present, and searching for causes. A prescription is given for a screening study, and ideas are suggested for a detailed study, when the data take of any of three forms: counts of events per time, counts of events per demand, and non-event data
STATISTICS. The reusable holdout: Preserving validity in adaptive data analysis.
Dwork, Cynthia; Feldman, Vitaly; Hardt, Moritz; Pitassi, Toniann; Reingold, Omer; Roth, Aaron
2015-08-07
Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. Existing approaches to ensuring the validity of inferences drawn from data assume a fixed procedure to be performed, selected before the data are examined. In common practice, however, data analysis is an intrinsically adaptive process, with new analyses generated on the basis of data exploration, as well as the results of previous analyses on the same data. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis. As an application, we show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses. Copyright © 2015, American Association for the Advancement of Science.
International Conference on Modern Problems of Stochastic Analysis and Statistics
2017-01-01
This book brings together the latest findings in the area of stochastic analysis and statistics. The individual chapters cover a wide range of topics from limit theorems, Markov processes, nonparametric methods, acturial science, population dynamics, and many others. The volume is dedicated to Valentin Konakov, head of the International Laboratory of Stochastic Analysis and its Applications on the occasion of his 70th birthday. Contributions were prepared by the participants of the international conference of the international conference “Modern problems of stochastic analysis and statistics”, held at the Higher School of Economics in Moscow from May 29 - June 2, 2016. It offers a valuable reference resource for researchers and graduate students interested in modern stochastics.
Prescott, Jeffrey William
2013-02-01
The importance of medical imaging for clinical decision making has been steadily increasing over the last four decades. Recently, there has also been an emphasis on medical imaging for preclinical decision making, i.e., for use in pharamaceutical and medical device development. There is also a drive towards quantification of imaging findings by using quantitative imaging biomarkers, which can improve sensitivity, specificity, accuracy and reproducibility of imaged characteristics used for diagnostic and therapeutic decisions. An important component of the discovery, characterization, validation and application of quantitative imaging biomarkers is the extraction of information and meaning from images through image processing and subsequent analysis. However, many advanced image processing and analysis methods are not applied directly to questions of clinical interest, i.e., for diagnostic and therapeutic decision making, which is a consideration that should be closely linked to the development of such algorithms. This article is meant to address these concerns. First, quantitative imaging biomarkers are introduced by providing definitions and concepts. Then, potential applications of advanced image processing and analysis to areas of quantitative imaging biomarker research are described; specifically, research into osteoarthritis (OA), Alzheimer's disease (AD) and cancer is presented. Then, challenges in quantitative imaging biomarker research are discussed. Finally, a conceptual framework for integrating clinical and preclinical considerations into the development of quantitative imaging biomarkers and their computer-assisted methods of extraction is presented.
Song, Yimeng; Zhong, Lijun; Zhou, Juntuo; Lu, Min; Xing, Tianying; Ma, Lulin; Shen, Jing
2017-12-01
Renal cell carcinoma (RCC) is a malignant and metastatic cancer with 95% mortality, and clear cell RCC (ccRCC) is the most observed among the five major subtypes of RCC. Specific biomarkers that can distinguish cancer tissues from adjacent normal tissues should be developed to diagnose this disease in early stages and conduct a reliable prognostic evaluation. Data-independent acquisition (DIA) strategy has been widely employed in proteomic analysis because of various advantages, including enhanced protein coverage and reliable data acquisition. In this study, a DIA workflow is constructed on a quadrupole-Orbitrap LC-MS platform to reveal dysregulated proteins between ccRCC and adjacent normal tissues. More than 4000 proteins are identified, 436 of these proteins are dysregulated in ccRCC tissues. Bioinformatic analysis reveals that multiple pathways and Gene Ontology items are strongly associated with ccRCC. The expression levels of L-lactate dehydrogenase A chain, annexin A4, nicotinamide N-methyltransferase, and perilipin-2 examined through RT-qPCR, Western blot, and immunohistochemistry confirm the validity of the proteomic analysis results. The proposed DIA workflow yields optimum time efficiency and data reliability and provides a good choice for proteomic analysis in biological and clinical studies, and these dysregulated proteins might be potential biomarkers for ccRCC diagnosis. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Acoustic and temporal analysis of speech: A potential biomarker for schizophrenia.
LENUS (Irish Health Repository)
Rapcan, Viliam
2010-11-01
Currently, there are no established objective biomarkers for the diagnosis or monitoring of schizophrenia. It has been previously reported that there are notable qualitative differences in the speech of schizophrenics. The objective of this study was to determine whether a quantitative acoustic and temporal analysis of speech may be a potential biomarker for schizophrenia. In this study, 39 schizophrenic patients and 18 controls were digitally recorded reading aloud an emotionally neutral text passage from a children\\'s story. Temporal, energy and vocal pitch features were automatically extracted from the recordings. A classifier based on linear discriminant analysis was employed to differentiate between controls and schizophrenic subjects. Processing the recordings with the algorithm developed demonstrated that it is possible to differentiate schizophrenic patients and controls with a classification accuracy of 79.4% (specificity=83.6%, sensitivity=75.2%) based on speech pause related parameters extracted from recordings carried out in standard office (non-studio) environments. Acoustic and temporal analysis of speech may represent a potential tool for the objective analysis in schizophrenia.
Consolidity analysis for fully fuzzy functions, matrices, probability and statistics
Directory of Open Access Journals (Sweden)
Walaa Ibrahim Gabr
2015-03-01
Full Text Available The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, the handling of fuzzy matrices covered determinants of fuzzy matrices, the eigenvalues of fuzzy matrices, and solving least-squares fuzzy linear equations. The approach demonstrated to be also applicable in a systematic way in handling new fuzzy probabilistic and statistical problems. This included extending the conventional probabilistic and statistical analysis for handling fuzzy random data. Application also covered the consolidity of fuzzy optimization problems. Various numerical examples solved have demonstrated that the new consolidity concept is highly effective in solving in a compact form the propagation of fuzziness in linear, nonlinear, multivariable and dynamic problems with different types of complexities. Finally, it is demonstrated that the implementation of the suggested fuzzy mathematics can be easily embedded within normal mathematics through building special fuzzy functions library inside the computational Matlab Toolbox or using other similar software languages.
FADTTS: functional analysis of diffusion tensor tract statistics.
Zhu, Hongtu; Kong, Linglong; Li, Runze; Styner, Martin; Gerig, Guido; Lin, Weili; Gilmore, John H
2011-06-01
The aim of this paper is to present a functional analysis of a diffusion tensor tract statistics (FADTTS) pipeline for delineating the association between multiple diffusion properties along major white matter fiber bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these white matter tract properties in various diffusion tensor imaging studies. The FADTTS integrates five statistical tools: (i) a multivariate varying coefficient model for allowing the varying coefficient functions in terms of arc length to characterize the varying associations between fiber bundle diffusion properties and a set of covariates, (ii) a weighted least squares estimation of the varying coefficient functions, (iii) a functional principal component analysis to delineate the structure of the variability in fiber bundle diffusion properties, (iv) a global test statistic to test hypotheses of interest, and (v) a simultaneous confidence band to quantify the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of FADTTS. We apply FADTTS to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment. FADTTS can be used to facilitate the understanding of normal brain development, the neural bases of neuropsychiatric disorders, and the joint effects of environmental and genetic factors on white matter fiber bundles. The advantages of FADTTS compared with the other existing approaches are that they are capable of modeling the structured inter-subject variability, testing the joint effects, and constructing their simultaneous confidence bands. However, FADTTS is not crucial for estimation and reduces to the functional analysis method for the single measure. Copyright © 2011 Elsevier Inc. All rights reserved.
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
Signal processing and statistical analysis of spaced-based measurements
International Nuclear Information System (INIS)
Iranpour, K.
1996-05-01
The reports deals with data obtained by the ROSE rocket project. This project was designed to investigate the low altitude auroral instabilities in the electrojet region. The spectral and statistical analyses indicate the existence of unstable waves in the ionized gas in the region. An experimentally obtained dispersion relation for these waves were established. It was demonstrated that the characteristic phase velocities are much lower than what is expected from the standard theoretical results. This analysis of the ROSE data indicate the cascading of energy from lower to higher frequencies. 44 refs., 54 figs
Statistical Analysis of Designed Experiments Theory and Applications
Tamhane, Ajit C
2012-01-01
A indispensable guide to understanding and designing modern experiments The tools and techniques of Design of Experiments (DOE) allow researchers to successfully collect, analyze, and interpret data across a wide array of disciplines. Statistical Analysis of Designed Experiments provides a modern and balanced treatment of DOE methodology with thorough coverage of the underlying theory and standard designs of experiments, guiding the reader through applications to research in various fields such as engineering, medicine, business, and the social sciences. The book supplies a foundation for the
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....
Using R for Data Management, Statistical Analysis, and Graphics
Horton, Nicholas J
2010-01-01
This title offers quick and easy access to key element of documentation. It includes worked examples across a wide variety of applications, tasks, and graphics. "Using R for Data Management, Statistical Analysis, and Graphics" presents an easy way to learn how to perform an analytical task in R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation and vast number of add-on packages. Organized by short, clear descriptive entries, the book covers many common tasks, such as data management, descriptive summaries, inferential proc
Statistical analysis of anomalous transport in resistive interchange turbulence
International Nuclear Information System (INIS)
Sugama, Hideo; Wakatani, Masahiro.
1992-01-01
A new anomalous transport model for resistive interchange turbulence is derived from statistical analysis applying two-scale direct-interaction approximation to resistive magnetohydrodynamic equations with a gravity term. Our model is similar to the K-ε model for eddy viscosity of turbulent shear flows in that anomalous transport coefficients are expressed in terms of by the turbulent kinetic energy K and its dissipation rate ε while K and ε are determined by transport equations. This anomalous transport model can describe some nonlocal effects such as those from boundary conditions which cannot be treated by conventional models based on the transport coefficients represented by locally determined plasma parameters. (author)
Spatial Analysis Along Networks Statistical and Computational Methods
Okabe, Atsuyuki
2012-01-01
In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process
Statistical analysis of the W Cyg light curve
International Nuclear Information System (INIS)
Klyus, I.A.
1983-01-01
A statistical analysis of the light curve of W Cygni has been carried out. The process of brightness variations brightness of the star is shown to be a stationary stochastic one. The hypothesis of stationarity of the process was checked at the significance level of α=0.05. Oscillations of the brightness with average durations of 131 and 250 days have been found. It is proved that oscillations are narrow-band noise, i.e. cycles. Peaks on the power spectrum corresponding to these cycles exceed 99% confidence interval. It has been stated that the oscillations are independent
CFAssay: statistical analysis of the colony formation assay
International Nuclear Information System (INIS)
Braselmann, Herbert; Michna, Agata; Heß, Julia; Unger, Kristian
2015-01-01
Colony formation assay is the gold standard to determine cell reproductive death after treatment with ionizing radiation, applied for different cell lines or in combination with other treatment modalities. Associated linear-quadratic cell survival curves can be calculated with different methods. For easy code exchange and methodological standardisation among collaborating laboratories a software package CFAssay for R (R Core Team, R: A Language and Environment for Statistical Computing, 2014) was established to perform thorough statistical analysis of linear-quadratic cell survival curves after treatment with ionizing radiation and of two-way designs of experiments with chemical treatments only. CFAssay offers maximum likelihood and related methods by default and the least squares or weighted least squares method can be optionally chosen. A test for comparision of cell survival curves and an ANOVA test for experimental two-way designs are provided. For the two presented examples estimated parameters do not differ much between maximum-likelihood and least squares. However the dispersion parameter of the quasi-likelihood method is much more sensitive for statistical variation in the data than the multiple R 2 coefficient of determination from the least squares method. The dispersion parameter for goodness of fit and different plot functions in CFAssay help to evaluate experimental data quality. As open source software interlaboratory code sharing between users is facilitated
Yokoi, Norihide; Beppu, Masayuki; Yoshida, Eri; Hoshikawa, Ritsuko; Hidaka, Shihomi; Matsubara, Toshiya; Shinohara, Masami; Irino, Yasuhiro; Hatano, Naoya; Seino, Susumu
2015-01-01
Biomarkers for the development of type 2 diabetes (T2D) are useful for prediction and intervention of the disease at earlier stages. In this study, we performed a longitudinal study of changes in metabolites using an animal model of T2D, the spontaneously diabetic Torii (SDT) rat. Fasting plasma samples of SDT and control Sprague-Dawley (SD) rats were collected from 6 to 24 weeks of age, and subjected to gas chromatography–mass spectrometry-based metabolome analysis. Fifty-nine hydrophilic me...
Implementation of statistical analysis methods for medical physics data
International Nuclear Information System (INIS)
Teixeira, Marilia S.; Pinto, Nivia G.P.; Barroso, Regina C.; Oliveira, Luis F.
2009-01-01
The objective of biomedical research with different radiation natures is to contribute for the understanding of the basic physics and biochemistry of the biological systems, the disease diagnostic and the development of the therapeutic techniques. The main benefits are: the cure of tumors through the therapy, the anticipated detection of diseases through the diagnostic, the using as prophylactic mean for blood transfusion, etc. Therefore, for the better understanding of the biological interactions occurring after exposure to radiation, it is necessary for the optimization of therapeutic procedures and strategies for reduction of radioinduced effects. The group pf applied physics of the Physics Institute of UERJ have been working in the characterization of biological samples (human tissues, teeth, saliva, soil, plants, sediments, air, water, organic matrixes, ceramics, fossil material, among others) using X-rays diffraction and X-ray fluorescence. The application of these techniques for measurement, analysis and interpretation of the biological tissues characteristics are experimenting considerable interest in the Medical and Environmental Physics. All quantitative data analysis must be initiated with descriptive statistic calculation (means and standard deviations) in order to obtain a previous notion on what the analysis will reveal. It is well known que o high values of standard deviation found in experimental measurements of biologicals samples can be attributed to biological factors, due to the specific characteristics of each individual (age, gender, environment, alimentary habits, etc). This work has the main objective the development of a program for the use of specific statistic methods for the optimization of experimental data an analysis. The specialized programs for this analysis are proprietary, another objective of this work is the implementation of a code which is free and can be shared by the other research groups. As the program developed since the
Directory of Open Access Journals (Sweden)
Astrid Wachter
2015-11-01
Full Text Available Mastering the systematic analysis of tumor tissues on a large scale has long been a technical challenge for proteomics. In 2001, reverse phase protein arrays (RPPA were added to the repertoire of existing immunoassays, which, for the first time, allowed a profiling of minute amounts of tumor lysates even after microdissection. A characteristic feature of RPPA is its outstanding sample capacity permitting the analysis of thousands of samples in parallel as a routine task. Until today, the RPPA approach has matured to a robust and highly sensitive high-throughput platform, which is ideally suited for biomarker discovery. Concomitant with technical advancements, new bioinformatic tools were developed for data normalization and data analysis as outlined in detail in this review. Furthermore, biomarker signatures obtained by different RPPA screens were compared with another or with that obtained by other proteomic formats, if possible. Options for overcoming the downside of RPPA, which is the need to steadily validate new antibody batches, will be discussed. Finally, a debate on using RPPA to advance personalized medicine will conclude this article.
Data analysis for radiological characterisation: Geostatistical and statistical complementarity
International Nuclear Information System (INIS)
Desnoyers, Yvon; Dubot, Didier
2012-01-01
Radiological characterisation may cover a large range of evaluation objectives during a decommissioning and dismantling (D and D) project: removal of doubt, delineation of contaminated materials, monitoring of the decontamination work and final survey. At each stage, collecting relevant data to be able to draw the conclusions needed is quite a big challenge. In particular two radiological characterisation stages require an advanced sampling process and data analysis, namely the initial categorization and optimisation of the materials to be removed and the final survey to demonstrate compliance with clearance levels. On the one hand the latter is widely used and well developed in national guides and norms, using random sampling designs and statistical data analysis. On the other hand a more complex evaluation methodology has to be implemented for the initial radiological characterisation, both for sampling design and for data analysis. The geostatistical framework is an efficient way to satisfy the radiological characterisation requirements providing a sound decision-making approach for the decommissioning and dismantling of nuclear premises. The relevance of the geostatistical methodology relies on the presence of a spatial continuity for radiological contamination. Thus geo-statistics provides reliable methods for activity estimation, uncertainty quantification and risk analysis, leading to a sound classification of radiological waste (surfaces and volumes). This way, the radiological characterization of contaminated premises can be divided into three steps. First, the most exhaustive facility analysis provides historical and qualitative information. Then, a systematic (exhaustive or not) surface survey of the contamination is implemented on a regular grid. Finally, in order to assess activity levels and contamination depths, destructive samples are collected at several locations within the premises (based on the surface survey results) and analysed. Combined with
Topics in statistical data analysis for high-energy physics
International Nuclear Information System (INIS)
Cowan, G.
2011-01-01
These lectures concert two topics that are becoming increasingly important in the analysis of high-energy physics data: Bayesian statistics and multivariate methods. In the Bayesian approach, we extend the interpretation of probability not only to cover the frequency of repeatable outcomes but also to include a degree of belief. In this way we are able to associate probability with a hypothesis and thus to answer directly questions that cannot be addressed easily with traditional frequentist methods. In multivariate analysis, we try to exploit as much information as possible from the characteristics that we measure for each event to distinguish between event types. In particular we will look at a method that has gained popularity in high-energy physics in recent years: the boosted decision tree. Finally, we give a brief sketch of how multivariate methods may be applied in a search for a new signal process. (author)
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability
Energy Technology Data Exchange (ETDEWEB)
Shear, Trevor Allan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-08-29
Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystal sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.
Avoiding Pitfalls in the Statistical Analysis of Heterogeneous Tumors
Directory of Open Access Journals (Sweden)
Judith-Anne W. Chapman
2009-01-01
Full Text Available Information about tumors is usually obtained from a single assessment of a tumor sample, performed at some point in the course of the development and progression of the tumor, with patient characteristics being surrogates for natural history context. Differences between cells within individual tumors (intratumor heterogeneity and between tumors of different patients (intertumor heterogeneity may mean that a small sample is not representative of the tumor as a whole, particularly for solid tumors which are the focus of this paper. This issue is of increasing importance as high-throughput technologies generate large multi-feature data sets in the areas of genomics, proteomics, and image analysis. Three potential pitfalls in statistical analysis are discussed (sampling, cut-points, and validation and suggestions are made about how to avoid these pitfalls.
Statistical analysis of magnetically soft particles in magnetorheological elastomers
Gundermann, T.; Cremer, P.; Löwen, H.; Menzel, A. M.; Odenbach, S.
2017-04-01
The physical properties of magnetorheological elastomers (MRE) are a complex issue and can be influenced and controlled in many ways, e.g. by applying a magnetic field, by external mechanical stimuli, or by an electric potential. In general, the response of MRE materials to these stimuli is crucially dependent on the distribution of the magnetic particles inside the elastomer. Specific knowledge of the interactions between particles or particle clusters is of high relevance for understanding the macroscopic rheological properties and provides an important input for theoretical calculations. In order to gain a better insight into the correlation between the macroscopic effects and microstructure and to generate a database for theoretical analysis, x-ray micro-computed tomography (X-μCT) investigations as a base for a statistical analysis of the particle configurations were carried out. Different MREs with quantities of 2-15 wt% (0.27-2.3 vol%) of iron powder and different allocations of the particles inside the matrix were prepared. The X-μCT results were edited by an image processing software regarding the geometrical properties of the particles with and without the influence of an external magnetic field. Pair correlation functions for the positions of the particles inside the elastomer were calculated to statistically characterize the distributions of the particles in the samples.
Statistical analysis of installed wind capacity in the United States
International Nuclear Information System (INIS)
Staid, Andrea; Guikema, Seth D.
2013-01-01
There is a large disparity in the amount of wind power capacity installed in each of the states in the U.S. It is often thought that the different policies of individual state governments are the main reason for these differences, but this may not necessarily be the case. The aim of this paper is to use statistical methods to study the factors that have the most influence on the amount of installed wind capacity in each state. From this analysis, we were able to use these variables to accurately predict the installed wind capacity and to gain insight into the driving factors for wind power development and the reasons behind the differences among states. Using our best model, we find that the most important variables for explaining the amount of wind capacity have to do with the physical and geographic characteristics of the state as opposed to policies in place that favor renewable energy. - Highlights: • We conduct a statistical analysis of factors influencing wind capacity in the U.S. • We find that state policies do not strongly influence the differences among states. • Driving factors are wind resources, cropland area, and available percentage of land
Pattern recognition in menstrual bleeding diaries by statistical cluster analysis
Directory of Open Access Journals (Sweden)
Wessel Jens
2009-07-01
Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.
Statistical mechanical analysis of LMFBR fuel cladding tubes
International Nuclear Information System (INIS)
Poncelet, J.-P.; Pay, A.
1977-01-01
The most important design requirement on fuel pin cladding for LMFBR's is its mechanical integrity. Disruptive factors include internal pressure from mixed oxide fuel fission gas release, thermal stresses and high temperature creep, neutron-induced differential void-swelling as a source of stress in the cladding and irradiation creep of stainless steel material, corrosion by fission products. Under irradiation these load-restraining mechanisms are accentuated by stainless steel embrittlement and strength alterations. To account for the numerous uncertainties involved in the analysis by theoretical models and computer codes statistical tools are unavoidably requested, i.e. Monte Carlo simulation methods. Thanks to these techniques, uncertainties in nominal characteristics, material properties and environmental conditions can be linked up in a correct way and used for a more accurate conceptual design. First, a thermal creep damage index is set up through a sufficiently sophisticated clad physical analysis including arbitrary time dependence of power and neutron flux as well as effects of sodium temperature, burnup and steel mechanical behavior. Although this strain limit approach implies a more general but time consuming model., on the counterpart the net output is improved and e.g. clad temperature, stress and strain maxima may be easily assessed. A full spectrum of variables are statistically treated to account for their probability distributions. Creep damage probability may be obtained and can contribute to a quantitative fuel probability estimation
Evaluating biomarkers for prognostic enrichment of clinical trials.
Kerr, Kathleen F; Roth, Jeremy; Zhu, Kehao; Thiessen-Philbrook, Heather; Meisner, Allison; Wilson, Francis Perry; Coca, Steven; Parikh, Chirag R
2017-12-01
A potential use of biomarkers is to assist in prognostic enrichment of clinical trials, where only patients at relatively higher risk for an outcome of interest are eligible for the trial. We investigated methods for evaluating biomarkers for prognostic enrichment. We identified five key considerations when considering a biomarker and a screening threshold for prognostic enrichment: (1) clinical trial sample size, (2) calendar time to enroll the trial, (3) total patient screening costs and the total per-patient trial costs, (4) generalizability of trial results, and (5) ethical evaluation of trial eligibility criteria. Items (1)-(3) are amenable to quantitative analysis. We developed the Biomarker Prognostic Enrichment Tool for evaluating biomarkers for prognostic enrichment at varying levels of screening stringency. We demonstrate that both modestly prognostic and strongly prognostic biomarkers can improve trial metrics using Biomarker Prognostic Enrichment Tool. Biomarker Prognostic Enrichment Tool is available as a webtool at http://prognosticenrichment.com and as a package for the R statistical computing platform. In some clinical settings, even biomarkers with modest prognostic performance can be useful for prognostic enrichment. In addition to the quantitative analysis provided by Biomarker Prognostic Enrichment Tool, investigators must consider the generalizability of trial results and evaluate the ethics of trial eligibility criteria.
EBprot: Statistical analysis of labeling-based quantitative proteomics data.
Koh, Hiromi W L; Swa, Hannah L F; Fermin, Damian; Ler, Siok Ghee; Gunaratne, Jayantha; Choi, Hyungwon
2015-08-01
Labeling-based proteomics is a powerful method for detection of differentially expressed proteins (DEPs). The current data analysis platform typically relies on protein-level ratios, which is obtained by summarizing peptide-level ratios for each protein. In shotgun proteomics, however, some proteins are quantified with more peptides than others, and this reproducibility information is not incorporated into the differential expression (DE) analysis. Here, we propose a novel probabilistic framework EBprot that directly models the peptide-protein hierarchy and rewards the proteins with reproducible evidence of DE over multiple peptides. To evaluate its performance with known DE states, we conducted a simulation study to show that the peptide-level analysis of EBprot provides better receiver-operating characteristic and more accurate estimation of the false discovery rates than the methods based on protein-level ratios. We also demonstrate superior classification performance of peptide-level EBprot analysis in a spike-in dataset. To illustrate the wide applicability of EBprot in different experimental designs, we applied EBprot to a dataset for lung cancer subtype analysis with biological replicates and another dataset for time course phosphoproteome analysis of EGF-stimulated HeLa cells with multiplexed labeling. Through these examples, we show that the peptide-level analysis of EBprot is a robust alternative to the existing statistical methods for the DE analysis of labeling-based quantitative datasets. The software suite is freely available on the Sourceforge website http://ebprot.sourceforge.net/. All MS data have been deposited in the ProteomeXchange with identifier PXD001426 (http://proteomecentral.proteomexchange.org/dataset/PXD001426/). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Directory of Open Access Journals (Sweden)
A. Lucacchini
2011-11-01
Full Text Available Fibromyalgia (FM is characterized by the presence of chronic widespread pain throughout the musculoskeletal system and diffuse tenderness. Unfortunately, no laboratory tests have been appropriately validated for FM and correlated with the subsets and activity. The aim of this study was to apply a proteomic technique in saliva of FM patients: the Surface Enhance Laser Desorption/Ionization Time-of-Flight (SELDI-TOF. For this study, 57 FM patients and 35 HC patients were enrolled. The proteomic analysis of saliva was carried out using SELDI-TOF. The analysis was performed using different chip arrays with different characteristics of binding. The statistical analysis was performed using cluster analysis and the difference between two groups was underlined using Student’s t-test. Spectra analysis highlighted the presence of several peaks differently expressed in FM patients compared with controls. The preliminary results obtained by SELDI-TOF analysis were compared with those obtained in our previous study performed on whole saliva of FM patients by using electrophoresis. The m/z of two peaks, increased in FM patients, seem to overlap well with the molecular weight of calgranulin A and C and Rho GDP-dissociation inhibitor 2, which we had found up-regulated in our previous study. These preliminary results showed the possibility of identifying potential salivary biomarker through salivary proteomic analysis with MALDI-TOF and SELDI-TOF in FM patients. The peaks observed allow us to focus on some of the particular pathogenic aspects of FM, the oxidative stress which contradistinguishes this condition, the involvement of proteins related to the cytoskeletal arrangements, and central sensibilization.
International Nuclear Information System (INIS)
Lacombe, J.P.
1985-12-01
Statistic study of Poisson non-homogeneous and spatial processes is the first part of this thesis. A Neyman-Pearson type test is defined concerning the intensity measurement of these processes. Conditions are given for which consistency of the test is assured, and others giving the asymptotic normality of the test statistics. Then some techniques of statistic processing of Poisson fields and their applications to a particle multidetector study are given. Quality tests of the device are proposed togetherwith signal extraction methods [fr
USE OF MULTIPARAMETER ANALYSIS OF LABORATORY BIOMARKERS TO ASSESS RHEUMATOID ARTHRITIS ACTIVITY
Directory of Open Access Journals (Sweden)
A. A. Novikov
2015-01-01
Full Text Available The key component in the management of patients with rheumatoid arthritis (RA is regular control of RA activity. The quantitative assessment of a patient’s status allows the development of standardized indications for anti-rheumatic therapy.Objective: to identify the laboratory biomarkers able to reflect RA activity.Subjects and methods. Fifty-eight patients with RA and 30 age- and sex-matched healthy donors were examined. The patients were divided into high/moderate and mild disease activity groups according to DAS28. The serum concentrations of 30 biomarkers were measured using immunonephelometric assay, enzyme immunoassay, and xMAP technology.Results and discussion. Multivariate analysis could identify the factors mostly related to high/moderate RA activity according to DAS28, such as fibroblast growth factor-2, monocyte chemoattractant protein-1, interleukins (IL 1α, 6, and 15, and tumor necrosis factor-α and could create a prognostic model for RA activity assessment. ROC analysis has shown that this model has excellent diagnostic efficiency in differentiating high/moderate versus low RA activity.Conclusion. To create a subjective assessment-independent immunological multiparameter index of greater diagnostic accuracy than the laboratory parameters routinely used in clinical practice may be a qualitatively new step in assessing and monitoring RA activity.
Mass spectrum analysis of serum biomarker proteins from patients with schizophrenia.
Zhou, Na; Wang, Jie; Yu, Yaqin; Shi, Jieping; Li, Xiaokun; Xu, Bin; Yu, Qiong
2014-05-01
Diagnosis of schizophrenia does not have a clear objective test at present, so we aimed to identify the potential biomarkers for the diagnosis of schizophrenia by comparison of serum protein profiling between first-episode schizophrenia patients and healthy controls. The combination of a magnetic bead separation system with matrix-assisted laser desorption/ionization time-of-flight tandem mass spectrometry (MALDI-TOF/TOF-MS) was used to analyze the serum protein spectra of 286 first-episode patients with schizophrenia, 41 chronic disease patients and 304 healthy controls. FlexAnlysis 3.0 and ClinProTools(TM) 2.1 software was used to establish a diagnostic model for schizophrenia. The results demonstrated that 10 fragmented peptides demonstrated an optimal discriminatory performance. Among these fragmented peptides, the peptide with m/z 1206.58 was identified as a fragment of fibrinopeptide A. Receiver operating characteristic analysis for m/z 1206.58 showed that the area under the curve was 0.981 for schizophrenia vs healthy controls, and 0.999 for schizophrenia vs other chronic disease controls. From our result, we consider that the analysis of serum protein spectrum using the magnetic bead separation system and MALDI-TOF/TOF-MS is an objective diagnostic tool. We conclude that fibrinopeptide A has the potential to be a biomarker for diagnosis of schizophrenia. This protein may also help to elucidate schizophrenia disease pathogenesis. Copyright © 2013 John Wiley & Sons, Ltd.
Ahmed, Farid E
2009-03-01
Sample preparation and fractionation technologies are one of the most crucial processes in proteomic analysis and biomarker discovery in solubilized samples. Chromatographic or electrophoretic proteomic technologies are also available for separation of cellular protein components. There are, however, considerable limitations in currently available proteomic technologies as none of them allows for the analysis of the entire proteome in a simple step because of the large number of peptides, and because of the wide concentration dynamic range of the proteome in clinical blood samples. The results of any undertaken experiment depend on the condition of the starting material. Therefore, proper experimental design and pertinent sample preparation is essential to obtain meaningful results, particularly in comparative clinical proteomics in which one is looking for minor differences between experimental (diseased) and control (nondiseased) samples. This review discusses problems associated with general and specialized strategies of sample preparation and fractionation, dealing with samples that are solution or suspension, in a frozen tissue state, or formalin-preserved tissue archival samples, and illustrates how sample processing might influence detection with mass spectrometric techniques. Strategies that dramatically improve the potential for cancer biomarker discovery in minimally invasive, blood-collected human samples are also presented.
Naboulsi, Wael; Megger, Dominik A; Bracht, Thilo; Kohl, Michael; Turewicz, Michael; Eisenacher, Martin; Voss, Don Marvin; Schlaak, Jörg F; Hoffmann, Andreas-Claudius; Weber, Frank; Baba, Hideo A; Meyer, Helmut E; Sitek, Barbara
2016-01-04
Hepatocellular carcinoma (HCC) is one of the most aggressive tumors, and the treatment outcome of this disease is improved when the cancer is diagnosed at an early stage. This requires biomarkers allowing an accurate and early tumor diagnosis. To identify potential markers for such applications, we analyzed a patient cohort consisting of 50 patients (50 HCC and 50 adjacent nontumorous tissue samples as controls) using two independent proteomics approaches. We performed label-free discovery analysis on 19 HCC and corresponding tissue samples. The data were analyzed considering events known to take place in early events of HCC development, such as abnormal regulation of Wnt/b-catenin and activation of receptor tyrosine kinases (RTKs). 31 proteins were selected for verification experiments. For this analysis, the second set of the patient cohort (31 HCC and corresponding tissue samples) was analyzed using selected (multiple) reaction monitoring (SRM/MRM). We present the overexpression of ATP-dependent RNA helicase (DDX39), Fibulin-5 (FBLN5), myristoylated alanine-rich C-kinase substrate (MARCKS), and Serpin H1 (SERPINH1) in HCC for the first time. We demonstrate Versican core protein (VCAN) to be significantly associated with well differentiated and low-stage HCC. We revealed for the first time the evidence of VCAN as a potential biomarker for early-HCC diagnosis.
A meta-analysis of biomarkers related to oxidative stress and nitric oxide pathway in migraine.
Neri, Monica; Frustaci, Alessandra; Milic, Mirta; Valdiglesias, Vanessa; Fini, Massimo; Bonassi, Stefano; Barbanti, Piero
2015-09-01
Oxidative and nitrosative stress are considered key events in the still unclear pathophysiology of migraine. Studies comparing the level of biomarkers related to nitric oxide (NO) pathway/oxidative stress in the blood/urine of migraineurs vs. unaffected controls were extracted from the PubMed database. Summary estimates of mean ratios (MR) were carried out whenever a minimum of three papers were available. Nineteen studies were included in the meta-analyses, accounting for more than 1000 patients and controls, and compared with existing literature. Most studies measuring superoxide dismutase (SOD) showed lower activity in cases, although the meta-analysis in erythrocytes gave null results. On the contrary, plasma levels of thiobarbituric acid reactive substances (TBARS), an aspecific biomarker of oxidative damage, showed a meta-MR of 2.20 (95% CI: 1.65-2.93). As for NOs, no significant results were found in plasma, serum and urine. However, higher levels were shown during attacks, in patients with aura, and an effect of diet was found. The analysis of glutathione precursor homocysteine and asymmetric dimethylarginine (ADMA), an NO synthase inhibitor, gave inconclusive results. The role of the oxidative pathway in migraine is still uncertain. Interesting evidence emerged for TBARS and SOD, and concerning the possible role of diet in the control of NOx levels. © International Headache Society 2015.
Directory of Open Access Journals (Sweden)
Antonio Cerasa
2015-01-01
Full Text Available Presently, there are no valid biomarkers to identify individuals with eating disorders (ED. The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa were compared against 17 body mass index-matched healthy controls (HC. Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice.
Cerasa, Antonio; Castiglioni, Isabella; Salvatore, Christian; Funaro, Angela; Martino, Iolanda; Alfano, Stefania; Donzuso, Giulia; Perrotta, Paolo; Gioia, Maria Cecilia; Gilardi, Maria Carla; Quattrone, Aldo
2015-01-01
Presently, there are no valid biomarkers to identify individuals with eating disorders (ED). The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM) technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa) were compared against 17 body mass index-matched healthy controls (HC). Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice. PMID:26648660
Festing, Michael F W
2014-01-01
The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error) due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs) a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL) was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.
Directory of Open Access Journals (Sweden)
Michael F W Festing
Full Text Available The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Mitrut
2006-03-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Statistical Analysis of Environmental Tritium around Wolsong Site
Energy Technology Data Exchange (ETDEWEB)
Kim, Ju Youl [FNC Technology Co., Yongin (Korea, Republic of)
2010-04-15
To find the relationship among airborne tritium, tritium in rainwater, TFWT (Tissue Free Water Tritium) and TBT (Tissue Bound Tritium), statistical analysis is conducted based on tritium data measured at KHNP employees' house around Wolsong nuclear power plants during 10 years from 1999 to 2008. The results show that tritium in such media exhibits a strong seasonal and annual periodicity. Tritium concentration in rainwater is observed to be highly correlated with TFWT and directly transmitted to TFWT without delay. The response of environmental radioactivity of tritium around Wolsong site is analyzed using time-series technique and non-parametric trend analysis. Tritium in the atmosphere and rainwater is strongly auto-correlated by seasonal and annual periodicity. TFWT concentration in pine needle is proven to be more sensitive to rainfall phenomenon than other weather variables. Non-parametric trend analysis of TFWT concentration within pine needle shows a increasing slope in terms of confidence level of 95%. This study demonstrates a usefulness of time-series and trend analysis for the interpretation of environmental radioactivity relationship with various environmental media.
Statistical analysis and Kalman filtering applied to nuclear materials accountancy
International Nuclear Information System (INIS)
Annibal, P.S.
1990-08-01
Much theoretical research has been carried out on the development of statistical methods for nuclear material accountancy. In practice, physical, financial and time constraints mean that the techniques must be adapted to give an optimal performance in plant conditions. This thesis aims to bridge the gap between theory and practice, to show the benefits to be gained from a knowledge of the facility operation. Four different aspects are considered; firstly, the use of redundant measurements to reduce the error on the estimate of the mass of heavy metal in an 'accountancy tank' is investigated. Secondly, an analysis of the calibration data for the same tank is presented, establishing bounds for the error and suggesting a means of reducing them. Thirdly, a plant-specific method of producing an optimal statistic from the input, output and inventory data, to help decide between 'material loss' and 'no loss' hypotheses, is developed and compared with existing general techniques. Finally, an application of the Kalman Filter to materials accountancy is developed, to demonstrate the advantages of state-estimation techniques. The results of the analyses and comparisons illustrate the importance of taking into account a complete and accurate knowledge of the plant operation, measurement system, and calibration methods, to derive meaningful results from statistical tests on materials accountancy data, and to give a better understanding of critical random and systematic error sources. The analyses were carried out on the head-end of the Fast Reactor Reprocessing Plant, where fuel from the prototype fast reactor is cut up and dissolved. However, the techniques described are general in their application. (author)
The system for statistical analysis of logistic information
Directory of Open Access Journals (Sweden)
Khayrullin Rustam Zinnatullovich
2015-05-01
Full Text Available The current problem for managers in logistic and trading companies is the task of improving the operational business performance and developing the logistics support of sales. The development of logistics sales supposes development and implementation of a set of works for the development of the existing warehouse facilities, including both a detailed description of the work performed, and the timing of their implementation. Logistics engineering of warehouse complex includes such tasks as: determining the number and the types of technological zones, calculation of the required number of loading-unloading places, development of storage structures, development and pre-sales preparation zones, development of specifications of storage types, selection of loading-unloading equipment, detailed planning of warehouse logistics system, creation of architectural-planning decisions, selection of information-processing equipment, etc. The currently used ERP and WMS systems did not allow us to solve the full list of logistics engineering problems. In this regard, the development of specialized software products, taking into account the specifics of warehouse logistics, and subsequent integration of these software with ERP and WMS systems seems to be a current task. In this paper we suggest a system of statistical analysis of logistics information, designed to meet the challenges of logistics engineering and planning. The system is based on the methods of statistical data processing.The proposed specialized software is designed to improve the efficiency of the operating business and the development of logistics support of sales. The system is based on the methods of statistical data processing, the methods of assessment and prediction of logistics performance, the methods for the determination and calculation of the data required for registration, storage and processing of metal products, as well as the methods for planning the reconstruction and development
Using robust statistics to improve neutron activation analysis results
International Nuclear Information System (INIS)
Zahn, Guilherme S.; Genezini, Frederico A.; Ticianelli, Regina B.; Figueiredo, Ana Maria G.
2011-01-01
Neutron activation analysis (NAA) is an analytical technique where an unknown sample is submitted to a neutron flux in a nuclear reactor, and its elemental composition is calculated by measuring the induced activity produced. By using the relative NAA method, one or more well-characterized samples (usually certified reference materials - CRMs) are irradiated together with the unknown ones, and the concentration of each element is then calculated by comparing the areas of the gamma ray peaks related to that element. When two or more CRMs are used as reference, the concentration of each element can be determined by several different ways, either using more than one gamma ray peak for that element (when available), or using the results obtained in the comparison with each CRM. Therefore, determining the best estimate for the concentration of each element in the sample can be a delicate issue. In this work, samples from three CRMs were irradiated together and the elemental concentration in one of them was calculated using the other two as reference. Two sets of peaks were analyzed for each element: a smaller set containing only the literature-recommended gamma-ray peaks and a larger one containing all peaks related to that element that could be quantified in the gamma-ray spectra; the most recommended transition was also used as a benchmark. The resulting data for each element was then reduced using up to five different statistical approaches: the usual (and not robust) unweighted and weighted means, together with three robust means: the Limitation of Relative Statistical Weight, Normalized Residuals and Rajeval. The resulting concentration values were then compared to the certified value for each element, allowing for discussion on both the performance of each statistical tool and on the best choice of peaks for each element. (author)
Page, Robert B; Monaghan, James R; Samuels, Amy K; Smith, Jeramiah J; Beachy, Christopher K; Voss, S Randal
2007-02-01
Ambystomatid salamanders offer several advantages for endocrine disruption research, including genomic and bioinformatics resources, an accessible laboratory model (Ambystoma mexicanum), and natural lineages that are broadly distributed among North American habitats. We used microarray analysis to measure the relative abundance of transcripts isolated from A. mexicanum epidermis (skin) after exogenous application of thyroid hormone (TH). Only one gene had a >2-fold change in transcript abundance after 2 days of TH treatment. However, hundreds of genes showed significantly different transcript levels at days 12 and 28 in comparison to day 0. A list of 123 TH-responsive genes was identified using statistical, BLAST, and fold level criteria. Cluster analysis identified two groups of genes with similar transcription patterns: up-regulated versus down-regulated. Most notably, several keratins exhibited dramatic (1000 fold) increases or decreases in transcript abundance. Keratin gene expression changes coincided with morphological remodeling of epithelial tissues. This suggests that keratin loci can be developed as sensitive biomarkers to assay temporal disruptions of larval-to-adult gene expression programs. Our study has identified the first collection of loci that are regulated during TH-induced metamorphosis in a salamander, thus setting the stage for future investigations of TH disruption in the Mexican axolotl and other salamanders of the genus Ambystoma.
Analysis of neutron flux measurement systems using statistical functions
International Nuclear Information System (INIS)
Pontes, Eduardo Winston
1997-01-01
This work develops an integrated analysis for neutron flux measurement systems using the concepts of cumulants and spectra. Its major contribution is the generalization of Campbell's theorem in the form of spectra in the frequency domain, and its application to the analysis of neutron flux measurement systems. Campbell's theorem, in its generalized form, constitutes an important tool, not only to find the nth-order frequency spectra of the radiation detector, but also in the system analysis. The radiation detector, an ionization chamber for neutrons, is modeled for cylindrical, plane and spherical geometries. The detector current pulses are characterized by a vector of random parameters, and the associated charges, statistical moments and frequency spectra of the resulting current are calculated. A computer program is developed for application of the proposed methodology. In order for the analysis to integrate the associated electronics, the signal processor is studied, considering analog and digital configurations. The analysis is unified by developing the concept of equivalent systems that can be used to describe the cumulants and spectra in analog or digital systems. The noise in the signal processor input stage is analysed in terms of second order spectrum. Mathematical expressions are presented for cumulants and spectra up to fourth order, for important cases of filter positioning relative to detector spectra. Unbiased conventional estimators for cumulants are used, and, to evaluate systems precision and response time, expressions are developed for their variances. Finally, some possibilities for obtaining neutron radiation flux as a function of cumulants are discussed. In summary, this work proposes some analysis tools which make possible important decisions in the design of better neutron flux measurement systems. (author)
Log-Normality and Multifractal Analysis of Flame Surface Statistics
Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.
2013-11-01
The turbulent flame surface is typically highly wrinkled and folded at a multitude of scales controlled by various flame properties. It is useful if the information contained in this complex geometry can be projected onto a simpler regular geometry for the use of spectral, wavelet or multifractal analyses. Here we investigate local flame surface statistics of turbulent flame expanding under constant pressure. First the statistics of local length ratio is experimentally obtained from high-speed Mie scattering images. For spherically expanding flame, length ratio on the measurement plane, at predefined equiangular sectors is defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at corresponding area-ratio pdfs. Both the pdfs are found to be near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis. Currently at Indian Institute of Science, India.
Analysis of filament statistics in fast camera data on MAST
Farley, Tom; Militello, Fulvio; Walkden, Nick; Harrison, James; Silburn, Scott; Bradley, James
2017-10-01
Coherent filamentary structures have been shown to play a dominant role in turbulent cross-field particle transport [D'Ippolito 2011]. An improved understanding of filaments is vital in order to control scrape off layer (SOL) density profiles and thus control first wall erosion, impurity flushing and coupling of radio frequency heating in future devices. The Elzar code [T. Farley, 2017 in prep.] is applied to MAST data. The code uses information about the magnetic equilibrium to calculate the intensity of light emission along field lines as seen in the camera images, as a function of the field lines' radial and toroidal locations at the mid-plane. In this way a `pseudo-inversion' of the intensity profiles in the camera images is achieved from which filaments can be identified and measured. In this work, a statistical analysis of the intensity fluctuations along field lines in the camera field of view is performed using techniques similar to those typically applied in standard Langmuir probe analyses. These filament statistics are interpreted in terms of the theoretical ergodic framework presented by F. Militello & J.T. Omotani, 2016, in order to better understand how time averaged filament dynamics produce the more familiar SOL density profiles. This work has received funding from the RCUK Energy programme (Grant Number EP/P012450/1), from Euratom (Grant Agreement No. 633053) and from the EUROfusion consortium.
Statistical analysis of hydrologic data for Yucca Mountain
International Nuclear Information System (INIS)
Rutherford, B.M.; Hall, I.J.; Peters, R.R.; Easterling, R.G.; Klavetter, E.A.
1992-02-01
The geologic formations in the unsaturated zone at Yucca Mountain are currently being studied as the host rock for a potential radioactive waste repository. Data from several drill holes have been collected to provide the preliminary information needed for planning site characterization for the Yucca Mountain Project. Hydrologic properties have been measured on the core samples and the variables analyzed here are thought to be important in the determination of groundwater travel times. This report presents a statistical analysis of four hydrologic variables: saturated-matrix hydraulic conductivity, maximum moisture content, suction head, and calculated groundwater travel time. It is important to modelers to have as much information about the distribution of values of these variables as can be obtained from the data. The approach taken in this investigation is to (1) identify regions at the Yucca Mountain site that, according to the data, are distinctly different; (2) estimate the means and variances within these regions; (3) examine the relationships among the variables; and (4) investigate alternative statistical methods that might be applicable when more data become available. The five different functional stratigraphic units at three different locations are compared and grouped into relatively homogeneous regions. Within these regions, the expected values and variances associated with core samples of different sizes are estimated. The results provide a rough estimate of the distribution of hydrologic variables for small core sections within each region
A statistically self-consistent type Ia supernova data analysis
International Nuclear Information System (INIS)
Lago, B.L.; Calvao, M.O.; Joras, S.E.; Reis, R.R.R.; Waga, I.; Giostri, R.
2011-01-01
Full text: The type Ia supernovae are one of the main cosmological probes nowadays and are used as standardized candles in distance measurements. The standardization processes, among which SALT2 and MLCS2k2 are the most used ones, are based on empirical relations and leave room for a residual dispersion in the light curves of the supernovae. This dispersion is introduced in the chi squared used to fit the parameters of the model in the expression for the variance of the data, as an attempt to quantify our ignorance in modeling the supernovae properly. The procedure used to assign a value to this dispersion is statistically inconsistent and excludes the possibility of comparing different cosmological models. In addition, the SALT2 light curve fitter introduces parameters on the model for the variance that are also used in the model for the data. In the chi squared statistics context the minimization of such a quantity yields, in the best case scenario, a bias. An iterative method has been developed in order to perform the minimization of this chi squared but it is not well grounded, although it is used by several groups. We propose an analysis of the type Ia supernovae data that is based on the likelihood itself and makes it possible to address both inconsistencies mentioned above in a straightforward way. (author)
Detecting fire in video stream using statistical analysis
Directory of Open Access Journals (Sweden)
Koplík Karel
2017-01-01
Full Text Available The real time fire detection in video stream is one of the most interesting problems in computer vision. In fact, in most cases it would be nice to have fire detection algorithm implemented in usual industrial cameras and/or to have possibility to replace standard industrial cameras with one implementing the fire detection algorithm. In this paper, we present new algorithm for detecting fire in video. The algorithm is based on tracking suspicious regions in time with statistical analysis of their trajectory. False alarms are minimized by combining multiple detection criteria: pixel brightness, trajectories of suspicious regions for evaluating characteristic fire flickering and persistence of alarm state in sequence of frames. The resulting implementation is fast and therefore can run on wide range of affordable hardware.
Statistical mechanical analysis of LMFBR fuel cladding tubes
International Nuclear Information System (INIS)
Poncelet, J.-P.; Pay, A.
1977-01-01
The most important design requirement on fuel pin cladding for LMFBR's is its mechanical integrity. Disruptive factors include internal pressure from mixed oxide fuel fission gas release, thermal stresses and high temperature creep, neutron-induced differential void-swelling as a source of stress in the cladding and irradiation creep of stainless steel material, corrosion by fission products. Under irradiation these load-restraining mechanisms are accentuated by stainless steel embrittlement and strength alterations. To account for the numerous uncertainties involved in the analysis by theoretical models and computer codes statistical tools are unavoidably requested, i.e. Monte Carlo simulation methods. Thanks to these techniques, uncertainties in nominal characteristics, material properties and environmental conditions can be linked up in a correct way and used for a more accurate conceptual design. (Auth.)
Langmuir waveforms at interplanetary shocks: STEREO statistical analysis
Briand, C.
2016-12-01
Wave-particle interactions and particle acceleration are the two main processes allowing energy dissipation at non collisional shocks. Ion acceleration has been deeply studied for many years, also for their central role in the shock front reformation. Electron dynamics is also important in the shock dynamics through the instabilities they can generate which may impact the ion dynamics.Particle measurements can be efficiently completed by wave measurements to determine the characteristics of the electron beams and study the turbulence of the medium. Electric waveforms obtained from the S/WAVES instrument of the STEREO mission between 2007 to 2014 are analyzed. Thus, clear signature of Langmuir waves are observed on 41 interplanetary shocks. These data enable a statistical analysis and to deduce some characteristics of the electron dynamics on different shocks sources (SIR or ICME) and types (quasi-perpendicular or quasi-parallel). The conversion process between electrostatic to electromagnetic waves has also been tested in several cases.
Analysis of Official Suicide Statistics in Spain (1910-2011
Directory of Open Access Journals (Sweden)
2017-01-01
Full Text Available In this article we examine the evolution of suicide rates in Spain from 1910 to 2011. As something new, we use standardised suicide rates, making them perfectly comparable geographically and in time, as they no longer reflect population structure. Using historical data from a series of socioeconomic variables for all Spain's provinces and applying new techniques for the statistical analysis of panel data, we are able to confirm many of the hypotheses established by Durkheim at the end of the 19th century, especially those related to fertility and marriage rates, age, sex and the aging index. Our findings, however, contradict Durkheim's approach regarding the impact of urbanisation processes and poverty on suicide.
Higher order statistical moment application for solar PV potential analysis
Basri, Mohd Juhari Mat; Abdullah, Samizee; Azrulhisham, Engku Ahmad; Harun, Khairulezuan
2016-10-01
Solar photovoltaic energy could be as alternative energy to fossil fuel, which is depleting and posing a global warming problem. However, this renewable energy is so variable and intermittent to be relied on. Therefore the knowledge of energy potential is very important for any site to build this solar photovoltaic power generation system. Here, the application of higher order statistical moment model is being analyzed using data collected from 5MW grid-connected photovoltaic system. Due to the dynamic changes of skewness and kurtosis of AC power and solar irradiance distributions of the solar farm, Pearson system where the probability distribution is calculated by matching their theoretical moments with that of the empirical moments of a distribution could be suitable for this purpose. On the advantage of the Pearson system in MATLAB, a software programming has been developed to help in data processing for distribution fitting and potential analysis for future projection of amount of AC power and solar irradiance availability.
Statistical uncertainty analysis of radon transport in nonisothermal, unsaturated soils
International Nuclear Information System (INIS)
Holford, D.J.; Owczarski, P.C.; Gee, G.W.; Freeman, H.D.
1990-10-01
To accurately predict radon fluxes soils to the atmosphere, we must know more than the radium content of the soil. Radon flux from soil is affected not only by soil properties, but also by meteorological factors such as air pressure and temperature changes at the soil surface, as well as the infiltration of rainwater. Natural variations in meteorological factors and soil properties contribute to uncertainty in subsurface model predictions of radon flux, which, when coupled with a building transport model, will also add uncertainty to predictions of radon concentrations in homes. A statistical uncertainty analysis using our Rn3D finite-element numerical model was conducted to assess the relative importance of these meteorological factors and the soil properties affecting radon transport. 10 refs., 10 figs., 3 tabs
Book Chapter 18, titled Application in pesticide analysis: Liquid chromatography - A review of the state of science for biomarker discovery and identification, will be published in the book titled High Performance Liquid Chromatography in Pesticide Residue Analysis (Part of the C...
On the analysis of line profile variations: A statistical approach
International Nuclear Information System (INIS)
McCandliss, S.R.
1988-01-01
This study is concerned with the empirical characterization of the line profile variations (LPV), which occur in many of and Wolf-Rayet stars. The goal of the analysis is to gain insight into the physical mechanisms producing the variations. The analytic approach uses a statistical method to quantify the significance of the LPV and to identify those regions in the line profile which are undergoing statistically significant variations. Line positions and flux variations are then measured and subject to temporal and correlative analysis. Previous studies of LPV have for the most part been restricted to observations of a single line. Important information concerning the range and amplitude of the physical mechanisms involved can be obtained by simultaneously observing spectral features formed over a range of depths in the extended mass losing atmospheres of massive, luminous stars. Time series of a Wolf-Rayet and two of stars with nearly complete spectral coverage from 3940 angstrom to 6610 angstrom and with spectral resolution of R = 10,000 are analyzed here. These three stars exhibit a wide range of both spectral and temporal line profile variations. The HeII Pickering lines of HD 191765 show a monotonic increase in the peak rms variation amplitude with lines formed at progressively larger radii in the Wolf-Rayet star wind. Two times scales of variation have been identified in this star: a less than one day variation associated with small scale flickering in the peaks of the line profiles and a greater than one day variation associated with large scale asymmetric changes in the overall line profile shapes. However, no convincing period phenomena are evident at those periods which are well sampled in this time series
Statistical analysis of the uncertainty related to flood hazard appraisal
Notaro, Vincenza; Freni, Gabriele
2015-12-01
The estimation of flood hazard frequency statistics for an urban catchment is of great interest in practice. It provides the evaluation of potential flood risk and related damage and supports decision making for flood risk management. Flood risk is usually defined as function of the probability, that a system deficiency can cause flooding (hazard), and the expected damage, due to the flooding magnitude (damage), taking into account both the exposure and the vulnerability of the goods at risk. The expected flood damage can be evaluated by an a priori estimation of potential damage caused by flooding or by interpolating real damage data. With regard to flood hazard appraisal several procedures propose to identify some hazard indicator (HI) such as flood depth or the combination of flood depth and velocity and to assess the flood hazard corresponding to the analyzed area comparing the HI variables with user-defined threshold values or curves (penalty curves or matrixes). However, flooding data are usually unavailable or piecemeal allowing for carrying out a reliable flood hazard analysis, therefore hazard analysis is often performed by means of mathematical simulations aimed at evaluating water levels and flow velocities over catchment surface. As results a great part of the uncertainties intrinsic to flood risk appraisal can be related to the hazard evaluation due to the uncertainty inherent to modeling results and to the subjectivity of the user defined hazard thresholds applied to link flood depth to a hazard level. In the present work, a statistical methodology was proposed for evaluating and reducing the uncertainties connected with hazard level estimation. The methodology has been applied to a real urban watershed as case study.
A STATISTICAL ANALYSIS OF LARYNGEAL MALIGNANCIES AT OUR INSTITUTION
Directory of Open Access Journals (Sweden)
Bharathi Mohan Mathan
2017-03-01
Full Text Available BACKGROUND Malignancies of larynx are an increasing global burden with a distribution of approximately 2-5% of all malignancies with an incidence of 3.6/1,00,000 for men and 1.3/1,00,000 for women with a male-to-female ratio of 4:1. Smoking and alcohol are major established risk factors. More than 90-95% of all malignancies are squamous cell type. Three main subsite of laryngeal malignancies are glottis, supraglottis and subglottis. Improved surgical techniques and advanced chemoradiotherapy has increased the overall 5 year survival rate. The above study is statistical analysis of laryngeal malignancies at our institution for a period of one year and analysis of pattern of distribution, aetiology, sites and subsites and causes for recurrence. MATERIALS AND METHODS Based on the statistical data available in the institution for the period of one year from January 2016-December 2016, all laryngeal malignancies were analysed with respect to demographic pattern, age, gender, site, subsite, aetiology, staging, treatment received and probable cause for failure of treatment. Patients were followed up for 12 months period during the study. RESULTS Total number of cases studied are 27 (twenty seven. Male cases are 23 and female cases are 4, male-to-female ratio is 5.7:1, most common age is above 60 years, most common site is supraglottis, most common type is moderately-differentiated squamous cell carcinoma, most common cause for relapse or recurrence is advanced stage of disease and poor differentiation. CONCLUSION The commonest age occurrence at the end of the study is above 60 years and male-to-female ratio is 5.7:1, which is slightly above the international standards. Most common site is supraglottis and not glottis. The relapse and recurrences are higher compared to the international standards.
Spectral signature verification using statistical analysis and text mining
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
Wavelet Statistical Analysis of Low-Latitude Geomagnetic Measurements
Papa, A. R.; Akel, A. F.
2009-05-01
Following previous works by our group (Papa et al., JASTP, 2006), where we analyzed a series of records acquired at the Vassouras National Geomagnetic Observatory in Brazil for the month of October 2000, we introduced a wavelet analysis for the same type of data and for other periods. It is well known that wavelets allow a more detailed study in several senses: the time window for analysis can be drastically reduced if compared to other traditional methods (Fourier, for example) and at the same time allow an almost continuous accompaniment of both amplitude and frequency of signals as time goes by. This advantage brings some possibilities for potentially useful forecasting methods of the type also advanced by our group in previous works (see for example, Papa and Sosman, JASTP, 2008). However, the simultaneous statistical analysis of both time series (in our case amplitude and frequency) is a challenging matter and is in this sense that we have found what we consider our main goal. Some possible trends for future works are advanced.
Classification of Malaysia aromatic rice using multivariate statistical analysis
International Nuclear Information System (INIS)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-01-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties
Classification of Malaysia aromatic rice using multivariate statistical analysis
Energy Technology Data Exchange (ETDEWEB)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Directory of Open Access Journals (Sweden)
John P. Jakupciak
2013-01-01
Full Text Available Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.
The YKL-40 protein is a potential biomarker for COPD: a meta-analysis and systematic review
Directory of Open Access Journals (Sweden)
Tong X
2018-01-01
Full Text Available Xiang Tong,1 Dongguang Wang,1 Sitong Liu,1 Yao Ma,1,2 Zhenzhen Li,3 Panwen Tian,1,4 Hong Fan1 1Department of Respiratory and Critical Care Medicine, West China Hospital/West China School of Medicine, Sichuan University, Chengdu, People’s Republic of China; 2The Center of Gerontology and Geriatrics, West China Hospital/West China School of Medicine, Sichuan University, Chengdu, People’s Republic of China; 3Health Management Center, West China Hospital/West China School of Medicine, Sichuan University, Chengdu, People’s Republic of China; 4Lung Cancer Center, West China Hospital/West China School of Medicine, Sichuan University, Chengdu, People’s Republic of China Background: Many studies have found that YKL-40 may play an important pathogenic role in COPD. However, the results of these studies were inconsistent. Therefore, we performed a systematic review and meta-analysis to investigate the role of YKL-40 in COPD. Methods: We performed a systematic literature search in many database and commercial internet search engines to identify studies involving the role of YKL-40 in patients with COPD. The standardized mean difference (SMD and Fisher’s Z-value with its 95% confidence interval (CI were used to investigate the effect sizes. Results: A total of 15 eligible articles including 16 case–control/cohort groups were included in the meta-analysis. The results indicated that the serum YKL-40 levels in patients with COPD were significantly higher than those in healthy controls (SMD =1.58, 95% CI =0.68–2.49, P=0.001, and it was correlated with lung function (pooled r=-0.32; Z=-0.33; P<0.001. The results of subgroup analysis found that the serum YKL-40 levels were statistically different between the exacerbation group and the stable group in patients with COPD (SMD =1.55, 95% CI =0.81–2.30, P<0.001. Moreover, the results indicated that the sputum YKL-40 levels in patients with COPD were also significantly higher than those in healthy
A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.
Tang, Liansheng Larry; Balakrishnan, N
2011-01-01
The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.
Picard de Muller, Gaël; Ait-Belkacem, Rima; Bonnel, David; Longuespée, Rémi; Stauber, Jonathan
2017-12-01
Mass spectrometry imaging datasets are mostly analyzed in terms of average intensity in regions of interest. However, biological tissues have different morphologies with several sizes, shapes, and structures. The important biological information, contained in this highly heterogeneous cellular organization, could be hidden by analyzing the average intensities. Finding an analytical process of morphology would help to find such information, describe tissue model, and support identification of biomarkers. This study describes an informatics approach for the extraction and identification of mass spectrometry image features and its application to sample analysis and modeling. For the proof of concept, two different tissue types (healthy kidney and CT-26 xenograft tumor tissues) were imaged and analyzed. A mouse kidney model and tumor model were generated using morphometric - number of objects and total surface - information. The morphometric information was used to identify m/z that have a heterogeneous distribution. It seems to be a worthwhile pursuit as clonal heterogeneity in a tumor is of clinical relevance. This study provides a new approach to find biomarker or support tissue classification with more information. [Figure not available: see fulltext.
Loonen, Anne J. M.; de Jager, Cornelis P. C.; Tosserams, Janna; Kusters, Ron; Hilbink, Mirrian; Wever, Peter C.; van den Brule, Adriaan J. C.
2014-01-01
Molecular pathogen detection from blood is still expensive and the exact clinical value remains to be determined. The use of biomarkers may assist in preselecting patients for immediate molecular testing besides blood culture. In this study, 140 patients with ≥ 2 SIRS criteria and clinical signs of infection presenting at the emergency department of our hospital were included. C-reactive protein (CRP), neutrophil-lymphocyte count ratio (NLCR), procalcitonin (PCT) and soluble urokinase plasminogen activator receptor (suPAR) levels were determined. One ml EDTA blood was obtained and selective pathogen DNA isolation was performed with MolYsis (Molzym). DNA samples were analysed for the presence of pathogens, using both the MagicPlex Sepsis Test (Seegene) and SepsiTest (Molzym), and results were compared to blood cultures. Fifteen patients had to be excluded from the study, leaving 125 patients for further analysis. Of the 125 patient samples analysed, 27 presented with positive blood cultures of which 7 were considered to be contaminants. suPAR, PCT, and NLCR values were significantly higher in patients with positive blood cultures compared to patients without (p molecular assays perform poorly when one ml whole blood is used from emergency care unit patients. NLCR is a cheap, fast, easy to determine, and rapidly available biomarker, and therefore seems most promising in differentiating BSI from non-BSI patients for subsequent pathogen identification using molecular diagnostics. PMID:24475269
A Statistic Analysis Of Romanian Seaside Hydro Tourism
Secara Mirela
2011-01-01
Tourism represents one of the ways of spending spare time for rest, recreation, treatment and entertainment, and the specific aspect of Constanta County economy is touristic and spa capitalization of Romanian seaside. In order to analyze hydro tourism on Romanian seaside we have used statistic indicators within tourism as well as statistic methods such as chronological series, interdependent statistic series, regression and statistic correlation. The major objective of this research is to rai...
Tucker tensor analysis of Matern functions in spatial statistics
Litvinenko, Alexander
2018-04-20
Low-rank Tucker tensor methods in spatial statistics 1. Motivation: improve statistical models 2. Motivation: disadvantages of matrices 3. Tools: Tucker tensor format 4. Tensor approximation of Matern covariance function via FFT 5. Typical statistical operations in Tucker tensor format 6. Numerical experiments
Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis
Directory of Open Access Journals (Sweden)
Charles Frank
2018-03-01
Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.
Criminal victimization in Ukraine: analysis of statistical data
Directory of Open Access Journals (Sweden)
Serhiy Nezhurbida
2007-12-01
Full Text Available The article is based on the analysis of statistical data provided by law-enforcement, judicial and other bodies of Ukraine. The given analysis allows us to give an accurate quantity of a current status of crime victimization in Ukraine, to characterize its basic features (level, rate, structure, dynamics, and etc.. L’article se concentre sur l’analyse des données statystiques fournies par les institutions de contrôle sociale (forces de police et magistrature et par d’autres organes institutionnels ukrainiens. Les analyses effectuées attirent l'attention sur la situation actuelle des victimes du crime en Ukraine et aident à délinéer leur principales caractéristiques (niveau, taux, structure, dynamiques, etc.L’articolo si basa sull’analisi dei dati statistici forniti dalle agenzie del controllo sociale (forze dell'ordine e magistratura e da altri organi istituzionali ucraini. Le analisi effettuate forniscono molte informazioni sulla situazione attuale delle vittime del crimine in Ucraina e aiutano a delinearne le caratteristiche principali (livello, tasso, struttura, dinamiche, ecc..
FTree query construction for virtual screening: a statistical analysis.
Gerlach, Christof; Broughton, Howard; Zaliani, Andrea
2008-02-01
FTrees (FT) is a known chemoinformatic tool able to condense molecular descriptions into a graph object and to search for actives in large databases using graph similarity. The query graph is classically derived from a known active molecule, or a set of actives, for which a similar compound has to be found. Recently, FT similarity has been extended to fragment space, widening its capabilities. If a user were able to build a knowledge-based FT query from information other than a known active structure, the similarity search could be combined with other, normally separate, fields like de-novo design or pharmacophore searches. With this aim in mind, we performed a comprehensive analysis of several databases in terms of FT description and provide a basic statistical analysis of the FT spaces so far at hand. Vendors' catalogue collections and MDDR as a source of potential or known "actives", respectively, have been used. With the results reported herein, a set of ranges, mean values and standard deviations for several query parameters are presented in order to set a reference guide for the users. Applications on how to use this information in FT query building are also provided, using a newly built 3D-pharmacophore from 57 5HT-1F agonists and a published one which was used for virtual screening for tRNA-guanine transglycosylase (TGT) inhibitors.
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
Metaviz: interactive statistical and visual analysis of metagenomic data.
Wagner, Justin; Chelaru, Florin; Kancherla, Jayaram; Paulson, Joseph N; Zhang, Alexander; Felix, Victor; Mahurkar, Anup; Elmqvist, Niklas; Corrada Bravo, Héctor
2018-04-06
Large studies profiling microbial communities and their association with healthy or disease phenotypes are now commonplace. Processed data from many of these studies are publicly available but significant effort is required for users to effectively organize, explore and integrate it, limiting the utility of these rich data resources. Effective integrative and interactive visual and statistical tools to analyze many metagenomic samples can greatly increase the value of these data for researchers. We present Metaviz, a tool for interactive exploratory data analysis of annotated microbiome taxonomic community profiles derived from marker gene or whole metagenome shotgun sequencing. Metaviz is uniquely designed to address the challenge of browsing the hierarchical structure of metagenomic data features while rendering visualizations of data values that are dynamically updated in response to user navigation. We use Metaviz to provide the UMD Metagenome Browser web service, allowing users to browse and explore data for more than 7000 microbiomes from published studies. Users can also deploy Metaviz as a web service, or use it to analyze data through the metavizr package to interoperate with state-of-the-art analysis tools available through Bioconductor. Metaviz is free and open source with the code, documentation and tutorials publicly accessible.
Data Analysis & Statistical Methods for Command File Errors
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Recurrence time statistics: versatile tools for genomic DNA sequence analysis.
Cao, Yinhe; Tung, Wen-Wen; Gao, J B
2004-01-01
With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.
A statistical method for draft tube pressure pulsation analysis
International Nuclear Information System (INIS)
Doerfler, P K; Ruchonnet, N
2012-01-01
Draft tube pressure pulsation (DTPP) in Francis turbines is composed of various components originating from different physical phenomena. These components may be separated because they differ by their spatial relationships and by their propagation mechanism. The first step for such an analysis was to distinguish between so-called synchronous and asynchronous pulsations; only approximately periodic phenomena could be described in this manner. However, less regular pulsations are always present, and these become important when turbines have to operate in the far off-design range, in particular at very low load. The statistical method described here permits to separate the stochastic (random) component from the two traditional 'regular' components. It works in connection with the standard technique of model testing with several pressure signals measured in draft tube cone. The difference between the individual signals and the averaged pressure signal, together with the coherence between the individual pressure signals is used for analysis. An example reveals that a generalized, non-periodic version of the asynchronous pulsation is important at low load.
Directory of Open Access Journals (Sweden)
Supaporn Kradtap Hartwell
2012-01-01
Full Text Available Flow injection/sequential injection analysis (FIA/SIA systems are suitable for carrying out automatic wet chemical/biochemical reactions with reduced volume and time consumption. Various parts of the system such as pump, valve, and reactor may be built or adapted from available materials. Therefore the systems can be at lower cost as compared to other instrumentation-based analysis systems. Their applications for determination of biomarkers for liver diseases have been demonstrated in various formats of operation but only a few and limited types of biomarkers have been used as model analytes. This paper summarizes these applications for different types of reactions as a guide for using flow-based systems in more biomarker and/or multibiomarker studies.
Hendikawati, P.; Arifudin, R.; Zahid, M. Z.
2018-03-01
This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
Statistical analysis of cone penetration resistance of railway ballast
Directory of Open Access Journals (Sweden)
Saussine Gilles
2017-01-01
Full Text Available Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd sensitivity is important and increases with depth. The (qd distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd distribution for clean ballast; they increase by 50% the average value of (qd for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd distribution.
Vector field statistical analysis of kinematic and force trajectories.
Pataky, Todd C; Robinson, Mark A; Vanrenterghem, Jos
2013-09-27
When investigating the dynamics of three-dimensional multi-body biomechanical systems it is often difficult to derive spatiotemporally directed predictions regarding experimentally induced effects. A paradigm of 'non-directed' hypothesis testing has emerged in the literature as a result. Non-directed analyses typically consist of ad hoc scalar extraction, an approach which substantially simplifies the original, highly multivariate datasets (many time points, many vector components). This paper describes a commensurately multivariate method as an alternative to scalar extraction. The method, called 'statistical parametric mapping' (SPM), uses random field theory to objectively identify field regions which co-vary significantly with the experimental design. We compared SPM to scalar extraction by re-analyzing three publicly available datasets: 3D knee kinematics, a ten-muscle force system, and 3D ground reaction forces. Scalar extraction was found to bias the analyses of all three datasets by failing to consider sufficient portions of the dataset, and/or by failing to consider covariance amongst vector components. SPM overcame both problems by conducting hypothesis testing at the (massively multivariate) vector trajectory level, with random field corrections simultaneously accounting for temporal correlation and vector covariance. While SPM has been widely demonstrated to be effective for analyzing 3D scalar fields, the current results are the first to demonstrate its effectiveness for 1D vector field analysis. It was concluded that SPM offers a generalized, statistically comprehensive solution to scalar extraction's over-simplification of vector trajectories, thereby making it useful for objectively guiding analyses of complex biomechanical systems. © 2013 Published by Elsevier Ltd. All rights reserved.
RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database
Directory of Open Access Journals (Sweden)
Andronescu Mirela
2008-08-01
Full Text Available Abstract Background The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. Results In this paper we describe RNA STRAND – the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. Conclusion RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at http://www.rnasoft.ca/strand.
Tucker Tensor analysis of Matern functions in spatial statistics
Litvinenko, Alexander
2018-03-09
In this work, we describe advanced numerical tools for working with multivariate functions and for the analysis of large data sets. These tools will drastically reduce the required computing time and the storage cost, and, therefore, will allow us to consider much larger data sets or finer meshes. Covariance matrices are crucial in spatio-temporal statistical tasks, but are often very expensive to compute and store, especially in 3D. Therefore, we approximate covariance functions by cheap surrogates in a low-rank tensor format. We apply the Tucker and canonical tensor decompositions to a family of Matern- and Slater-type functions with varying parameters and demonstrate numerically that their approximations exhibit exponentially fast convergence. We prove the exponential convergence of the Tucker and canonical approximations in tensor rank parameters. Several statistical operations are performed in this low-rank tensor format, including evaluating the conditional covariance matrix, spatially averaged estimation variance, computing a quadratic form, determinant, trace, loglikelihood, inverse, and Cholesky decomposition of a large covariance matrix. Low-rank tensor approximations reduce the computing and storage costs essentially. For example, the storage cost is reduced from an exponential O(n^d) to a linear scaling O(drn), where d is the spatial dimension, n is the number of mesh points in one direction, and r is the tensor rank. Prerequisites for applicability of the proposed techniques are the assumptions that the data, locations, and measurements lie on a tensor (axes-parallel) grid and that the covariance function depends on a distance, ||x-y||.
The Inappropriate Symmetries of Multivariate Statistical Analysis in Geometric Morphometrics.
Bookstein, Fred L
In today's geometric morphometrics the commonest multivariate statistical procedures, such as principal component analysis or regressions of Procrustes shape coordinates on Centroid Size, embody a tacit roster of symmetries -axioms concerning the homogeneity of the multiple spatial domains or descriptor vectors involved-that do not correspond to actual biological fact. These techniques are hence inappropriate for any application regarding which we have a-priori biological knowledge to the contrary (e.g., genetic/morphogenetic processes common to multiple landmarks, the range of normal in anatomy atlases, the consequences of growth or function for form). But nearly every morphometric investigation is motivated by prior insights of this sort. We therefore need new tools that explicitly incorporate these elements of knowledge, should they be quantitative, to break the symmetries of the classic morphometric approaches. Some of these are already available in our literature but deserve to be known more widely: deflated (spatially adaptive) reference distributions of Procrustes coordinates, Sewall Wright's century-old variant of factor analysis, the geometric algebra of importing explicit biomechanical formulas into Procrustes space. Other methods, not yet fully formulated, might involve parameterized models for strain in idealized forms under load, principled approaches to the separation of functional from Brownian aspects of shape variation over time, and, in general, a better understanding of how the formalism of landmarks interacts with the many other approaches to quantification of anatomy. To more powerfully organize inferences from the high-dimensional measurements that characterize so much of today's organismal biology, tomorrow's toolkit must rely neither on principal component analysis nor on the Procrustes distance formula, but instead on sound prior biological knowledge as expressed in formulas whose coefficients are not all the same. I describe the problems
Statistical Analysis of Data with Non-Detectable Values
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L.
2004-08-26
Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical
Park, Sungjin; Moon, SeongRyeol; Lee, Kiyoung; Park, Ie Byung; Lee, Dae Ho; Nam, Seungyoon
2018-01-01
Diabetic nephropathy (DN), a major diabetic microvascular complication, has a long and growing list of biomarkers, including microRNA biomarkers, which have not been consistent across preclinical and clinical studies. This meta-analysis aims to identify significant blood- and urine-incident microRNAs as diagnostic/prognostic biomarker candidates for DN. PubMed, Web of Science, and Cochrane Library were searched from their earliest records through 12th Dec 2016. Relevant publications for the meta-analysis included (1) human participants; (2) microRNAs in blood and urine; (3) DN studies; and (4) English language. Four reviewers, including two physicians, independently and blindly extracted published data regarding microRNA profiles in blood and/or urine from subjects with diabetic nephropathy. A random-effect model was used to pool the data. Statistical associations between diabetic nephropathy and urinary or blood microRNA expression levels were assessed. Fourteen out of 327 studies (n=2,747 patients) were selected. Blood or urinary microRNA expression data of diabetic nephropathy were pooled for this analysis. The hsa-miR-126 family was significantly (OR: 0.57; 95% CI: 0.44-0.74; p-value diabetic kidney disease, while its urinary level was upregulated (OR: 2931.12; 95% CI: 9.96-862623.21; p-value = 0.0059). The hsa-miR-770 family microRNA were significantly (OR: 10.24; 95% CI: 2.37-44.25; p-value = 0.0018) upregulated in both blood and urine from patients with diabetic nephropathy. Our meta-analysis suggests that hsa-miR-126 and hsa-miR-770 family microRNA may have important diagnostic and pathogenetic implications for DN, which warrants further systematic clinical studies. © 2018 The Author(s). Published by S. Karger AG, Basel.
Directory of Open Access Journals (Sweden)
Lee Jae-Mok
2011-07-01
Full Text Available Abstract Background The inflammatory disease periodontitis results in tooth loss and can even lead to diseases of the whole body if not treated. Gingival crevicular fluid (GCF reflects the condition of the gingiva and contains proteins transuded from serum or cells at inflamed sites. In this study, we aimed to discover potential protein biomarkers for periodontitis in GCF proteome using LC-MS/MS. Results We identified 305 proteins from GCF of healthy individuals and periodontitis patients collected using a sterile gel loading tip by ESI-MS/MS coupled to nano-LC. Among these proteins, about 45 proteins were differentially expressed in the GCF proteome of moderate periodontitis patients when compared to the healthy individuals. We first identified azurocidin in the GCF, but not the saliva, as an upregulated protein in the periodontitis patients and verified its increased expression during periodontitis by ELISA using the GCF of the classified periodontitis patients compared to the healthy individuals. In addition, we found that azurocidin inhibited the differentiation of bone marrow-derived macrophages to osteoclasts. Conclusions Our results show that GCF collection using a gel loading tip and subsequent LC-MS/MS analysis following 1D-PAGE proteomic separation are effective for the analysis of the GCF proteome. Our current results also suggest that azurocidin could be a potential biomarker candidate for the early detection of inflammatory periodontal destruction by gingivitis and some chronic periodontitis. Our data also suggest that azurocidin may have an inhibitory role in osteoclast differentiation and, thus, a protective role in alveolar bone loss during the early stages of periodontitis.
Analysis and Evaluation of Statistical Models for Integrated Circuits Design
Directory of Open Access Journals (Sweden)
Sáenz-Noval J.J.
2011-10-01
Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.
Biomarkers in Alzheimer’s Disease Analysis by Mass Spectrometry-Based Proteomics
Directory of Open Access Journals (Sweden)
Yahui Liu
2014-05-01
Full Text Available Alzheimer’s disease (AD is a common chronic and destructive disease. The early diagnosis of AD is difficult, thus the need for clinically applicable biomarkers development is growing rapidly. There are many methods to biomarker discovery and identification. In this review, we aim to summarize Mass spectrometry (MS-based proteomics studies on AD and discuss thoroughly the methods to identify candidate biomarkers in cerebrospinal fluid (CSF and blood. This review will also discuss the potential research areas on biomarkers.
Plutonium metal exchange program : current status and statistical analysis
Energy Technology Data Exchange (ETDEWEB)
Tandon, L. (Lav); Eglin, J. L. (Judith Lynn); Michalak, S. E. (Sarah E.); Picard, R. R.; Temer, D. J. (Donald J.)
2004-01-01
The Rocky Flats Plutonium (Pu) Metal Sample Exchange program was conducted to insure the quality and intercomparability of measurements such as Pu assay, Pu isotopics, and impurity analyses. The Rocky Flats program was discontinued in 1989 after more than 30 years. In 2001, Los Alamos National Laboratory (LANL) reestablished the Pu Metal Exchange program. In addition to the Atomic Weapons Establishment (AWE) at Aldermaston, six Department of Energy (DOE) facilities Argonne East, Argonne West, Livermore, Los Alamos, New Brunswick Laboratory, and Savannah River are currently participating in the program. Plutonium metal samples are prepared and distributed to the sites for destructive measurements to determine elemental concentration, isotopic abundance, and both metallic and nonmetallic impurity levels. The program provides independent verification of analytical measurement capabilies for each participating facility and allows problems in analytical methods to be identified. The current status of the program will be discussed with emphasis on the unique statistical analysis and modeling of the data developed for the program. The discussion includes the definition of the consensus values for each analyte (in the presence and absence of anomalous values and/or censored values), and interesting features of the data and the results.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Energy Technology Data Exchange (ETDEWEB)
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Corrected Statistical Energy Analysis Model for Car Interior Noise
Directory of Open Access Journals (Sweden)
A. Putra
2015-01-01
Full Text Available Statistical energy analysis (SEA is a well-known method to analyze the flow of acoustic and vibration energy in a complex structure. For an acoustic space where significant absorptive materials are present, direct field component from the sound source dominates the total sound field rather than a reverberant field, where the latter becomes the basis in constructing the conventional SEA model. Such environment can be found in a car interior and thus a corrected SEA model is proposed here to counter this situation. The model is developed by eliminating the direct field component from the total sound field and only the power after the first reflection is considered. A test car cabin was divided into two subsystems and by using a loudspeaker as a sound source, the power injection method in SEA was employed to obtain the corrected coupling loss factor and the damping loss factor from the corrected SEA model. These parameters were then used to predict the sound pressure level in the interior cabin using the injected input power from the engine. The results show satisfactory agreement with the directly measured SPL.
Multivariate statistical analysis - an application to lunar materials
International Nuclear Information System (INIS)
Deb, M.
1978-01-01
The compositional characteristics of clinopyroxenes and spinels - two minerals considered to be very useful in deciphering lunar history, have been studied using the multivariate statistical method of principal component analysis. The mineral-chemical data used are from certain lunar rocks and fines collected by Apollo 11, 12, 14 and 15 and Luna 16 and 20 missions, representing mainly the mare basalts and also non-mare basalts, breccia and rock fragments from the highland regions, in which a large number of these minerals have been analyzed. The correlations noted in the mineral compositions, indicating substitutional relationships, have been interpreted on the basis of available crystal-chemical and petrological informations. Compositional trends for individual specimens have been delineated and compared by producing ''principal latent vector diagrams''. The percent variance of the principal components denoted by the eigenvalues, have been evaluated in terms of the crystallization history of the samples. Some of the major petrogenetic implications of this study concern the role of early formed cumulate phases in the near-surface fractionation of mare basalts, mixing of mineral compositions in the highland regolith and the subsolidus reduction trends in lunar spinels. (auth.)
STATISTICAL ANALYSIS OF DAMAGEABILITY OF THE BYPASS ENGINES COMPRESSOR BLADES
Directory of Open Access Journals (Sweden)
Boris A. Chichkov
2018-01-01
Full Text Available Aircraft gas turbine engines during the operation are exposed to damage of flowing parts. The elements of the engine design, appreciably determining operational characteristics are rotor blades. Character of typical damages for various types of engines depends on appointment and a geographical place of the aircraft operation on which one or another engine is installed. For example, the greatest problem for turboshaft engines operated in the dusty air conditions is erosive wear of a rotor blade airfoil. Among principal causes of flowing parts damages of bypass engine compressors are foreign object damages. Independently there are the damages caused by fatigue of a rotor blade material at dangerous blade mode. Pieces of the ice formed in the input unit, birds and the like can also be a source of danger. The foreign objects getting into the engine from runway are nuts, bolts, pieces of tire protectors, lock-wire, elements from earlier flying off aircraft, etc. The entry of foreign objects into the engine depends on both an operation mode (during the operation on the ground, on takeoff, on landing roll using the reverse and so on, and the aircraft engine position.Thus the foreign objects entered into the flowing path of bypass engine damage blade cascade of low and high pressure. Foreign objects entered into the flowing part of the engine with rotor blades result in dents on edges and blade shroud, deformations of edges, breakage, camber of peripheral parts and are distributed "nonlinear" on path length (steps. The article presents the results of the statistical analysis of three types engine compressors damageability over the period of more than three years. Damages are divided according to types of engines in whole and to their separate steps, depths and lengths, blades damage location. The results of the analysis make it possible to develop recommendations to carry out the optical-visual control procedures.
Statistical Analysis of Development Trends in Global Renewable Energy
Directory of Open Access Journals (Sweden)
Marina D. Simonova
2016-01-01
Full Text Available The article focuses on the economic and statistical analysis of industries associated with the use of renewable energy sources in several countries. The dynamic development and implementation of technologies based on renewable energy sources (hereinafter RES is the defining trend of world energy development. The uneven distribution of hydrocarbon reserves, increasing demand of developing countries and environmental risks associated with the production and consumption of fossil resources has led to an increasing interest of many states to this field. Creating low-carbon economies involves the implementation of plans to increase the proportion of clean energy through renewable energy sources, energy efficiency, reduce greenhouse gas emissions. The priority of this sector is a characteristic feature of modern development of developed (USA, EU, Japan and emerging economies (China, India, Brazil, etc., as evidenced by the inclusion of the development of this segment in the state energy strategies and the revision of existing approaches to energy security. The analysis of the use of renewable energy, its contribution to value added of countries-producers is of a particular interest. Over the last decade, the share of energy produced from renewable sources in the energy balances of the world's largest economies increased significantly. Every year the number of power generating capacity based on renewable energy is growing, especially, this trend is apparent in China, USA and European Union countries. There is a significant increase in direct investment in renewable energy. The total investment over the past ten years increased by 5.6 times. The most rapidly developing kinds are solar energy and wind power.
TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION
Directory of Open Access Journals (Sweden)
А. А. Vershinina
2014-01-01
Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.
Directory of Open Access Journals (Sweden)
Ainara Sistiaga
Full Text Available Our understanding of early human diets is based on reconstructed biomechanics of hominin jaws, bone and teeth isotopic data, tooth wear patterns, lithic, taphonomic and zooarchaeological data, which do not provide information about the relative amounts of different types of foods that contributed most to early human diets. Faecal biomarkers are proving to be a valuable tool in identifying relative proportions of plant and animal tissues in Palaeolithic diets. A limiting factor in the application of the faecal biomarker approach is the striking absence of data related to the occurrence of faecal biomarkers in non-human primate faeces. In this study we explored the nature and proportions of sterols and stanols excreted by our closest living relatives. This investigation reports the first faecal biomarker data for wild chimpanzee (Pan troglodytes and mountain gorilla (Gorilla beringei. Our results suggest that the chemometric analysis of faecal biomarkers is a useful tool for distinguishing between NHP and human faecal matter, and hence, it could provide information for palaeodietary research and early human diets.
Sistiaga, Ainara; Wrangham, Richard; Rothman, Jessica M; Summons, Roger E
2015-01-01
Our understanding of early human diets is based on reconstructed biomechanics of hominin jaws, bone and teeth isotopic data, tooth wear patterns, lithic, taphonomic and zooarchaeological data, which do not provide information about the relative amounts of different types of foods that contributed most to early human diets. Faecal biomarkers are proving to be a valuable tool in identifying relative proportions of plant and animal tissues in Palaeolithic diets. A limiting factor in the application of the faecal biomarker approach is the striking absence of data related to the occurrence of faecal biomarkers in non-human primate faeces. In this study we explored the nature and proportions of sterols and stanols excreted by our closest living relatives. This investigation reports the first faecal biomarker data for wild chimpanzee (Pan troglodytes) and mountain gorilla (Gorilla beringei). Our results suggest that the chemometric analysis of faecal biomarkers is a useful tool for distinguishing between NHP and human faecal matter, and hence, it could provide information for palaeodietary research and early human diets.
Energy Technology Data Exchange (ETDEWEB)
Brink, N.W. van den; Ruiter-Dijkman, E.M. De; Broekhuizen, S.; Reijnders, P.J.H.; Bosveld, A.T.C.
2000-03-01
Biomarkers are valuable instruments to assess the risks from exposure of organisms to organochlorines. In general, however, these biomarkers are either destructive to the animal of interest or extremely difficult to obtain otherwise. In this paper, the authors present a nondestructive biomarker for exposure to cytochrome P450-inducing organochlorines. This marker is based on a pattern analysis of metabolizable and nonmetabolizable polychlorinated biphenyl (PCB) congeners, which occur in several kinds of tissues (and even blood) that can be obtained without serious effects on the organism involved. The fraction of metabolizable PCB congeners is negatively correlated with exposure to PCBs, which are known to induce specific P450 isoenzymes. This relation can be modeled by a logistic curve, which can be used to define critical levels of exposure. In addition, this method creates an opportunity to analyze biomarker responses in archived tissues stored at standard freezing temperatures ({minus}20 C), at which responses to established biomarkers deteriorate. Furthermore, this method facilitates attribution of the enzyme induction to certain classes of compounds.
Directory of Open Access Journals (Sweden)
Felipe Gomes Naveca
2018-05-01
Full Text Available BACKGROUND Infection with Zika virus (ZIKV manifests in a broad spectrum of disease ranging from mild illness to severe neurological complications and little is known about Zika immunopathogenesis. OBJECTIVES To define the immunologic biomarkers that correlate with acute ZIKV infection. METHODS We characterized the levels of circulating cytokines, chemokines, and growth factors in 54 infected patients of both genders at five different time points after symptom onset using microbeads multiplex immunoassay; comparison to 100 age-matched controls was performed for statistical analysis and data mining. FINDINGS ZIKV-infected patients present a striking systemic inflammatory response with high levels of pro-inflammatory mediators. Despite the strong inflammatory pattern, IL-1Ra and IL-4 are also induced during the acute infection. Interestingly, the inflammatory cytokines IL-1β, IL-13, IL-17, TNF-α, and IFN-γ; chemokines CXCL8, CCL2, CCL5; and the growth factor G-CSF, displayed a bimodal distribution accompanying viremia. While this is the first manuscript to document bimodal distributions of viremia in ZIKV infection, this has been documented in other viral infections, with a primary viremia peak during mild systemic disease and a secondary peak associated with distribution of the virus to organs and tissues. MAIN CONCLUSIONS Biomarker network analysis demonstrated distinct dynamics in concurrence with the bimodal viremia profiles at different time points during ZIKV infection. Such a robust cytokine and chemokine response has been associated with blood-brain barrier permeability and neuroinvasiveness in other flaviviral infections. High-dimensional data analysis further identified CXCL10, a chemokine involved in foetal neuron apoptosis and Guillain-Barré syndrome, as the most promising biomarker of acute ZIKV infection for potential clinical application.
Keshishian, Hasmik; Burgess, Michael W; Specht, Harrison; Wallace, Luke; Clauser, Karl R; Gillette, Michael A; Carr, Steven A
2017-08-01
Proteomic characterization of blood plasma is of central importance to clinical proteomics and particularly to biomarker discovery studies. The vast dynamic range and high complexity of the plasma proteome have, however, proven to be serious challenges and have often led to unacceptable tradeoffs between depth of coverage and sample throughput. We present an optimized sample-processing pipeline for analysis of the human plasma proteome that provides greatly increased depth of detection, improved quantitative precision and much higher sample analysis throughput as compared with prior methods. The process includes abundant protein depletion, isobaric labeling at the peptide level for multiplexed relative quantification and ultra-high-performance liquid chromatography coupled to accurate-mass, high-resolution tandem mass spectrometry analysis of peptides fractionated off-line by basic pH reversed-phase (bRP) chromatography. The overall reproducibility of the process, including immunoaffinity depletion, is high, with a process replicate coefficient of variation (CV) of 4,500 proteins are detected and quantified per patient sample on average, with two or more peptides per protein and starting from as little as 200 μl of plasma. The approach can be multiplexed up to 10-plex using tandem mass tags (TMT) reagents, further increasing throughput, albeit with some decrease in the number of proteins quantified. In addition, we provide a rapid protocol for analysis of nonfractionated depleted plasma samples analyzed in 10-plex. This provides ∼600 quantified proteins for each of the ten samples in ∼5 h of instrument time.
Parallelization of the Physical-Space Statistical Analysis System (PSAS)
Larson, J. W.; Guo, J.; Lyster, P. M.
1999-01-01
Atmospheric data assimilation is a method of combining observations with model forecasts to produce a more accurate description of the atmosphere than the observations or forecast alone can provide. Data assimilation plays an increasingly important role in the study of climate and atmospheric chemistry. The NASA Data Assimilation Office (DAO) has developed the Goddard Earth Observing System Data Assimilation System (GEOS DAS) to create assimilated datasets. The core computational components of the GEOS DAS include the GEOS General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). The need for timely validation of scientific enhancements to the data assimilation system poses computational demands that are best met by distributed parallel software. PSAS is implemented in Fortran 90 using object-based design principles. The analysis portions of the code solve two equations. The first of these is the "innovation" equation, which is solved on the unstructured observation grid using a preconditioned conjugate gradient (CG) method. The "analysis" equation is a transformation from the observation grid back to a structured grid, and is solved by a direct matrix-vector multiplication. Use of a factored-operator formulation reduces the computational complexity of both the CG solver and the matrix-vector multiplication, rendering the matrix-vector multiplications as a successive product of operators on a vector. Sparsity is introduced to these operators by partitioning the observations using an icosahedral decomposition scheme. PSAS builds a large (approx. 128MB) run-time database of parameters used in the calculation of these operators. Implementing a message passing parallel computing paradigm into an existing yet developing computational system as complex as PSAS is nontrivial. One of the technical challenges is balancing the requirements for computational reproducibility with the need for high performance. The problem of computational
Reiser, Vladimír; Smith, Ryan C; Xue, Jiyan; Kurtz, Marc M; Liu, Rong; Legrand, Cheryl; He, Xuanmin; Yu, Xiang; Wong, Peggy; Hinchcliffe, John S; Tanen, Michael R; Lazar, Gloria; Zieba, Renata; Ichetovkin, Marina; Chen, Zhu; O'Neill, Edward A; Tanaka, Wesley K; Marton, Matthew J; Liao, Jason; Morris, Mark; Hailman, Eric; Tokiwa, George Y; Plump, Andrew S
2011-11-01
With expanding biomarker discovery efforts and increasing costs of drug development, it is critical to maximize the value of mass-limited clinical samples. The main limitation of available methods is the inability to isolate and analyze, from a single sample, molecules requiring incompatible extraction methods. Thus, we developed a novel semiautomated method for tissue processing and tissue milling and division (TMAD). We used a SilverHawk atherectomy catheter to collect atherosclerotic plaques from patients requiring peripheral atherectomy. Tissue preservation by flash freezing was compared with immersion in RNAlater®, and tissue grinding by traditional mortar and pestle was compared with TMAD. Comparators were protein, RNA, and lipid yield and quality. Reproducibility of analyte yield from aliquots of the same tissue sample processed by TMAD was also measured. The quantity and quality of biomarkers extracted from tissue prepared by TMAD was at least as good as that extracted from tissue stored and prepared by traditional means. TMAD enabled parallel analysis of gene expression (quantitative reverse-transcription PCR, microarray), protein composition (ELISA), and lipid content (biochemical assay) from as little as 20 mg of tissue. The mean correlation was r = 0.97 in molecular composition (RNA, protein, or lipid) between aliquots of individual samples generated by TMAD. We also demonstrated that it is feasible to use TMAD in a large-scale clinical study setting. The TMAD methodology described here enables semiautomated, high-throughput sampling of small amounts of heterogeneous tissue specimens by multiple analytical techniques with generally improved quality of recovered biomolecules.
Moon, Myungjin; Nakai, Kenta
2018-04-01
Currently, cancer biomarker discovery is one of the important research topics worldwide. In particular, detecting significant genes related to cancer is an important task for early diagnosis and treatment of cancer. Conventional studies mostly focus on genes that are differentially expressed in different states of cancer; however, noise in gene expression datasets and insufficient information in limited datasets impede precise analysis of novel candidate biomarkers. In this study, we propose an integrative analysis of gene expression and DNA methylation using normalization and unsupervised feature extractions to identify candidate biomarkers of cancer using renal cell carcinoma RNA-seq datasets. Gene expression and DNA methylation datasets are normalized by Box-Cox transformation and integrated into a one-dimensional dataset that retains the major characteristics of the original datasets by unsupervised feature extraction methods, and differentially expressed genes are selected from the integrated dataset. Use of the integrated dataset demonstrated improved performance as compared with conventional approaches that utilize gene expression or DNA methylation datasets alone. Validation based on the literature showed that a considerable number of top-ranked genes from the integrated dataset have known relationships with cancer, implying that novel candidate biomarkers can also be acquired from the proposed analysis method. Furthermore, we expect that the proposed method can be expanded for applications involving various types of multi-omics datasets.
Multi-analyte analysis of saliva biomarkers as predictors of periodontal and pre-implant disease
Braun, Thomas; Giannobile, William V; Herr, Amy E; Singh, Anup K; Shelburne, Charlie
2015-04-07
The present invention relates to methods of measuring biomarkers to determine the probability of a periodontal and/or peri-implant disease. More specifically, the invention provides a panel of biomarkers that, when used in combination, can allow determination of the probability of a periodontal and/or peri-implant disease state with extremely high accuracy.
Biomarkers for atopic dermatitis : a systematic review and meta-analysis
Thijs, Judith; Krastev, Todor; Weidinger, Stephan; Buckens, Constantinus F; de Bruin-Weller, Marjolein; Bruijnzeel-Koomen, Carla; Flohr, Carsten; Hijnen, DirkJan
2015-01-01
PURPOSE OF REVIEW: A large number of studies investigating the correlation between severity of atopic dermatitis and various biomarkers have been published over the past decades. The aim of this review was to identify, evaluate and synthesize the evidence examining the correlation of biomarkers with
Cellular Analysis of Boltzmann Most Probable Ideal Gas Statistics
Cahill, Michael E.
2018-04-01
Exact treatment of Boltzmann's Most Probable Statistics for an Ideal Gas of Identical Mass Particles having Translational Kinetic Energy gives a Distribution Law for Velocity Phase Space Cell j which relates the Particle Energy and the Particle Population according toB e(j) = A - Ψ(n(j) + 1)where A & B are the Lagrange Multipliers and Ψ is the Digamma Function defined byΨ(x + 1) = d/dx ln(x!)A useful sufficiently accurate approximation for Ψ is given byΨ(x +1) ≈ ln(e-γ + x)where γ is the Euler constant (≈.5772156649) & so the above distribution equation is approximatelyB e(j) = A - ln(e-γ + n(j))which can be inverted to solve for n(j) givingn(j) = (eB (eH - e(j)) - 1) e-γwhere B eH = A + γ& where B eH is a unitless particle energy which replaces the parameter A. The 2 approximate distribution equations imply that eH is the highest particle energy and the highest particle population isnH = (eB eH - 1) e-γwhich is due to the facts that population becomes negative if e(j) > eH and kinetic energy becomes negative if n(j) > nH.An explicit construction of Cells in Velocity Space which are equal in volume and homogeneous for almost all cells is shown to be useful in the analysis.Plots for sample distribution properties using e(j) as the independent variable are presented.
Mascaró, Maite; Sacristán, Ana Isabel; Rufino, Marta M.
2016-01-01
For the past 4 years, we have been involved in a project that aims to enhance the teaching and learning of experimental analysis and statistics, of environmental and biological sciences students, through computational programming activities (using R code). In this project, through an iterative design, we have developed sequences of R-code-based…
Olive mill wastewater characteristics: modelling and statistical analysis
Directory of Open Access Journals (Sweden)
Martins-Dias, Susete
2004-09-01
Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus
A statistical framework for differential network analysis from microarray data
Directory of Open Access Journals (Sweden)
Datta Somnath
2010-02-01
Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the
Statistical analysis of complex systems with nonclassical invariant measures
Fratalocchi, Andrea
2011-01-01
I investigate the problem of finding a statistical description of a complex many-body system whose invariant measure cannot be constructed stemming from classical thermodynamics ensembles. By taking solitons as a reference system and by employing a
Practical application and statistical analysis of titrimetric monitoring ...
African Journals Online (AJOL)
2008-09-18
Sep 18, 2008 ... The statistical tests showed that, depending on the titrant concentration ... The ASD process offers the possibility of transferring waste streams into ..... (1993) Weak acid/bases and pH control in anaerobic system – A review.
STATISTICAL ANALYSIS OF THE DEMOLITION OF THE HITCH DEVICES ELEMENTS
Directory of Open Access Journals (Sweden)
V. V. Artemchuk
2009-03-01
Full Text Available The results of statistical research of wear of automatic coupler body butts and thrust plates of electric locomotives are presented in the article. Due to the increased wear the mentioned elements require special attention.
Gregor Mendel's Genetic Experiments: A Statistical Analysis after 150 Years
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2016-01-01
Roč. 12, č. 2 (2016), s. 20-26 ISSN 1801-5603 Institutional support: RVO:67985807 Keywords : genetics * history of science * biostatistics * design of experiments Subject RIV: BB - Applied Statistics, Operational Research
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Directory of Open Access Journals (Sweden)
Claudia Bühnemann
Full Text Available Driven by genomic somatic variation, tumour tissues are typically heterogeneous, yet unbiased quantitative methods are rarely used to analyse heterogeneity at the protein level. Motivated by this problem, we developed automated image segmentation of images of multiple biomarkers in Ewing sarcoma to generate distributions of biomarkers between and within tumour cells. We further integrate high dimensional data with patient clinical outcomes utilising random survival forest (RSF machine learning. Using material from cohorts of genetically diagnosed Ewing sarcoma with EWSR1 chromosomal translocations, confocal images of tissue microarrays were segmented with level sets and watershed algorithms. Each cell nucleus and cytoplasm were identified in relation to DAPI and CD99, respectively, and protein biomarkers (e.g. Ki67, pS6, Foxo3a, EGR1, MAPK localised relative to nuclear and cytoplasmic regions of each cell in order to generate image feature distributions. The image distribution features were analysed with RSF in relation to known overall patient survival from three separate cohorts (185 informative cases. Variation in pre-analytical processing resulted in elimination of a high number of non-informative images that had poor DAPI localisation or biomarker preservation (67 cases, 36%. The distribution of image features for biomarkers in the remaining high quality material (118 cases, 104 features per case were analysed by RSF with feature selection, and performance assessed using internal cross-validation, rather than a separate validation cohort. A prognostic classifier for Ewing sarcoma with low cross-validation error rates (0.36 was comprised of multiple features, including the Ki67 proliferative marker and a sub-population of cells with low cytoplasmic/nuclear ratio of CD99. Through elimination of bias, the evaluation of high-dimensionality biomarker distribution within cell populations of a tumour using random forest analysis in quality
Bühnemann, Claudia; Li, Simon; Yu, Haiyue; Branford White, Harriet; Schäfer, Karl L; Llombart-Bosch, Antonio; Machado, Isidro; Picci, Piero; Hogendoorn, Pancras C W; Athanasou, Nicholas A; Noble, J Alison; Hassan, A Bassim
2014-01-01
Driven by genomic somatic variation, tumour tissues are typically heterogeneous, yet unbiased quantitative methods are rarely used to analyse heterogeneity at the protein level. Motivated by this problem, we developed automated image segmentation of images of multiple biomarkers in Ewing sarcoma to generate distributions of biomarkers between and within tumour cells. We further integrate high dimensional data with patient clinical outcomes utilising random survival forest (RSF) machine learning. Using material from cohorts of genetically diagnosed Ewing sarcoma with EWSR1 chromosomal translocations, confocal images of tissue microarrays were segmented with level sets and watershed algorithms. Each cell nucleus and cytoplasm were identified in relation to DAPI and CD99, respectively, and protein biomarkers (e.g. Ki67, pS6, Foxo3a, EGR1, MAPK) localised relative to nuclear and cytoplasmic regions of each cell in order to generate image feature distributions. The image distribution features were analysed with RSF in relation to known overall patient survival from three separate cohorts (185 informative cases). Variation in pre-analytical processing resulted in elimination of a high number of non-informative images that had poor DAPI localisation or biomarker preservation (67 cases, 36%). The distribution of image features for biomarkers in the remaining high quality material (118 cases, 104 features per case) were analysed by RSF with feature selection, and performance assessed using internal cross-validation, rather than a separate validation cohort. A prognostic classifier for Ewing sarcoma with low cross-validation error rates (0.36) was comprised of multiple features, including the Ki67 proliferative marker and a sub-population of cells with low cytoplasmic/nuclear ratio of CD99. Through elimination of bias, the evaluation of high-dimensionality biomarker distribution within cell populations of a tumour using random forest analysis in quality controlled tumour
2010-05-05
...] Guidance for Industry on Documenting Statistical Analysis Programs and Data Files; Availability AGENCY... documenting statistical analyses and data files submitted to the Center for Veterinary Medicine (CVM) for the... on Documenting Statistical Analysis Programs and Data Files; Availability'' giving interested persons...
Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.
Rudert, Thomas; Lohmann, Gabriele
2008-12-01
To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.
Statistical Analysis of CMC Constituent and Processing Data
Fornuff, Jonathan
2004-01-01
observed using statistical analysis software. The ultimate purpose of this study is to determine what variations in material processing can lead to the most critical changes in the materials property. The work I have taken part in this summer explores, in general, the key properties needed In this study SiC/SiC composites of varying architectures, utilizing a boron-nitride (BN)
Pre-pregnancy obesity and maternal nutritional biomarker status during pregnancy: a factor analysis.
Tomedi, Laura E; Chang, Chung-Chou H; Newby, P K; Evans, Rhobert W; Luther, James F; Wisner, Katherine L; Bodnar, Lisa M
2013-08-01
Pre-pregnancy obesity has been associated with adverse birth outcomes. Poor essential fatty acid (EFA) and micronutrient status during pregnancy may contribute to these associations. We assessed the associations between pre-pregnancy BMI and nutritional patterns of maternal micronutrient and EFA status during mid-pregnancy. A cross-sectional analysis from a prospective cohort study. Women provided non-fasting blood samples at ≥ 20 weeks’ gestation that were assayed for red cell EFA; plasma folate, homocysteine and ascorbic acid; and serum retinol, 25-hydroxyvitamin D, a-tocopherol, soluble transferrin receptors and carotenoids. These nutritional biomarkers were employed in a factor analysis and three patterns were derived: EFA, Micronutrients and Carotenoids. The Antidepressant Use During Pregnancy Study, Pittsburgh, PA, USA. Pregnant women (n 129). After adjustment for parity, race/ethnicity and age, obese pregnant women were 3.0 (95% CI 1.1, 7.7) times more likely to be in the lowest tertile of the EFA pattern and 4.5 (95% CI 1.7, 12.3) times more likely to be in the lowest tertile of the Carotenoid pattern compared with their lean counterparts. We found no association between pre-pregnancy obesity and the Micronutrient pattern after confounder adjustment. Our results suggest that obese pregnant women have diminished EFA and carotenoid concentrations.
Visibility graph analysis of heart rate time series and bio-marker of congestive heart failure
Bhaduri, Anirban; Bhaduri, Susmita; Ghosh, Dipak
2017-09-01
Study of RR interval time series for Congestive Heart Failure had been an area of study with different methods including non-linear methods. In this article the cardiac dynamics of heart beat are explored in the light of complex network analysis, viz. visibility graph method. Heart beat (RR Interval) time series data taken from Physionet database [46, 47] belonging to two groups of subjects, diseased (congestive heart failure) (29 in number) and normal (54 in number) are analyzed with the technique. The overall results show that a quantitative parameter can significantly differentiate between the diseased subjects and the normal subjects as well as different stages of the disease. Further, the data when split into periods of around 1 hour each and analyzed separately, also shows the same consistent differences. This quantitative parameter obtained using the visibility graph analysis thereby can be used as a potential bio-marker as well as a subsequent alarm generation mechanism for predicting the onset of Congestive Heart Failure.
Biomarkers of tolerance: searching for the hidden phenotype.
Perucha, Esperanza; Rebollo-Mesa, Irene; Sagoo, Pervinder; Hernandez-Fuentes, Maria P
2011-08-01
Induction of transplantation tolerance remains the ideal long-term clinical and logistic solution to the current challenges facing the management of renal allograft recipients. In this review, we describe the recent studies and advances made in identifying biomarkers of renal transplant tolerance, from study inceptions, to the lessons learned and their implications for current and future studies with the same goal. With the age of biomarker discovery entering a new dimension of high-throughput technologies, here we also review the current approaches, developments, and pitfalls faced in the subsequent statistical analysis required to identify valid biomarker candidates.
Statistical analysis of natural disasters and related losses
Pisarenko, VF
2014-01-01
The study of disaster statistics and disaster occurrence is a complicated interdisciplinary field involving the interplay of new theoretical findings from several scientific fields like mathematics, physics, and computer science. Statistical studies on the mode of occurrence of natural disasters largely rely on fundamental findings in the statistics of rare events, which were derived in the 20th century. With regard to natural disasters, it is not so much the fact that the importance of this problem for mankind was recognized during the last third of the 20th century - the myths one encounters in ancient civilizations show that the problem of disasters has always been recognized - rather, it is the fact that mankind now possesses the necessary theoretical and practical tools to effectively study natural disasters, which in turn supports effective, major practical measures to minimize their impact. All the above factors have resulted in considerable progress in natural disaster research. Substantial accrued ma...
Statistics and data analysis for financial engineering with R examples
Ruppert, David
2015-01-01
The new edition of this influential textbook, geared towards graduate or advanced undergraduate students, teaches the statistics necessary for financial engineering. In doing so, it illustrates concepts using financial markets and economic data, R Labs with real-data exercises, and graphical and analytic methods for modeling and diagnosing modeling errors. Financial engineers now have access to enormous quantities of data. To make use of these data, the powerful methods in this book, particularly about volatility and risks, are essential. Strengths of this fully-revised edition include major additions to the R code and the advanced topics covered. Individual chapters cover, among other topics, multivariate distributions, copulas, Bayesian computations, risk management, multivariate volatility and cointegration. Suggested prerequisites are basic knowledge of statistics and probability, matrices and linear algebra, and calculus. There is an appendix on probability, statistics and linear algebra. Practicing fina...
Statistical cluster analysis and diagnosis of nuclear system level performance
International Nuclear Information System (INIS)
Teichmann, T.; Levine, M.M.; Samanta, P.K.; Kato, W.Y.
1985-01-01
The complexity of individual nuclear power plants and the importance of maintaining reliable and safe operations makes it desirable to complement the deterministic analyses of these plants by corresponding statistical surveys and diagnoses. Based on such investigations, one can then explore, statistically, the anticipation, prevention, and when necessary, the control of such failures and malfunctions. This paper, and the accompanying one by Samanta et al., describe some of the initial steps in exploring the feasibility of setting up such a program on an integrated and global (industry-wide) basis. The conceptual statistical and data framework was originally outlined in BNL/NUREG-51609, NUREG/CR-3026, and the present work aims at showing how some important elements might be implemented in a practical way (albeit using hypothetical or simulated data)
Certification of medical librarians, 1949--1977 statistical analysis.
Schmidt, D
1979-01-01
The Medical Library Association's Code for Training and Certification of Medical Librarians was in effect from 1949 to August 1977, a period during which 3,216 individuals were certified. Statistics on each type of certificate granted each year are provided. Because 54.5% of those granted certification were awarded it in the last three-year, two-month period of the code's existence, these applications are reviewed in greater detail. Statistics on each type of certificate granted each year are provided. Because 54.5% of those granted certification were awarded it in the last three-year, two-month period of the code's existence, these applications are reviewed in greater detail. Statistics on MLA membership, sex, residence, library school, and method of meeting requirements are detailed. Questions relating to certification under the code now in existence are raised.
Multivariate Statistical Methods as a Tool of Financial Analysis of Farm Business
Czech Academy of Sciences Publication Activity Database
Novák, J.; Sůvová, H.; Vondráček, Jiří
2002-01-01
Roč. 48, č. 1 (2002), s. 9-12 ISSN 0139-570X Institutional research plan: AV0Z1030915 Keywords : financial analysis * financial ratios * multivariate statistical methods * correlation analysis * discriminant analysis * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Statistical analysis and planning of multihundred-watt impact tests
International Nuclear Information System (INIS)
Martz, H.F. Jr.; Waterman, M.S.
1977-10-01
Modular multihundred-watt (MHW) radioisotope thermoelectric generators (RTG's) are used as a power source for spacecraft. Due to possible environmental contamination by radioactive materials, numerous tests are required to determine and verify the safety of the RTG. There are results available from 27 fueled MHW impact tests regarding hoop failure, fingerprint failure, and fuel failure. Data from the 27 tests are statistically analyzed for relationships that exist between the test design variables and the failure types. Next, these relationships are used to develop a statistical procedure for planning and conducting either future MHW impact tests or similar tests on other RTG fuel sources. Finally, some conclusions are given
Biomarkers of Pediatric Brain Tumors
Directory of Open Access Journals (Sweden)
Mark D Russell
2013-03-01
Full Text Available Background and Need for Novel Biomarkers: Brain tumors are the leading cause of death by solid tumors in children. Although improvements have been made in their radiological detection and treatment, our capacity to promptly diagnose pediatric brain tumors in their early stages remains limited. This contrasts several other cancers where serum biomarkers such as CA 19-9 and CA 125 facilitate early diagnosis and treatment. Aim: The aim of this article is to review the latest literature and highlight biomarkers which may be of clinical use in the common types of primary pediatric brain tumor. Methods: A PubMed search was performed to identify studies reporting biomarkers in the bodily fluids of pediatric patients with brain tumors. Details regarding the sample type (serum, cerebrospinal fluid or urine, biomarkers analyzed, methodology, tumor type and statistical significance were recorded. Results: A total of 12 manuscripts reporting 19 biomarkers in 367 patients vs. 397 controls were identified in the literature. Of the 19 biomarkers identified, 12 were isolated from cerebrospinal fluid, 2 from serum, 3 from urine, and 2 from multiple bodily fluids. All but one study reported statistically significant differences in biomarker expression between patient and control groups.Conclusions: This review identifies a panel of novel biomarkers for pediatric brain tumors. It provides a platform for the further studies necessary to validate these biomarkers and, in addition, highlights several techniques through which new biomarkers can be discovered.
Bruña, Ricardo; Poza, Jesús; Gómez, Carlos; García, María; Fernández, Alberto; Hornero, Roberto
2012-06-01
Alzheimer's disease (AD) is the most common cause of dementia. Over the last few years, a considerable effort has been devoted to exploring new biomarkers. Nevertheless, a better understanding of brain dynamics is still required to optimize therapeutic strategies. In this regard, the characterization of mild cognitive impairment (MCI) is crucial, due to the high conversion rate from MCI to AD. However, only a few studies have focused on the analysis of magnetoencephalographic (MEG) rhythms to characterize AD and MCI. In this study, we assess the ability of several parameters derived from information theory to describe spontaneous MEG activity from 36 AD patients, 18 MCI subjects and 26 controls. Three entropies (Shannon, Tsallis and Rényi entropies), one disequilibrium measure (based on Euclidean distance ED) and three statistical complexities (based on Lopez Ruiz-Mancini-Calbet complexity LMC) were used to estimate the irregularity and statistical complexity of MEG activity. Statistically significant differences between AD patients and controls were obtained with all parameters (p validation procedure was applied. The accuracies reached 83.9% and 65.9% to discriminate AD and MCI subjects from controls, respectively. Our findings suggest that MCI subjects exhibit an intermediate pattern of abnormalities between normal aging and AD. Furthermore, the proposed parameters provide a new description of brain dynamics in AD and MCI.
Common pitfalls in statistical analysis: The perils of multiple testing
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2016-01-01
Multiple testing refers to situations where a dataset is subjected to statistical testing multiple times - either at multiple time-points or through multiple subgroups or for multiple end-points. This amplifies the probability of a false-positive finding. In this article, we look at the consequences of multiple testing and explore various methods to deal with this issue. PMID:27141478
Statistical analysis of agarwood oil compounds in discriminating the ...
African Journals Online (AJOL)
Enhancing and improving the discrimination technique is the main aim to determine or grade the good quality of agarwood oil. In this paper, all statistical works were performed via SPSS software. Two parameters involved are abundance of compound (%) and quality of t agarwood oil either low or high quality. The result ...
Bayesian statistical analysis of censored data in geotechnical engineering
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans
2000-01-01
The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...
Statistical models and NMR analysis of polymer microstructure
Statistical models can be used in conjunction with NMR spectroscopy to study polymer microstructure and polymerization mechanisms. Thus, Bernoullian, Markovian, and enantiomorphic-site models are well known. Many additional models have been formulated over the years for additional situations. Typica...
Toward a theory of statistical tree-shape analysis
DEFF Research Database (Denmark)
Feragen, Aasa; Lo, Pechin Chien Pau; de Bruijne, Marleen
2013-01-01
In order to develop statistical methods for shapes with a tree-structure, we construct a shape space framework for tree-shapes and study metrics on the shape space. This shape space has singularities, which correspond to topological transitions in the represented trees. We study two closely relat...
Statistical analysis of lightning electric field measured under Malaysian condition
Salimi, Behnam; Mehranzamir, Kamyar; Abdul-Malek, Zulkurnain
2014-02-01
Lightning is an electrical discharge during thunderstorms that can be either within clouds (Inter-Cloud), or between clouds and ground (Cloud-Ground). The Lightning characteristics and their statistical information are the foundation for the design of lightning protection system as well as for the calculation of lightning radiated fields. Nowadays, there are various techniques to detect lightning signals and to determine various parameters produced by a lightning flash. Each technique provides its own claimed performances. In this paper, the characteristics of captured broadband electric fields generated by cloud-to-ground lightning discharges in South of Malaysia are analyzed. A total of 130 cloud-to-ground lightning flashes from 3 separate thunderstorm events (each event lasts for about 4-5 hours) were examined. Statistical analyses of the following signal parameters were presented: preliminary breakdown pulse train time duration, time interval between preliminary breakdowns and return stroke, multiplicity of stroke, and percentages of single stroke only. The BIL model is also introduced to characterize the lightning signature patterns. Observations on the statistical analyses show that about 79% of lightning signals fit well with the BIL model. The maximum and minimum of preliminary breakdown time duration of the observed lightning signals are 84 ms and 560 us, respectively. The findings of the statistical results show that 7.6% of the flashes were single stroke flashes, and the maximum number of strokes recorded was 14 multiple strokes per flash. A preliminary breakdown signature in more than 95% of the flashes can be identified.
Statistical analysis of the profile of consumer Internet services
Directory of Open Access Journals (Sweden)
Arzhenovskii Sergei Valentinovich
2014-09-01
Full Text Available Article is devoted to the construction of the Russian Internet user profile. Statistical methods of summary, grouping and the graphical representation of information about Internet consumer by socio-demographic characteristics and settlement are used. RLMS at 2005-2012 years are the information base.
Statistical Analysis of Geo-electric Imaging and Geotechnical Test ...
Indian Academy of Sciences (India)
12
On the other hand cost-effective geoelctric imaging methods provide 2-D / 3-D .... SPSS (Statistical package for social sciences) have been used to carry out linear ..... P W J 1997 Theory of ionic surface electrical conduction in porous media;.
Statistical analysis of the potassium concentration obtained through
International Nuclear Information System (INIS)
Pereira, Joao Eduardo da Silva; Silva, Jose Luiz Silverio da; Pires, Carlos Alberto da Fonseca; Strieder, Adelir Jose
2007-01-01
The present work was developed in outcrops of Santa Maria region, southern Brazil, Rio Grande do Sul State. Statistic evaluations were applied in different rock types. The possibility to distinguish different geologic units, sedimentary and volcanic (acid and basic types) by means of the statistic analyses from the use of airborne gamma-ray spectrometry integrating potash radiation emissions data with geological and geochemistry data is discussed. This Project was carried out at 1973 by Geological Survey of Brazil/Companhia de Pesquisas de Recursos Minerais. The Camaqua Project evaluated the behavior of potash concentrations generating XYZ Geosof 1997 format, one grid, thematic map and digital thematic map files from this total area. Using these data base, the integration of statistics analyses in sedimentary formations which belong to the Depressao Central do Rio Grande do Sul and/or to volcanic rocks from Planalto da Serra Geral at the border of Parana Basin was tested. Univariate statistics model was used: the media, the standard media error, and the trust limits were estimated. The Tukey's Test was used in order to compare mean values. The results allowed to create criteria to distinguish geological formations based on their potash content. The back-calibration technique was employed to transform K radiation to percentage. Inside this context it was possible to define characteristic values from radioactive potash emissions and their trust ranges in relation to geologic formations. The potash variable when evaluated in relation to geographic Universal Transverse Mercator coordinates system showed a spatial relation following one polynomial model of second order, with one determination coefficient. The statistica 7.1 software Generalist Linear Models produced by Statistics Department of Federal University of Santa Maria/Brazil was used. (author)
Directory of Open Access Journals (Sweden)
Bin Hu
2013-12-01
Full Text Available The clinical value of Serum alpha-fetoprotein (AFP to detect early hepatocellular carcinoma (HCC has been questioned due to its low sensitivity and specificity found in recent years. Other than AFP, several new serum biomarkers including the circulating AFP isoform AFP-L3, des-gamma-carboxy prothrombin (DCP and Golgi protein-73 (GP73 have been identified as useful HCC markers. In this investigation, we review the current knowledge about these HCC-related biomarkers, and sum up the results of our meta-analysis on studies that have addressed the utility of these biomarkers in early detection and prognostic prediction of HCC. A systematic search in PubMed, Web of Science, and the Cochrane Library was performed for articles published in English from 1999 to 2012, focusing on serum biomarkers for HCC detection. Data on sensitivity and specificity of tests were extracted from 40 articles that met the inclusion criteria, and the summary receiver operating characteristic curve (sROC was obtained. A meta-analysis was carried out in which the area under the curve (AUC for each biomarker or biomarker combinations (AFP, DCP, GP73, AFP-L3, AFP + DCP, AFP + AFP-L3, and AFP + GP73 was used to compare the diagnostic accuracy of different biomarker tests. The AUC of AFP, DCP, GP73, AFP-L3, AFP + DCP, AFP + AFP-L3, and AFP + GP73 are 0.835, 0.797, 0.914, 0.710, 0.874, 0.748, and 0.932 respectively. A combination of AFP + GP73 is superior to AFP in detecting HCC and differentiating HCC patients from non-HCC patients, and may prove to be a useful marker in the diagnosis and screening of HCC. In addition, the AUC of GP73, AFP + DCP and AFP + GP73 are better than that of AFP. The clinical value of GP73, AFP + DCP, or AFP + GP73 as serological markers for HCC diagnosis needs to be addressed further in future studies.
Wang, Yimin; Ye, Fang; Huang, Chanyan; Xue, Faling; Li, Yingyuan; Gao, Shaowei; Qiu, Zeting; Li, Si; Chen, Qinchang; Zhou, Huaqiang; Song, Yiyan; Huang, Wenqi; Tan, Wulin; Wang, Zhongxing
2018-03-15
Neuropathic pain is one of the common complications after spinal cord injury (SCI), affecting patients' life quality. The molecular mechanism for neuropathic pain after SCI is still unclear. We aimed to discover potential genes and MicroRNAs(miRNAs) related to neuropathic pain by bioinformatics method. Microarray data of GSE69901 were obtained from Gene Expression Omnibus (GEO) database. Peripheral blood samples from patients with or without neuropathic pain after spinal cord injury (SCI) were collected. 12 samples with neuropathic pain and 13 samples without pain as control were included in the downloaded microarray. Differentially expressed genes (DEGs) between neuropathic pain group and control group were detected using GEO2R online tool. Functional enrichment analysis of DEGs was performed using DAVID database. Protein-protein interaction (PPI) network was constructed from STRING database. MiRNAs targeting these DEGs were obtained from miRNet database. A merged miRNA-DEG network was constructed and analyzed with Cytoscape software. Total 1134 DEGs were identified between patients with or without neuropathic pain(case and control) and 454 biological processes were enriched. We identified 4 targeted miRNAs, including mir-204-5p, mir-519d-3p, mir-20b-5p, mir-6838-5p, which may be the potential biomarker for SCI patients. Protein modification and regulation biological process of central nervous system may be a risk factor of in SCI patients. Certain genes and miRNAs may be potential biomarkers for the prediction of and potential targets for prevention and treatment of neuropathic pain after SCI.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http
Screening of plasma biomarkers in patients with unstable angina pectoris with proteomics analysis
Directory of Open Access Journals (Sweden)
Shui-wang HU
2017-08-01
Full Text Available Objective To analyze and compare the differentially expressed plasma proteins between patients with stable angina pectoris (SAP and unstable angina pectoris (UAP, and search for the biomarkers that maybe used for early diagnosis of UAP. Methods Sixty plasma samples were collected respectively from normal controls group (N group, SAP group and UAP group during Jun. 2014 to Apr. 2015 from the Third Affiliated Hospital of Southern Medical University. Ten samples (100μl of each group were selected randomly to pool into 3 groups severally. After removing high-abundance proteins from plasma, two- dimensional difference gel electrophoresis (DIGE was used to isolate the total proteins, and then the protein spots with more than 2-fold changes between UAP and SAP were picked up after the differential software analysis. Afterward, the varied proteins were identified by matrix assisted laser desorption ionization-time of flight/time of flight (MALDI-TOF/TOF mass spectrometry (MS. Finally, 40 plasma samples were collected respectively from N, SAP and UAP group, and the UAP specific differential proteins were selected to be verified by ELISA. Results A total of 10 varied protein spots with more than 2-fold changes in UAP and SAP were found including 9 up-regulated proteins and 1 down-regulated one. MS identification indicated that the up-regulated proteins included fibrinogen gamma chain (FGG, complement C4-B (C4B, immunoglobulin (Ig kappa chain C region (IGKC and hemoglobin subunit alpha (HBA1, whereas the down-regulated one was haptoglobin (HP. After comparing the varied proteins with that in N group, 2 specifically UAP-related proteins, IGKC and HP, were detected totally. IGKC was selected to validate by ELISA, and the corresponding results showed that IGKC was increased specifically in UAP plasma (P<0.05 when compared with N and SAP group, which was consistent with DIGE. Conclusion IGKC and HP have been detected as specifically related proteins to UAP
Gene Expression and Proteome Analysis as Sources of Biomarkers in Basal Cell Carcinoma
Ghita, Mihaela Adriana; Voiculescu, Suzana; Rosca, Adrian E.; Moraru, Liliana; Greabu, Maria
2016-01-01
Basal cell carcinoma (BCC) is the world's leading skin cancer in terms of frequency at the moment and its incidence continues to rise each year, leading to profound negative psychosocial and economic consequences. UV exposure is the most important environmental factor in the development of BCC in genetically predisposed individuals, this being reflected by the anatomical distribution of lesions mainly on sun-exposed skin areas. Early diagnosis and prompt management are of crucial importance in order to prevent local tissue destruction and subsequent disfigurement. Although various noninvasive or minimal invasive techniques have demonstrated their utility in increasing diagnostic accuracy of BCC and progress has been made in its treatment options, recurrent, aggressive, and metastatic variants of BCC still pose significant challenge for the healthcare system. Analysis of gene expression and proteomic profiling of tumor cells and of tumoral microenvironment in various tissues strongly suggests that certain molecules involved in skin cancer pathogenic pathways might represent novel predictive and prognostic biomarkers in BCC. PMID:27578920
The Potential of Angiogenin as a Serum Biomarker for Diseases: Systematic Review and Meta-Analysis
Directory of Open Access Journals (Sweden)
Dongdong Yu
2018-01-01
Full Text Available Background. Angiogenin (ANG is a multifunctional angiogenic protein that participates in both normal development and diseases. Abnormal serum ANG levels are commonly reported in various diseases. However, whether ANG can serve as a diagnostic or prognostic marker for different diseases remains a matter of debate. Methods. Here, we performed a systematic review and meta-analysis of the literature utilizing PubMed, Web of Science, and Scopus search engines to identify all publications comparing plasma or serum ANG levels between patients with different diseases and healthy controls, as were studies evaluating circulating ANG levels in healthy populations, pregnant women, or other demographic populations. Results. This study demonstrated that the serum ANG concentration in healthy populations was 336.14 ± 142.83 ng/ml and remained relatively stable in different populations and regions. We noted no significant differences in serum ANG levels between patients and healthy controls, except in cases in which patients suffered from cancer or cardiovascular diseases. The serum ANG concentrations were significantly higher in patients who developed colorectal cancer, acute myeloid leukemia, multiple myeloma, myelodysplastic syndromes, and heart failure than those in healthy controls. Conclusion. ANG has the potential of being a serum biomarker for cancers and cardiovascular diseases.
Cordeiro, Fernanda Bertuccez; Cataldi, Thais Regiani; Perkel, Kayla Jane; do Vale Teixeira da Costa, Lívia; Rochetti, Raquel Cellin; Stevanato, Juliana; Eberlin, Marcos Nogueira; Zylbersztejn, Daniel Suslik; Cedenho, Agnaldo Pereira; Turco, Edson Guimarães Lo
2015-12-01
The aim of the present study was to analyze the lipid profile of follicular fluid from patients with endometriosis and endometrioma who underwent in vitro fertilization treatment (IVF). The control group (n = 10) was composed of women with tubal factor or minimal male factor infertility who had positive pregnancy outcomes after IVF. The endometriosis group consisted of women with endometriosis diagnosed by videolaparoscopy (n = 10), and from the same patients, the endometriomas fluids were collected, which composed the endometrioma group (n = 10). From the follicular fluid and endometriomas, lipids were extracted by the Bligh and Dyer method, and the samples were analyzed by tandem mass spectrometry. We observed phosphatidylglycerol phosphate, phosphatidylcholine, phosphatidylserine, and phosphatidylnositol bisphosphate in the control group. In the endometriosis group, sphingolipids and phosphatidylcholines were more abundant, while in the endometrioma group, sphingolipids and phosphatidylcholines with different m/z from the endometriosis group were found in high abundance. This analysis demonstrated that there is a differential representation of these lipids according to their respective groups. In addition, the lipids found are involved in important mechanisms related to endometriosis progress in the ovary. Thus, the metabolomic approach for the study of lipids may be helpful in potential biomarker discovery.
Huang, Peiwu; Ren, Xiaohu; Huang, Zhijun; Yang, Xifei; Hong, Wenxu; Zhang, Yanfang; Zhang, Hang; Liu, Wei; Huang, Haiyan; Huang, Xinfeng; Wu, Desheng; Yang, Linqing; Tang, Haiyan; Zhou, Li; Li, Xuan; Liu, Jianjun
2014-08-17
Trichloroethylene (TCE) is an industrial solvent with widespread occupational exposure and also a major environmental contaminant. Occupational medicamentosa-like dermatitis induced by trichloroethylene (OMLDT) is an autoimmune disease and it has become one major hazard in China. In this study, sera from 3 healthy controls and 3 OMLDT patients at different disease stages were used for a screening study by 2D-DIGE and MALDI-TOF-MS/MS. Eight proteins including transthyretin (TTR), retinol binding protein 4 (RBP4), haptoglobin, clusterin, serum amyloid A protein (SAA), apolipoprotein A-I, apolipoprotein C-III and apolipoprotein C-II were found to be significantly altered among the healthy, acute-stage, healing-stage and healed-stage groups. Specifically, the altered expression of TTR, RBP4 and haptoglobin were further validated by Western blot analysis and ELISA. Our data not only suggested that TTR, RBP4 and haptoglobin could serve as potential serum biomarkers of OMLDT, but also indicated that measurement of TTR, RBP4 and haptoglobin or their combination could help aid in the diagnosis, monitoring the progression and therapy of the disease. Copyright © 2014. Published by Elsevier Ireland Ltd.
Evaluation of the TRPM2 channel as a biomarker in breast cancer using public databases analysis.
Sumoza-Toledo, Adriana; Espinoza-Gabriel, Mario Iván; Montiel-Condado, Dvorak
Breast cancer is one of the most common malignancies affecting women. Recent investigations have revealed a major role of ion channels in cancer. The transient receptor potential melastatin-2 (TRPM2) is a plasma membrane and lysosomal channel with important roles in cell migration and cell death in immune cells and tumor cells. In this study, we investigated the prognostic value of TRPM2 channel in breast cancer, analyzing public databases compiled in Oncomine™ (Thermo Fisher, Ann Arbor, MI) and online Kaplan-Meier Plotter platforms. The results revealed that TRPM2 mRNA overexpression is significant in situ and invasive breast carcinoma compared to normal breast tissue. Furthermore, multi-gene validation using Oncomine™ showed that this channel is coexpressed with proteins related to cellular migration, transformation, and apoptosis. On the other hand, Kaplan-Meier analysis exhibited that low expression of TRPM2 could be used to predict poor outcome in ER- and HER2+ breast carcinoma patients. TRPM2 is a promising biomarker for aggressiveness of breast cancer, and a potential target for the development of new therapies. Copyright © 2016 Hospital Infantil de México Federico Gómez. Publicado por Masson Doyma México S.A. All rights reserved.
Gene Expression and Proteome Analysis as Sources of Biomarkers in Basal Cell Carcinoma
Directory of Open Access Journals (Sweden)
Mihai Lupu
2016-01-01
Full Text Available Basal cell carcinoma (BCC is the world’s leading skin cancer in terms of frequency at the moment and its incidence continues to rise each year, leading to profound negative psychosocial and economic consequences. UV exposure is the most important environmental factor in the development of BCC in genetically predisposed individuals, this being reflected by the anatomical distribution of lesions mainly on sun-exposed skin areas. Early diagnosis and prompt management are of crucial importance in order to prevent local tissue destruction and subsequent disfigurement. Although various noninvasive or minimal invasive techniques have demonstrated their utility in increasing diagnostic accuracy of BCC and progress has been made in its treatment options, recurrent, aggressive, and metastatic variants of BCC still pose significant challenge for the healthcare system. Analysis of gene expression and proteomic profiling of tumor cells and of tumoral microenvironment in various tissues strongly suggests that certain molecules involved in skin cancer pathogenic pathways might represent novel predictive and prognostic biomarkers in BCC.
Statistical methods for data analysis in particle physics
Lista, Luca
2017-01-01
This concise set of course-based notes provides the reader with the main concepts and tools needed to perform statistical analyses of experimental data, in particular in the field of high-energy physics (HEP). First, the book provides an introduction to probability theory and basic statistics, mainly intended as a refresher from readers’ advanced undergraduate studies, but also to help them clearly distinguish between the Frequentist and Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on both discoveries and upper limits, as many applications in HEP concern hypothesis testing, where the main goal is often to provide better and better limits so as to eventually be able to distinguish between competing hypotheses, or to rule out some of them altogether. Many worked-out examples will help newcomers to the field and graduate students alike understand the pitfalls involved in applying theoretical co...
A Statistical Framework for the Functional Analysis of Metagenomes
Energy Technology Data Exchange (ETDEWEB)
Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.
2008-10-01
Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.
Common misconceptions about data analysis and statistics1
Motulsky, Harvey J
2015-01-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word “significant”. (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Statistical analysis of field data for aircraft warranties
Lakey, Mary J.
Air Force and Navy maintenance data collection systems were researched to determine their scientific applicability to the warranty process. New and unique algorithms were developed to extract failure distributions which were then used to characterize how selected families of equipment typically fails. Families of similar equipment were identified in terms of function, technology and failure patterns. Statistical analyses and applications such as goodness-of-fit test, maximum likelihood estimation and derivation of confidence intervals for the probability density function parameters were applied to characterize the distributions and their failure patterns. Statistical and reliability theory, with relevance to equipment design and operational failures were also determining factors in characterizing the failure patterns of the equipment families. Inferences about the families with relevance to warranty needs were then made.
STATISTICAL ANALYSYS OF THE SCFE OF A BRAZILAN MINERAL COAL
Directory of Open Access Journals (Sweden)
DARIVA Cláudio
1997-01-01
Full Text Available The influence of some process variables on the productivity of the fractions (liquid yield times fraction percent obtained from SCFE of a Brazilian mineral coal using isopropanol and ethanol as primary solvents is analyzed using statistical techniques. A full factorial 23 experimental design was adopted to investigate the effects of process variables (temperature, pressure and cosolvent concentration on the extraction products. The extracts were analyzed by the Preparative Liquid Chromatography-8 fractions method (PLC-8, a reliable, non destructive solvent fractionation method, especially developed for coal-derived liquids. Empirical statistical modeling was carried out in order to reproduce the experimental data. Correlations obtained were always greater than 0.98. Four specific process criteria were used to allow process optimization. Results obtained show that it is not possible to maximize both extract productivity and purity (through the minimization of heavy fraction content simultaneously by manipulating the mentioned process variables.
STATISTICAL ANALYSIS OF RAW SUGAR MATERIAL FOR SUGAR PRODUCER COMPLEX
A. A. Gromkovskii; O. I. Sherstyuk
2015-01-01
Summary. In the article examines the statistical data on the development of average weight and average sugar content of sugar beet roots. The successful solution of the problem of forecasting these raw indices is essential for solving problems of sugar producing complex control. In the paper by calculating the autocorrelation function demonstrated that the predominant trend component of the growth raw characteristics. For construct the prediction model is proposed to use an autoregressive fir...
Statistical analysis of personal dosimetry of exposed workers
International Nuclear Information System (INIS)
Sanchez Munoz, F. J.; Alejo Luque, L.; Mas Munoz, I.; Serrada Hierro, A.
2013-01-01
The dosimetry centers accredited by the Nuclear Safety Council (CSN) normally report overcoming legal limits, or some fraction thereof, but do not provide comparative dosimetric criteria indicating if assigned to a given dose is large TPE or small relative to that of their peers. In order to help to resolve the difficulties mentioned ds, it has developed an application that statistically processes the dosimetric data provided by the National Dosimetry Center. (Author)
Solar radiation data - statistical analysis and simulation models
Energy Technology Data Exchange (ETDEWEB)
Mustacchi, C; Cena, V; Rocchi, M; Haghigat, F
1984-01-01
The activities consisted in collecting meteorological data on magnetic tape for ten european locations (with latitudes ranging from 42/sup 0/ to 56/sup 0/ N), analysing the multi-year sequences, developing mathematical models to generate synthetic sequences having the same statistical properties of the original data sets, and producing one or more Short Reference Years (SRY's) for each location. The meteorological parameters examinated were (for all the locations) global + diffuse radiation on horizontal surface, dry bulb temperature, sunshine duration. For some of the locations additional parameters were available, namely, global, beam and diffuse radiation on surfaces other than horizontal, wet bulb temperature, wind velocity, cloud type, cloud cover. The statistical properties investigated were mean, variance, autocorrelation, crosscorrelation with selected parameters, probability density function. For all the meteorological parameters, various mathematical models were built: linear regression, stochastic models of the AR and the DAR type. In each case, the model with the best statistical behaviour was selected for the production of a SRY for the relevant parameter/location.
Olcott Marshall, Alison; Cestari, Nicholas A
2015-09-01
One of the major exploration targets for current and future Mars missions are lithofacies suggestive of biotic activity. Although such lithofacies are not confirmation of biotic activity, they provide a way to identify samples for further analyses. To test the efficacy of this approach, we identified carbonate samples from the Eocene Green River Formation as "microbial" or "non-microbial" based on the macroscale morphology of their laminations. These samples were then crushed and analyzed by gas chromatography/mass spectroscopy (GC/MS) to determine their lipid biomarker composition. GC/MS analysis revealed that carbonates visually identified as "microbial" contained a higher concentration of more diverse biomarkers than those identified as "non-microbial," suggesting that this could be a viable detection strategy for selecting samples for further analysis or caching on Mars.
Wu, Xiao; Huang, Yue; Sun, Jinghan; Wen, Yongqing; Qin, Feng; Zhao, Longshan; Xiong, Zhili
2018-01-01
A HILIC-UHPLC-MS/MS untargeted urinary metabonomic method combined with quantitative analysis of five potential polar biomarkers in rat urine was developed and validated, to further understand the anti-osteoporosis effect of Gushudan(GSD) and its mechanism on prednisolone-induced osteoporosis(OP) rats in this study. The metabolites were separated and identified on Waters BEH HILIC (2.1mm×100mm, 1.7μm) column using the Waters ACQUITY™ ultra performance liquid chromatography system (Waters Corporation, Milford, USA) coupled with a Micromass Quattro Micro™ API mass spectrometer (Waters Corp, Milford, MA, USA). Principal component analysis (PCA) was used to identify potential biomarkers. Primary potential polar biomarkers including creatinine, taurine, betaine, hypoxanthine and cytosine, which were related to energy metabolism, lipid metabolism and amino acid metabolism, were found in the untargeted metabonomic research. Moreover, these targeted biomarkers were further separated and quantified in multiple-reaction monitoring (MRM) with positive ionization mode, using tinidazole as internal standard (I.S.). Good linearities (r>0.99) were obtained for all the analytes with the low limit of quantification from 1.00 to 12.8μg/mL. The relative standard deviation (RSD) of the intra-day and inter-day precisions were within 15.0% and the accuracy ranged from -14.3% to 13.5%. The recovery was more than 85.0%. And the validated method was successfully applied to investigate the urine samples of the control group, prednisolone-induced osteoporosis model group and Gushudan-treatment group in rats. Compared to the control group, the level of creatinine, taurine, betaine, hypoxanthine and cytosine in the model group revealed a significant decrease trend (p<0.05), while the Gushudan-treatment group showed no statistically differences by an independent sample t-test. This paper provided a better understanding of the therapeutic effect and mechanism of GSD on prednisolone
Shaikh, Masood Ali
2017-09-01
Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.
Directory of Open Access Journals (Sweden)
Higgins LeeAnn
2010-06-01
Full Text Available Abstract Background Ovarian cancer is the most lethal gynecologic malignancy, with the majority of cases diagnosed at an advanced stage when treatments are less successful. Novel serum protein markers are needed to detect ovarian cancer in its earliest stage; when detected early, survival rates are over 90%. The identification of new serum biomarkers is hindered by the presence of a small number of highly abundant proteins that comprise approximately 95% of serum total protein. In this study, we used pooled serum depleted of the most highly abundant proteins to reduce the dynamic range of proteins, and thereby enhance the identification of serum biomarkers using the quantitative proteomic method iTRAQ®. Results Medium and low abundance proteins from 6 serum pools of 10 patients each from women with serous ovarian carcinoma, and 6 non-cancer control pools were labeled with isobaric tags using iTRAQ® to determine the relative abundance of serum proteins identified by MS. A total of 220 unique proteins were identified and fourteen proteins were elevated in ovarian cancer compared to control serum pools, including several novel candidate ovarian cancer biomarkers: extracellular matrix protein-1, leucine-rich alpha-2 glycoprotein-1, lipopolysaccharide binding protein-1, and proteoglycan-4. Western immunoblotting validated the relative increases in serum protein levels for several of the proteins identified. Conclusions This study provides the first analysis of immunodepleted serum in combination with iTRAQ® to measure relative protein expression in ovarian cancer patients for the pursuit of serum biomarkers. Several candidate biomarkers were identified which warrant further development.
Normality Tests for Statistical Analysis: A Guide for Non-Statisticians
Ghasemi, Asghar; Zahediasl, Saleh
2012-01-01
Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS. PMID:23843808
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
A note on the statistical analysis of point judgment matrices
Directory of Open Access Journals (Sweden)
MG Kabera
2013-06-01
Full Text Available The Analytic Hierarchy Process is a multicriteria decision making technique developed by Saaty in the 1970s. The core of the approach is the pairwise comparison of objects according to a single criterion using a 9-point ratio scale and the estimation of weights associated with these objects based on the resultant judgment matrix. In the present paper some statistical approaches to extracting the weights of objects from a judgment matrix are reviewed and new ideas which are rooted in the traditional method of paired comparisons are introduced.
Statistical analysis of s-wave neutron reduced widths
International Nuclear Information System (INIS)
Pandita Anita; Agrawal, H.M.
1992-01-01
The fluctuations of the s-wave neutron reduced widths for many nuclei have been analyzed with emphasis on recent measurements by a statistical procedure which is based on the method of maximum likelihood. It is shown that the s-wave neutron reduced widths of nuclei follow single channel Porter Thomas distribution (x 2 -distribution with degree of freedom ν = 1) for most of the cases. However there are apparent deviations from ν = 1 and possible explanation and significance of this deviation is given. These considerations are likely to modify the evaluation of neutron cross section. (author)
Introduction to statistical data analysis for the life sciences
Ekstrom, Claus Thorn
2014-01-01
This text provides a computational toolbox that enables students to analyze real datasets and gain the confidence and skills to undertake more sophisticated analyses. Although accessible with any statistical software, the text encourages a reliance on R. For those new to R, an introduction to the software is available in an appendix. The book also includes end-of-chapter exercises as well as an entire chapter of case exercises that help students apply their knowledge to larger datasets and learn more about approaches specific to the life sciences.
An invariant approach to statistical analysis of shapes
Lele, Subhash R
2001-01-01
INTRODUCTIONA Brief History of MorphometricsFoundations for the Study of Biological FormsDescription of the data SetsMORPHOMETRIC DATATypes of Morphometric DataLandmark Homology and CorrespondenceCollection of Landmark CoordinatesReliability of Landmark Coordinate DataSummarySTATISTICAL MODELS FOR LANDMARK COORDINATE DATAStatistical Models in GeneralModels for Intra-Group VariabilityEffect of Nuisance ParametersInvariance and Elimination of Nuisance ParametersA Definition of FormCoordinate System Free Representation of FormEst
JAWS data collection, analysis highlights, and microburst statistics
Mccarthy, J.; Roberts, R.; Schreiber, W.
1983-01-01
Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.
Symbolic Data Analysis Conceptual Statistics and Data Mining
Billard, Lynne
2012-01-01
With the advent of computers, very large datasets have become routine. Standard statistical methods don't have the power or flexibility to analyse these efficiently, and extract the required knowledge. An alternative approach is to summarize a large dataset in such a way that the resulting summary dataset is of a manageable size and yet retains as much of the knowledge in the original dataset as possible. One consequence of this is that the data may no longer be formatted as single values, but be represented by lists, intervals, distributions, etc. The summarized data have their own internal s
Wang, Yumei; Yin, Xiaoling; Yang, Fang
2018-02-01
Sepsis is an inflammatory-related disease, and severe sepsis would induce multiorgan dysfunction, which is the most common cause of death of patients in noncoronary intensive care units. Progression of novel therapeutic strategies has proven to be of little impact on the mortality of severe sepsis, and unfortunately, its mechanisms still remain poorly understood. In this study, we analyzed gene expression profiles of severe sepsis with failure of lung, kidney, and liver for the identification of potential biomarkers. We first downloaded the gene expression profiles from the Gene Expression Omnibus and performed preprocessing of raw microarray data sets and identification of differential expression genes (DEGs) through the R programming software; then, significantly enriched functions of DEGs in lung, kidney, and liver failure sepsis samples were obtained from the Database for Annotation, Visualization, and Integrated Discovery; finally, protein-protein interaction network was constructed for DEGs based on the STRING database, and network modules were also obtained through the MCODE cluster method. As a result, lung failure sepsis has the highest number of DEGs of 859, whereas the number of DEGs in kidney and liver failure sepsis samples is 178 and 175, respectively. In addition, 17 overlaps were obtained among the three lists of DEGs. Biological processes related to immune and inflammatory response were found to be significantly enriched in DEGs. Network and module analysis identified four gene clusters in which all or most of genes were upregulated. The expression changes of Icam1 and Socs3 were further validated through quantitative PCR analysis. This study should shed light on the development of sepsis and provide potential therapeutic targets for sepsis-induced multiorgan failure.
Data analysis of asymmetric structures advanced approaches in computational statistics
Saito, Takayuki
2004-01-01
Data Analysis of Asymmetric Structures provides a comprehensive presentation of a variety of models and theories for the analysis of asymmetry and its applications and provides a wealth of new approaches in every section. It meets both the practical and theoretical needs of research professionals across a wide range of disciplines and considers data analysis in fields such as psychology, sociology, social science, ecology, and marketing. In seven comprehensive chapters this guide details theories, methods, and models for the analysis of asymmetric structures in a variety of disciplines and presents future opportunities and challenges affecting research developments and business applications.
Statistical image analysis of cerebral blood flow in moyamoya disease
International Nuclear Information System (INIS)
Yamada, Masaru; Yuzawa, Izumi; Suzuki, Sachio; Kurata, Akira; Fujii, Kiyotaka; Asano, Yuji
2007-01-01
The Summary of this study was to investigate pathophysiology of moyamoya disease, we analyzed brain single photon emission tomography (SPECT) images of patients with this disease by using interface software for a 3-dimensional (3D) data extraction format. Presenting symptoms were transient ischemic attack (TIA) in 21 patients and hemorrhage in 6 patients. All the patients underwent brain SPECT scan of 123 I-iofetamine (IMP) at rest and after acetazolamide challenge (17 mg/kg iv, 2-day method). Cerebral blood flow (CBF) was quantitatively measured using arterial blood sampling and an autoradiography model. The group of the patients who presented with TIAs showed decreased CBF in the frontal lobe at rest compared to that of patients with hemorrhage, but Z-score ((mean-patient data)/ standard deviation (SD)) did not reach statistical significance. Significant CBF decrease after acetazolamide challenge was observed in a wider cerebral cortical area in the TIA group than in the hemorrhagic group. The brain region of hemodynamic ischemia (stage II) correlated well with the responsible cortical area for clinical symptoms of TIA. A hemodynamic ischemia stage image clearly represented recovery of reserve capacity after bypass surgery. Statistical evaluation of SPECT may be useful to understand and clarify the pathophysiology of this disease. (author)