WorldWideScience

Sample records for survey statistical analysis

  1. About Statistical Analysis of Qualitative Survey Data

    Directory of Open Access Journals (Sweden)

    Stefan Loehnert

    2010-01-01

    Full Text Available Gathered data is frequently not in a numerical form allowing immediate appliance of the quantitative mathematical-statistical methods. In this paper are some basic aspects examining how quantitative-based statistical methodology can be utilized in the analysis of qualitative data sets. The transformation of qualitative data into numeric values is considered as the entrance point to quantitative analysis. Concurrently related publications and impacts of scale transformations are discussed. Subsequently, it is shown how correlation coefficients are usable in conjunction with data aggregation constrains to construct relationship modelling matrices. For illustration, a case study is referenced at which ordinal type ordered qualitative survey answers are allocated to process defining procedures as aggregation levels. Finally options about measuring the adherence of the gathered empirical data to such kind of derived aggregation models are introduced and a statistically based reliability check approach to evaluate the reliability of the chosen model specification is outlined.

  2. Sensitivity analysis and related analysis : A survey of statistical techniques

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical

  3. Usage of R in Official StatisticsSurvey Data Analysis at the Statistical Office of the Republic of Slovenia

    Directory of Open Access Journals (Sweden)

    Jerneja Pikelj

    2015-06-01

    Full Text Available The paper has two practical purposes. The first one is to analyze how successfully R can be used for data analysis on surveys carried out by the Statistical Office of the Republic of Slovenia. In order to achieve this goal, we analyzed the data of the Monthly Statistical Survey on Earnings Paid by Legal Persons. The second purpose is to analyze how the assumption on the nonresponse mechanism, which occurs in the sample, impacts the estimated values of the unknown statistics in the survey. Depending on these assumptions, different approaches to adjust the problem caused by unit nonresponse are presented. We conclude the paper with the results of the analysis of the data and the main issues connected with the usage of R in official statistics.

  4. Gravitational lensing statistics with extragalactic surveys; 2, Analysis of the Jodrell Bank-VLA Astrometric Survey

    NARCIS (Netherlands)

    Helbig, P.; Marlow, D. R.; Quast, R.; Wilkinson, P. N.; Browne, I. W. A.; Koopmans, L. V. E.

    1999-01-01

    Published in: Astron. Astrophys. Suppl. Ser. 136 (1999) no. 2, pp.297-305 citations recorded in [Science Citation Index] Abstract: We present constraints on the cosmological constant $lambda_{0}$ from gravitational lensing statistics of the Jodrell Bank-VLA Astrometric Survey (JVAS). Although this

  5. Statistical analysis of astrometric errors for the most productive asteroid surveys

    Science.gov (United States)

    Vereš, Peter; Farnocchia, Davide; Chesley, Steven R.; Chamberlin, Alan B.

    2017-11-01

    We performed a statistical analysis of the astrometric errors for the major asteroid surveys. We analyzed the astrometric residuals as a function of observation epoch, observed brightness and rate of motion, finding that astrometric errors are larger for faint observations and some stations improved their astrometric quality over time. Based on this statistical analysis we develop a new weighting scheme to be used when performing asteroid orbit determination. The proposed weights result in ephemeris predictions that can be conservative by a factor as large as 1.5. However, the new scheme is robust with respect to outliers and handles the larger errors for faint detections.

  6. STATISTICAL ANALYSIS AND OPINION SURVEY UPON DICTATORSHIP AS A PEDAGOGICAL STRATEGY OF THE TEACHING OF HISTORY.

    Directory of Open Access Journals (Sweden)

    Vitória A. da Fonseca

    2016-07-01

    Full Text Available This paper presents a practice of teaching, whose purpose was to make students of high school capable of understanding the issues upon the dictatorship as a theme in the teaching of history. Considering the importance of practice as a tool which makes up a learning path, the activity has involved debate, survey and statistical analysis. It is worth highlighting the engagement of students in this activity and mapping of their opinions about the dictatorship.

  7. Statistical imprints of CMB B-type polarization leakage in an incomplete sky survey analysis

    Science.gov (United States)

    Santos, Larissa; Wang, Kai; Hu, Yangrui; Fang, Wenjuan; Zhao, Wen

    2017-01-01

    One of the main goals of modern cosmology is to search for primordial gravitational waves by looking on their imprints in the B-type polarization in the cosmic microwave background radiation. However, this signal is contaminated by various sources, including cosmic weak lensing, foreground radiations, instrumental noises, as well as the E-to-B leakage caused by the partial sky surveys, which should be well understood to avoid the misinterpretation of the observed data. In this paper, we adopt the E/B decomposition method suggested by Smith in 2006, and study the imprints of E-to-B leakage residuals in the constructed B-type polarization maps, Script B(hat n), by employing various statistical tools. We find that the effects of E-to-B leakage are negligible for the Script B-mode power spectrum, as well as the skewness and kurtosis analyses of Script B-maps. However, if employing the morphological statistical tools, including Minkowski functionals and/or Betti numbers, we find the effect of leakage can be detected at very high confidence level, which shows that in the morphological analysis, the leakage can play a significant role as a contaminant for measuring the primordial B-mode signal and must be taken into account for a correct explanation of the data.

  8. Associative Analysis in Statistics

    Directory of Open Access Journals (Sweden)

    Mihaela Muntean

    2015-03-01

    Full Text Available In the last years, the interest in technologies such as in-memory analytics and associative search has increased. This paper explores how you can use in-memory analytics and an associative model in statistics. The word “associative” puts the emphasis on understanding how datasets relate to one another. The paper presents the main characteristics of “associative” data model. Also, the paper presents how to design an associative model for labor market indicators analysis. The source is the EU Labor Force Survey. Also, this paper presents how to make associative analysis.

  9. Applied multivariate statistical analysis

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners.  It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added.  All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior.  All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...

  10. Mathematical and statistical analysis

    Science.gov (United States)

    Houston, A. Glen

    1988-01-01

    The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.

  11. Statistical Survey of Non-Formal Education

    Directory of Open Access Journals (Sweden)

    Ondřej Nývlt

    2012-12-01

    Full Text Available focused on a programme within a regular education system. Labour market flexibility and new requirements on employees create a new domain of education called non-formal education. Is there a reliable statistical source with a good methodological definition for the Czech Republic? Labour Force Survey (LFS has been the basic statistical source for time comparison of non-formal education for the last ten years. Furthermore, a special Adult Education Survey (AES in 2011 was focused on individual components of non-formal education in a detailed way. In general, the goal of the EU is to use data from both internationally comparable surveys for analyses of the particular fields of lifelong learning in the way, that annual LFS data could be enlarged by detailed information from AES in five years periods. This article describes reliability of statistical data aboutnon-formal education. This analysis is usually connected with sampling and non-sampling errors.

  12. Statistical Analysis of the Worker Engagement Survey Administered at the Worker Safety and Security Team Festival

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Adam Christopher [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-08-25

    The Worker Safety and Security Team (WSST) at Los Alamos National Laboratory holds an annual festival, WSST-fest, to engage workers and inform them about safety- and securityrelated matters. As part of the 2015 WSST-fest, workers were given the opportunity to participate in a survey assessing their engagement in their organizations and work environments. A total of 789 workers participated in the 23-question survey where they were also invited, optionally, to identify themselves, their organization, and to give open-ended feedback. The survey consisted of 23 positive statements (i.e. “My organization is a good place to work.”) with which the respondent could express a level of agreement. The text of these statements are provided in Table 1. The level of agreement corresponds to a 5-level Likert scale ranging from “Strongly Disagree” to “Strongly Agree.” In addition to assessing the overall positivity or negativity of the scores, the results were partitioned into several cohorts based on the response meta-data (self-identification, comments, etc.) to explore trends. Survey respondents were presented with the options to identify themselves, their organizations and to provide comments. These options suggested the following questions about the data set.

  13. US Geological Survey nutrient preservation experiment : experimental design, statistical analysis, and interpretation of analytical results

    Science.gov (United States)

    Patton, Charles J.; Gilroy, Edward J.

    1999-01-01

    This report describes the experimental details and interprets results from a study conducted by the U.S. Geological Survey (USGS) in 1992 to assess the effect of different sample-processing treatments on the stability of eight nutrient species in samples of surface-, ground-, and municipal-supply water during storage at 4 degrees Celsius for about 30 days. Over a 7-week period, splits of filtered- and whole-water samples from 15 stations in the continental United States were preserved at collection sites with sulfuric acid (U.S. Environmental Protection Agency protocol), mercury (II) chloride (former U.S. Geological Survey protocol), and ASTM (American Society for Testing and Materials) Type I deionized water (control) and then shipped by overnight express to the USGS National Water Quality Laboratory (NWQL). At the NWQL, the eight nutrient species were determined in splits from each of the 15 stations, typically, within 24 hours of collection and at intervals of 3, 7, 14, 22, and 35 days thereafter. Ammonium, nitrate plus nitrite, nitrite, and orthophosphate were determined only in filtered-water splits. Kjeldahl nitrogen and phosphorus were determined in both filtered-water and whole-water splits.

  14. Statistical data analysis handbook

    National Research Council Canada - National Science Library

    Wall, Francis J

    1986-01-01

    It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...

  15. Statistics, data mining, and machine learning in astronomy a practical Python guide for the analysis of survey data

    CERN Document Server

    Ivezic, Željko; VanderPlas, Jacob T; Gray, Alexander

    2014-01-01

    As telescopes, detectors, and computers grow ever more powerful, the volume of data at the disposal of astronomers and astrophysicists will enter the petabyte domain, providing accurate measurements for billions of celestial objects. This book provides a comprehensive and accessible introduction to the cutting-edge statistical methods needed to efficiently analyze complex data sets from astronomical surveys such as the Panoramic Survey Telescope and Rapid Response System, the Dark Energy Survey, and the upcoming Large Synoptic Survey Telescope. It serves as a practical handbook for graduate s

  16. An evaluation of the quality of statistical design and analysis of published medical research: results from a systematic survey of general orthopaedic journals.

    Science.gov (United States)

    Parsons, Nick R; Price, Charlotte L; Hiskens, Richard; Achten, Juul; Costa, Matthew L

    2012-04-25

    The application of statistics in reported research in trauma and orthopaedic surgery has become ever more important and complex. Despite the extensive use of statistical analysis, it is still a subject which is often not conceptually well understood, resulting in clear methodological flaws and inadequate reporting in many papers. A detailed statistical survey sampled 100 representative orthopaedic papers using a validated questionnaire that assessed the quality of the trial design and statistical analysis methods. The survey found evidence of failings in study design, statistical methodology and presentation of the results. Overall, in 17% (95% confidence interval; 10-26%) of the studies investigated the conclusions were not clearly justified by the results, in 39% (30-49%) of studies a different analysis should have been undertaken and in 17% (10-26%) a different analysis could have made a difference to the overall conclusions. It is only by an improved dialogue between statistician, clinician, reviewer and journal editor that the failings in design methodology and analysis highlighted by this survey can be addressed.

  17. An evaluation of the quality of statistical design and analysis of published medical research: results from a systematic survey of general orthopaedic journals

    Directory of Open Access Journals (Sweden)

    Parsons Nick R

    2012-04-01

    Full Text Available Abstract Background The application of statistics in reported research in trauma and orthopaedic surgery has become ever more important and complex. Despite the extensive use of statistical analysis, it is still a subject which is often not conceptually well understood, resulting in clear methodological flaws and inadequate reporting in many papers. Methods A detailed statistical survey sampled 100 representative orthopaedic papers using a validated questionnaire that assessed the quality of the trial design and statistical analysis methods. Results The survey found evidence of failings in study design, statistical methodology and presentation of the results. Overall, in 17% (95% confidence interval; 10–26% of the studies investigated the conclusions were not clearly justified by the results, in 39% (30–49% of studies a different analysis should have been undertaken and in 17% (10–26% a different analysis could have made a difference to the overall conclusions. Conclusion It is only by an improved dialogue between statistician, clinician, reviewer and journal editor that the failings in design methodology and analysis highlighted by this survey can be addressed.

  18. Per Object statistical analysis

    DEFF Research Database (Denmark)

    2008-01-01

    This RS code is to do Object-by-Object analysis of each Object's sub-objects, e.g. statistical analysis of an object's individual image data pixels. Statistics, such as percentiles (so-called "quartiles") are derived by the process, but the return of that can only be a Scene Variable, not an Object...... an analysis of the values of the object's pixels in MS-Excel. The shell of the proceedure could also be used for purposes other than just the derivation of Object - Sub-object statistics, e.g. rule-based assigment processes....... Variable. This procedure was developed in order to be able to export objects as ESRI shape data with the 90-percentile of the Hue of each object's pixels as an item in the shape attribute table. This procedure uses a sub-level single pixel chessboard segmentation, loops for each of the objects...

  19. Statistical literacy and sample survey results

    Science.gov (United States)

    McAlevey, Lynn; Sullivan, Charles

    2010-10-01

    Sample surveys are widely used in the social sciences and business. The news media almost daily quote from them, yet they are widely misused. Using students with prior managerial experience embarking on an MBA course, we show that common sample survey results are misunderstood even by those managers who have previously done a statistics course. In general, they fare no better than managers who have never studied statistics. There are implications for teaching, especially in business schools, as well as for consulting.

  20. Statistical Analysis of Demographic and Temporal Differences in LANL's 2014 Voluntary Protection Program Survey

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Adam Christopher [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Booth, Steven Richard [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-08-20

    Voluntary Protection Program (VPP) surveys were conducted in 2013 and 2014 to assess the degree to which workers at Los Alamos National Laboratory feel that their safety is valued by their management and peers. The goal of this analysis is to determine whether the difference between the VPP survey scores in 2013 and 2014 is significant, and to present the data in a way such that it can help identify either positive changes or potential opportunities for improvement. Data for several questions intended to identify the demographic groups of the respondent are included in both the 2013 and 2014 VPP survey results. These can be used to identify any significant differences among groups of employees as well as to identify any temporal trends in these cohorts.

  1. Statistical Literacy and Sample Survey Results

    Science.gov (United States)

    McAlevey, Lynn; Sullivan, Charles

    2010-01-01

    Sample surveys are widely used in the social sciences and business. The news media almost daily quote from them, yet they are widely misused. Using students with prior managerial experience embarking on an MBA course, we show that common sample survey results are misunderstood even by those managers who have previously done a statistics course. In…

  2. Beginning statistics with data analysis

    CERN Document Server

    Mosteller, Frederick; Rourke, Robert EK

    2013-01-01

    This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.

  3. Challenges in dental statistics: survey methodology topics

    OpenAIRE

    Pizzo, Giuseppe; Milani, Silvano; Spada, Elena; Ottolenghi, Livia

    2013-01-01

    This paper gathers some contributions concerning survey methodology in dental research, as discussed during the first Workshop of the SISMEC STATDENT working group on statistical methods and applications in dentistry, held in Ancona on the 28th September 2011.The first contribution deals with the European Global Oral Health Indicators Development (EGOHID) Project which proposed a comprehensive and standardized system of epidemiological tools (questionnaires and clinical forms) for national da...

  4. Key Statistics from the National Survey of Family Growth: Vasectomy

    Science.gov (United States)

    ... Collection Systems Vital Statistics: Birth Data NCHS Key Statistics from the National Survey of Family Growth - V ... NCHS Listservs Surveys and Data Collection Systems Vital Statistics: Birth Data File Formats Help: How do I ...

  5. Statistical Analysis Plan

    DEFF Research Database (Denmark)

    Ris Hansen, Inge; Søgaard, Karen; Gram, Bibi

    2015-01-01

    This is the analysis plan for the multicentre randomised control study looking at the effect of training and exercises in chronic neck pain patients that is being conducted in Jutland and Funen, Denmark. This plan will be used as a work description for the analyses of the data collected....

  6. Impact of the Global Food Safety Initiative on Food Safety Worldwide: Statistical Analysis of a Survey of International Food Processors.

    Science.gov (United States)

    Crandall, Philip G; Mauromoustakos, Andy; O'Bryan, Corliss A; Thompson, Kevin C; Yiannas, Frank; Bridges, Kerry; Francois, Catherine

    2017-10-01

    In 2000, the Consumer Goods Forum established the Global Food Safety Initiative (GFSI) to increase the safety of the world's food supply and to harmonize food safety regulations worldwide. In 2013, a university research team in conjunction with Diversey Consulting (Sealed Air), the Consumer Goods Forum, and officers of GFSI solicited input from more than 15,000 GFSI-certified food producers worldwide to determine whether GFSI certification had lived up to these expectations. A total of 828 usable questionnaires were analyzed, representing about 2,300 food manufacturing facilities and food suppliers in 21 countries, mainly across Western Europe, Australia, New Zealand, and North America. Nearly 90% of these certified suppliers perceived GFSI as being beneficial for addressing their food safety concerns, and respondents were eight times more likely to repeat the certification process knowing what it entailed. Nearly three-quarters (74%) of these food manufacturers would choose to go through the certification process again even if certification were not required by one of their current retail customers. Important drivers for becoming GFSI certified included continuing to do business with an existing customer, starting to do business with new customer, reducing the number of third-party food safety audits, and continuing improvement of their food safety program. Although 50% or fewer respondents stated that they saw actual increases in sales, customers, suppliers, or employees, significantly more companies agreed than disagreed that there was an increase in these key performance indicators in the year following GFSI certification. A majority of respondents (81%) agreed that there was a substantial investment in staff time since certification, and 50% agreed there was a significant capital investment. This survey is the largest and most representative of global food manufacturers conducted to date.

  7. Challenges in dental statistics: survey methodology topics

    Directory of Open Access Journals (Sweden)

    Giuseppe Pizzo

    2013-12-01

    Full Text Available This paper gathers some contributions concerning survey methodology in dental research, as discussed during the first Workshop of the SISMEC STATDENT working group on statistical methods and applications in dentistry, held in Ancona on the 28th September 2011.The first contribution deals with the European Global Oral Health Indicators Development (EGOHID Project which proposed a comprehensive and standardized system of epidemiological tools (questionnaires and clinical forms for national data collection on oral health in Europe. The second contribution regards the design and conduct of trials to evaluate the clinical efficacy and safety of toothbrushes and mouthrinses. Finally, a flexible and effective tool used to trace dental age reference charts tailored to Italian children is presented.

  8. The implicative statistical analysis: an interdisciplinary paradigm

    OpenAIRE

    Iurato, Giuseppe

    2012-01-01

    In this brief note, which has simply the role of an epistemological survey paper, some of the main basic elements of Implicative Statistical Analysis (ISA) pattern are put into a possible critical comparison with some of the main aspects of Probability Theory, Inductive Inference Theory, Nonparametric and Multivariate Statistics, Optimization Theory and Dynamical System Theory which point out the very interesting multidisciplinary nature of the ISA pattern and related possible hints.

  9. Research design and statistical analysis

    CERN Document Server

    Myers, Jerome L; Lorch Jr, Robert F

    2013-01-01

    Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data.  The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations.  Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions.  Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations

  10. Statistical mapping of count survey data

    Science.gov (United States)

    Royle, J. Andrew; Link, W.A.; Sauer, J.R.; Scott, J. Michael; Heglund, Patricia J.; Morrison, Michael L.; Haufler, Jonathan B.; Wall, William A.

    2002-01-01

    We apply a Poisson mixed model to the problem of mapping (or predicting) bird relative abundance from counts collected from the North American Breeding Bird Survey (BBS). The model expresses the logarithm of the Poisson mean as a sum of a fixed term (which may depend on habitat variables) and a random effect which accounts for remaining unexplained variation. The random effect is assumed to be spatially correlated, thus providing a more general model than the traditional Poisson regression approach. Consequently, the model is capable of improved prediction when data are autocorrelated. Moreover, formulation of the mapping problem in terms of a statistical model facilitates a wide variety of inference problems which are cumbersome or even impossible using standard methods of mapping. For example, assessment of prediction uncertainty, including the formal comparison of predictions at different locations, or through time, using the model-based prediction variance is straightforward under the Poisson model (not so with many nominally model-free methods). Also, ecologists may generally be interested in quantifying the response of a species to particular habitat covariates or other landscape attributes. Proper accounting for the uncertainty in these estimated effects is crucially dependent on specification of a meaningful statistical model. Finally, the model may be used to aid in sampling design, by modifying the existing sampling plan in a manner which minimizes some variance-based criterion. Model fitting under this model is carried out using a simulation technique known as Markov Chain Monte Carlo. Application of the model is illustrated using Mourning Dove (Zenaida macroura) counts from Pennsylvania BBS routes. We produce both a model-based map depicting relative abundance, and the corresponding map of prediction uncertainty. We briefly address the issue of spatial sampling design under this model. Finally, we close with some discussion of mapping in relation to

  11. Statistical methods for bioimpedance analysis

    Directory of Open Access Journals (Sweden)

    Christian Tronstad

    2014-04-01

    Full Text Available This paper gives a basic overview of relevant statistical methods for the analysis of bioimpedance measurements, with an aim to answer questions such as: How do I begin with planning an experiment? How many measurements do I need to take? How do I deal with large amounts of frequency sweep data? Which statistical test should I use, and how do I validate my results? Beginning with the hypothesis and the research design, the methodological framework for making inferences based on measurements and statistical analysis is explained. This is followed by a brief discussion on correlated measurements and data reduction before an overview is given of statistical methods for comparison of groups, factor analysis, association, regression and prediction, explained in the context of bioimpedance research. The last chapter is dedicated to the validation of a new method by different measures of performance. A flowchart is presented for selection of statistical method, and a table is given for an overview of the most important terms of performance when evaluating new measurement technology.

  12. Pyrotechnic Shock Analysis Using Statistical Energy Analysis

    Science.gov (United States)

    2015-10-23

    29th Aerospace Testing Seminar, October 2015 Pyrotechnic Shock Analysis Using Statistical Energy Analysis James Ho-Jin Hwang Engineering...maximum structural response due to a pyrotechnic shock input using Statistical Energy Analysis (SEA). It had been previously understood that since the...pyrotechnic shock is not a steady state event, traditional SEA method may not applicable. A new analysis methodology effectively utilizes the

  13. A survey of statistical downscaling techniques

    Energy Technology Data Exchange (ETDEWEB)

    Zorita, E.; Storch, H. von [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Hydrophysik

    1997-12-31

    The derivation of regional information from integrations of coarse-resolution General Circulation Models (GCM) is generally referred to as downscaling. The most relevant statistical downscaling techniques are described here and some particular examples are worked out in detail. They are classified into three main groups: linear methods, classification methods and deterministic non-linear methods. Their performance in a particular example, winter rainfall in the Iberian peninsula, is compared to a simple downscaling analog method. It is found that the analog method performs equally well than the more complicated methods. Downscaling analysis can be also used as a tool to validate regional performance of global climate models by analyzing the covariability of the simulated large-scale climate and the regional climates. (orig.) [Deutsch] Die Ableitung regionaler Information aus Integrationen grob aufgeloester Klimamodelle wird als `Regionalisierung` bezeichnet. Dieser Beitrag beschreibt die wichtigsten statistischen Regionalisierungsverfahren und gibt darueberhinaus einige detaillierte Beispiele. Regionalisierungsverfahren lassen sich in drei Hauptgruppen klassifizieren: lineare Verfahren, Klassifikationsverfahren und nicht-lineare deterministische Verfahren. Diese Methoden werden auf den Niederschlag auf der iberischen Halbinsel angewandt und mit den Ergebnissen eines einfachen Analog-Modells verglichen. Es wird festgestellt, dass die Ergebnisse der komplizierteren Verfahren im wesentlichen auch mit der Analog-Methode erzielt werden koennen. Eine weitere Anwendung der Regionalisierungsmethoden besteht in der Validierung globaler Klimamodelle, indem die simulierte und die beobachtete Kovariabilitaet zwischen dem grosskaligen und dem regionalen Klima miteinander verglichen wird. (orig.)

  14. Regularized Statistical Analysis of Anatomy

    DEFF Research Database (Denmark)

    Sjöstrand, Karl

    2007-01-01

    This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus cal...

  15. Bayesian Inference in Statistical Analysis

    CERN Document Server

    Box, George E P

    2011-01-01

    The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob

  16. Statistical analysis of survival data.

    Science.gov (United States)

    Crowley, J; Breslow, N

    1984-01-01

    A general review of the statistical techniques that the authors feel are most important in the analysis of survival data is presented. The emphasis is on the study of the duration of time between any two events as applied to people and on the nonparametric and semiparametric models most often used in these settings. The unifying concept is the hazard function, variously known as the risk, the force of mortality, or the force of transition.

  17. Spatial Statistical Analysis of Large Astronomical Datasets

    Science.gov (United States)

    Szapudi, Istvan

    2002-12-01

    The future of astronomy will be dominated with large and complex data bases. Megapixel CMB maps, joint analyses of surveys across several wavelengths, as envisioned in the planned National Virtual Observatory (NVO), TByte/day data rate of future surveys (Pan-STARRS) put stringent constraints on future data analysis methods: they have to achieve at least N log N scaling to be viable in the long term. This warrants special attention to computational requirements, which were ignored during the initial development of current analysis tools in favor of statistical optimality. Even an optimal measurement, however, has residual errors due to statistical sample variance. Hence a suboptimal technique with significantly smaller measurement errors than the unavoidable sample variance produces results which are nearly identical to that of a statistically optimal technique. For instance, for analyzing CMB maps, I present a suboptimal alternative, indistinguishable from the standard optimal method with N3 scaling, that can be rendered N log N with a hierarchical representation of the data; a speed up of a trillion times compared to other methods. In this spirit I will present a set of novel algorithms and methods for spatial statistical analyses of future large astronomical data bases, such as galaxy catalogs, megapixel CMB maps, or any point source catalog.

  18. New developments in survey data collection methodology for official statistics

    NARCIS (Netherlands)

    Bethlehem, J.

    2010-01-01

    There is a growing demand for statistical information in society. National statistical institutes have to satisfy this demand. The way they attempt to accomplish this, changes over time. Changes in survey methodology for official statistics may have been caused by new developments, for example in

  19. Statistical Signatures of Panspermia in Exoplanet Surveys

    Science.gov (United States)

    Lin, Henry W.; Loeb, Abraham

    2015-09-01

    A fundamental astrobiological question is whether life can be transported between extrasolar systems. We propose a new strategy to answer this question based on the principle that life which arose via spreading will exhibit more clustering than life which arose spontaneously. We develop simple statistical models of panspermia to illustrate observable consequences of these excess correlations. Future searches for biosignatures in the atmospheres of exoplanets could test these predictions: a smoking gun signature of panspermia would be the detection of large regions in the Milky Way where life saturates its environment interspersed with voids where life is very uncommon. In a favorable scenario, detection of as few as ∼25 biologically active exoplanets could yield a 5σ detection of panspermia. Detectability of position-space correlations is possible unless the timescale for life to become observable once seeded is longer than the timescale for stars to redistribute in the Milky Way.

  20. Statistical Analysis of Iberian Peninsula Megaliths Orientations

    Science.gov (United States)

    González-García, A. C.

    2009-08-01

    Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.

  1. Measurement error models for survey statistics and economic archaeology

    OpenAIRE

    Groß, Marcus

    2016-01-01

    The present work is concerned with so-called measurement error models in applied statistics. The data were analyzed and processed from two very different fields. On the one hand survey and register data, which are used in the Survey statistics and on the other hand anthropological data on prehistoric skeletons. For both fields the problem arises that some variables cannot be measured with sufficient accuracy. This can be due to privacy or measuring inaccuracies. This circumstance can be summa...

  2. A survey of abstracts of high-impact clinical journals indicated most statistical methods presented are summary statistics.

    Science.gov (United States)

    Taback, Nathan; Krzyzanowska, Monika K

    2008-03-01

    To assess what statistical methods are commonly used in high-impact clinical research and how they are presented in abstracts of articles published in high-impact medical journals. A cross-sectional survey of abstracts of original articles published in July 2003 in four high-impact medical journals was conducted. The primary outcome was the distribution of statistical methods used in study results presented in the abstract of articles. Seventy articles met inclusion criteria. One hundred twenty-five unique statistical method presentations were analyzed. Sixty-eight percent of statistical methods used summary statistics, and 27.2% used regression analysis. When summary statistics were used, clinical evidence was presented with a P-value or confidence interval (CI) in 51.8% of statistical methods compared to 72.5% when summary statistics were not used (P=0.0282). Clinical evidence was presented verbally in 7.1% of statistical methods when summary statistics were used and in 20.0% when summary statistics were not used (P=0.0323). Summary statistics are the most frequently used statistical method to generate high-impact clinical evidence presented in the abstract of a medical article. Evidence described by summary statistics is significantly associated with less frequent reporting of a P-value or CI, and less frequent verbal presentations.

  3. Systematic survey of the design, statistical analysis, and reporting of studies published in the 2008 volume of the Journal of Cerebral Blood Flow and Metabolism.

    Science.gov (United States)

    Vesterinen, Hanna M; Vesterinen, Hanna V; Egan, Kieren; Deister, Amelie; Schlattmann, Peter; Macleod, Malcolm R; Dirnagl, Ulrich

    2011-04-01

    Translating experimental findings into clinically effective therapies is one of the major bottlenecks of modern medicine. As this has been particularly true for cerebrovascular research, attention has turned to the quality and validity of experimental cerebrovascular studies. We set out to assess the study design, statistical analyses, and reporting of cerebrovascular research. We assessed all original articles published in the Journal of Cerebral Blood Flow and Metabolism during the year 2008 against a checklist designed to capture the key attributes relating to study design, statistical analyses, and reporting. A total of 156 original publications were included (animal, in vitro, human). Few studies reported a primary research hypothesis, statement of purpose, or measures to safeguard internal validity (such as randomization, blinding, exclusion or inclusion criteria). Many studies lacked sufficient information regarding methods and results to form a reasonable judgment about their validity. In nearly 20% of studies, statistical tests were either not appropriate or information to allow assessment of appropriateness was lacking. This study identifies a number of factors that should be addressed if the quality of research in basic and translational biomedicine is to be improved. We support the widespread implementation of the ARRIVE (Animal Research Reporting In Vivo Experiments) statement for the reporting of experimental studies in biomedicine, for improving training in proper study design and analysis, and that reviewers and editors adopt a more constructively critical approach in the assessment of manuscripts for publication.

  4. "TNOs are Cool": A survey of the trans-Neptunian region. XIII. Statistical analysis of multiple trans-Neptunian objects observed with Herschel Space Observatory

    Science.gov (United States)

    Kovalenko, I. D.; Doressoundiram, A.; Lellouch, E.; Vilenius, E.; Müller, T.; Stansberry, J.

    2017-11-01

    Context. Gravitationally bound multiple systems provide an opportunity to estimate the mean bulk density of the objects, whereas this characteristic is not available for single objects. Being a primitive population of the outer solar system, binary and multiple trans-Neptunian objects (TNOs) provide unique information about bulk density and internal structure, improving our understanding of their formation and evolution. Aims: The goal of this work is to analyse parameters of multiple trans-Neptunian systems, observed with Herschel and Spitzer space telescopes. Particularly, statistical analysis is done for radiometric size and geometric albedo, obtained from photometric observations, and for estimated bulk density. Methods: We use Monte Carlo simulation to estimate the real size distribution of TNOs. For this purpose, we expand the dataset of diameters by adopting the Minor Planet Center database list with available values of the absolute magnitude therein, and the albedo distribution derived from Herschel radiometric measurements. We use the 2-sample Anderson-Darling non-parametric statistical method for testing whether two samples of diameters, for binary and single TNOs, come from the same distribution. Additionally, we use the Spearman's coefficient as a measure of rank correlations between parameters. Uncertainties of estimated parameters together with lack of data are taken into account. Conclusions about correlations between parameters are based on statistical hypothesis testing. Results: We have found that the difference in size distributions of multiple and single TNOs is biased by small objects. The test on correlations between parameters shows that the effective diameter of binary TNOs strongly correlates with heliocentric orbital inclination and with magnitude difference between components of binary system. The correlation between diameter and magnitude difference implies that small and large binaries are formed by different mechanisms. Furthermore

  5. Pseudo-populations a basic concept in statistical surveys

    CERN Document Server

    Quatember, Andreas

    2015-01-01

    This book emphasizes that artificial or pseudo-populations play an important role in statistical surveys from finite universes in two manners: firstly, the concept of pseudo-populations may substantially improve users’ understanding of various aspects in the sampling theory and survey methodology; an example of this scenario is the Horvitz-Thompson estimator. Secondly, statistical procedures exist in which pseudo-populations actually have to be generated. An example of such a scenario can be found in simulation studies in the field of survey sampling, where close-to-reality pseudo-populations are generated from known sample and population data to form the basis for the simulation process. The chapters focus on estimation methods, sampling techniques, nonresponse, questioning designs and statistical disclosure control.This book is a valuable reference in understanding the importance of the pseudo-population concept and applying it in teaching and research.

  6. Statistical analysis of management data

    CERN Document Server

    Gatignon, Hubert

    2013-01-01

    This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.

  7. Statistical Estimators Using Jointly Administrative and Survey Data to Produce French Structural Business Statistics

    Directory of Open Access Journals (Sweden)

    Brion Philippe

    2015-12-01

    Full Text Available Using as much administrative data as possible is a general trend among most national statistical institutes. Different kinds of administrative sources, from tax authorities or other administrative bodies, are very helpful material in the production of business statistics. However, these sources often have to be completed by information collected through statistical surveys. This article describes the way Insee has implemented such a strategy in order to produce French structural business statistics. The originality of the French procedure is that administrative and survey variables are used jointly for the same enterprises, unlike the majority of multisource systems, in which the two kinds of sources generally complement each other for different categories of units. The idea is to use, as much as possible, the richness of the administrative sources combined with the timeliness of a survey, even if the latter is conducted only on a sample of enterprises. One main issue is the classification of enterprises within the NACE nomenclature, which is a cornerstone variable in producing the breakdown of the results by industry. At a given date, two values of the corresponding code may coexist: the value of the register, not necessarily up to date, and the value resulting from the data collected via the survey, but only from a sample of enterprises. Using all this information together requires the implementation of specific statistical estimators combining some properties of the difference estimators with calibration techniques. This article presents these estimators, as well as their statistical properties, and compares them with those of other methods.

  8. Statistical trend analysis methods for temporal phenomena

    Energy Technology Data Exchange (ETDEWEB)

    Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.

  9. Statistical Literacy Among Academic Pathologists: A Survey Study to Gauge Knowledge of Frequently Used Statistical Tests Among Trainees and Faculty.

    Science.gov (United States)

    Schmidt, Robert L; Chute, Deborah J; Colbert-Getz, Jorie M; Firpo-Betancourt, Adolfo; James, Daniel S; Karp, Julie K; Miller, Douglas C; Milner, Danny A; Smock, Kristi J; Sutton, Ann T; Walker, Brandon S; White, Kristie L; Wilson, Andrew R; Wojcik, Eva M; Yared, Marwan A; Factor, Rachel E

    2017-02-01

    -Statistical literacy can be defined as understanding the statistical tests and terminology needed for the design, analysis, and conclusions of original research or laboratory testing. Little is known about the statistical literacy of clinical or anatomic pathologists. -To determine the statistical methods most commonly used in pathology studies from the literature and to assess familiarity and knowledge level of these statistical tests by pathology residents and practicing pathologists. -The most frequently used statistical methods were determined by a review of 1100 research articles published in 11 pathology journals during 2015. Familiarity with statistical methods was determined by a survey of pathology trainees and practicing pathologists at 9 academic institutions in which pathologists were asked to rate their knowledge of the methods identified by the focused review of the literature. -We identified 18 statistical tests that appear frequently in published pathology studies. On average, pathologists reported a knowledge level between "no knowledge" and "basic knowledge" of most statistical tests. Knowledge of tests was higher for more frequently used tests. Greater statistical knowledge was associated with a focus on clinical pathology versus anatomic pathology, having had a statistics course, having an advanced degree other than an MD degree, and publishing research. Statistical knowledge was not associated with length of pathology practice. -An audit of pathology literature reveals that knowledge of about 12 statistical tests would be sufficient to provide statistical literacy for pathologists. On average, most pathologists report they can interpret commonly used tests but are unable to perform them. Most pathologists indicated that they would benefit from additional statistical training.

  10. 2010 National Beneficiary Survey: Methodology and Descriptive Statistics.

    OpenAIRE

    Debra Wright; Gina Livermore; Denise Hoffman; Eric Grau; Maura Bardos

    2012-01-01

    This report presents the sampling design and data collection activities for round 4 (2010) of the Social Security Administration’s National Beneficiary Survey (NBS). It also provides descriptive statistics on working-age individuals receiving Supplemental Security Income and Social Security Disability Insurance benefits, based on the nationally representative sample from the 2010 NBS.

  11. A Statistical Analysis of Cryptocurrencies

    OpenAIRE

    Stephen Chan; Jeffrey Chu; Saralees Nadarajah; Joerg Osterrieder

    2017-01-01

    We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization), of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic...

  12. Water Quality Stressor Information from Clean Water Act Statewide Statistical Surveys

    Data.gov (United States)

    U.S. Environmental Protection Agency — Stressors assessed by statewide statistical surveys and their state and national attainment categories. Statewide statistical surveys are water quality assessments...

  13. Water Quality attainment Information from Clean Water Act Statewide Statistical Surveys

    Data.gov (United States)

    U.S. Environmental Protection Agency — Designated uses assessed by statewide statistical surveys and their state and national attainment categories. Statewide statistical surveys are water quality...

  14. A Statistical Analysis of Cryptocurrencies

    Directory of Open Access Journals (Sweden)

    Stephen Chan

    2017-05-01

    Full Text Available We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization, of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic distribution gives the best fit, while for the smaller cryptocurrencies the normal inverse Gaussian distribution, generalized t distribution, and Laplace distribution give good fits. The results are important for investment and risk management purposes.

  15. Risk analysis methodology survey

    Science.gov (United States)

    Batson, Robert G.

    1987-01-01

    NASA regulations require that formal risk analysis be performed on a program at each of several milestones as it moves toward full-scale development. Program risk analysis is discussed as a systems analysis approach, an iterative process (identification, assessment, management), and a collection of techniques. These techniques, which range from simple to complex network-based simulation were surveyed. A Program Risk Analysis Handbook was prepared in order to provide both analyst and manager with a guide for selection of the most appropriate technique.

  16. Morphological Analysis for Statistical Machine Translation

    National Research Council Canada - National Science Library

    Lee, Young-Suk

    2004-01-01

    We present a novel morphological analysis technique which induces a morphological and syntactic symmetry between two languages with highly asymmetrical morphological structures to improve statistical...

  17. Using Person Fit Statistics to Detect Outliers in Survey Research.

    Science.gov (United States)

    Felt, John M; Castaneda, Ruben; Tiemensma, Jitske; Depaoli, Sarah

    2017-01-01

    Context: When working with health-related questionnaires, outlier detection is important. However, traditional methods of outlier detection (e.g., boxplots) can miss participants with "atypical" responses to the questions that otherwise have similar total (subscale) scores. In addition to detecting outliers, it can be of clinical importance to determine the reason for the outlier status or "atypical" response. Objective: The aim of the current study was to illustrate how to derive person fit statistics for outlier detection through a statistical method examining person fit with a health-based questionnaire. Design and Participants: Patients treated for Cushing's syndrome (n = 394) were recruited from the Cushing's Support and Research Foundation's (CSRF) listserv and Facebook page. Main Outcome Measure: Patients were directed to an online survey containing the CushingQoL (English version). A two-dimensional graded response model was estimated, and person fit statistics were generated using the Zh statistic. Results: Conventional outlier detections methods revealed no outliers reflecting extreme scores on the subscales of the CushingQoL. However, person fit statistics identified 18 patients with "atypical" response patterns, which would have been otherwise missed (Zh > |±2.00|). Conclusion: While the conventional methods of outlier detection indicated no outliers, person fit statistics identified several patients with "atypical" response patterns who otherwise appeared average. Person fit statistics allow researchers to delve further into the underlying problems experienced by these "atypical" patients treated for Cushing's syndrome. Annotated code is provided to aid other researchers in using this method.

  18. Non-gaussian statistics of pencil beam surveys

    Science.gov (United States)

    Amendola, Luca

    1994-01-01

    We study the effect of the non-Gaussian clustering of galaxies on the statistics of pencil beam surveys. We derive the probability from the power spectrum peaks by means of Edgeworth expansion and find that the higher order moments of the galaxy distribution play a dominant role. The probability of obtaining the 128 Mpc/h periodicity found in pencil beam surveys is raised by more than one order of magnitude, up to 1%. Further data are needed to decide if non-Gaussian distribution alone is sufficient to explain the 128 Mpc/h periodicity, or if extra large-scale power is necessary.

  19. Statistical Power in Meta-Analysis

    Science.gov (United States)

    Liu, Jin

    2015-01-01

    Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…

  20. Statistical methods for astronomical data analysis

    CERN Document Server

    Chattopadhyay, Asis Kumar

    2014-01-01

    This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...

  1. STATISTICAL ANALYSIS OF MONETARY POLICY INDICATORS VARIABILITY

    Directory of Open Access Journals (Sweden)

    ANAMARIA POPESCU

    2016-10-01

    Full Text Available This paper attempts to characterize through statistical indicators of statistical data that we have available. The purpose of this paper is to present statistical indicators, primary and secondary, simple and synthetic, which is frequently used for statistical characterization of statistical series. We can thus analyze central tendency, and data variability, form and concentration distributions package data using analytical tools in Microsoft Excel that enables automatic calculation of descriptive statistics using Data Analysis option from the Tools menu. We will also study the links which exist between statistical variables can be studied using two techniques, correlation and regression. From the analysis of monetary policy in the period 2003 - 2014 and information provided by the website of the National Bank of Romania (BNR seems to be a certain tendency towards eccentricity and asymmetry of financial data series.

  2. Statistics available for site studies in registers and surveys at Statistics Sweden

    Energy Technology Data Exchange (ETDEWEB)

    Haldorson, Marie [Statistics Sweden, Oerebro (Sweden)

    2000-03-01

    Statistics Sweden (SCB) has produced this report on behalf of the Swedish Nuclear Fuel and Waste Management Company (SKB), as part of the data to be used by SKB in conducting studies of potential sites. The report goes over the statistics obtainable from SCB in the form of registers and surveys. The purpose is to identify the variables that are available, and to specify their degree of geographical detail and the time series that are available. Chapter two describes the statistical registers available at SCB, registers that share the common feature that they provide total coverage, i.e. they contain all 'objects' of a given type, such as population, economic activities (e.g. from statements of employees' earnings provided to the tax authorities), vehicles, enterprises or real estate. SCB has exclusive responsibility for seven of the nine registers included in the chapter, while two registers are ordered by public authorities with statistical responsibilities. Chapter three describes statistical surveys that are conducted by SCB, with the exception of the National Forest Inventory, which is carried out by the Swedish University of Agricultural Sciences. In terms of geographical breakdown, the degree of detail in the surveys varies, but all provide some possibility of reporting data at lower than the national level. The level involved may be county, municipality, yield district, coastal district or category of enterprises, e.g. aquaculture. Six of the nine surveys included in the chapter have been ordered by public authorities with statistical responsibilities, while SCB has exclusive responsibility for the others. Chapter four presents an overview of the statistics on land use maintained by SCB. This chapter does not follow the same pattern as chapters two and three but instead gives a more general account. The conclusion can be drawn that there are good prospects that SKB can make use of SCB's data as background information or in other ways when

  3. Statistical analysis with Excel for dummies

    CERN Document Server

    Schmuller, Joseph

    2013-01-01

    Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro

  4. Sevelamer hydrochloride dose-dependent increase in prevalence of severe acidosis in hemodialysis patients: analysis of nationwide statistical survey in Japan.

    Science.gov (United States)

    Oka, Yoshinari; Miyazaki, Masashi; Matsuda, Hiroaki; Takatsu, Shigeko; Katsube, Ryouichi; Mori, Toshiko; Takehara, Kiyoto; Umeda, Yuzo; Uno, Futoshi

    2014-02-01

    Metabolic acidosis has a negative impact on prognosis of dialysis patients. The aim of this study was to determine the prevalence of severe metabolic acidosis in dialysis patients treated with sevelamer hydrochloride. In 2004, a nationwide survey (101,516 dialysis patients) was conducted by the Japanese Society for Dialysis Therapy. We analyzed 32,686 dialysis patients whose bicarbonate levels were measured in the survey. Sevelamer hydrochloride was prescribed to 9231 dialysis patients while 23,455 dialysis patients were not prescribed sevelamer hydrochloride. In the present study, we defined severe acidosis as bicarbonate acidosis increased significantly with increased dose of sevelamer hydrochloride (R(2) = 0.885, P acidosis in 10% and 15% of patients were 3.5 g/day (95% confidence interval [95%CI], 2.8-4.4) and 7.7 g/day (95%CI = 5.9-10.9), respectively. Severe acidosis was noted in 4.5% of patients who were not treated with sevelamer hydrochloride and in 16.1% of patients treated with sevelamer hydrochloride at ≥ 5.25 g/day (P < 0.0001). The results call for careful monitoring of serum bicarbonate level in hemodialysis patients treated with sevelamer hydrochloride. © 2013 The Authors. Therapeutic Apheresis and Dialysis © 2013 International Society for Apheresis.

  5. A Statistical Analysis of Women's Perceptions on Politics and Peace ...

    African Journals Online (AJOL)

    This article is a statistical analysis of the perception that more women in politics would enhance peace building. The data was drawn from a comparative survey of 325 women and four men (community leaders) in the regions of the Niger Delta (Nigeria) and KwaZulu-Natal (South Africa). According to the findings, the ...

  6. Hypothesis testing and statistical analysis of microbiome

    Directory of Open Access Journals (Sweden)

    Yinglin Xia

    2017-09-01

    Full Text Available After the initiation of Human Microbiome Project in 2008, various biostatistic and bioinformatic tools for data analysis and computational methods have been developed and applied to microbiome studies. In this review and perspective, we discuss the research and statistical hypotheses in gut microbiome studies, focusing on mechanistic concepts that underlie the complex relationships among host, microbiome, and environment. We review the current available statistic tools and highlight recent progress of newly developed statistical methods and models. Given the current challenges and limitations in biostatistic approaches and tools, we discuss the future direction in developing statistical methods and models for the microbiome studies.

  7. Statistical shape analysis with applications in R

    CERN Document Server

    Dryden, Ian L

    2016-01-01

    A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...

  8. Spatial analysis statistics, visualization, and computational methods

    CERN Document Server

    Oyana, Tonny J

    2015-01-01

    An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...

  9. Quantifying the impact of rising food prices on child mortality in India: a cross-district statistical analysis of the District Level Household Survey.

    Science.gov (United States)

    Fledderjohann, Jasmine; Vellakkal, Sukumar; Khan, Zaky; Ebrahim, Shah; Stuckler, David

    2016-04-01

    Rates of child malnutrition and mortality in India remain high. We tested the hypothesis that rising food prices are contributing to India's slow progress in improving childhood survival. Using rounds 2 and 3 (2002-08) of the Indian District Level Household Survey, we calculated neonatal, infant and under-five mortality rates in 364 districts, and merged these with district-level food price data from the National Sample Survey Office. Multivariate models were estimated, stratified into 27 less deprived states and territories and 8 deprived states ('Empowered Action Groups'). Between 2002 and 2008, the real price of food in India rose by 11.7%. A 1% increase in total food prices was associated with a 0.49% increase in neonatal (95% confidence interval (CI): 0.13% to 0.85%), but not infant or under-five mortality rates. Disaggregating by type of food and level of deprivation, in the eight deprived states, we found an elevation in neonatal mortality rates of 0.33% for each 1% increase in the price of meat (95% CI: 0.06% to 0.60%) and 0.10% for a 1% increase in dairy (95% CI: 0.01% to 0.20%). We also detected an adverse association of the price of dairy with infant (b = 0.09%; 95% CI: 0.01% to 0.16%) and under-five mortality rates (b = 0.10%; 95% CI: 0.03% to 0.17%). These associations were not detected in less deprived states and territories. Rising food prices, particularly of high-protein meat and dairy products, were associated with worse child mortality outcomes. These adverse associations were concentrated in the most deprived states. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.

  10. Advances in statistical models for data analysis

    CERN Document Server

    Minerva, Tommaso; Vichi, Maurizio

    2015-01-01

    This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.

  11. Engaging Students in Survey Research Projects across Research Methods and Statistics Courses

    Science.gov (United States)

    Lovekamp, William E.; Soboroff, Shane D.; Gillespie, Michael D.

    2017-01-01

    One innovative way to help students make sense of survey research has been to create a multifaceted, collaborative assignment that promotes critical thinking, comparative analysis, self-reflection, and statistical literacy. We use a short questionnaire adapted from the Higher Education Research Institute's Cooperative Institutional Research…

  12. Comparative analysis of positive and negative attitudes toward statistics

    Science.gov (United States)

    Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah

    2015-02-01

    Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.

  13. Classification, (big) data analysis and statistical learning

    CERN Document Server

    Conversano, Claudio; Vichi, Maurizio

    2018-01-01

    This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...

  14. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2013-01-01

    Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...

  15. Reproducible statistical analysis with multiple languages

    DEFF Research Database (Denmark)

    Lenth, Russell; Højsgaard, Søren

    2011-01-01

    This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....

  16. Foundation of statistical energy analysis in vibroacoustics

    CERN Document Server

    Le Bot, A

    2015-01-01

    This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.

  17. Statistical analysis of SAMPEX PET proton measurements

    CERN Document Server

    Pierrard, V; Heynderickx, D; Kruglanski, M; Looper, M; Blake, B; Mewaldt, D

    2000-01-01

    We present a statistical study of the distributions of proton counts from the Proton-Electron Telescope aboard the low-altitude polar satellite SAMPEX. Our statistical analysis shows that histograms of observed proton counts are generally distributed according to Poisson distributions but are sometimes quite different. The observed departures from Poisson distributions can be attributed to variations of the average flux or to the non-constancy of the detector lifetimes.

  18. Teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses.

    Science.gov (United States)

    Metz, Anneke M

    2008-01-01

    There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study.

  19. Opinion Polls and Statistical Surveys: What They Really Tell Us

    Indian Academy of Sciences (India)

    Author Affiliations. Rajeeva L Karandikar1 Ayanendranath Basu2. Statistics & Mathematics Unit, Indian Statistical Institute, 7, SJS Sansanwal Marg, New Delhi 110 016, India. Applied Statistics Unit, Indian Statistical Institute, 203, BT Road, Calcutta 700 035, India.

  20. Statistical analysis of network data with R

    CERN Document Server

    Kolaczyk, Eric D

    2014-01-01

    Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).

  1. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2017-01-01

    The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...

  2. The fuzzy approach to statistical analysis

    NARCIS (Netherlands)

    Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.

    2006-01-01

    For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;

  3. Selected papers on analysis, probability, and statistics

    CERN Document Server

    Nomizu, Katsumi

    1994-01-01

    This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.

  4. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  5. Statistical Tools for Forensic Analysis of Toolmarks

    Energy Technology Data Exchange (ETDEWEB)

    David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser

    2004-04-22

    Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.

  6. Multivariate analysis: A statistical approach for computations

    Science.gov (United States)

    Michu, Sachin; Kaushik, Vandana

    2014-10-01

    Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.

  7. 2016 Workplace and Gender Relations Survey of Active Duty Members: Statistical Methodology Report

    Science.gov (United States)

    2017-03-01

    MEMBERS: STATISTICAL METHODOLOGY REPORT Office of People Analytics (OPA) Defense Research, Surveys, and Statistics Center 4800 Mark Center Drive...Introduction The Defense Research, Surveys, and Statistics Center, Office of People Analytics (OPA), conducts both web-based and paper-and-pen surveys to...the 2014 RMWS. Both weighting methodologies used the statistical computing software R and specifically functions from the packages “gbm” (Ridgeway

  8. Vapor Pressure Data Analysis and Statistics

    Science.gov (United States)

    2016-12-01

    there were flaws in the original data prior to its publication. 3. FITTING METHODS Our process for correlating experimental vapor pressure ...2. Penski, E.C. Vapor Pressure Data Analysis Methodology, Statistics, and Applications; CRDEC-TR-386; U.S. Army Chemical Research, Development, and... Chemical Biological Center: Aberdeen Proving Ground, MD, 2006; UNCLASSIFIED Report (ADA447993). 11. Kemme, H.R.; Kreps, S.I. Vapor Pressure of

  9. Statistical analysis of brake squeal noise

    Science.gov (United States)

    Oberst, S.; Lai, J. C. S.

    2011-06-01

    Despite substantial research efforts applied to the prediction of brake squeal noise since the early 20th century, the mechanisms behind its generation are still not fully understood. Squealing brakes are of significant concern to the automobile industry, mainly because of the costs associated with warranty claims. In order to remedy the problems inherent in designing quieter brakes and, therefore, to understand the mechanisms, a design of experiments study, using a noise dynamometer, was performed by a brake system manufacturer to determine the influence of geometrical parameters (namely, the number and location of slots) of brake pads on brake squeal noise. The experimental results were evaluated with a noise index and ranked for warm and cold brake stops. These data are analysed here using statistical descriptors based on population distributions, and a correlation analysis, to gain greater insight into the functional dependency between the time-averaged friction coefficient as the input and the peak sound pressure level data as the output quantity. The correlation analysis between the time-averaged friction coefficient and peak sound pressure data is performed by applying a semblance analysis and a joint recurrence quantification analysis. Linear measures are compared with complexity measures (nonlinear) based on statistics from the underlying joint recurrence plots. Results show that linear measures cannot be used to rank the noise performance of the four test pad configurations. On the other hand, the ranking of the noise performance of the test pad configurations based on the noise index agrees with that based on nonlinear measures: the higher the nonlinearity between the time-averaged friction coefficient and peak sound pressure, the worse the squeal. These results highlight the nonlinear character of brake squeal and indicate the potential of using nonlinear statistical analysis tools to analyse disc brake squeal.

  10. The CALORIES trial: statistical analysis plan.

    Science.gov (United States)

    Harvey, Sheila E; Parrott, Francesca; Harrison, David A; Mythen, Michael; Rowan, Kathryn M

    2014-12-01

    The CALORIES trial is a pragmatic, open, multicentre, randomised controlled trial (RCT) of the clinical effectiveness and cost-effectiveness of early nutritional support via the parenteral route compared with early nutritional support via the enteral route in unplanned admissions to adult general critical care units (CCUs) in the United Kingdom. The trial derives from the need for a large, pragmatic RCT to determine the optimal route of delivery for early nutritional support in the critically ill. To describe the proposed statistical analyses for the evaluation of the clinical effectiveness in the CALORIES trial. With the primary and secondary outcomes defined precisely and the approach to safety monitoring and data collection summarised, the planned statistical analyses, including prespecified subgroups and secondary analyses, were developed and are described. The primary outcome is all-cause mortality at 30 days. The primary analysis will be reported as a relative risk and absolute risk reduction and tested with the Fisher exact test. Prespecified subgroup analyses will be based on age, degree of malnutrition, acute severity of illness, mechanical ventilation at admission to the CCU, presence of cancer and time from CCU admission to commencement of early nutritional support. Secondary analyses include adjustment for baseline covariates. In keeping with best trial practice, we have developed, described and published a statistical analysis plan for the CALORIES trial and are placing it in the public domain before inspecting data from the trial.

  11. Statistical analysis of sleep spindle occurrences.

    Science.gov (United States)

    Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr

    2013-01-01

    Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.

  12. Coupling strength assumption in statistical energy analysis

    Science.gov (United States)

    Lafont, T.; Totaro, N.; Le Bot, A.

    2017-04-01

    This paper is a discussion of the hypothesis of weak coupling in statistical energy analysis (SEA). The examples of coupled oscillators and statistical ensembles of coupled plates excited by broadband random forces are discussed. In each case, a reference calculation is compared with the SEA calculation. First, it is shown that the main SEA relation, the coupling power proportionality, is always valid for two oscillators irrespective of the coupling strength. But the case of three subsystems, consisting of oscillators or ensembles of plates, indicates that the coupling power proportionality fails when the coupling is strong. Strong coupling leads to non-zero indirect coupling loss factors and, sometimes, even to a reversal of the energy flow direction from low to high vibrational temperature.

  13. 2015 Workplace and Gender Relations Survey of Reserve Component Members: Statistical Methodology Report

    Science.gov (United States)

    2016-03-17

    completion use the same methodology as Step 1 (CHAID and logistic model).  Step 3: Create final weights – The weights were poststratified to match...2015 Workplace and Gender Relations Survey of Reserve Component Members Statistical Methodology Report Additional copies of this report...RESERVE COMPONENT MEMBERS: STATISTICAL METHODOLOGY REPORT Defense Research, Surveys, and Statistics Center (RSSC) Defense Manpower Data Center

  14. Cross-linked survey analysis is an approach for separating cause and effect in survey research.

    Science.gov (United States)

    Redelmeier, Donald A; Thiruchelvam, Deva; Lustig, Andrew J

    2015-01-01

    We developed a new research approach, called cross-linked survey analysis, to explore how an acute exposure might lead to changes in survey responses. The goal was to identify associations between exposures and outcomes while reducing some ambiguities related to interpreting cause and effect in survey responses from a population-based community questionnaire. Cross-linked survey analysis differs from a cross-sectional, longitudinal, and panel survey analysis by individualizing the timeline to the unique history of each respondent. Cross-linked survey analysis, unlike a repeated-measures self-matching design, does not track changes in a repeated survey question given to the same respondent at multiple time points. Pilot data from three analyses (n = 1,177 respondents) illustrate how a cross-linked survey analysis can control for population shifts, temporal trends, and reverse causality. Accompanying graphs provide an intuitive display to readers, summarize results, and show differences in response distributions. Population-based individual-level linkages also reduce selection bias and increase statistical power compared with a single-center cross-sectional survey. Cross-linked survey analysis has limitations related to unmeasured confounding, pragmatics, survivor bias, statistical models, and the underlying artifacts in survey responses. We suggest that a cross-linked survey analysis may help in epidemiology science using survey data. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Complex Surveys A Guide to Analysis Using R

    CERN Document Server

    Lumley, Thomas

    2010-01-01

    A complete guide to carrying out complex survey analysis using R. As survey analysis continues to serve as a core component of sociological research, researchers are increasingly relying upon data gathered from complex surveys to carry out traditional analyses. Complex Surveys is a practical guide to the analysis of this kind of data using R, the freely available and downloadable statistical programming language. As creator of the specific survey package for R, the author provides the ultimate presentation of how to successfully use the software for analyzing data from complex surveys while al

  16. Analysis of Preference Data Using Intermediate Test Statistic ...

    African Journals Online (AJOL)

    Intermediate statistic is a link between Friedman test statistic and the multinomial statistic. The statistic is based on ranking in a selected number of treatments, not necessarily all alternatives. We show that this statistic is transitive to well-known test statistic being used for analysis of preference data. Specifically, it is shown ...

  17. Statistical analysis of solar proton events

    Directory of Open Access Journals (Sweden)

    V. Kurt

    2004-06-01

    Full Text Available A new catalogue of 253 solar proton events (SPEs with energy >10MeV and peak intensity >10 protons/cm2.s.sr (pfu at the Earth's orbit for three complete 11-year solar cycles (1970-2002 is given. A statistical analysis of this data set of SPEs and their associated flares that occurred during this time period is presented. It is outlined that 231 of these proton events are flare related and only 22 of them are not associated with Ha flares. It is also noteworthy that 42 of these events are registered as Ground Level Enhancements (GLEs in neutron monitors. The longitudinal distribution of the associated flares shows that a great number of these events are connected with west flares. This analysis enables one to understand the long-term dependence of the SPEs and the related flare characteristics on the solar cycle which are useful for space weather prediction.

  18. A statistical evaluation of factors influencing aerial survey results on brown bears

    Data.gov (United States)

    US Fish and Wildlife Service, Department of the Interior — This report is a statistical evaluation of factors influencing aerial survey results on Brown Bears. The purpose of this study was to provide a statistical...

  19. Wavelet and statistical analysis for melanoma classification

    Science.gov (United States)

    Nimunkar, Amit; Dhawan, Atam P.; Relue, Patricia A.; Patwardhan, Sachin V.

    2002-05-01

    The present work focuses on spatial/frequency analysis of epiluminesence images of dysplastic nevus and melanoma. A three-level wavelet decomposition was performed on skin-lesion images to obtain coefficients in the wavelet domain. A total of 34 features were obtained by computing ratios of the mean, variance, energy and entropy of the wavelet coefficients along with the mean and standard deviation of image intensity. An unpaired t-test for a normal distribution based features and the Wilcoxon rank-sum test for non-normal distribution based features were performed for selecting statistically correlated features. For our data set, the statistical analysis of features reduced the feature set from 34 to 5 features. For classification, the discriminant functions were computed in the feature space using the Mahanalobis distance. ROC curves were generated and evaluated for false positive fraction from 0.1 to 0.4. Most of the discrimination functions provided a true positive rate for melanoma of 93% with a false positive rate up to 21%.

  20. Statistical analysis of tourism destination competitiveness

    Directory of Open Access Journals (Sweden)

    Attilio Gardini

    2013-05-01

    Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.

  1. Multivariate statistical analysis of wildfires in Portugal

    Science.gov (United States)

    Costa, Ricardo; Caramelo, Liliana; Pereira, Mário

    2013-04-01

    Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).

  2. Statistical analysis of sleep spindle occurrences.

    Directory of Open Access Journals (Sweden)

    Dagmara Panas

    Full Text Available Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.

  3. A survey of statistics in three UK general practice journal

    OpenAIRE

    Rigby, A S; Armstrong, G K; Campbell, M J; Summerton, N

    2004-01-01

    Abstract Background Many medical specialities have reviewed the statistical content of their journals. To our knowledge this has not been done in general practice. Given the main role of a general practitioner as a diagnostician we thought it would be of interest to see whether the statistical methods reported reflect the diagnostic process. Methods Hand search of three UK journals of general practice namely the British Medical Journal (general practice section), British Journal of General Pr...

  4. Statistical Analysis of Bus Networks in India

    CERN Document Server

    Chatterjee, Atanu; Ramadurai, Gitakrishnan

    2015-01-01

    Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical system as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in \\textit{L}-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world property. We observe tha...

  5. A survey of statistics in three UK general practice journal

    Directory of Open Access Journals (Sweden)

    Campbell Michael J

    2004-12-01

    Full Text Available Abstract Background Many medical specialities have reviewed the statistical content of their journals. To our knowledge this has not been done in general practice. Given the main role of a general practitioner as a diagnostician we thought it would be of interest to see whether the statistical methods reported reflect the diagnostic process. Methods Hand search of three UK journals of general practice namely the British Medical Journal (general practice section, British Journal of General Practice and Family Practice over a one-year period (1 January to 31 December 2000. Results A wide variety of statistical techniques were used. The most common methods included t-tests and Chi-squared tests. There were few articles reporting likelihood ratios and other useful diagnostic methods. There was evidence that the journals with the more thorough statistical review process reported a more complex and wider variety of statistical techniques. Conclusions The BMJ had a wider range and greater diversity of statistical methods than the other two journals. However, in all three journals there was a dearth of papers reflecting the diagnostic process. Across all three journals there were relatively few papers describing randomised controlled trials thus recognising the difficulty of implementing this design in general practice.

  6. Statistics Analysis Measures Painting of Cooling Tower

    Directory of Open Access Journals (Sweden)

    A. Zacharopoulou

    2013-01-01

    Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.

  7. Void statistics of the CfA redshift survey

    Science.gov (United States)

    Vogeley, Michael S.; Geller, Margaret J.; Huchra, John P.

    1991-01-01

    Clustering properties of two samples from the CfA redshift survey, each containing about 2500 galaxies, are studied. A comparison of the velocity distributions via a K-S test reveals structure on scales comparable with the extent of the survey. The void probability function (VPF) is employed for these samples to examine the structure and to test for scaling relations in the galaxy distribution. The galaxy correlation function is calculated via moments of galaxy counts. The shape and amplitude of the correlation function roughly agree with previous determinations. The VPFs for distance-limited samples of the CfA survey do not match the scaling relation predicted by the hierarchical clustering models. On scales not greater than 10/h Mpc, the VPFs for these samples roughly follow the hierarchical pattern. A variant of the VPF which uses nearly all the data in magnitude-limited samples is introduced; it accounts for the variation of the sampling density with velocity in a magnitude-limited survey.

  8. Transit safety & security statistics & analysis 2002 annual report (formerly SAMIS)

    Science.gov (United States)

    2004-12-01

    The Transit Safety & Security Statistics & Analysis 2002 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...

  9. Transit safety & security statistics & analysis 2003 annual report (formerly SAMIS)

    Science.gov (United States)

    2005-12-01

    The Transit Safety & Security Statistics & Analysis 2003 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...

  10. Surveys Assessing Students' Attitudes toward Statistics: A Systematic Review of Validity and Reliability

    Science.gov (United States)

    Nolan, Meaghan M.; Beran, Tanya; Hecker, Kent G.

    2012-01-01

    Students with positive attitudes toward statistics are likely to show strong academic performance in statistics courses. Multiple surveys measuring students' attitudes toward statistics exist; however, a comparison of the validity and reliability of interpretations based on their scores is needed. A systematic review of relevant electronic…

  11. Survey of Native English Speakers and Spanish-Speaking English Language Learners in Tertiary Introductory Statistics

    Science.gov (United States)

    Lesser, Lawrence M.; Wagler, Amy E.; Esquinca, Alberto; Valenzuela, M. Guadalupe

    2013-01-01

    The framework of linguistic register and case study research on Spanish-speaking English language learners (ELLs) learning statistics informed the construction of a quantitative instrument, the Communication, Language, And Statistics Survey (CLASS). CLASS aims to assess whether ELLs and non-ELLs approach the learning of statistics differently with…

  12. The Business Of Urban Animals Survey: the facts and statistics on companion animals in Canada.

    Science.gov (United States)

    Perrin, Terri

    2009-01-01

    At the first Banff Summit for Urban Animal Strategies (BSUAS) in 2006, delegates clearly indicated that a lack of reliable Canadian statistics hampers municipal leaders and legislators in their efforts to develop urban animal strategies that create and sustain a healthy community for pets and people. To gain a better understanding of the situation, BSUAS municipal delegates and other industry stakeholders partnered with Ipsos Reid, one of the world's leading polling firms, to conduct a national survey on the "Business of Urban Animals." The results of the survey, summarized in this article, were presented at the BSUAS meeting in October 2008. In addition, each participating community will receive a comprehensive written analysis, as well as a customized report. The online survey was conducted from September 22 to October 1, 2008. There were 7208 participants, including 3973 pet and 3235 non-pet owners from the Ipsos-Reid's proprietary Canadian online panel. The national results were weighted to reflect the true population distribution across Canada and the panel was balanced on all major demographics to mirror Statistics Canada census information. The margin for error for the national results is 1/- 1.15%.

  13. Statistical network analysis for analyzing policy networks

    DEFF Research Database (Denmark)

    Robins, Garry; Lewis, Jenny; Wang, Peng

    2012-01-01

    To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social...... and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs...

  14. Statistical Analysis of Bus Networks in India.

    Science.gov (United States)

    Chatterjee, Atanu; Manohar, Manju; Ramadurai, Gitakrishnan

    2016-01-01

    In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future.

  15. Developments in statistical analysis in quantitative genetics

    DEFF Research Database (Denmark)

    Sorensen, Daniel

    2009-01-01

    A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and ...

  16. Statistical analysis on the factors affecting agricultural landowners’ willingness to enroll in a tree planting program

    Science.gov (United States)

    Taeyoung Kim; Christian. Langpap

    2015-01-01

    This report provides a statistical analysis of the data collected from two survey regions of the United States, the Pacific Northwest and the Southeast. The survey asked about individual agricultural landowners’ characteristics, characteristics of their land, and the landowners’ willingness to enroll in a tree planting program under incentive payments for carbon...

  17. Analysis of Preference Data Using Intermediate Test Statistic Abstract

    African Journals Online (AJOL)

    PROF. O. E. OSUAGWU

    2013-06-01

    Jun 1, 2013 ... We show that this statistic is transitive to well-known test statistic being used for analysis of preference data. Specifically, it is shown that our link is equivalent to the ... Keywords:-Preference data, Friedman statistic, multinomial test statistic, intermediate test ... favourable ones would not be a big issue in.

  18. Statistical Analysis of Data for Timber Strengths

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard; Hoffmeyer, P.

    Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull. The......-parameter Weibull (and Normal) distributions give the best fits to the data available, especially if tail fits are used whereas the LogNormal distribution generally gives poor fit and larger coefficients of variation, especially if tail fits are used........ The statistical fits have generally been made using all data (100%) and the lower tail (30%) of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. 8 different databases are analysed. The results show that 2...

  19. Statistical convergence, selection principles and asymptotic analysis

    Energy Technology Data Exchange (ETDEWEB)

    Di Maio, G. [Dipartimento di Matematica, Seconda Universita di Napoli, Via Vivaldi 43, 81100 Caserta (Italy)], E-mail: giuseppe.dimaio@unina2.it; Djurcic, D. [Technical Faculty, University of Kragujevac, Svetog Save 65, 32000 Cacak (Serbia)], E-mail: dragandj@tfc.kg.ac.yu; Kocinac, Lj.D.R. [Faculty of Sciences and Mathematics, University of Nis, Visegradska 33, 18000 Nis (Serbia)], E-mail: lkocinac@ptt.rs; Zizovic, M.R. [Technical Faculty, University of Kragujevac, Svetog Save 65, 32000 Cacak (Serbia)], E-mail: zizo@tfc.kg.ac.yu

    2009-12-15

    We consider the set S of sequences of positive real numbers in the context of statistical convergence/divergence and show that some subclasses of S have certain nice selection and game-theoretic properties.

  20. Using spatiotemporal statistical models to estimate animal abundance and infer ecological dynamics from survey counts

    Science.gov (United States)

    Conn, Paul B.; Johnson, Devin S.; Ver Hoef, Jay M.; Hooten, Mevin B.; London, Joshua M.; Boveng, Peter L.

    2015-01-01

    Ecologists often fit models to survey data to estimate and explain variation in animal abundance. Such models typically require that animal density remains constant across the landscape where sampling is being conducted, a potentially problematic assumption for animals inhabiting dynamic landscapes or otherwise exhibiting considerable spatiotemporal variation in density. We review several concepts from the burgeoning literature on spatiotemporal statistical models, including the nature of the temporal structure (i.e., descriptive or dynamical) and strategies for dimension reduction to promote computational tractability. We also review several features as they specifically relate to abundance estimation, including boundary conditions, population closure, choice of link function, and extrapolation of predicted relationships to unsampled areas. We then compare a suite of novel and existing spatiotemporal hierarchical models for animal count data that permit animal density to vary over space and time, including formulations motivated by resource selection and allowing for closed populations. We gauge the relative performance (bias, precision, computational demands) of alternative spatiotemporal models when confronted with simulated and real data sets from dynamic animal populations. For the latter, we analyze spotted seal (Phoca largha) counts from an aerial survey of the Bering Sea where the quantity and quality of suitable habitat (sea ice) changed dramatically while surveys were being conducted. Simulation analyses suggested that multiple types of spatiotemporal models provide reasonable inference (low positive bias, high precision) about animal abundance, but have potential for overestimating precision. Analysis of spotted seal data indicated that several model formulations, including those based on a log-Gaussian Cox process, had a tendency to overestimate abundance. By contrast, a model that included a population closure assumption and a scale prior on total

  1. Statistical analysis of microbiological diagnostic tests

    Directory of Open Access Journals (Sweden)

    C P Baveja

    2017-01-01

    Full Text Available No study in medical science is complete without application of the statistical principles. Incorrect application of statistical tests causes incorrect interpretation of the study results obtained through hard work. Yet statistics remains one of the most neglected and loathed areas, probably due to the lack of understanding of the basic principles. In microbiology, rapid progress is being made in the field of diagnostic test, and a huge number of studies being conducted are related to the evaluation of these tests. Therefore, a good knowledge of statistical principles will aid a microbiologist to plan, conduct and interpret the result. The initial part of this review discusses the study designs, types of variables, principles of sampling, calculation of sample size, types of errors and power of the study. Subsequently, description of the performance characteristics of a diagnostic test, receiver operator characteristic curve and tests of significance are explained. Lack of a perfect gold standard test against which our test is being compared can hamper the study results; thus, it becomes essential to apply the remedial measures described here. Rapid computerisation has made statistical calculations much simpler, obviating the need for the routine researcher to rote learn the derivations and apply the complex formulae. Thus, greater focus has been laid on developing an understanding of principles. Finally, it should be kept in mind that a diagnostic test may show exemplary statistical results, yet it may not be useful in the routine laboratory or in the field; thus, its operational characteristics are as important as the statistical results.

  2. The U.S. geological survey rass-statpac system for management and statistical reduction of geochemical data

    Science.gov (United States)

    VanTrump, G.; Miesch, A.T.

    1977-01-01

    RASS is an acronym for Rock Analysis Storage System and STATPAC, for Statistical Package. The RASS and STATPAC computer programs are integrated into the RASS-STATPAC system for the management and statistical reduction of geochemical data. The system, in its present form, has been in use for more than 9 yr by scores of U.S. Geological Survey geologists, geochemists, and other scientists engaged in a broad range of geologic and geochemical investigations. The principal advantage of the system is the flexibility afforded the user both in data searches and retrievals and in the manner of statistical treatment of data. The statistical programs provide for most types of statistical reduction normally used in geochemistry and petrology, but also contain bridges to other program systems for statistical processing and automatic plotting. ?? 1977.

  3. Statistical Analysis of Data for Timber Strengths

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2003-01-01

    . The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...

  4. Information sources of company's competitive environment statistical analysis

    OpenAIRE

    Khvostenko, O.

    2010-01-01

    The article is dedicated to a problem of the company's competitive environment statistical analysis and its information sources. The main features of information system and its significance in the competitive environment statistical research have been considered.

  5. Statistical analysis of lineaments of Goa, India

    Digital Repository Service at National Institute of Oceanography (India)

    Iyer, S.D.; Banerjee, G.; Wagle, B.G.

    statistically to obtain the nonlinear pattern in the form of a cosine wave. Three distinct peaks were found at azimuths of 40-45 degrees, 90-95 degrees and 140-145 degrees, which have peak values of 5.85, 6.80 respectively. These three peaks are correlated...

  6. Statistical models and methods for reliability and survival analysis

    CERN Document Server

    Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo

    2013-01-01

    Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical

  7. Statistical analysis of medical data using SAS

    CERN Document Server

    Der, Geoff

    2005-01-01

    An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences

  8. Survey of editors and reviewers of high-impact psychology journals: statistical and research design problems in submitted manuscripts.

    Science.gov (United States)

    Harris, Alex; Reeder, Rachelle; Hyun, Jenny

    2011-01-01

    The authors surveyed 21 editors and reviewers from major psychology journals to identify and describe the statistical and design errors they encounter most often and to get their advice regarding prevention of these problems. Content analysis of the text responses revealed themes in 3 major areas: (a) problems with research design and reporting (e.g., lack of an a priori power analysis, lack of congruence between research questions and study design/analysis, failure to adequately describe statistical procedures); (b) inappropriate data analysis (e.g., improper use of analysis of variance, too many statistical tests without adjustments, inadequate strategy for addressing missing data); and (c) misinterpretation of results. If researchers attended to these common methodological and analytic issues, the scientific quality of manuscripts submitted to high-impact psychology journals might be significantly improved.

  9. The use of test scores from large-scale assessment surveys: psychometric and statistical considerations

    Directory of Open Access Journals (Sweden)

    Henry Braun

    2017-11-01

    Full Text Available Abstract Background Economists are making increasing use of measures of student achievement obtained through large-scale survey assessments such as NAEP, TIMSS, and PISA. The construction of these measures, employing plausible value (PV methodology, is quite different from that of the more familiar test scores associated with assessments such as the SAT or ACT. These differences have important implications both for utilization and interpretation. Although much has been written about PVs, it appears that there are still misconceptions about whether and how to employ them in secondary analyses. Methods We address a range of technical issues, including those raised in a recent article that was written to inform economists using these databases. First, an extensive review of the relevant literature was conducted, with particular attention to key publications that describe the derivation and psychometric characteristics of such achievement measures. Second, a simulation study was carried out to compare the statistical properties of estimates based on the use of PVs with those based on other, commonly used methods. Results It is shown, through both theoretical analysis and simulation, that under fairly general conditions appropriate use of PV yields approximately unbiased estimates of model parameters in regression analyses of large scale survey data. The superiority of the PV methodology is particularly evident when measures of student achievement are employed as explanatory variables. Conclusions The PV methodology used to report student test performance in large scale surveys remains the state-of-the-art for secondary analyses of these databases.

  10. Common misconceptions about data analysis and statistics.

    Science.gov (United States)

    Motulsky, Harvey J

    2015-02-01

    Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.

  11. Fundamentals of statistical experimental design and analysis

    CERN Document Server

    Easterling, Robert G

    2015-01-01

    Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and t...

  12. Common misconceptions about data analysis and statistics.

    Science.gov (United States)

    Motulsky, Harvey J

    2014-11-01

    Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood.

  13. Common pitfalls in statistical analysis: "P" values, statistical significance and confidence intervals

    Directory of Open Access Journals (Sweden)

    Priya Ranganathan

    2015-01-01

    Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper

  14. Predicting Survey Responses: How and Why Semantics Shape Survey Statistics on Organizational Behaviour

    Science.gov (United States)

    Arnulf, Jan Ketil; Larsen, Kai Rune; Martinsen, Øyvind Lund; Bong, Chih How

    2014-01-01

    Some disciplines in the social sciences rely heavily on collecting survey responses to detect empirical relationships among variables. We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods. Language processing algorithms were used to calculate the semantic similarity among all items in state-of-the-art surveys from Organisational Behaviour research. These surveys covered areas such as transformational leadership, work motivation and work outcomes. This information was used to explain and predict the response patterns from real subjects. Semantic algorithms explained 60–86% of the variance in the response patterns and allowed remarkably precise prediction of survey responses from humans, except in a personality test. Even the relationships between independent and their purported dependent variables were accurately predicted. This raises concern about the empirical nature of data collected through some surveys if results are already given a priori through the way subjects are being asked. Survey response patterns seem heavily determined by semantics. Language algorithms may suggest these prior to administering a survey. This study suggests that semantic algorithms are becoming new tools for the social sciences, opening perspectives on survey responses that prevalent psychometric theory cannot explain. PMID:25184672

  15. Predicting survey responses: how and why semantics shape survey statistics on organizational behaviour.

    Directory of Open Access Journals (Sweden)

    Jan Ketil Arnulf

    Full Text Available Some disciplines in the social sciences rely heavily on collecting survey responses to detect empirical relationships among variables. We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods. Language processing algorithms were used to calculate the semantic similarity among all items in state-of-the-art surveys from Organisational Behaviour research. These surveys covered areas such as transformational leadership, work motivation and work outcomes. This information was used to explain and predict the response patterns from real subjects. Semantic algorithms explained 60-86% of the variance in the response patterns and allowed remarkably precise prediction of survey responses from humans, except in a personality test. Even the relationships between independent and their purported dependent variables were accurately predicted. This raises concern about the empirical nature of data collected through some surveys if results are already given a priori through the way subjects are being asked. Survey response patterns seem heavily determined by semantics. Language algorithms may suggest these prior to administering a survey. This study suggests that semantic algorithms are becoming new tools for the social sciences, opening perspectives on survey responses that prevalent psychometric theory cannot explain.

  16. Predicting survey responses: how and why semantics shape survey statistics on organizational behaviour.

    Science.gov (United States)

    Arnulf, Jan Ketil; Larsen, Kai Rune; Martinsen, Øyvind Lund; Bong, Chih How

    2014-01-01

    Some disciplines in the social sciences rely heavily on collecting survey responses to detect empirical relationships among variables. We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods. Language processing algorithms were used to calculate the semantic similarity among all items in state-of-the-art surveys from Organisational Behaviour research. These surveys covered areas such as transformational leadership, work motivation and work outcomes. This information was used to explain and predict the response patterns from real subjects. Semantic algorithms explained 60-86% of the variance in the response patterns and allowed remarkably precise prediction of survey responses from humans, except in a personality test. Even the relationships between independent and their purported dependent variables were accurately predicted. This raises concern about the empirical nature of data collected through some surveys if results are already given a priori through the way subjects are being asked. Survey response patterns seem heavily determined by semantics. Language algorithms may suggest these prior to administering a survey. This study suggests that semantic algorithms are becoming new tools for the social sciences, opening perspectives on survey responses that prevalent psychometric theory cannot explain.

  17. Commentary Discrepancy between statistical analysis method and ...

    African Journals Online (AJOL)

    to strive for compatibility between study design and analysis plan. Many authors have reported on common discrepancies in medical research, specifically between analysis methods and study design.4,5 For instance, after reviewing several published studies, Varnell et al. observed that many studies had applied.

  18. Confidence Levels in Statistical Analyses. Analysis of Variances. Case Study.

    Directory of Open Access Journals (Sweden)

    Ileana Brudiu

    2010-05-01

    Full Text Available Applying a statistical test to check statistical assumptions offers a positive or negative response regarding the veracity of the issued hypothesis. In case of variance analysis it’s necessary to apply a post hoc test to determine differences within the group. Statistical estimation using confidence levels provides more information than a statistical test, it shows the high degree of uncertainty resulting from small samples and builds conclusions in terms of "marginally significant" or "almost significant (p being close to 0,05 . The case study shows how the statistical estimation completes the application form the analysis of variance test and Tukey test.

  19. Why Flash Type Matters: A Statistical Analysis

    Science.gov (United States)

    Mecikalski, Retha M.; Bitzer, Phillip M.; Carey, Lawrence D.

    2017-09-01

    While the majority of research only differentiates between intracloud (IC) and cloud-to-ground (CG) flashes, there exists a third flash type, known as hybrid flashes. These flashes have extensive IC components as well as return strokes to ground but are misclassified as CG flashes in current flash type analyses due to the presence of a return stroke. In an effort to show that IC, CG, and hybrid flashes should be separately classified, the two-sample Kolmogorov-Smirnov (KS) test was applied to the flash sizes, flash initiation, and flash propagation altitudes for each of the three flash types. The KS test statistically showed that IC, CG, and hybrid flashes do not have the same parent distributions and thus should be separately classified. Separate classification of hybrid flashes will lead to improved lightning-related research, because unambiguously classified hybrid flashes occur on the same order of magnitude as CG flashes for multicellular storms.

  20. Statistics over features: EEG signals analysis.

    Science.gov (United States)

    Derya Ubeyli, Elif

    2009-08-01

    This paper presented the usage of statistics over the set of the features representing the electroencephalogram (EEG) signals. Since classification is more accurate when the pattern is simplified through representation by important features, feature extraction and selection play an important role in classifying systems such as neural networks. Multilayer perceptron neural network (MLPNN) architectures were formulated and used as basis for detection of electroencephalographic changes. Three types of EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified. The selected Lyapunov exponents, wavelet coefficients and the power levels of power spectral density (PSD) values obtained by eigenvector methods of the EEG signals were used as inputs of the MLPNN trained with Levenberg-Marquardt algorithm. The classification results confirmed that the proposed MLPNN has potential in detecting the electroencephalographic changes.

  1. Statistical power analysis for the behavioral sciences

    National Research Council Canada - National Science Library

    Cohen, Jacob

    1988-01-01

    .... A chapter has been added for power analysis in set correlation and multivariate methods (Chapter 10). Set correlation is a realization of the multivariate general linear model, and incorporates the standard multivariate methods...

  2. Statistical methods for categorical data analysis

    CERN Document Server

    Powers, Daniel

    2008-01-01

    This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/

  3. Statistical power analysis for the behavioral sciences

    National Research Council Canada - National Science Library

    Cohen, Jacob

    1988-01-01

    ... offers a unifying framework and some new data-analytic possibilities. 2. A new chapter (Chapter 11) considers some general topics in power analysis in more integrted form than is possible in the earlier...

  4. Statistical Analysis of the Flexographic Printing Quality

    Directory of Open Access Journals (Sweden)

    Agnė Matulaitienė

    2014-02-01

    Full Text Available Analysis of flexographic printing output quality was performedusing SPSS software package. Samples of defected productswere collected for one year in the existing flexographic printingcompany. Any defective products examples were described indetails and analyzed. It was decided to use SPPS software packagebecause of large amount of data. Data flaw based hypotheseswere formulated which were approved or rejected in analysis.The results obtained are presented in the charts.

  5. A statistical package for computing time and frequency domain analysis

    Science.gov (United States)

    Brownlow, J.

    1978-01-01

    The spectrum analysis (SPA) program is a general purpose digital computer program designed to aid in data analysis. The program does time and frequency domain statistical analyses as well as some preanalysis data preparation. The capabilities of the SPA program include linear trend removal and/or digital filtering of data, plotting and/or listing of both filtered and unfiltered data, time domain statistical characterization of data, and frequency domain statistical characterization of data.

  6. Statistics

    CERN Document Server

    Hayslett, H T

    1991-01-01

    Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the

  7. Statistical analysis of Hasegawa-Wakatani turbulence

    Science.gov (United States)

    Anderson, Johan; Hnat, Bogdan

    2017-06-01

    Resistive drift wave turbulence is a multipurpose paradigm that can be used to understand transport at the edge of fusion devices. The Hasegawa-Wakatani model captures the essential physics of drift turbulence while retaining the simplicity needed to gain a qualitative understanding of this process. We provide a theoretical interpretation of numerically generated probability density functions (PDFs) of intermittent events in Hasegawa-Wakatani turbulence with enforced equipartition of energy in large scale zonal flows, and small scale drift turbulence. We find that for a wide range of adiabatic index values, the stochastic component representing the small scale turbulent eddies of the flow, obtained from the autoregressive integrated moving average model, exhibits super-diffusive statistics, consistent with intermittent transport. The PDFs of large events (above one standard deviation) are well approximated by the Laplace distribution, while small events often exhibit a Gaussian character. Furthermore, there exists a strong influence of zonal flows, for example, via shearing and then viscous dissipation maintaining a sub-diffusive character of the fluxes.

  8. Predicting survey responses: how and why semantics shape survey statistics on organizational behaviour

    National Research Council Canada - National Science Library

    Arnulf, Jan Ketil; Larsen, Kai Rune; Martinsen, Øyvind Lund; Bong, Chih How

    2014-01-01

    .... We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods...

  9. Book review: Statistical Analysis and Modelling of Spatial Point Patterns

    DEFF Research Database (Denmark)

    Møller, Jesper

    2009-01-01

    Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912......Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912...

  10. Statistical Modelling of Wind Proles - Data Analysis and Modelling

    DEFF Research Database (Denmark)

    Jónsson, Tryggvi; Pinson, Pierre

    The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....

  11. Sensitivity analysis of ranked data: from order statistics to quantiles

    NARCIS (Netherlands)

    Heidergott, B.F.; Volk-Makarewicz, W.

    2015-01-01

    In this paper we provide the mathematical theory for sensitivity analysis of order statistics of continuous random variables, where the sensitivity is with respect to a distributional parameter. Sensitivity analysis of order statistics over a finite number of observations is discussed before

  12. The Statistical Analysis of Failure Time Data

    CERN Document Server

    Kalbfleisch, John D

    2011-01-01

    Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.

  13. Statistical Analysis Of Reconnaissance Geochemical Data From ...

    African Journals Online (AJOL)

    Five factors, whose structures were similar to the subjective groupings derived from the correlation matrix, were derived from R-mode factor analysis and have been interpreted in terms of underlying rock lithology, potential mineralization, and physico-chemical conditions in the environment. A high possibility of occurrence ...

  14. Statistical inference of Minimum Rank Factor Analysis

    NARCIS (Netherlands)

    Shapiro, A; Ten Berge, JMF

    For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the

  15. Assessing attitudes towards statistics among medical students: psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS.

    Directory of Open Access Journals (Sweden)

    Dejana Stanisavljevic

    Full Text Available BACKGROUND: Medical statistics has become important and relevant for future doctors, enabling them to practice evidence based medicine. Recent studies report that students' attitudes towards statistics play an important role in their statistics achievements. The aim of the study was to test the psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS in order to acquire a valid instrument to measure attitudes inside the Serbian educational context. METHODS: The validation study was performed on a cohort of 417 medical students who were enrolled in an obligatory introductory statistics course. The SATS adaptation was based on an internationally accepted methodology for translation and cultural adaptation. Psychometric properties of the Serbian version of the SATS were analyzed through the examination of factorial structure and internal consistency. RESULTS: Most medical students held positive attitudes towards statistics. The average total SATS score was above neutral (4.3±0.8, and varied from 1.9 to 6.2. Confirmatory factor analysis validated the six-factor structure of the questionnaire (Affect, Cognitive Competence, Value, Difficulty, Interest and Effort. Values for fit indices TLI (0.940 and CFI (0.961 were above the cut-off of ≥0.90. The RMSEA value of 0.064 (0.051-0.078 was below the suggested value of ≤0.08. Cronbach's alpha of the entire scale was 0.90, indicating scale reliability. In a multivariate regression model, self-rating of ability in mathematics and current grade point average were significantly associated with the total SATS score after adjusting for age and gender. CONCLUSION: Present study provided the evidence for the appropriate metric properties of the Serbian version of SATS. Confirmatory factor analysis validated the six-factor structure of the scale. The SATS might be reliable and a valid instrument for identifying medical students' attitudes towards statistics in the

  16. Eros or Ethnos: Pioneering statistical survey on prostitution at the beginning of the 20th century.

    Science.gov (United States)

    Kuhar, Martin

    2015-01-01

    The earliest serious investigation into prostitution in Croatia was a survey conducted in 1907 by the physician Fran Gundrum. His study was an attempt at a comprehensive exploration of prostitution, which tried to reconstruct demographic, anthropologic, and sociologic features of prostitutes. I present an analysis of his study and argue that Gundrum consistently found himself vacillating between blaming society and charging the nature of women to explain the existence of prostitution. This ambivalence was a result of embracing both the power of Enlightenment, which believed that human morality could be improved by the process of learning, and the notion of hereditary degeneration, which regarded human improvement by reeducation as futile. Heavily influenced by his Catholic upbringing and political conservatism, Gundrum married the "scientific" notion of innate prostitution with a pervasive view of women as flirtatious and materialistic. His survey reveals the typical personality of the period, a scientific enthusiast advocating the medical control of the population and the use of statistics in realizing that goal. It was, essentially, an attempt to construct and verify widespread attitudes toward public health as a method of monitoring venereal diseases and social control in general. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. NUCLEI SHAPE ANALYSIS, A STATISTICAL APPROACH

    Directory of Open Access Journals (Sweden)

    Alberto Nettel-Aguirre

    2011-05-01

    Full Text Available The method presented in our paper suggests the use of Functional Data Analysis (FDA techniques in an attempt to characterise the nuclei of two types of cells: Cancer and non-cancer, based on their 2 dimensional profiles. The characteristics of the profile itself, as traced by its X and Y coordinates, their first and second derivatives, their variability and use in characterization are the main focus of this approach which is not constrained to star shaped nuclei. Findings: Principal components created from the coordinates relate to shape with significant differences between nuclei type. Characterisations for each type of profile were found.

  18. Baltic sea algae analysis using Bayesian spatial statistics methods

    Directory of Open Access Journals (Sweden)

    Eglė Baltmiškytė

    2013-03-01

    Full Text Available Spatial statistics is one of the fields in statistics dealing with spatialy spread data analysis. Recently, Bayes methods are often applied for data statistical analysis. A spatial data model for predicting algae quantity in the Baltic Sea is made and described in this article. Black Carrageen is a dependent variable and depth, sand, pebble, boulders are independent variables in the described model. Two models with different covariation functions (Gaussian and exponential are built to estimate the best model fitting for algae quantity prediction. Unknown model parameters are estimated and Bayesian kriging prediction posterior distribution is computed in OpenBUGS modeling environment by using Bayesian spatial statistics methods.

  19. The Ontology of Biological and Clinical Statistics (OBCS) for standardized and reproducible statistical analysis.

    Science.gov (United States)

    Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun

    2016-09-14

    Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology

  20. A statistical survey on death and digital practices. Reflexivity on methodological biases

    Directory of Open Access Journals (Sweden)

    BOURDELOIE Hélène

    2016-07-01

    Full Text Available Constructing a questionnaire, both in terms of methodology and ethics, supposes an exercise in reflexivity, especially when the context relates to a taboo subject such as death. Drawing on a statistical survey aimed mainly at understanding the role of digital technologies in mourning practices, this paper explores a raft of methodological and ethical questions raised by the different steps spanning the design, communication and administration of the survey. We pinpoint the limits of statistical data and the need to supplement these with a qualitative approach as well as “quali-quantitative” data to decipher socio-digital uses in mourning, which relates to the emotive dimension.

  1. A statistical analysis of UK financial networks

    Science.gov (United States)

    Chu, J.; Nadarajah, S.

    2017-04-01

    In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.

  2. Statistical Performance Analysis and Modeling Techniques for Nanometer VLSI Designs

    CERN Document Server

    Shen, Ruijing; Yu, Hao

    2012-01-01

    Since process variation and chip performance uncertainties have become more pronounced as technologies scale down into the nanometer regime, accurate and efficient modeling or characterization of variations from the device to the architecture level have  become imperative for the successful design of VLSI chips. This book provides readers with tools for variation-aware design methodologies and computer-aided design (CAD) of VLSI systems, in the presence of process variations at the nanometer scale. It presents the latest developments for modeling and analysis, with a focus on statistical interconnect modeling, statistical parasitic extractions, statistical full-chip leakage and dynamic power analysis considering spatial correlations, statistical analysis and modeling for large global interconnects and analog/mixed-signal circuits.  Provides readers with timely, systematic and comprehensive treatments of statistical modeling and analysis of VLSI systems with a focus on interconnects, on-chip power grids and ...

  3. Statistics

    Science.gov (United States)

    Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.

  4. CORSSA: The Community Online Resource for Statistical Seismicity Analysis

    Science.gov (United States)

    Michael, Andrew J.; Wiemer, Stefan

    2010-01-01

    Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.

  5. Statistical Analysis of Research Data | Center for Cancer Research

    Science.gov (United States)

    Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 12-13, 2017 from 9:00 AM – 5:00 PM at the Natcher Conference Center, Balcony A on the Bethesda campus. SARD is designed to provide an overview of the general principles of statistical analysis of research data. The course will be taught by Paul W. Thurman of Columbia University.

  6. Method for statistical data analysis of multivariate observations

    CERN Document Server

    Gnanadesikan, R

    1997-01-01

    A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte

  7. Statistical evaluation of diagnostic performance topics in ROC analysis

    CERN Document Server

    Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E

    2016-01-01

    Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...

  8. Online Statistical Modeling (Regression Analysis) for Independent Responses

    Science.gov (United States)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  9. Analysis of thrips distribution: application of spatial statistics and Kriging

    Science.gov (United States)

    John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard

    1991-01-01

    Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...

  10. Attitudes and Achievement in Statistics: A Meta-Analysis Study

    Science.gov (United States)

    Emmioglu, Esma; Capa-Aydin, Yesim

    2012-01-01

    This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…

  11. The Importance of Statistical Modeling in Data Analysis and Inference

    Science.gov (United States)

    Rollins, Derrick, Sr.

    2017-01-01

    Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…

  12. Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data

    Science.gov (United States)

    Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val

    2006-01-01

    Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…

  13. Violence against women in Yemen: official statistics and results from an exploratory victim survey

    NARCIS (Netherlands)

    Ba-Obeid, M.; Bijleveld, C.C.J.H.

    2002-01-01

    This article presents official statistics on violence against women in Yemen, as a threshold indicator of victimization incidence. Next, we present the findings from an exploratory survey into the prevalence of violent victimisation among a stratified sample of 120 women in Sana' a. We distinguish

  14. Adapting the Survey of Attitudes towards Statistics (SATS-36) for Estonian Secondary School Students

    Science.gov (United States)

    Hommik, Carita; Luik, Piret

    2017-01-01

    The purpose of this study is to adapt the Survey of Attitudes Towards Statistics (SATS-36) for Estonian secondary school students in order to develop a valid instrument to measure students' attitudes within the Estonian educational context. The SATS-36 was administered to Estonian-speaking secondary school students before their compulsory…

  15. Validation of survey information on smoking and alcohol consumption against import statistics, Greenland 1993–2010

    Directory of Open Access Journals (Sweden)

    Peter Bjerregaard

    2013-03-01

    Full Text Available Background. Questionnaires are widely used to obtain information on health-related behaviour, and they are more often than not the only method that can be used to assess the distribution of behaviour in subgroups of the population. No validation studies of reported consumption of tobacco or alcohol have been published from circumpolar indigenous communities. Objective. The purpose of the study is to compare information on the consumption of tobacco and alcohol obtained from 3 population surveys in Greenland with import statistics. Design. Estimates of consumption of cigarettes and alcohol using several different survey instruments in cross-sectional population studies from 1993–1994, 1999–2001 and 2005–2010 were compared with import statistics from the same years. Results. For cigarettes, survey results accounted for virtually the total import. Alcohol consumption was significantly under-reported with reporting completeness ranging from 40% to 51% for different estimates of habitual weekly consumption in the 3 study periods. Including an estimate of binge drinking increased the estimated total consumption to 78% of the import. Conclusion. Compared with import statistics, questionnaire-based population surveys capture the consumption of cigarettes well in Greenland. Consumption of alcohol is under-reported, but asking about binge episodes in addition to the usual intake considerably increased the reported intake in this population and made it more in agreement with import statistics. It is unknown to what extent these findings at the population level can be inferred to population subgroups.

  16. Statistical Analysis of the Exchange Rate of Bitcoin: e0133678

    National Research Council Canada - National Science Library

    Jeffrey Chu; Saralees Nadarajah; Stephen Chan

    2015-01-01

      Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar...

  17. Statistical Analysis for Grinding Mechanism of Fine Ceramic Material

    National Research Council Canada - National Science Library

    NISHIOKA, Takao; TANAKA, Yoshio; YAMAKAWA, Akira; MIYAKE, Masaya

    1994-01-01

    .... Statistical analysis was conducted on the specific grinding energy and stock removal rate with respect to the maximum grain depth of cut by a new method of directly evaluation successive cutting point spacing...

  18. [Statistical analysis using freely-available "EZR (Easy R)" software].

    Science.gov (United States)

    Kanda, Yoshinobu

    2015-10-01

    Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.

  19. Using multivariate statistical analysis to assess changes in water ...

    African Journals Online (AJOL)

    Abstract. Multivariate statistical analysis was used to investigate changes in water chemistry at 5 river sites in the Vaal Dam catch- ... analysis (CCA) showed that the environmental variables used in the analysis, discharge and month of sampling, explained ...... DINGENEN R, WILD O and ZENG G (2006) The global atmos-.

  20. Statistical Learning in Specific Language Impairment: A Meta-Analysis

    Science.gov (United States)

    Lammertink, Imme; Boersma, Paul; Wijnen, Frank; Rispens, Judith

    2017-01-01

    Purpose: The current meta-analysis provides a quantitative overview of published and unpublished studies on statistical learning in the auditory verbal domain in people with and without specific language impairment (SLI). The database used for the meta-analysis is accessible online and open to updates (Community-Augmented Meta-Analysis), which…

  1. Detecting errors in micro and trace analysis by using statistics

    DEFF Research Database (Denmark)

    Heydorn, K.

    1993-01-01

    to be in statistical control. Significant deviations between analytical results from different laboratories reveal the presence of systematic errors, and agreement between different laboratories indicate the absence of systematic errors. This statistical approach, referred to as the analysis of precision, was applied...... to results for chlorine in freshwater from BCR certification analyses by highly competent analytical laboratories in the EC. Titration showed systematic errors of several percent, while radiochemical neutron activation analysis produced results without detectable bias....

  2. [Appropriate usage of statistical analysis in eye research].

    Science.gov (United States)

    Ge, Jian

    2013-02-01

    To avoid data bias in clinical research, it is essential to carefully select the suitable analysis of statistics on different research purposes and designs. It is optimal that team-work by statistician, scientist and specialist will assure to obtain reliable and scientific analysis of a study. The best way to analyze a study is to select more appropriate statistical methods rather than complicated ones.

  3. Advanced data analysis in neuroscience integrating statistical and computational models

    CERN Document Server

    Durstewitz, Daniel

    2017-01-01

    This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering.  Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...

  4. THE INTEGRATED SHORT-TERM STATISTICAL SURVEYS: EXPERIENCE OF NBS IN MOLDOVA

    Directory of Open Access Journals (Sweden)

    Oleg CARA

    2012-07-01

    Full Text Available The users’ rising need for relevant, reliable, coherent, timely data for the early diagnosis of the economic vulnerability and of the turning points in the business cycles, especially during a financial and economic crisis, asks for a prompt answer, coordinated by statistical institutions. High quality short term statistics are of special interest for the emerging market economies, such as the Moldavian one, being extremely vulnerable when facing economic recession. Answering to the challenges of producing a coherent and adequate image of the economic activity, by using the system of indicators and definitions efficiently applied at the level of the European Union, the National Bureau of Statistics (NBS of the Republic of Moldova has launched the development process of an integrated system of short term statistics (STS based on the advanced international experience.Thus, in 2011, BNS implemented the integrated statistical survey on STS based on consistent concepts, harmonized with the EU standards. The integration of the production processes, which were previously separated, is based on a common technical infrastructure, standardized procedures and techniques for data production. The achievement of this complex survey with holistic approach has allowed the consolidation of the statistical data quality, comparable at European level and the signifi cant reduction of information burden on business units, especially of small size.The reformation of STS based on the integrated survey has been possible thanks to the consistent methodological and practical support given to NBS by the National Institute of Statistics (INS of Romania, for which we would like to thank to our Romanian colleagues.

  5. Basic statistical tools in research and data analysis

    Directory of Open Access Journals (Sweden)

    Zulfiqar Ali

    2016-01-01

    Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

  6. Numeric computation and statistical data analysis on the Java platform

    CERN Document Server

    Chekanov, Sergei V

    2016-01-01

    Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...

  7. Statistical assessment on a combined analysis of GRYN-ROMN-UCBN upland vegetation vital signs

    Science.gov (United States)

    Irvine, Kathryn M.; Rodhouse, Thomas J.

    2014-01-01

    As of 2013, Rocky Mountain and Upper Columbia Basin Inventory and Monitoring Networks have multiple years of vegetation data and Greater Yellowstone Network has three years of vegetation data and monitoring is ongoing in all three networks. Our primary objective is to assess whether a combined analysis of these data aimed at exploring correlations with climate and weather data is feasible. We summarize the core survey design elements across protocols and point out the major statistical challenges for a combined analysis at present. The dissimilarity in response designs between ROMN and UCBN-GRYN network protocols presents a statistical challenge that has not been resolved yet. However, the UCBN and GRYN data are compatible as they implement a similar response design; therefore, a combined analysis is feasible and will be pursued in future. When data collected by different networks are combined, the survey design describing the merged dataset is (likely) a complex survey design. A complex survey design is the result of combining datasets from different sampling designs. A complex survey design is characterized by unequal probability sampling, varying stratification, and clustering (see Lohr 2010 Chapter 7 for general overview). Statistical analysis of complex survey data requires modifications to standard methods, one of which is to include survey design weights within a statistical model. We focus on this issue for a combined analysis of upland vegetation from these networks, leaving other topics for future research. We conduct a simulation study on the possible effects of equal versus unequal probability selection of points on parameter estimates of temporal trend using available packages within the R statistical computing package. We find that, as written, using lmer or lm for trend detection in a continuous response and clm and clmm for visually estimated cover classes with “raw” GRTS design weights specified for the weight argument leads to substantially

  8. Simulation Experiments in Practice: Statistical Design and Regression Analysis

    OpenAIRE

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...

  9. ALGORITHM OF PRIMARY STATISTICAL ANALYSIS OF ARRAYS OF EXPERIMENTAL DATA

    Directory of Open Access Journals (Sweden)

    LAUKHIN D. V.

    2017-02-01

    Full Text Available Annotation. Purpose. Construction of an algorithm for preliminary (primary estimation of arrays of experimental data for further obtaining a mathematical model of the process under study. Methodology. The use of the main regularities of the theory of processing arrays of experimental values in the initial analysis of data. Originality. An algorithm for performing a primary statistical analysis of the arrays of experimental data is given. Practical value. Development of methods for revealing statistically unreliable values in arrays of experimental data for the purpose of their subsequent detailed analysis and construction of a mathematical model of the studied processes.

  10. Statistical analysis of planktic foraminifera of the surface Continental ...

    African Journals Online (AJOL)

    Planktic foraminiferal assemblage recorded from selected samples obtained from shallow continental shelf sediments off southwestern Nigeria were subjected to statistical analysis. The Principal Component Analysis (PCA) was used to determine variants of planktic parameters. Values obtained for these parameters were ...

  11. PRECISE - pregabalin in addition to usual care: Statistical analysis plan

    NARCIS (Netherlands)

    S. Mathieson (Stephanie); L. Billot (Laurent); C. Maher (Chris); A.J. McLachlan (Andrew J.); J. Latimer (Jane); B.W. Koes (Bart); M.J. Hancock (Mark J.); I. Harris (Ian); R.O. Day (Richard O.); J. Pik (Justin); S. Jan (Stephen); C.-W.C. Lin (Chung-Wei Christine)

    2016-01-01

    textabstractBackground: Sciatica is a severe, disabling condition that lacks high quality evidence for effective treatment strategies. This a priori statistical analysis plan describes the methodology of analysis for the PRECISE study. Methods/design: PRECISE is a prospectively registered, double

  12. HistFitter software framework for statistical data analysis

    CERN Document Server

    Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.

    2015-01-01

    We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...

  13. A Divergence Statistics Extension to VTK for Performance Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-02-01

    This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.

  14. Gravitational lensing statistics with extragalactic surveys - IV. Joint constraints on lambda(0) and Omega(0) from gravitational lensing statistics and CMB anisotropies

    NARCIS (Netherlands)

    Macias-Perez, JF; Helbig, P; Quast, R; Wilkinson, A; Davies, R

    We present constraints on the cosmological constant lambda(0) and the density parameter Omega(0) from joint constraints from the analyses of gravitational lensing statistics of the Jo- drell Bank-VLA Astrometric Survey (JVAS), optical gravitational lens surveys from the literature and CMB

  15. Statistical analysis of spatial and spatio-temporal point patterns

    CERN Document Server

    Diggle, Peter J

    2013-01-01

    Written by a prominent statistician and author, the first edition of this bestseller broke new ground in the then emerging subject of spatial statistics with its coverage of spatial point patterns. Retaining all the material from the second edition and adding substantial new material, Statistical Analysis of Spatial and Spatio-Temporal Point Patterns, Third Edition presents models and statistical methods for analyzing spatially referenced point process data. Reflected in the title, this third edition now covers spatio-temporal point patterns. It explores the methodological developments from th

  16. Longitudinal data analysis a handbook of modern statistical methods

    CERN Document Server

    Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert

    2008-01-01

    Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint

  17. Streamstats: U.S. Geological Survey Web Application for Streamflow Statistics for Connecticut

    Science.gov (United States)

    Ahearn, Elizabeth A.; Ries, Kernell G.; Steeves, Peter A.

    2006-01-01

    Introduction An important mission of the U. S. Geological Survey (USGS) is to provide information on streamflow in the Nation's rivers. Streamflow statistics are used by water managers, engineers, scientists, and others to protect people and property during floods and droughts, and to manage land, water, and biological resources. Common uses for streamflow statistics include dam, bridge, and culvert design; water-supply planning and management; water-use appropriations and permitting; wastewater and industrial discharge permitting; hydropower-facility design and regulation; and flood-plain mapping for establishing flood-insurance rates and land-use zones. In an effort to improve access to published streamflow statistics, and to make the process of computing streamflow statistics for ungaged stream sites easier, more accurate, and more consistent, the USGS and the Environmental Systems Research Institute, Inc. (ESRI) developed StreamStats (Ries and others, 2004). StreamStats is a Geographic Information System (GIS)-based Web application for serving previously published streamflow statistics and basin characteristics for USGS data-collection stations, and computing streamflow statistics and basin characteristics for ungaged stream sites. The USGS, in cooperation with the Connecticut Department of Environmental Protection and the Connecticut Department of Transportation, has implemented StreamStats for Connecticut.

  18. MANAGERIAL DECISION IN INNOVATIVE EDUCATION SYSTEMS STATISTICAL SURVEY BASED ON SAMPLE THEORY

    Directory of Open Access Journals (Sweden)

    Gheorghe SĂVOIU

    2012-12-01

    Full Text Available Before formulating the statistical hypotheses and the econometrictesting itself, a breakdown of some of the technical issues is required, which are related to managerial decision in innovative educational systems, the educational managerial phenomenon tested through statistical and mathematical methods, respectively the significant difference in perceiving the current qualities, knowledge, experience, behaviour and desirable health, obtained through a questionnaire applied to a stratified population at the end,in the educational environment, either with educational activities, or with simultaneously managerial and educational activities. The details having to do with research focused on the survey theory, turning into a working tool the questionnaires and statistical data that are processed from those questionnaires, are summarized below.

  19. Recommendations for statistical designs of in vivo mutagenicity tests with regard to subsequent statistical analysis.

    Science.gov (United States)

    Adler, I D; Bootman, J; Favor, J; Hook, G; Schriever-Schwemmer, G; Welzl, G; Whorton, E; Yoshimura, I; Hayashi, M

    1998-09-01

    A workshop was held on September 13 and 14, 1993, at the GSF, Neuherberg, Germany, to start a discussion of experimental design and statistical analysis issues for three in vivo mutagenicity test systems, the micronucleus test in mouse bone marrow/peripheral blood, the chromosomal aberration tests in mouse bone marrow/differentiating spermatogonia, and the mouse dominant lethal test. The discussion has now come to conclusions which we would like to make generally known. Rather than dwell upon specific statistical tests which could be used for data analysis, serious consideration was given to test design. However, the test design, its power of detecting a given increase of adverse effects and the test statistics are interrelated. Detailed analyses of historical negative control data led to important recommendations for each test system. Concerning the statistical sensitivity parameters, a type I error of 0.05 (one tailed), a type II error of 0.20 and a dose related increase of twice the background (negative control) frequencies were generally adopted. It was recommended that sufficient observations (cells, implants) be planned for each analysis unit (animal) so that at least one adverse outcome (micronucleus, aberrant cell, dead implant) would likely be observed. The treated animal was the smallest unit of analysis allowed. On the basis of these general consideration the sample size was determined for each of the three assays. A minimum of 2000 immature erythrocytes/animal should be scored for micronuclei from each of at least 4 animals in each comparison group in the micronucleus assays. A minimum of 200 cells should be scored for chromosomal aberrations from each of at least 5 animals in each comparison group in the aberration assays. In the dominant lethal test, a minimum of 400 implants (40-50 pregnant females) are required per dose group for each mating period. The analysis unit for the dominant lethal test would be the treated male unless the background

  20. Towards proper sampling and statistical analysis of defects

    Directory of Open Access Journals (Sweden)

    Cetin Ali

    2014-06-01

    Full Text Available Advancements in applied statistics with great relevance to defect sampling and analysis are presented. Three main issues are considered; (i proper handling of multiple defect types, (ii relating sample data originating from polished inspection surfaces (2D to finite material volumes (3D, and (iii application of advanced extreme value theory in statistical analysis of block maximum data. Original and rigorous, but practical mathematical solutions are presented. Finally, these methods are applied to make prediction regarding defect sizes in a steel alloy containing multiple defect types.

  1. Statistical analysis of hydroclimatic time series: Uncertainty and insights

    Science.gov (United States)

    Koutsoyiannis, Demetris; Montanari, Alberto

    2007-05-01

    Today, hydrologic research and modeling depends largely on climatological inputs, whose physical and statistical behavior are the subject of many debates in the scientific community. A relevant ongoing discussion is focused on long-term persistence (LTP), a natural behavior identified in several studies of instrumental and proxy hydroclimatic time series, which, nevertheless, is neglected in some climatological studies. LTP may reflect a long-term variability of several factors and thus can support a more complete physical understanding and uncertainty characterization of climate. The implications of LTP in hydroclimatic research, especially in statistical questions and problems, may be substantial but appear to be not fully understood or recognized. To offer insights on these implications, we demonstrate by using analytical methods that the characteristics of temperature series, which appear to be compatible with the LTP hypothesis, imply a dramatic increase of uncertainty in statistical estimation and reduction of significance in statistical testing, in comparison with classical statistics. Therefore we maintain that statistical analysis in hydroclimatic research should be revisited in order not to derive misleading results and simultaneously that merely statistical arguments do not suffice to verify or falsify the LTP (or another) climatic hypothesis.

  2. Guidelines to Statistical Analysis of Microbial Composition Data Inferred from Metagenomic Sequencing.

    Science.gov (United States)

    Odintsova, Vera; Tyakht, Alexander; Alexeev, Dmitry

    2017-01-01

    Metagenomics, the application of high-throughput DNA sequencing for surveys of environmental samples, has revolutionized our view on the taxonomic and genetic composition of complex microbial communities. An enormous richness of microbiota keeps unfolding in the context of various fields ranging from biomedicine and food industry to geology. Primary analysis of metagenomic reads allows to infer semi-quantitative data describing the community structure. However, such compositional data possess statistical specific properties that are important to be considered during preprocessing, hypothesis testing and interpreting the results of statistical tests. Failure to account for these specifics may lead to essentially wrong conclusions as a result of the survey. Here we present a researcher introduced to the field of metagenomics with the basic properties of microbial compositional data including statistical power and proposed distribution models, perform a review of the publicly available software tools developed specifically for such data and outline the recommendations for the application of the methods.

  3. Data analysis using the Gnu R system for statistical computation

    Energy Technology Data Exchange (ETDEWEB)

    Simone, James; /Fermilab

    2011-07-01

    R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.

  4. Statistical analysis of fNIRS data: a comprehensive review.

    Science.gov (United States)

    Tak, Sungho; Ye, Jong Chul

    2014-01-15

    Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Bayesian analysis: a new statistical paradigm for new technology.

    Science.gov (United States)

    Grunkemeier, Gary L; Payne, Nicola

    2002-12-01

    Full Bayesian analysis is an alternative statistical paradigm, as opposed to traditionally used methods, usually called frequentist statistics. Bayesian analysis is controversial because it requires assuming a prior distribution, which can be arbitrarily chosen; thus there is a subjective element, which is considered to be a major weakness. However, this could also be considered a strength since it provides a formal way of incorporating prior knowledge. Since it is flexible and permits repeated looks at evolving data, Bayesian analysis is particularly well suited to the evaluation of new medical technology. Bayesian analysis can refer to a range of things: from a simple, noncontroversial formula for inverting probabilities to an alternative approach to the philosophy of science. Its advantages include: (1) providing direct probability statements--which are what most people wrongly assume they are getting from conventional statistics; (2) formally incorporating previous information in statistical inference of a data set, a natural approach which we follow in everyday reasoning; and (3) flexible, adaptive research designs allowing multiple looks at accumulating study data. Its primary disadvantage is the element of subjectivity which some think is not scientific. We discuss and compare frequentist and Bayesian approaches and provide three examples of Bayesian analysis: (1) EKG interpretation, (2) a coin-tossing experiment, and (3) assessing the thromboembolic risk of a new mechanical heart valve.

  6. Analysis of room transfer function and reverberant signal statistics

    DEFF Research Database (Denmark)

    Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn

    2008-01-01

    For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the R...... “anechoic” and “reverberant” audio speech signals, in order to model the alterations due to room acoustics. The above results are obtained from both in-situ room response measurements and controlled acoustical response simulations.......For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the RTF...... magnitude and phase). RTF fractional octave smoothing, as with 1-slash 3 octave analysis, may lead to RTF simplifications that can be useful for several audio applications, like room compensation, room modeling, auralisation purposes. The aim of this work is to identify the relationship of optimal response...

  7. SPORTS ORGANIZATIONS MANAGEMENT IMPROVEMENT: A SURVEY ANALYSIS

    Directory of Open Access Journals (Sweden)

    Alin Molcut

    2015-07-01

    Full Text Available Sport organizations exist to perform tasks that can only be executed through cooperative effort, and sport management is responsible for the performance and success of these organizations. The main of the paper is to analyze several issues of management sports organizations in order to asses their quality management. In this respect a questionnaire has been desingned for performing a survey analysis through a statistical approach. Investigation was conducted over a period of 3 months, and have been questioned a number of managers and coaches of football, all while pursuing an activity in football clubs in the counties of Timis and Arad, the level of training for children and juniors. The results suggest that there is a significant interest for the improvement of management across teams of children and under 21 clubs, emphasis on players' participation and rewarding performance. Furthermore, we can state that in the sports clubs there is established a vision and a mission as well as the objectives of the club's general refers to both sporting performance, and financial performance.

  8. What do Indian children drink when they do not receive water? Statistical analysis of water and alternative beverage consumption from the 2005-2006 Indian National Family Health Survey.

    Science.gov (United States)

    Fledderjohann, Jasmine; Doyle, Pat; Campbell, Oona; Ebrahim, Shah; Basu, Sanjay; Stuckler, David

    2015-07-05

    Over 1.2 billion people lack access to clean water. However, little is known about what children drink when there is no clean water. We investigated the prevalence of receiving no water and what Indian children drink instead. We analysed children's beverage consumption using representative data from India's National Family and Health Survey (NFHS-3, 2005-2006). Consumption was based on mothers' reports (n = 22,668) for children aged 6-59 months (n = 30,656). About 10 % of Indian children had no water in the last 24 h, corresponding to 12,700,000 children nationally, (95 % CI: 12,260,000 to 13,200,000). Among children who received no water, 23 % received breast or fresh milk and 24 % consumed formula, "other liquid", juice, or two or more beverages. Children over 2 were more likely to consume non-milk beverages, including tea, coffee, and juice than those under 2 years. Those in the lowest two wealth quintiles were 16 % less likely to have received water (OR = 0.84; 95 % CI: 0.74 to 0.96). Compared to those living in households with bottled, piped, or tanker water, children were significantly less likely to receive water in households using well water (OR = 0.75; 95 % CI: 0.64 to 0.89) or river, spring, or rain water (OR =0.70; 95 % CI: 0.53 to 0.92) in the last 24 h. About 13 million Indian children aged 6-59 months received no water in the last 24 h. Further research is needed to assess the risks potentially arising from insufficient water, caffeinated beverages, and high sugar drinks at early stages of life.

  9. A novel statistic for genome-wide interaction analysis.

    Directory of Open Access Journals (Sweden)

    Xuesen Wu

    2010-09-01

    Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001analysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  10. Statistical Compilation of the ICT Sector and Policy Analysis | IDRC ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...

  11. statistical analysis of wind speed for electrical power generation in ...

    African Journals Online (AJOL)

    HOD

    are employed to fit wind speed data of some selected sites in Northern Nigeria. This is because the design of wind energy conversion systems depends on the correct analysis of the site renewable energy resources. [13]. In addition, the statistical judgements are based on the accuracy in fitting the available data at the sites.

  12. Using multivariate statistical analysis to assess changes in water ...

    African Journals Online (AJOL)

    Multivariate statistical analysis was used to investigate changes in water chemistry at 5 river sites in the Vaal Dam catchment, draining the Highveld grasslands. These grasslands receive more than 8 kg sulphur (S) ha-1·year-1 and 6 kg nitrogen (N) ha-1·year-1 via atmospheric deposition. It was hypothesised that between ...

  13. Statistical Compilation of the ICT Sector and Policy Analysis | CRDI ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...

  14. Cosmological constraints with weak lensing peak counts and second-order statistics in a large-field survey

    Science.gov (United States)

    Peel, Austin; Lin, Chieh-An; Lanusse, Francois; Leonard, Adrienne; Starck, Jean-Luc; Kilbinger, Martin

    2017-01-01

    Peak statistics in weak lensing maps access the non-Gaussian information contained in the large-scale distribution of matter in the Universe. They are therefore a promising complementary probe to two-point and higher-order statistics to constrain our cosmological models. To prepare for the high precision afforded by next-generation weak lensing surveys, we assess the constraining power of peak counts in a simulated Euclid-like survey on the cosmological parameters Ωm, σ8, and w0de. In particular, we study how CAMELUS---a fast stochastic model for predicting peaks---can be applied to such large surveys. The algorithm avoids the need for time-costly N-body simulations, and its stochastic approach provides full PDF information of observables. We measure the abundance histogram of peaks in a mock shear catalogue of approximately 5,000 deg2 using a multiscale mass map filtering technique, and we then constrain the parameters of the mock survey using CAMELUS combined with approximate Bayesian computation, a robust likelihood-free inference algorithm. We find that peak statistics yield a tight but significantly biased constraint in the σ8-Ωm plane, indicating the need to better understand and control the model's systematics before applying it to a real survey of this size or larger. We perform a calibration of the model to remove the bias and compare results to those from the two-point correlation functions (2PCF) measured on the same field. In this case, we find the derived parameter Σ8 = σ8(Ωm/0.27)α = 0.76 (-0.03 +0.02) with α = 0.65 for peaks, while for 2PCF the values are Σ8 = 0.76 (-0.01 +0.02) and α = 0.70. We conclude that the constraining power can therefore be comparable between the two weak lensing observables in large-field surveys. Furthermore, the tilt in the σ8-Ωm degeneracy direction for peaks with respect to that of 2PCF suggests that a combined analysis would yield tighter constraints than either measure alone. As expected, w0de cannot be

  15. The Statistics Concept Inventory: Development and analysis of a cognitive assessment instrument in statistics

    Science.gov (United States)

    Allen, Kirk

    The Statistics Concept Inventory (SCI) is a multiple choice test designed to assess students' conceptual understanding of topics typically encountered in an introductory statistics course. This dissertation documents the development of the SCI from Fall 2002 up to Spring 2006. The first phase of the project essentially sought to answer the question: "Can you write a test to assess topics typically encountered in introductory statistics?" Book One presents the results utilized in answering this question in the affirmative. The bulk of the results present the development and evolution of the items, primarily relying on objective metrics to gauge effectiveness but also incorporating student feedback. The second phase boils down to: "Now that you have the test, what else can you do with it?" This includes an exploration of Cronbach's alpha, the most commonly-used measure of test reliability in the literature. An online version of the SCI was designed, and its equivalency to the paper version is assessed. Adding an extra wrinkle to the online SCI, subjects rated their answer confidence. These results show a general positive trend between confidence and correct responses. However, some items buck this trend, revealing potential sources of misunderstandings, with comparisons offered to the extant statistics and probability educational research. The third phase is a re-assessment of the SCI: "Are you sure?" A factor analytic study favored a uni-dimensional structure for the SCI, although maintaining the likelihood of a deeper structure if more items can be written to tap similar topics. A shortened version of the instrument is proposed, demonstrated to be able to maintain a reliability nearly identical to that of the full instrument. Incorporating student feedback and a faculty topics survey, improvements to the items and recommendations for further research are proposed. The state of the concept inventory movement is assessed, to offer a comparison to the work presented

  16. Evaluation of heart failure biomarker tests: a survey of statistical considerations.

    Science.gov (United States)

    De, Arkendra; Meier, Kristen; Tang, Rong; Li, Meijuan; Gwise, Thomas; Gomatam, Shanti; Pennello, Gene

    2013-08-01

    Biomarkers assessing cardiovascular function can encompass a wide range of biochemical or physiological measurements. Medical tests that measure biomarkers are typically evaluated for measurement validation and clinical performance in the context of their intended use. General statistical principles for the evaluation of medical tests are discussed in this paper in the context of heart failure. Statistical aspects of study design and analysis to be considered while assessing the quality of measurements and the clinical performance of tests are highlighted. A discussion of statistical considerations for specific clinical uses is also provided. The remarks in this paper mainly focus on methods and considerations for statistical evaluation of medical tests from the perspective of bias and precision. With such an evaluation of performance, healthcare professionals could have information that leads to a better understanding on the strengths and limitations of tests related to heart failure.

  17. Multivariate statistical analysis of atom probe tomography data.

    Science.gov (United States)

    Parish, Chad M; Miller, Michael K

    2010-10-01

    The application of spectrum imaging multivariate statistical analysis methods, specifically principal component analysis (PCA), to atom probe tomography (APT) data has been investigated. The mathematical method of analysis is described and the results for two example datasets are analyzed and presented. The first dataset is from the analysis of a PM 2000 Fe-Cr-Al-Ti steel containing two different ultrafine precipitate populations. PCA properly describes the matrix and precipitate phases in a simple and intuitive manner. A second APT example is from the analysis of an irradiated reactor pressure vessel steel. Fine, nm-scale Cu-enriched precipitates having a core-shell structure were identified and qualitatively described by PCA. Advantages, disadvantages, and future prospects for implementing these data analysis methodologies for APT datasets, particularly with regard to quantitative analysis, are also discussed. Copyright 2010 Elsevier B.V. All rights reserved.

  18. Using Multivariate Statistical Analysis for Grouping of State Forest Enterprises

    Directory of Open Access Journals (Sweden)

    Atakan Öztürk

    2010-11-01

    Full Text Available The purpose of this study was to investigate the use possibilities of multivariate statistical analysis methods for grouping of Forest Enterprises. This study involved 24 Forest Enterprises in Eastern Black Sea Region. A total 69 variables, classified as physical, economic, social, rural settlements, technical-managerial, and functional variables, were developed. Multivariate statistics such as factor, cluster and discriminate analyses were used to classify 24 Forest Enterpprises. These enterprises classified into 2 groups. 22 enterprises were in first group and while remained 2 enterprises in second group.

  19. Network similarity and statistical analysis of earthquake seismic data

    Science.gov (United States)

    Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban

    2017-09-01

    We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.

  20. Explorations in statistics: the analysis of ratios and normalized data.

    Science.gov (United States)

    Curran-Everett, Douglas

    2013-09-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of Explorations in Statistics explores the analysis of ratios and normalized-or standardized-data. As researchers, we compute a ratio-a numerator divided by a denominator-to compute a proportion for some biological response or to derive some standardized variable. In each situation, we want to control for differences in the denominator when the thing we really care about is the numerator. But there is peril lurking in a ratio: only if the relationship between numerator and denominator is a straight line through the origin will the ratio be meaningful. If not, the ratio will misrepresent the true relationship between numerator and denominator. In contrast, regression techniques-these include analysis of covariance-are versatile: they can accommodate an analysis of the relationship between numerator and denominator when a ratio is useless.

  1. Statistical analysis and interpolation of compositional data in materials science.

    Science.gov (United States)

    Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M

    2015-02-09

    Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.

  2. Feature-Based Statistical Analysis of Combustion Simulation Data

    Energy Technology Data Exchange (ETDEWEB)

    Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T

    2011-11-18

    We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion

  3. SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS

    Science.gov (United States)

    Brownlow, J. D.

    1994-01-01

    The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval

  4. Statistical evaluation of Pacific Northwest Residential Energy Consumption Survey weather data

    Energy Technology Data Exchange (ETDEWEB)

    Tawil, J.J.

    1986-02-01

    This report addresses an issue relating to energy consumption and conservation in the residential sector. BPA has obtained two meteorological data bases for use with its 1983 Pacific Northwest Residential Energy Survey (PNWRES). One data base consists of temperature data from weather stations; these have been aggregated to form a second data base that covers the National Oceanographic and Atmospheric Administration (NOAA) climatic divisions. At BPA's request, Pacific Northwest Laboratory has produced a household energy use model for both electricity and natural gas in order to determine whether the statistically estimated parameters of the model significantly differ when the two different meteorological data bases are used.

  5. Understanding of statistical terms routinely used in meta-analyses: an international survey among researchers.

    Science.gov (United States)

    Mavros, Michael N; Alexiou, Vangelis G; Vardakas, Konstantinos Z; Falagas, Matthew E

    2013-01-01

    Biomedical literature is increasingly enriched with literature reviews and meta-analyses. We sought to assess the understanding of statistical terms routinely used in such studies, among researchers. An online survey posing 4 clinically-oriented multiple-choice questions was conducted in an international sample of randomly selected corresponding authors of articles indexed by PubMed. A total of 315 unique complete forms were analyzed (participation rate 39.4%), mostly from Europe (48%), North America (31%), and Asia/Pacific (17%). Only 10.5% of the participants answered correctly all 4 "interpretation" questions while 9.2% answered all questions incorrectly. Regarding each question, 51.1%, 71.4%, and 40.6% of the participants correctly interpreted statistical significance of a given odds ratio, risk ratio, and weighted mean difference with 95% confidence intervals respectively, while 43.5% correctly replied that no statistical model can adjust for clinical heterogeneity. Clinicians had more correct answers than non-clinicians (mean score ± standard deviation: 2.27±1.06 versus 1.83±1.14, presearchers, randomly selected from a diverse international sample of biomedical scientists, misinterpreted statistical terms commonly reported in meta-analyses. Authors could be prompted to explicitly interpret their findings to prevent misunderstandings and readers are encouraged to keep up with basic biostatistics.

  6. An Experience of Statistical Method Application in Forest Survey at Angara River Region in 1932

    Directory of Open Access Journals (Sweden)

    L. N. Vashchuk

    2014-10-01

    Full Text Available Report of the Angara forest economic expedition of forest economic survey in 1932 on the left bank of the Angara River has been found. The survey covered a part of Krasnoyarsk Territory and Irkutsk region, a total area of 18641.8 thousand ha. The report describes technology of forest inventory and achievements that have not previously been published. The survey was conducted by statistical method, which consisted of a sample by a continuous forest inventory enumeration of trees on sample plots (SP, arranged in an array on a particular system, followed by mathematical-statistical recalculation of the sample results to the entire survey. To do this, strip finders (sights were cut in the latitudinal direction at a distance from one another at 16 km. On the hacked sights, by every 2 km, 0.1 ha (10 × 100 m SP were established. In total 32 forest inventory sights were hacked, with total length of 9931 km, which incorporated 4817 SP. The accuracy of forest resources’ inventory characteristics determining also was investigated using smaller sample plots. For this purpose, each of the SP were cut to smaller area of 0.01 ha (10 × 10 m, where independent continuous enumeration of trees was conducted, andsample trees were cut, measured and bucked to the assortments, to explore the tree stand assortment structure. At each «sample cutting area» all the trees were felled out from 44 cm and above DBH. At half of the sample plot with 5 × 10 m size, located in the eastern end, all the trees were felled out and measured from 24 cm and above DBH. Every four «sample cutting area» in the fifth, all the trees with 12 cm and above DBH were cut down and measured. According to the results of the work, a detailed description of forest resources in the whole Angara river basin, and across 17 forest exploitation areas was completed.

  7. GASPS—A Herschel Survey of Gas and Dust in Protoplanetary Disks: Summary and Initial Statistics : Summary and Initial Statistics

    NARCIS (Netherlands)

    Dent, W. R. F.; Thi, W. F.; Kamp, I.; Williams, J. P.; Menard, F.; Andrews, S.; Ardila, D.; Aresu, G.; Augereau, J. -C.; Barrado y Navascues, D.; Brittain, S.; Carmona, A.; Ciardi, D.; Danchi, W.; Donaldson, J.; Duchene, G.; Eiroa, C.; Fedele, D.; Grady, C.; de Gregorio-Molsalvo, I.; Howard, C.; Huelamo, N.; Krivov, A.; Lebreton, J.; Liseau, R.; Martin-Zaidi, C.; Mathews, G.; Meeus, G.; Mendigutia, I.; Montesinos, B.; Morales-Calderon, M.; Mora, A.; Nomura, H.; Pantin, E.; Pascucci, I.; Phillips, N.; Pinte, C.; Podio, L.; Ramsay, S. K.; Riaz, B.; Riviere-Marichalar, P.; Roberge, A.; Sandell, G.; Solano, E.; Tilling, I.; Torrelles, J. M.; Vandenbusche, B.; Vicente, S.; White, G. J.; Woitke, P.

    We describe a large-scale far-infrared line and continuum survey of protoplanetary disk through to young debris disk systems carried out using the ACS instrument on the Herschel Space Observatory. This Open Time Key program, known as GASPS (Gas Survey of Protoplanetary Systems), targeted ~250 young

  8. Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)

    Science.gov (United States)

    Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee

    2010-12-01

    Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review

  9. CORSSA: Community Online Resource for Statistical Seismicity Analysis

    Science.gov (United States)

    Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.

    2011-12-01

    Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.

  10. Heliostat mirror survey and analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lind, M.A.; Buckwalter, C.Q.; Daniel, J.L.; Hartman, J.S.; Thomas, M.T.; Pederson, L.R.

    1979-09-01

    The mirrors used on concentrating solar systems must be able to withstand severe and sustained environmental stresses for long periods of time if they are to be economically acceptable. Little is known about how commercially produced wet process silvered second surface mirrors will withstand the test of time in solar applications. Field experience in existing systems has shown that the performance of the reflective surface varies greatly with time and is influenced to a large extent by the construction details of the mirror module. Degradation of the reflective layer has been seen that ranges from non-observable to severe. The exact mechanisms involved in the degradation process are not well understood from either the phenomenological or microanalytical points of view and are thus subject to much debate. The three chapters of this report summarize the work recently performed in three general areas that are key to understanding and ultimately controlling the degradation phenomena. These areas are: a survey of the present commercial mirroring industry, the microanalytical examination of numerous degraded and nondegraded mirrors, and an investigation of several novel techniques that might be used to extend the life of heliostat mirrors. Appendices include: (a) list of mirror manufacturers and (b) recommended specifications for second surface silvered mirrors for central receiver heliostat applications. (WHK)

  11. Statistical Analysis of SAR Sea Clutter for Classification Purposes

    Directory of Open Access Journals (Sweden)

    Jaime Martín-de-Nicolás

    2014-09-01

    Full Text Available Statistical analysis of radar clutter has always been one of the topics, where more effort has been put in the last few decades. These studies were usually focused on finding the statistical models that better fitted the clutter distribution; however, the goal of this work is not the modeling of the clutter, but the study of the suitability of the statistical parameters to carry out a sea state classification. In order to achieve this objective and provide some relevance to this study, an important set of maritime and coastal Synthetic Aperture Radar data is considered. Due to the nature of the acquisition of data by SAR sensors, speckle noise is inherent to these data, and a specific study of how this noise affects the clutter distribution is also performed in this work. In pursuit of a sense of wholeness, a thorough study of the most suitable statistical parameters, as well as the most adequate classifier is carried out, achieving excellent results in terms of classification success rates. These concluding results confirm that a sea state classification is not only viable, but also successful using statistical parameters different from those of the best modeling distribution and applying a speckle filter, which allows a better characterization of the parameters used to distinguish between different sea states.

  12. Wavelet analysis in ecology and epidemiology: impact of statistical tests.

    Science.gov (United States)

    Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario

    2014-02-06

    Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.

  13. Survey of Task Analysis Methods

    Science.gov (United States)

    1978-02-14

    Taylor, for example, referred to task analysis in his work on scientific management (65). In the same time frame, the Gilbreths developed the first...ciation, Washington, D. C., 1965. 21. Gilbreth , F. B. Bricklaying System, M. C. Clark, New York, 1909. -42- REFERENCES (Continued) 22. Gilbreth , F

  14. Univariate statistical analysis of environmental (compositional) data: problems and possibilities.

    Science.gov (United States)

    Filzmoser, Peter; Hron, Karel; Reimann, Clemens

    2009-11-15

    For almost 30 years it has been known that compositional (closed) data have special geometrical properties. In environmental sciences, where the concentration of chemical elements in different sample materials is investigated, almost all datasets are compositional. In general, compositional data are parts of a whole which only give relative information. Data that sum up to a constant, e.g. 100 wt.%, 1,000,000 mg/kg are the best known example. It is widely neglected that the "closure" characteristic remains even if only one of all possible elements is measured, it is an inherent property of compositional data. No variable is free to vary independent of all the others. Existing transformations to "open" closed data are seldom applied. They are more complicated than a log transformation and the relationship to the original data unit is lost. Results obtained when using classical statistical techniques for data analysis appeared reasonable and the possible consequences of working with closed data were rarely questioned. Here the simple univariate case of data analysis is investigated. It can be demonstrated that data closure must be overcome prior to calculating even simple statistical measures like mean or standard deviation or plotting graphs of the data distribution, e.g. a histogram. Some measures like the standard deviation (or the variance) make no statistical sense with closed data and all statistical tests building on the standard deviation (or variance) will thus provide erroneous results if used with the original data.

  15. STATISTICAL ANALYSIS OF THE HEAVY NEUTRAL ATOMS MEASURED BY IBEX

    Energy Technology Data Exchange (ETDEWEB)

    Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard [Space Science Center and Department of Physics, University of New Hampshire, 8 College Road, Durham, NH 03824 (United States); Galli, André [Pysics Institute, University of Bern, Bern 3012 (Switzerland); Livadiotis, George; Fuselier, Steve A.; McComas, David J., E-mail: jtl29@wildcats.unh.edu [Southwest Research Institute, P.O. Drawer 28510, San Antonio, TX 78228 (United States)

    2015-10-15

    We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O and Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O and Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath.

  16. Applied statistical training to strengthen analysis and health research capacity in Rwanda.

    Science.gov (United States)

    Thomson, Dana R; Semakula, Muhammed; Hirschhorn, Lisa R; Murray, Megan; Ndahindwa, Vedaste; Manzi, Anatole; Mukabutera, Assumpta; Karema, Corine; Condo, Jeanine; Hedt-Gauthier, Bethany

    2016-09-29

    To guide efficient investment of limited health resources in sub-Saharan Africa, local researchers need to be involved in, and guide, health system and policy research. While extensive survey and census data are available to health researchers and program officers in resource-limited countries, local involvement and leadership in research is limited due to inadequate experience, lack of dedicated research time and weak interagency connections, among other challenges. Many research-strengthening initiatives host prolonged fellowships out-of-country, yet their approaches have not been evaluated for effectiveness in involvement and development of local leadership in research. We developed, implemented and evaluated a multi-month, deliverable-driven, survey analysis training based in Rwanda to strengthen skills of five local research leaders, 15 statisticians, and a PhD candidate. Research leaders applied with a specific research question relevant to country challenges and committed to leading an analysis to publication. Statisticians with prerequisite statistical training and experience with a statistical software applied to participate in class-based trainings and complete an assigned analysis. Both statisticians and research leaders were provided ongoing in-country mentoring for analysis and manuscript writing. Participants reported a high level of skill, knowledge and collaborator development from class-based trainings and out-of-class mentorship that were sustained 1 year later. Five of six manuscripts were authored by multi-institution teams and submitted to international peer-reviewed scientific journals, and three-quarters of the participants mentored others in survey data analysis or conducted an additional survey analysis in the year following the training. Our model was effective in utilizing existing survey data and strengthening skills among full-time working professionals without disrupting ongoing work commitments and using few resources. Critical to our

  17. GNSS Spoofing Detection Based on Signal Power Measurements: Statistical Analysis

    Directory of Open Access Journals (Sweden)

    V. Dehghanian

    2012-01-01

    Full Text Available A threat to GNSS receivers is posed by a spoofing transmitter that emulates authentic signals but with randomized code phase and Doppler values over a small range. Such spoofing signals can result in large navigational solution errors that are passed onto the unsuspecting user with potentially dire consequences. An effective spoofing detection technique is developed in this paper, based on signal power measurements and that can be readily applied to present consumer grade GNSS receivers with minimal firmware changes. An extensive statistical analysis is carried out based on formulating a multihypothesis detection problem. Expressions are developed to devise a set of thresholds required for signal detection and identification. The detection processing methods developed are further manipulated to exploit incidental antenna motion arising from user interaction with a GNSS handheld receiver to further enhance the detection performance of the proposed algorithm. The statistical analysis supports the effectiveness of the proposed spoofing detection technique under various multipath conditions.

  18. Statistics in experimental design, preprocessing, and analysis of proteomics data.

    Science.gov (United States)

    Jung, Klaus

    2011-01-01

    High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.

  19. Complex surveys analysis of categorical data

    CERN Document Server

    Mukhopadhyay, Parimal

    2016-01-01

    The primary objective of this book is to study some of the research topics in the area of analysis of complex surveys which have not been covered in any book yet. It discusses the analysis of categorical data using three models: a full model, a log-linear model and a logistic regression model. It is a valuable resource for survey statisticians and practitioners in the field of sociology, biology, economics, psychology and other areas who have to use these procedures in their day-to-day work. It is also useful for courses on sampling and complex surveys at the upper-undergraduate and graduate levels. The importance of sample surveys today cannot be overstated. From voters’ behaviour to fields such as industry, agriculture, economics, sociology, psychology, investigators generally resort to survey sampling to obtain an assessment of the behaviour of the population they are interested in. Many large-scale sample surveys collect data using complex survey designs like multistage stratified cluster designs. The o...

  20. Lifetime statistics of quantum chaos studied by a multiscale analysis

    KAUST Repository

    Di Falco, A.

    2012-04-30

    In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.

  1. Statistical Analysis of the Exchange Rate of Bitcoin.

    Science.gov (United States)

    Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen

    2015-01-01

    Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.

  2. Statistical Analysis of the Exchange Rate of Bitcoin.

    Directory of Open Access Journals (Sweden)

    Jeffrey Chu

    Full Text Available Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.

  3. Statistical Analysis of the Exchange Rate of Bitcoin

    Science.gov (United States)

    Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen

    2015-01-01

    Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702

  4. : Statistical analysis of the students' behavior in algebra

    OpenAIRE

    Bisson, Gilles; Bronner, Alain; Gordon, Mirta; Nicaud, Jean-François; Renaudie, David

    2003-01-01

    We present an analysis of behaviors of students solving algebra exercises with the Aplusix software. We built a set of statistics from the protocols (records of the interactions between the student and the software) in order to evaluate the correctness of the calculation steps and of the corresponding solutions. We have particularly studied the activities of college students (sixteen and seventeen years old). This study emphasizes the didactic variables which are relevant for the types of sel...

  5. Development of statistical analysis for single dose bronchodilators.

    Science.gov (United States)

    Salsburg, D

    1981-12-01

    When measurements developed for the diagnosis of patients are used to detect treatment effects in clinical trials with chronic disease, problems in definition of response and in the statistical distributions of those measurements within patients have to be resolved before the results of clinical studies can be analyzed. An example of this process is shown in the development of the analysis of single-dose bronchodilator trials.

  6. Lifetime statistics of quantum chaos studied by a multiscale analysis

    Science.gov (United States)

    Di Falco, A.; Krauss, T. F.; Fratalocchi, A.

    2012-04-01

    In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.

  7. Statistical Challenges of Big Data Analysis in Medicine

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2015-01-01

    Roč. 3, č. 1 (2015), s. 24-27 ISSN 1805-8698 R&D Projects: GA ČR GA13-23940S Grant - others:CESNET Development Fund(CZ) 494/2013 Institutional support: RVO:67985807 Keywords : big data * variable selection * classification * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research http://www.ijbh.org/ijbh2015-1.pdf

  8. Statistical and machine learning approaches for network analysis

    CERN Document Server

    Dehmer, Matthias

    2012-01-01

    Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation

  9. A statistical analysis of human lymphocyte transformation data.

    Science.gov (United States)

    Harina, B M; Gill, T J; Rabin, B S; Taylor, F H

    1979-06-01

    The lymphocytes from 107 maternal-foetal pairs were examined for their in vitro responsiveness, as determined by the incorporation of tritiated thymidine following stimulation with phytohaemagglutinin (PHA), candida, varicella, mumps, streptokinase-streptodornase (SKSD) and tetanus toxoid. The data were collected and analysed in two sequential groups (forty-seven and sixty) in order to determine whether the results were reproducible. The variable chosen for analysis was the difference (d) between the square roots of the isotope incorporation in the stimulated and control cultures because it gave the most symmetrical distribution of the data. The experimental error in the determination of maternal lymphocyte stimulation was 1.4--8.6% and of the foetal lymphocytes, 1.0--16.6%, depending upon the antigen or mitogen and its concentration. The data in the two sets of patients were statistically the same in forty-eight of the fifty-six analyses (fourteen antigen or mitogen concentrations in autologous and AB plasma for maternal and foetal lymphocytes). The statistical limits of the distribution of responses for stimulation or suppression were set by an analysis of variance taking two standard deviations from the mean as the limits. When these limits were translated into stimulation indices, they varied for each antigen or mitogen and for different concentrations of the same antigen. Thus, a detailed statistical analysis of a large volume of lymphocyte transformation data indicates that the technique is reproducible and offers a reliable method for determing when significant differences from control values are present.

  10. Fitting statistical distributions to sea duck count data: implications for survey design and abundance estimation

    Science.gov (United States)

    Zipkin, Elise F.; Leirness, Jeffery B.; Kinlan, Brian P.; O'Connell, Allan F.; Silverman, Emily D.

    2014-01-01

    Determining appropriate statistical distributions for modeling animal count data is important for accurate estimation of abundance, distribution, and trends. In the case of sea ducks along the U.S. Atlantic coast, managers want to estimate local and regional abundance to detect and track population declines, to define areas of high and low use, and to predict the impact of future habitat change on populations. In this paper, we used a modified marked point process to model survey data that recorded flock sizes of Common eiders, Long-tailed ducks, and Black, Surf, and White-winged scoters. The data come from an experimental aerial survey, conducted by the United States Fish & Wildlife Service (USFWS) Division of Migratory Bird Management, during which east-west transects were flown along the Atlantic Coast from Maine to Florida during the winters of 2009–2011. To model the number of flocks per transect (the points), we compared the fit of four statistical distributions (zero-inflated Poisson, zero-inflated geometric, zero-inflated negative binomial and negative binomial) to data on the number of species-specific sea duck flocks that were recorded for each transect flown. To model the flock sizes (the marks), we compared the fit of flock size data for each species to seven statistical distributions: positive Poisson, positive negative binomial, positive geometric, logarithmic, discretized lognormal, zeta and Yule–Simon. Akaike’s Information Criterion and Vuong’s closeness tests indicated that the negative binomial and discretized lognormal were the best distributions for all species for the points and marks, respectively. These findings have important implications for estimating sea duck abundances as the discretized lognormal is a more skewed distribution than the Poisson and negative binomial, which are frequently used to model avian counts; the lognormal is also less heavy-tailed than the power law distributions (e.g., zeta and Yule–Simon), which are

  11. SAS and R data management, statistical analysis, and graphics

    CERN Document Server

    Kleinman, Ken

    2009-01-01

    An All-in-One Resource for Using SAS and R to Carry out Common TasksProvides a path between languages that is easier than reading complete documentationSAS and R: Data Management, Statistical Analysis, and Graphics presents an easy way to learn how to perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and the creation of graphics, along with more complex applicat

  12. Institutions Function and Failure Statistic and Analysis of Wind Turbine

    Science.gov (United States)

    yang, Ma; Chengbing, He; Xinxin, Feng

    Recently,with install capacity of wind turbines increases continuously, the wind power consisting of operation,research on reliability,maintenance and rapair will be developed into a key point..Failure analysis can support operation,management of spare components and accessories in wind plants,maintenance and repair of wind turbines.In this paper,with the eye of wind plants'structure and function,statistic and analysis the common fault of each part of the plant,and then find out the faults law, faults cause and fault effect,from which put forward the corresponding measures.

  13. Statistical Analysis of Hypercalcaemia Data related to Transferability

    DEFF Research Database (Denmark)

    Frølich, Anne; Nielsen, Bo Friis

    2005-01-01

    In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...... at conferences during the late eighties and early nineties. In this report we give a more detailed description of many of the analysis and provide some new results primarily by simultaneous studies of several databases....

  14. Health and human rights: a statistical measurement framework using household survey data in Uganda.

    Science.gov (United States)

    Wesonga, Ronald; Owino, Abraham; Ssekiboobo, Agnes; Atuhaire, Leonard; Jehopio, Peter

    2015-05-03

    Health is intertwined with human rights as is clearly reflected in the right to life. Promotion of health practices in the context of human rights can be accomplished if there is a better understanding of the level of human rights observance. In this paper, we evaluate and present an appraisal for a possibility of applying household survey to study the determinants of health and human rights and also derive the probability that human rights are observed; an important ingredient into the national planning framework. Data from the Uganda National Governance Baseline Survey were used. A conceptual framework for predictors of a hybrid dependent variable was developed and both bivariate and multivariate statistical techniques employed. Multivariate post estimation computations were derived after evaluations of the significance of coefficients of health and human rights predictors. Findings, show that household characteristics of respondents considered in this study were statistically significant (p human rights observance. For example, a unit increase of respondents' schooling levels results in an increase of about 34% level of positively assessing human rights observance. Additionally, the study establishes, through the three models presented, that household assessment of health and human rights observance was 20% which also represents how much of the entire continuum of human rights is demanded. Findings propose important evidence for monitoring and evaluation of health in the context human rights using household survey data. They provide a benchmark for health and human rights assessments with a focus on international and national development plans to achieve socio-economic transformation and health in society.

  15. Current trends of liposuction in India: Survey and Analysis.

    Science.gov (United States)

    Methil, Bijoy

    2015-01-01

    Liposuction is the commonest aesthetic procedure performed by Indian plastic surgeons. However, there exists substantial disparity amongst Indian surgeons about guidelines concerning liposuction. To address this disparity, a nationwide email survey (Association of Plastic Surgeons of India [APSI] database) was started in December 2013 and continued for 5 months. The survey was developed with software from www.fluidsurveys.com. The study was designed to cover most aspects of patient selection, perioperative management, technical considerations, postoperative management and complications. This is the first survey to be conducted in India for an extremely popular procedure. It is also one of the most exhaustive surveys that have been conducted in terms of the topics covered. One hundred and eighteen surgeons (including a majority of the cosmetic surgery stalwarts in the country) completed the survey. As expected, the results show a disparity in most parameters but also consolidation on some issues. Liposuction is considered extremely safe (86.1%). The majority of surgeons (70.3%) aspirated >5 L at onetime. The majority (80.2%) felt that the limits for liposuction should be relative and not absolute. The survey highlights lack of standardization with respect to infiltration solutions. The commonest complications observed were contour irregularities, followed by seroma and inadequate skin redrape. The amount of aspirate is the only factor, which achieves statistical significance with respect to major complications. A review of the current evidence and recommendations has been incorporated, along with an in depth analysis of the survey.

  16. Statistical Analysis of 30 Years Rainfall Data: A Case Study

    Science.gov (United States)

    Arvind, G.; Ashok Kumar, P.; Girish Karthi, S.; Suribabu, C. R.

    2017-07-01

    Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.

  17. Validation of statistical models for creep rupture by parametric analysis

    Energy Technology Data Exchange (ETDEWEB)

    Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)

    2012-01-15

    Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).

  18. HistFitter: a flexible framework for statistical data analysis

    CERN Document Server

    Besjes, G J; Côté, D; Koutsman, A; Lorenz, J M; Short, D

    2015-01-01

    HistFitter is a software framework for statistical data analysis that has been used extensively in the ATLAS Collaboration to analyze data of proton-proton collisions produced by the Large Hadron Collider at CERN. Most notably, HistFitter has become a de-facto standard in searches for supersymmetric particles since 2012, with some usage for Exotic and Higgs boson physics. HistFitter coherently combines several statistics tools in a programmable and flexible framework that is capable of bookkeeping hundreds of data models under study using thousands of generated input histograms.HistFitter interfaces with the statistics tools HistFactory and RooStats to construct parametric models and to perform statistical tests of the data, and extends these tools in four key areas. The key innovations are to weave the concepts of control, validation and signal regions into the very fabric of HistFitter, and to treat these with rigorous methods. Multiple tools to visualize and interpret the results through a simple configura...

  19. Multivariate statistical analysis a high-dimensional approach

    CERN Document Server

    Serdobolskii, V

    2000-01-01

    In the last few decades the accumulation of large amounts of in­ formation in numerous applications. has stimtllated an increased in­ terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de­ ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat­ ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari­ ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen­ ...

  20. The bivariate statistical analysis of environmental (compositional) data.

    Science.gov (United States)

    Filzmoser, Peter; Hron, Karel; Reimann, Clemens

    2010-09-01

    Environmental sciences usually deal with compositional (closed) data. Whenever the concentration of chemical elements is measured, the data will be closed, i.e. the relevant information is contained in the ratios between the variables rather than in the data values reported for the variables. Data closure has severe consequences for statistical data analysis. Most classical statistical methods are based on the usual Euclidean geometry - compositional data, however, do not plot into Euclidean space because they have their own geometry which is not linear but curved in the Euclidean sense. This has severe consequences for bivariate statistical analysis: correlation coefficients computed in the traditional way are likely to be misleading, and the information contained in scatterplots must be used and interpreted differently from sets of non-compositional data. As a solution, the ilr transformation applied to a variable pair can be used to display the relationship and to compute a measure of stability. This paper discusses how this measure is related to the usual correlation coefficient and how it can be used and interpreted. Moreover, recommendations are provided for how the scatterplot can still be used, and which alternatives exist for displaying the relationship between two variables. Copyright 2010 Elsevier B.V. All rights reserved.

  1. Statistical Analysis of Sport Movement Observations: the Case of Orienteering

    Science.gov (United States)

    Amouzandeh, K.; Karimipour, F.

    2017-09-01

    Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope) and non-spatial movement attributes (e.g. speed and heart rate) of athletes. As the case study, an example dataset of movement observations acquired during the "orienteering" sport is presented and statistically analyzed.

  2. The NIRS Analysis Package: noise reduction and statistical inference.

    Directory of Open Access Journals (Sweden)

    Tomer Fekete

    Full Text Available Near infrared spectroscopy (NIRS is a non-invasive optical imaging technique that can be used to measure cortical hemodynamic responses to specific stimuli or tasks. While analyses of NIRS data are normally adapted from established fMRI techniques, there are nevertheless substantial differences between the two modalities. Here, we investigate the impact of NIRS-specific noise; e.g., systemic (physiological, motion-related artifacts, and serial autocorrelations, upon the validity of statistical inference within the framework of the general linear model. We present a comprehensive framework for noise reduction and statistical inference, which is custom-tailored to the noise characteristics of NIRS. These methods have been implemented in a public domain Matlab toolbox, the NIRS Analysis Package (NAP. Finally, we validate NAP using both simulated and actual data, showing marked improvement in the detection power and reliability of NIRS.

  3. Statistical Analysis of Radio Propagation Channel in Ruins Environment

    Directory of Open Access Journals (Sweden)

    Jiao He

    2015-01-01

    Full Text Available The cellphone based localization system for search and rescue in complex high density ruins has attracted a great interest in recent years, where the radio channel characteristics are critical for design and development of such a system. This paper presents a spatial smoothing estimation via rotational invariance technique (SS-ESPRIT for radio channel characterization of high density ruins. The radio propagations at three typical mobile communication bands (0.9, 1.8, and 2 GHz are investigated in two different scenarios. Channel parameters, such as arrival time, delays, and complex amplitudes, are statistically analyzed. Furthermore, a channel simulator is built based on these statistics. By comparison analysis of average excess delay and delay spread, the validation results show a good agreement between the measurements and channel modeling results.

  4. STATISTICAL ANALYSIS OF SPORT MOVEMENT OBSERVATIONS: THE CASE OF ORIENTEERING

    Directory of Open Access Journals (Sweden)

    K. Amouzandeh

    2017-09-01

    Full Text Available Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope and non-spatial movement attributes (e.g. speed and heart rate of athletes. As the case study, an example dataset of movement observations acquired during the “orienteering” sport is presented and statistically analyzed.

  5. Cosmological constraints with weak-lensing peak counts and second-order statistics in a large-field survey

    Science.gov (United States)

    Peel, Austin; Lin, Chieh-An; Lanusse, François; Leonard, Adrienne; Starck, Jean-Luc; Kilbinger, Martin

    2017-03-01

    Peak statistics in weak-lensing maps access the non-Gaussian information contained in the large-scale distribution of matter in the Universe. They are therefore a promising complementary probe to two-point and higher-order statistics to constrain our cosmological models. Next-generation galaxy surveys, with their advanced optics and large areas, will measure the cosmic weak-lensing signal with unprecedented precision. To prepare for these anticipated data sets, we assess the constraining power of peak counts in a simulated Euclid-like survey on the cosmological parameters Ωm, σ8, and w0de. In particular, we study how Camelus, a fast stochastic model for predicting peaks, can be applied to such large surveys. The algorithm avoids the need for time-costly N-body simulations, and its stochastic approach provides full PDF information of observables. Considering peaks with a signal-to-noise ratio ≥ 1, we measure the abundance histogram in a mock shear catalogue of approximately 5000 deg2 using a multiscale mass-map filtering technique. We constrain the parameters of the mock survey using Camelus combined with approximate Bayesian computation, a robust likelihood-free inference algorithm. Peak statistics yield a tight but significantly biased constraint in the σ8-Ωm plane, as measured by the width ΔΣ8 of the 1σ contour. We find Σ8 = σ8(Ωm/ 0.27)α = 0.77-0.05+0.06 with α = 0.75 for a flat ΛCDM model. The strong bias indicates the need to better understand and control the model systematics before applying it to a real survey of this size or larger. We perform a calibration of the model and compare results to those from the two-point correlation functions ξ± measured on the same field. We calibrate the ξ± result as well, since its contours are also biased, although not as severely as for peaks. In this case, we find for peaks Σ8 = 0.76-0.03+0.02 with α = 0.65, while for the combined ξ+ and ξ- statistics the values are Σ8 = 0.76-0.01+0.02 and α = 0

  6. Sentiment analysis algorithms and applications: A survey

    Directory of Open Access Journals (Sweden)

    Walaa Medhat

    2014-12-01

    Full Text Available Sentiment Analysis (SA is an ongoing field of research in text mining field. SA is the computational treatment of opinions, sentiments and subjectivity of text. This survey paper tackles a comprehensive overview of the last update in this field. Many recently proposed algorithms' enhancements and various SA applications are investigated and presented briefly in this survey. These articles are categorized according to their contributions in the various SA techniques. The related fields to SA (transfer learning, emotion detection, and building resources that attracted researchers recently are discussed. The main target of this survey is to give nearly full image of SA techniques and the related fields with brief details. The main contributions of this paper include the sophisticated categorizations of a large number of recent articles and the illustration of the recent trend of research in the sentiment analysis and its related areas.

  7. International Conference on Modern Problems of Stochastic Analysis and Statistics

    CERN Document Server

    2017-01-01

    This book brings together the latest findings in the area of stochastic analysis and statistics. The individual chapters cover a wide range of topics from limit theorems, Markov processes, nonparametric methods, acturial science, population dynamics, and many others. The volume is dedicated to Valentin Konakov, head of the International Laboratory of Stochastic Analysis and its Applications on the occasion of his 70th birthday. Contributions were prepared by the participants of the international conference of the international conference “Modern problems of stochastic analysis and statistics”, held at the Higher School of Economics in Moscow from May 29 - June 2, 2016. It offers a valuable reference resource for researchers and graduate students interested in modern stochastics.

  8. Statistical analysis plan for the EuroHYP-1 trial

    DEFF Research Database (Denmark)

    Winkel, Per; Bath, Philip M; Gluud, Christian

    2017-01-01

    Score; (4) brain infarct size at 48 +/-24 hours; (5) EQ-5D-5 L score, and (6) WHODAS 2.0 score. Other outcomes are: the primary safety outcome serious adverse events; and the incremental cost-effectiveness, and cost utility ratios. The analysis sets include (1) the intention-to-treat population, and (2...... outcome), logistic regression (binary outcomes), general linear model (continuous outcomes), and the Poisson or negative binomial model (rate outcomes). DISCUSSION: Major adjustments compared with the original statistical analysis plan encompass: (1) adjustment of analyses by nationality; (2) power......) the per protocol population. The sample size is estimated to 800 patients (5% type 1 and 20% type 2 errors). All analyses are adjusted for the protocol-specified stratification variables (nationality of centre), and the minimisation variables. In the analysis, we use ordinal regression (the primary...

  9. Composition and Statistical Analysis of Biophenols in Apulian Italian EVOOs

    Science.gov (United States)

    Centonze, Carla; Grasso, Maria Elena; Latronico, Maria Francesca; Mastrangelo, Pier Francesco; Maffia, Michele

    2017-01-01

    Extra-virgin olive oil (EVOO) is among the basic constituents of the Mediterranean diet. Its nutraceutical properties are due mainly, but not only, to a plethora of molecules with antioxidant activity known as biophenols. In this article, several biophenols were measured in EVOOs from South Apulia, Italy. Hydroxytyrosol, tyrosol and their conjugated structures to elenolic acid in different forms were identified and quantified by high performance liquid chromatography (HPLC) together with lignans, luteolin and α-tocopherol. The concentration of the analyzed metabolites was quite high in all the cultivars studied, but it was still possible to discriminate them through multivariate statistical analysis (MVA). Furthermore, principal component analysis (PCA) and orthogonal partial least-squares discriminant analysis (OPLS-DA) were also exploited for determining variances among samples depending on the interval time between harvesting and milling, on the age of the olive trees, and on the area where the olive trees were grown. PMID:29057813

  10. Composition and Statistical Analysis of Biophenols in Apulian Italian EVOOs

    Directory of Open Access Journals (Sweden)

    Andrea Ragusa

    2017-10-01

    Full Text Available Extra-virgin olive oil (EVOO is among the basic constituents of the Mediterranean diet. Its nutraceutical properties are due mainly, but not only, to a plethora of molecules with antioxidant activity known as biophenols. In this article, several biophenols were measured in EVOOs from South Apulia, Italy. Hydroxytyrosol, tyrosol and their conjugated structures to elenolic acid in different forms were identified and quantified by high performance liquid chromatography (HPLC together with lignans, luteolin and α-tocopherol. The concentration of the analyzed metabolites was quite high in all the cultivars studied, but it was still possible to discriminate them through multivariate statistical analysis (MVA. Furthermore, principal component analysis (PCA and orthogonal partial least-squares discriminant analysis (OPLS-DA were also exploited for determining variances among samples depending on the interval time between harvesting and milling, on the age of the olive trees, and on the area where the olive trees were grown.

  11. STATISTICS. The reusable holdout: Preserving validity in adaptive data analysis.

    Science.gov (United States)

    Dwork, Cynthia; Feldman, Vitaly; Hardt, Moritz; Pitassi, Toniann; Reingold, Omer; Roth, Aaron

    2015-08-07

    Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. Existing approaches to ensuring the validity of inferences drawn from data assume a fixed procedure to be performed, selected before the data are examined. In common practice, however, data analysis is an intrinsically adaptive process, with new analyses generated on the basis of data exploration, as well as the results of previous analyses on the same data. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis. As an application, we show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses. Copyright © 2015, American Association for the Advancement of Science.

  12. GIS-BASED SPATIAL STATISTICAL ANALYSIS OF COLLEGE GRADUATES EMPLOYMENT

    Directory of Open Access Journals (Sweden)

    R. Tang

    2012-07-01

    Full Text Available It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004–2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.

  13. Gis-Based Spatial Statistical Analysis of College Graduates Employment

    Science.gov (United States)

    Tang, R.

    2012-07-01

    It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.

  14. Consolidity analysis for fully fuzzy functions, matrices, probability and statistics

    Directory of Open Access Journals (Sweden)

    Walaa Ibrahim Gabr

    2015-03-01

    Full Text Available The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, the handling of fuzzy matrices covered determinants of fuzzy matrices, the eigenvalues of fuzzy matrices, and solving least-squares fuzzy linear equations. The approach demonstrated to be also applicable in a systematic way in handling new fuzzy probabilistic and statistical problems. This included extending the conventional probabilistic and statistical analysis for handling fuzzy random data. Application also covered the consolidity of fuzzy optimization problems. Various numerical examples solved have demonstrated that the new consolidity concept is highly effective in solving in a compact form the propagation of fuzziness in linear, nonlinear, multivariable and dynamic problems with different types of complexities. Finally, it is demonstrated that the implementation of the suggested fuzzy mathematics can be easily embedded within normal mathematics through building special fuzzy functions library inside the computational Matlab Toolbox or using other similar software languages.

  15. Statistical analysis of C/NOFS planar Langmuir probe data

    Directory of Open Access Journals (Sweden)

    E. Costa

    2014-07-01

    Full Text Available The planar Langmuir probe (PLP onboard the Communication/Navigation Outage Forecasting System (C/NOFS satellite has been monitoring ionospheric plasma densities and their irregularities with high resolution almost seamlessly since May 2008. Considering the recent changes in status of the C/NOFS mission, it may be interesting to summarize some statistical results from these measurements. PLP data from 2 different years (1 October 2008–30 September 2009 and 1 January 2012–31 December 2012 were selected for analysis. The first data set corresponds to solar minimum conditions and the second one is as close to solar maximum conditions of solar cycle 24 as possible at the time of the analysis. The results from the analysis show how the values of the standard deviation of the ion density which are greater than specified thresholds are statistically distributed as functions of several combinations of the following geophysical parameters: (i solar activity, (ii altitude range, (iii longitude sector, (iv local time interval, (v geomagnetic latitude interval, and (vi season.

  16. The features of Drosophila core promoters revealed by statistical analysis

    Directory of Open Access Journals (Sweden)

    Trifonov Edward N

    2006-06-01

    Full Text Available Abstract Background Experimental investigation of transcription is still a very labor- and time-consuming process. Only a few transcription initiation scenarios have been studied in detail. The mechanism of interaction between basal machinery and promoter, in particular core promoter elements, is not known for the majority of identified promoters. In this study, we reveal various transcription initiation mechanisms by statistical analysis of 3393 nonredundant Drosophila promoters. Results Using Drosophila-specific position-weight matrices, we identified promoters containing TATA box, Initiator, Downstream Promoter Element (DPE, and Motif Ten Element (MTE, as well as core elements discovered in Human (TFIIB Recognition Element (BRE and Downstream Core Element (DCE. Promoters utilizing known synergetic combinations of two core elements (TATA_Inr, Inr_MTE, Inr_DPE, and DPE_MTE were identified. We also establish the existence of promoters with potentially novel synergetic combinations: TATA_DPE and TATA_MTE. Our analysis revealed several motifs with the features of promoter elements, including possible novel core promoter element(s. Comparison of Human and Drosophila showed consistent percentages of promoters with TATA, Inr, DPE, and synergetic combinations thereof, as well as most of the same functional and mutual positions of the core elements. No statistical evidence of MTE utilization in Human was found. Distinct nucleosome positioning in particular promoter classes was revealed. Conclusion We present lists of promoters that potentially utilize the aforementioned elements/combinations. The number of these promoters is two orders of magnitude larger than the number of promoters in which transcription initiation was experimentally studied. The sequences are ready to be experimentally tested or used for further statistical analysis. The developed approach may be utilized for other species.

  17. Hierarchical model analysis of the Atlantic Flyway Breeding Waterfowl Survey

    Science.gov (United States)

    Sauer, John R.; Zimmerman, Guthrie S.; Klimstra, Jon D.; Link, William A.

    2014-01-01

    We used log-linear hierarchical models to analyze data from the Atlantic Flyway Breeding Waterfowl Survey. The survey has been conducted by state biologists each year since 1989 in the northeastern United States from Virginia north to New Hampshire and Vermont. Although yearly population estimates from the survey are used by the United States Fish and Wildlife Service for estimating regional waterfowl population status for mallards (Anas platyrhynchos), black ducks (Anas rubripes), wood ducks (Aix sponsa), and Canada geese (Branta canadensis), they are not routinely adjusted to control for time of day effects and other survey design issues. The hierarchical model analysis permits estimation of year effects and population change while accommodating the repeated sampling of plots and controlling for time of day effects in counting. We compared population estimates from the current stratified random sample analysis to population estimates from hierarchical models with alternative model structures that describe year to year changes as random year effects, a trend with random year effects, or year effects modeled as 1-year differences. Patterns of population change from the hierarchical model results generally were similar to the patterns described by stratified random sample estimates, but significant visibility differences occurred between twilight to midday counts in all species. Controlling for the effects of time of day resulted in larger population estimates for all species in the hierarchical model analysis relative to the stratified random sample analysis. The hierarchical models also provided a convenient means of estimating population trend as derived statistics from the analysis. We detected significant declines in mallard and American black ducks and significant increases in wood ducks and Canada geese, a trend that had not been significant for 3 of these 4 species in the prior analysis. We recommend using hierarchical models for analysis of the Atlantic

  18. Sensitivity analysis and optimization of system dynamics models : Regression analysis and statistical design of experiments

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for

  19. SAS and R data management, statistical analysis, and graphics

    CERN Document Server

    Kleinman, Ken

    2014-01-01

    An Up-to-Date, All-in-One Resource for Using SAS and R to Perform Frequent TasksThe first edition of this popular guide provided a path between SAS and R using an easy-to-understand, dictionary-like approach. Retaining the same accessible format, SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition explains how to easily perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferentia

  20. Using R for Data Management, Statistical Analysis, and Graphics

    CERN Document Server

    Horton, Nicholas J

    2010-01-01

    This title offers quick and easy access to key element of documentation. It includes worked examples across a wide variety of applications, tasks, and graphics. "Using R for Data Management, Statistical Analysis, and Graphics" presents an easy way to learn how to perform an analytical task in R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation and vast number of add-on packages. Organized by short, clear descriptive entries, the book covers many common tasks, such as data management, descriptive summaries, inferential proc

  1. STATISTIC ANALYSIS OF INTERNATIONAL TOURISM ON ROMANIAN SEASIDE

    Directory of Open Access Journals (Sweden)

    MIRELA SECARĂ

    2010-01-01

    Full Text Available In order to meet European and international touristic competition standards, modernization, re-establishment and development of Romanian tourism are necessary as well as creation of modern touristic products that are competitive on this market. The use of modern methods of statistic analysis in the field of tourism facilitates the achievement of systems of information that are the instruments for: evaluation of touristic demand and touristic supply, follow-up of touristic services of each touring form, follow-up of transportation services, leisure activities, hotel accommodation, touristic market study, and a complex flexible system of management and accountancy.

  2. Statistical Analysis of Strength Data for an Aerospace Aluminum Alloy

    Science.gov (United States)

    Neergaard, L.; Malone, T.

    2001-01-01

    Aerospace vehicles are produced in limited quantities that do not always allow development of MIL-HDBK-5 A-basis design allowables. One method of examining production and composition variations is to perform 100% lot acceptance testing for aerospace Aluminum (Al) alloys. This paper discusses statistical trends seen in strength data for one Al alloy. A four-step approach reduced the data to residuals, visualized residuals as a function of time, grouped data with quantified scatter, and conducted analysis of variance (ANOVA).

  3. Spatial Analysis Along Networks Statistical and Computational Methods

    CERN Document Server

    Okabe, Atsuyuki

    2012-01-01

    In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process

  4. Statistical Analysis of Designed Experiments Theory and Applications

    CERN Document Server

    Tamhane, Ajit C

    2012-01-01

    A indispensable guide to understanding and designing modern experiments The tools and techniques of Design of Experiments (DOE) allow researchers to successfully collect, analyze, and interpret data across a wide array of disciplines. Statistical Analysis of Designed Experiments provides a modern and balanced treatment of DOE methodology with thorough coverage of the underlying theory and standard designs of experiments, guiding the reader through applications to research in various fields such as engineering, medicine, business, and the social sciences. The book supplies a foundation for the

  5. Improved statistical model checking methods for pathway analysis.

    Science.gov (United States)

    Koh, Chuan Hock; Palaniappan, Sucheendra K; Thiagarajan, P S; Wong, Limsoon

    2012-01-01

    Statistical model checking techniques have been shown to be effective for approximate model checking on large stochastic systems, where explicit representation of the state space is impractical. Importantly, these techniques ensure the validity of results with statistical guarantees on errors. There is an increasing interest in these classes of algorithms in computational systems biology since analysis using traditional model checking techniques does not scale well. In this context, we present two improvements to existing statistical model checking algorithms. Firstly, we construct an algorithm which removes the need of the user to define the indifference region, a critical parameter in previous sequential hypothesis testing algorithms. Secondly, we extend the algorithm to account for the case when there may be a limit on the computational resources that can be spent on verifying a property; i.e, if the original algorithm is not able to make a decision even after consuming the available amount of resources, we resort to a p-value based approach to make a decision. We demonstrate the improvements achieved by our algorithms in comparison to current algorithms first with a straightforward yet representative example, followed by a real biological model on cell fate of gustatory neurons with microRNAs.

  6. Inappropriate survey design analysis of the Korean National Health and Nutrition Examination Survey may produce biased results.

    Science.gov (United States)

    Kim, Yangho; Park, Sunmin; Kim, Nam-Soo; Lee, Byung-Kook

    2013-03-01

    The inherent nature of the Korean National Health and Nutrition Examination Survey (KNHANES) design requires special analysis by incorporating sample weights, stratification, and clustering not used in ordinary statistical procedures. This study investigated the proportion of research papers that have used an appropriate statistical methodology out of the research papers analyzing the KNHANES cited in the PubMed online system from 2007 to 2012. We also compared differences in mean and regression estimates between the ordinary statistical data analyses without sampling weight and design-based data analyses using the KNHANES 2008 to 2010. Of the 247 research articles cited in PubMed, only 19.8% of all articles used survey design analysis, compared with 80.2% of articles that used ordinary statistical analysis, treating KNHANES data as if it were collected using a simple random sampling method. Means and standard errors differed between the ordinary statistical data analyses and design-based analyses, and the standard errors in the design-based analyses tended to be larger than those in the ordinary statistical data analyses. Ignoring complex survey design can result in biased estimates and overstated significance levels. Sample weights, stratification, and clustering of the design must be incorporated into analyses to ensure the development of appropriate estimates and standard errors of these estimates.

  7. On the choice of statistical models for estimating occurrence and extinction from animal surveys

    Science.gov (United States)

    Dorazio, R.M.

    2007-01-01

    In surveys of natural animal populations the number of animals that are present and available to be detected at a sample location is often low, resulting in few or no detections. Low detection frequencies are especially common in surveys of imperiled species; however, the choice of sampling method and protocol also may influence the size of the population that is vulnerable to detection. In these circumstances, probabilities of animal occurrence and extinction will generally be estimated more accurately if the models used in data analysis account for differences in abundance among sample locations and for the dependence between site-specific abundance and detection. Simulation experiments are used to illustrate conditions wherein these types of models can be expected to outperform alternative estimators of population site occupancy and extinction. ?? 2007 by the Ecological Society of America.

  8. Vibroacoustic optimization using a statistical energy analysis model

    Science.gov (United States)

    Culla, Antonio; D`Ambrogio, Walter; Fregolent, Annalisa; Milana, Silvia

    2016-08-01

    In this paper, an optimization technique for medium-high frequency dynamic problems based on Statistical Energy Analysis (SEA) method is presented. Using a SEA model, the subsystem energies are controlled by internal loss factors (ILF) and coupling loss factors (CLF), which in turn depend on the physical parameters of the subsystems. A preliminary sensitivity analysis of subsystem energy to CLF's is performed to select CLF's that are most effective on subsystem energies. Since the injected power depends not only on the external loads but on the physical parameters of the subsystems as well, it must be taken into account under certain conditions. This is accomplished in the optimization procedure, where approximate relationships between CLF's, injected power and physical parameters are derived. The approach is applied on a typical aeronautical structure: the cabin of a helicopter.

  9. Topics in statistical data analysis for high-energy physics

    CERN Document Server

    Cowan, G.

    2013-06-27

    These lectures concern two topics that are becoming increasingly important in the analysis of High Energy Physics (HEP) data: Bayesian statistics and multivariate methods. In the Bayesian approach we extend the interpretation of probability to cover not only the frequency of repeatable outcomes but also to include a degree of belief. In this way we are able to associate probability with a hypothesis and thus to answer directly questions that cannot be addressed easily with traditional frequentist methods. In multivariate analysis we try to exploit as much information as possible from the characteristics that we measure for each event to distinguish between event types. In particular we will look at a method that has gained popularity in HEP in recent years: the boosted decision tree (BDT).

  10. On Understanding Statistical Data Analysis in Higher Education

    CERN Document Server

    Montalbano, Vera

    2012-01-01

    Data analysis is a powerful tool in all experimental sciences. Statistical methods, such as sampling theory, computer technologies necessary for handling large amounts of data, skill in analysing information contained in different types of graphs are all competences necessary for achieving an in-depth data analysis. In higher education, these topics are usually fragmentized in different courses, the interdisciplinary integration can lack, some caution in the use of these topics can missing or be misunderstood. Students are often obliged to acquire these skills by themselves during the preparation of the final experimental thesis. A proposal for a learning path on nuclear phenomena is presented in order to develop these scientific competences in physics courses. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources can be monitored for long time in a physics laboratory. The data are collected and analyzed in a computer lab i...

  11. Statistical learning analysis in neuroscience: aiming for transparency

    Directory of Open Access Journals (Sweden)

    Michael Hanke

    2010-05-01

    Full Text Available Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires ``neuroscience-aware'' technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here we review its features and applicability to various neural data modalities.

  12. First statistical analysis of Geant4 quality software metrics

    Science.gov (United States)

    Ronchieri, Elisabetta; Grazia Pia, Maria; Giacomini, Francesco

    2015-12-01

    Geant4 is a simulation system of particle transport through matter, widely used in several experimental areas from high energy physics and nuclear experiments to medical studies. Some of its applications may involve critical use cases; therefore they would benefit from an objective assessment of the software quality of Geant4. In this paper, we provide a first statistical evaluation of software metrics data related to a set of Geant4 physics packages. The analysis aims at identifying risks for Geant4 maintainability, which would benefit from being addressed at an early stage. The findings of this pilot study set the grounds for further extensions of the analysis to the whole of Geant4 and to other high energy physics software systems.

  13. Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability

    Energy Technology Data Exchange (ETDEWEB)

    Shear, Trevor Allan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2017-08-29

    Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystal sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.

  14. [Analysis of complaints in primary care using statistical process control].

    Science.gov (United States)

    Valdivia Pérez, Antonio; Arteaga Pérez, Lourdes; Escortell Mayor, Esperanza; Monge Corella, Susana; Villares Rodríguez, José Enrique

    2009-08-01

    To analyze patient complaints in a Primary Health Care District (PHCD) using statistical process control methods compared to multivariate methods, as regards their results and feasibility of application in this context. Descriptive study based on an aggregate analysis of administrative complaints. Complaints received between January 2005 and August 2008 in the Customer Management Department in the 3rd PHCD Management Office, Madrid Health Services. Complaints are registered through Itrack, a computer software tool used throughout the whole Community of Madrid. Total number of complaints, complaints sorted by Reason and Primary Health Care Team (PHCT), total number of patient visits (including visits on demand, appointment visits and home visits) and visits by PHCT and per month and year. Multivariate analysis and control charts were used. 44-month time series with a mean of 76 complaints per month, an increasing trend in the first three years and decreasing during summer months. Poisson regression detected an excess of complaints in 8 out of the 44 months in the series. The control chart detected the same 8 months plus two additional ones. Statistical process control can be useful for detecting an excess of complaints in a PHCD and enables comparisons to be made between different PHC teams. As it is a simple technique, it can be used for ongoing monitoring of customer perceived quality.

  15. Using zebrafish to learn statistical analysis and Mendelian genetics.

    Science.gov (United States)

    Lindemann, Samantha; Senkler, Jon; Auchter, Elizabeth; Liang, Jennifer O

    2011-06-01

    This project was developed to promote understanding of how mathematics and statistical analysis are used as tools in genetic research. It gives students the opportunity to carry out hypothesis-driven experiments in the classroom: students generate hypotheses about Mendelian and non-Mendelian inheritance patterns, gather raw data, and test their hypotheses using chi-square statistical analysis. In the first protocol, students are challenged to analyze inheritance patterns using GloFish, brightly colored, commercially available, transgenic zebrafish that express Green, Yellow, or Red Fluorescent Protein throughout their muscles. In the second protocol, students learn about genetic screens, microscopy, and developmental biology by analyzing the inheritance patterns of mutations that cause developmental defects. The difficulty of the experiments can be adapted for middle school to upper level undergraduate students. Since the GloFish experiments use only fish and materials that can be purchased from pet stores, they should be accessible to many schools. For each protocol, we provide detailed instructions, ideas for how the experiments fit into an undergraduate curriculum, raw data, and example analyses. Our plan is to have these protocols form the basis of a growing and adaptable educational tool available on the Zebrafish in the Classroom Web site.

  16. A biologist's guide to statistical thinking and analysis.

    Science.gov (United States)

    Fay, David S; Gerow, Ken

    2013-07-09

    The proper understanding and use of statistical tools are essential to the scientific enterprise. This is true both at the level of designing one's own experiments as well as for critically evaluating studies carried out by others. Unfortunately, many researchers who are otherwise rigorous and thoughtful in their scientific approach lack sufficient knowledge of this field. This methods chapter is written with such individuals in mind. Although the majority of examples are drawn from the field of Caenorhabditis elegans biology, the concepts and practical applications are also relevant to those who work in the disciplines of molecular genetics and cell and developmental biology. Our intent has been to limit theoretical considerations to a necessary minimum and to use common examples as illustrations for statistical analysis. Our chapter includes a description of basic terms and central concepts and also contains in-depth discussions on the analysis of means, proportions, ratios, probabilities, and correlations. We also address issues related to sample size, normality, outliers, and non-parametric approaches.

  17. Pattern recognition in menstrual bleeding diaries by statistical cluster analysis

    Directory of Open Access Journals (Sweden)

    Wessel Jens

    2009-07-01

    Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.

  18. Design and statistical analysis of oral medicine studies: common pitfalls.

    Science.gov (United States)

    Baccaglini, L; Shuster, J J; Cheng, J; Theriaque, D W; Schoenbach, V J; Tomar, S L; Poole, C

    2010-04-01

    A growing number of articles are emerging in the medical and statistics literature that describe epidemiologic and statistical flaws of research studies. Many examples of these deficiencies are encountered in the oral, craniofacial, and dental literature. However, only a handful of methodologic articles have been published in the oral literature warning investigators of potential errors that may arise early in the study and that can irreparably bias the final results. In this study, we briefly review some of the most common pitfalls that our team of epidemiologists and statisticians has identified during the review of submitted or published manuscripts and research grant applications. We use practical examples from the oral medicine and dental literature to illustrate potential shortcomings in the design and analysis of research studies, and how these deficiencies may affect the results and their interpretation. A good study design is essential, because errors in the analysis can be corrected if the design was sound, but flaws in study design can lead to data that are not salvageable. We recommend consultation with an epidemiologist or a statistician during the planning phase of a research study to optimize study efficiency, minimize potential sources of bias, and document the analytic plan.

  19. Guidelines for the Content of Statistical Analysis Plans in Clinical Trials.

    Science.gov (United States)

    Gamble, Carrol; Krishan, Ashma; Stocken, Deborah; Lewis, Steff; Juszczak, Edmund; Doré, Caroline; Williamson, Paula R; Altman, Douglas G; Montgomery, Alan; Lim, Pilar; Berlin, Jesse; Senn, Stephen; Day, Simon; Barbachano, Yolanda; Loder, Elizabeth

    2017-12-19

    While guidance on statistical principles for clinical trials exists, there is an absence of guidance covering the required content of statistical analysis plans (SAPs) to support transparency and reproducibility. To develop recommendations for a minimum set of items that should be addressed in SAPs for clinical trials, developed with input from statisticians, previous guideline authors, journal editors, regulators, and funders. Funders and regulators (n = 39) of randomized trials were contacted and the literature was searched to identify existing guidance; a survey of current practice was conducted across the network of UK Clinical Research Collaboration-registered trial units (n = 46, 1 unit had 2 responders) and a Delphi survey (n = 73 invited participants) was conducted to establish consensus on SAPs. The Delphi survey was sent to statisticians in trial units who completed the survey of current practice (n = 46), CONSORT (Consolidated Standards of Reporting Trials) and SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) guideline authors (n = 16), pharmaceutical industry statisticians (n = 3), journal editors (n = 9), and regulators (n = 2) (3 participants were included in 2 groups each), culminating in a consensus meeting attended by experts (N = 12) with representatives from each group. The guidance subsequently underwent critical review by statisticians from the surveyed trial units and members of the expert panel of the consensus meeting (N = 51), followed by piloting of the guidance document in the SAPs of 5 trials. No existing guidance was identified. The registered trials unit survey (46 responses) highlighted diversity in current practice and confirmed support for developing guidance. The Delphi survey (54 of 73, 74% participants completing both rounds) reached consensus on 42% (n = 46) of 110 items. The expert panel (N = 12) agreed that 63 items should be included in the guidance

  20. Short-run and Current Analysis Model in Statistics

    Directory of Open Access Journals (Sweden)

    Constantin Anghelache

    2006-01-01

    Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.

  1. Theoretical assessment of image analysis: statistical vs structural approaches

    Science.gov (United States)

    Lei, Tianhu; Udupa, Jayaram K.

    2003-05-01

    Statistical and structural methods are two major approaches commonly used in image analysis and have demonstrated considerable success. The former is based on statistical properties and stochastic models of the image and the latter utilizes geometric and topological models. In this study, Markov random field (MRF) theory/model based image segmentation and Fuzzy Connectedness (FC) theory/Fuzzy connected objeect delineation are chosen as the representatives for these two approaches, respectively. The comparative study is focused on their theoretical foundations and main operative procedures. The MRF is defined on a lattice and the associated neighborhood system and is based on the Markov property. The FC method is defined on a fuzzy digital space and is based on fuzzy relations. Locally, MRF is characterized by potentials of cliques, and FC is described by fuzzy adjacency and affinity relations. Globally, MRF is characterized by Gibbs distribution, and FC is described by fuzzy connectedness. The task of MRF model based image segmentation is toe seek a realization of the embedded MRF through a two-level operation: partitioning and labeling. The task of FC object delineation is to extract a fuzzy object from a given scene, through a two-step operation: recognition and delineation. Theoretical foundations which underly statistical and structural approaches and the principles of the main operative procedures in image segmentation by these two approaches demonstrate more similarities than differences between them. Two approaches can also complement each other, particularly in seed selection, scale formation, affinity and object membership function design for FC and neighbor set selection and clique potential design for MRF.

  2. The system for statistical analysis of logistic information

    Directory of Open Access Journals (Sweden)

    Khayrullin Rustam Zinnatullovich

    2015-05-01

    Full Text Available The current problem for managers in logistic and trading companies is the task of improving the operational business performance and developing the logistics support of sales. The development of logistics sales supposes development and implementation of a set of works for the development of the existing warehouse facilities, including both a detailed description of the work performed, and the timing of their implementation. Logistics engineering of warehouse complex includes such tasks as: determining the number and the types of technological zones, calculation of the required number of loading-unloading places, development of storage structures, development and pre-sales preparation zones, development of specifications of storage types, selection of loading-unloading equipment, detailed planning of warehouse logistics system, creation of architectural-planning decisions, selection of information-processing equipment, etc. The currently used ERP and WMS systems did not allow us to solve the full list of logistics engineering problems. In this regard, the development of specialized software products, taking into account the specifics of warehouse logistics, and subsequent integration of these software with ERP and WMS systems seems to be a current task. In this paper we suggest a system of statistical analysis of logistics information, designed to meet the challenges of logistics engineering and planning. The system is based on the methods of statistical data processing.The proposed specialized software is designed to improve the efficiency of the operating business and the development of logistics support of sales. The system is based on the methods of statistical data processing, the methods of assessment and prediction of logistics performance, the methods for the determination and calculation of the data required for registration, storage and processing of metal products, as well as the methods for planning the reconstruction and development

  3. Statistical analysis of the autoregressive modeling of reverberant speech.

    Science.gov (United States)

    Gaubitch, Nikolay D; Ward, Darren B; Naylor, Patrick A

    2006-12-01

    Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M-channel observation (M > 1); and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M-channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced (<0.3m).

  4. Statistical analysis of the breaking processes of Ni nanowires

    Energy Technology Data Exchange (ETDEWEB)

    Garcia-Mochales, P [Departamento de Fisica de la Materia Condensada, Facultad de Ciencias, Universidad Autonoma de Madrid, c/ Francisco Tomas y Valiente 7, Campus de Cantoblanco, E-28049-Madrid (Spain); Paredes, R [Centro de Fisica, Instituto Venezolano de Investigaciones CientIficas, Apartado 20632, Caracas 1020A (Venezuela); Pelaez, S; Serena, P A [Instituto de Ciencia de Materiales de Madrid, Consejo Superior de Investigaciones CientIficas, c/ Sor Juana Ines de la Cruz 3, Campus de Cantoblanco, E-28049-Madrid (Spain)], E-mail: pedro.garciamochales@uam.es

    2008-06-04

    We have performed a massive statistical analysis on the breaking behaviour of Ni nanowires using molecular dynamic simulations. Three stretching directions, five initial nanowire sizes and two temperatures have been studied. We have constructed minimum cross-section histograms and analysed for the first time the role played by monomers and dimers. The shape of such histograms and the absolute number of monomers and dimers strongly depend on the stretching direction and the initial size of the nanowire. In particular, the statistical behaviour of the breakage final stages of narrow nanowires strongly differs from the behaviour obtained for large nanowires. We have analysed the structure around monomers and dimers. Their most probable local configurations differ from those usually appearing in static electron transport calculations. Their non-local environments show disordered regions along the nanowire if the stretching direction is [100] or [110]. Additionally, we have found that, at room temperature, [100] and [110] stretching directions favour the appearance of non-crystalline staggered pentagonal structures. These pentagonal Ni nanowires are reported in this work for the first time. This set of results suggests that experimental Ni conducting histograms could show a strong dependence on the orientation and temperature.

  5. Analysis of filament statistics in fast camera data on MAST

    Science.gov (United States)

    Farley, Tom; Militello, Fulvio; Walkden, Nick; Harrison, James; Silburn, Scott; Bradley, James

    2017-10-01

    Coherent filamentary structures have been shown to play a dominant role in turbulent cross-field particle transport [D'Ippolito 2011]. An improved understanding of filaments is vital in order to control scrape off layer (SOL) density profiles and thus control first wall erosion, impurity flushing and coupling of radio frequency heating in future devices. The Elzar code [T. Farley, 2017 in prep.] is applied to MAST data. The code uses information about the magnetic equilibrium to calculate the intensity of light emission along field lines as seen in the camera images, as a function of the field lines' radial and toroidal locations at the mid-plane. In this way a `pseudo-inversion' of the intensity profiles in the camera images is achieved from which filaments can be identified and measured. In this work, a statistical analysis of the intensity fluctuations along field lines in the camera field of view is performed using techniques similar to those typically applied in standard Langmuir probe analyses. These filament statistics are interpreted in terms of the theoretical ergodic framework presented by F. Militello & J.T. Omotani, 2016, in order to better understand how time averaged filament dynamics produce the more familiar SOL density profiles. This work has received funding from the RCUK Energy programme (Grant Number EP/P012450/1), from Euratom (Grant Agreement No. 633053) and from the EUROfusion consortium.

  6. A statistical study towards high-mass BGPS clumps with the MALT90 survey

    Science.gov (United States)

    Liu, Xiao-Lan; Xu, Jin-Long; Ning, Chang-Chun; Zhang, Chuan-Peng; Liu, Xiao-Tao

    2018-01-01

    In this work, we perform a statistical investigation towards 50 high-mass clumps using data from the Bolocam Galactic Plane Survey (BGPS) and Millimetre Astronomy Legacy Team 90-GHz survey (MALT90). Eleven dense molecular lines (N2H+(1–0), HNC(1–0), HCO+(1–0), HCN(1–0), HN13C(1–0), H13CO+(1–0), C2H(1–0), HC3N(10–9), SiO(2–1), 13CS(2–1)and HNCO(44,0 ‑ 30,3)) are detected. N2H+ and HNC are shown to be good tracers for clumps in various evolutionary stages since they are detected in all the fields. The detection rates of N-bearing molecules decrease as the clumps evolve, but those of O-bearing species increase with evolution. Furthermore, the abundance ratios [N2H+]/[HCO+] and log([HC3N]/[HCO+]) decline with log([HCO+]) as two linear functions, respectively. This suggests that N2H+ and HC3N transform to HCO+ as the clumps evolve. We also find that C2H is the most abundant molecule with an order of magnitude 10‑8. In addition, three new infall candidates, G010.214–00.324, G011.121–00.128 and G012.215–00.118(a), are discovered to have large-scale infall motions and infall rates with an order of magnitude 10‑3 M ⊙ yr‑1.

  7. GIS application on spatial landslide analysis using statistical based models

    Science.gov (United States)

    Pradhan, Biswajeet; Lee, Saro; Buchroithner, Manfred F.

    2009-09-01

    This paper presents the assessment results of spatially based probabilistic three models using Geoinformation Techniques (GIT) for landslide susceptibility analysis at Penang Island in Malaysia. Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and supported with field surveys. Maps of the topography, soil type, lineaments and land cover were constructed from the spatial data sets. There are ten landslide related factors were extracted from the spatial database and the frequency ratio, fuzzy logic, and bivariate logistic regression coefficients of each factor was computed. Finally, landslide susceptibility maps were drawn for study area using frequency ratios, fuzzy logic and bivariate logistic regression models. For verification, the results of the analyses were compared with actual landslide locations in study area. The verification results show that bivariate logistic regression model provides slightly higher prediction accuracy than the frequency ratio and fuzzy logic models.

  8. A Statistical Analysis of Cointegration for I(2) Variables

    DEFF Research Database (Denmark)

    Johansen, Søren

    1995-01-01

    be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper......This paper discusses inference for I(2) variables in a VAR model. The estimation procedure suggested consists of two reduced rank regressions. The asymptotic distribution of the proposed estimators of the cointegrating coefficients is mixed Gaussian, which implies that asymptotic inference can...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....

  9. Analysis of Official Suicide Statistics in Spain (1910-2011

    Directory of Open Access Journals (Sweden)

    2017-01-01

    Full Text Available In this article we examine the evolution of suicide rates in Spain from 1910 to 2011. As something new, we use standardised suicide rates, making them perfectly comparable geographically and in time, as they no longer reflect population structure. Using historical data from a series of socioeconomic variables for all Spain's provinces and applying new techniques for the statistical analysis of panel data, we are able to confirm many of the hypotheses established by Durkheim at the end of the 19th century, especially those related to fertility and marriage rates, age, sex and the aging index. Our findings, however, contradict Durkheim's approach regarding the impact of urbanisation processes and poverty on suicide.

  10. Statistical energy analysis for a compact refrigeration compressor

    Science.gov (United States)

    Lim, Ji Min; Bolton, J. Stuart; Park, Sung-Un; Hwang, Seon-Woong

    2005-09-01

    Traditionally the prediction of the vibrational energy level of the components in a compressor is accomplished by using a deterministic model such as a finite element model. While a deterministic approach requires much detail and computational time for a complete dynamic analysis, statistical energy analysis (SEA) requires much less information and computing time. All of these benefits can be obtained by using data averaged over the frequency and spatial domains instead of the direct use of deterministic data. In this paper, SEA will be applied to a compact refrigeration compressor for the prediction of dynamic behavior of each subsystem. Since the compressor used in this application is compact and stiff, the modal densities of its various components are low, especially in the low frequency ranges, and most energy transfers in these ranges are achieved through the indirect coupling paths instead of via direct coupling. For this reason, experimental SEA (ESEA), a good tool for the consideration of the indirect coupling, was used to derive an SEA formulation. Direct comparison of SEA results and experimental data for an operating compressor will be introduced. The power transfer path analysis at certain frequencies made possible by using SEA will be also described to show the advantage of SEA in this application.

  11. Spectral signature verification using statistical analysis and text mining

    Science.gov (United States)

    DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.

    2016-05-01

    In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is

  12. Classification of Malaysia aromatic rice using multivariate statistical analysis

    Science.gov (United States)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

    2015-05-01

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  13. Classification of Malaysia aromatic rice using multivariate statistical analysis

    Energy Technology Data Exchange (ETDEWEB)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)

    2015-05-15

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  14. A statistical survey of ultralow-frequency wave power and polarization in the Hermean magnetosphere.

    Science.gov (United States)

    James, Matthew K; Bunce, Emma J; Yeoman, Timothy K; Imber, Suzanne M; Korth, Haje

    2016-09-01

    We present a statistical survey of ultralow-frequency wave activity within the Hermean magnetosphere using the entire MErcury Surface, Space ENvironment, GEochemistry, and Ranging magnetometer data set. This study is focused upon wave activity with frequencies Wave activity is mapped to the magnetic equatorial plane of the magnetosphere and to magnetic latitude and local times on Mercury using the KT14 magnetic field model. Wave power mapped to the planetary surface indicates the average location of the polar cap boundary. Compressional wave power is dominant throughout most of the magnetosphere, while azimuthal wave power close to the dayside magnetopause provides evidence that interactions between the magnetosheath and the magnetopause such as the Kelvin-Helmholtz instability may be driving wave activity. Further evidence of this is found in the average wave polarization: left-handed polarized waves dominate the dawnside magnetosphere, while right-handed polarized waves dominate the duskside. A possible field line resonance event is also presented, where a time-of-flight calculation is used to provide an estimated local plasma mass density of ∼240 amu cm-3.

  15. A Statistic Analysis Of Romanian Seaside Hydro Tourism

    OpenAIRE

    Secara Mirela

    2011-01-01

    Tourism represents one of the ways of spending spare time for rest, recreation, treatment and entertainment, and the specific aspect of Constanta County economy is touristic and spa capitalization of Romanian seaside. In order to analyze hydro tourism on Romanian seaside we have used statistic indicators within tourism as well as statistic methods such as chronological series, interdependent statistic series, regression and statistic correlation. The major objective of this research is to rai...

  16. Accounting providing of statistical analysis of intangible assets renewal under marketing strategy

    Directory of Open Access Journals (Sweden)

    I.R. Polishchuk

    2016-12-01

    Full Text Available The article analyzes the content of the Regulations on accounting policies of the surveyed enterprises in terms of the operations concerning the amortization of intangible assets on the following criteria: assessment on admission, determination of useful life, the period of depreciation, residual value, depreciation method, reflection in the financial statements, a unit of account, revaluation, formation of fair value. The characteristic of factors affecting the accounting policies and determining the mechanism for evaluating the completeness and timeliness of intangible assets renewal is showed. The algorithm for selecting the method of intangible assets amortization is proposed. The knowledge base of statistical analysis of timeliness and completeness of intangible assets renewal in terms of the developed internal reporting is expanded. The statistical indicators to assess the effectiveness of the amortization policy for intangible assets are proposed. The marketing strategies depending on the condition and amount of intangible assets in relation to increasing marketing potential for continuity of economic activity are described.

  17. A statistical survey of dayside pulsed ionospheric flows as seen by the CUTLASS Finland HF radar

    Directory of Open Access Journals (Sweden)

    K. A. McWilliams

    Full Text Available Nearly two years of 2-min resolution data and 7- to 21-s resolution data from the CUTLASS Finland HF radar have undergone Fourier analysis in order to study statistically the occurrence rates and repetition frequencies of pulsed ionospheric flows in the noon-sector high-latitude ionosphere. Pulsed ionospheric flow bursts are believed to be the ionospheric footprint of newly reconnected geomagnetic field lines, which occur during episodes of magnetic flux transfer to the terrestrial magnetosphere - flux transfer events or FTEs. The distribution of pulsed ionospheric flows were found to be well grouped in the radar field of view, and to be in the vicinity of the radar signature of the cusp footprint. Two thirds of the pulsed ionospheric flow intervals included in the statistical study occurred when the interplanetary magnetic field had a southward component, supporting the hypothesis that pulsed ionospheric flows are a reconnection-related phenomenon. The occurrence rate of the pulsed ionospheric flow fluctuation period was independent of the radar scan mode. The statistical results obtained from the radar data are compared to occurrence rates and repetition frequencies of FTEs derived from spacecraft data near the magnetopause reconnection region, and to ground-based optical measurements of poleward moving auroral forms. The distributions obtained by the various instruments in different regions of the magnetosphere were remarkably similar. The radar, therefore, appears to give an unbiased sample of magnetopause activity in its routine observations of the cusp footprint.

    Key words: Magnetospheric physics (magnetosphere-ionosphere interactions; plasma convection; solar wind-magnetosphere interactions

  18. A statistical survey of dayside pulsed ionospheric flows as seen by the CUTLASS Finland HF radar

    Directory of Open Access Journals (Sweden)

    K. A. McWilliams

    2000-04-01

    Full Text Available Nearly two years of 2-min resolution data and 7- to 21-s resolution data from the CUTLASS Finland HF radar have undergone Fourier analysis in order to study statistically the occurrence rates and repetition frequencies of pulsed ionospheric flows in the noon-sector high-latitude ionosphere. Pulsed ionospheric flow bursts are believed to be the ionospheric footprint of newly reconnected geomagnetic field lines, which occur during episodes of magnetic flux transfer to the terrestrial magnetosphere - flux transfer events or FTEs. The distribution of pulsed ionospheric flows were found to be well grouped in the radar field of view, and to be in the vicinity of the radar signature of the cusp footprint. Two thirds of the pulsed ionospheric flow intervals included in the statistical study occurred when the interplanetary magnetic field had a southward component, supporting the hypothesis that pulsed ionospheric flows are a reconnection-related phenomenon. The occurrence rate of the pulsed ionospheric flow fluctuation period was independent of the radar scan mode. The statistical results obtained from the radar data are compared to occurrence rates and repetition frequencies of FTEs derived from spacecraft data near the magnetopause reconnection region, and to ground-based optical measurements of poleward moving auroral forms. The distributions obtained by the various instruments in different regions of the magnetosphere were remarkably similar. The radar, therefore, appears to give an unbiased sample of magnetopause activity in its routine observations of the cusp footprint.Key words: Magnetospheric physics (magnetosphere-ionosphere interactions; plasma convection; solar wind-magnetosphere interactions

  19. Statistical Analysis Of Tank 5 Floor Sample Results

    Energy Technology Data Exchange (ETDEWEB)

    Shine, E. P.

    2012-08-01

    Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements

  20. STATISTICAL ANALYSIS OF TANK 5 FLOOR SAMPLE RESULTS

    Energy Technology Data Exchange (ETDEWEB)

    Shine, E.

    2012-03-14

    Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, radionuclide, inorganic, and anion concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements above their

  1. Statistical Analysis of Tank 5 Floor Sample Results

    Energy Technology Data Exchange (ETDEWEB)

    Shine, E. P.

    2013-01-31

    Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide1, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements

  2. Statistical analysis of the operating parameters which affect cupola emissions

    Energy Technology Data Exchange (ETDEWEB)

    Davis, J.W.; Draper, A.B.

    1977-12-01

    A sampling program was undertaken to determine the operating parameters which affected air pollution emission from gray iron foundry cupolas. The experimental design utilized the analysis of variance routine. Four independent variables were selected for examination on the basis of previous work reported in the literature. These were: (1) blast rate; (2) iron-coke ratio; (3) blast temperature; and (4) cupola size. The last variable was chosen since it most directly affects melt rate. Emissions from cupolas for which concern has been expressed are particle matter and carbon monoxide. The dependent variables were, therefore, particle loading, particle size distribution, and carbon monoxide concentration. Seven production foundries were visited and samples taken under conditions prescribed by the experimental plan. The data obtained from these tests were analyzed using the analysis of variance and other statistical techniques where applicable. The results indicated that blast rate, blast temperature, and cupola size affected particle emissions and the latter two also affected the particle size distribution. The particle size information was also unique in that it showed a consistent particle size distribution at all seven foundaries with a sizable fraction of the particles less than 1.0 micrometers in diameter.

  3. Criminal victimization in Ukraine: analysis of statistical data

    Directory of Open Access Journals (Sweden)

    Serhiy Nezhurbida

    2007-12-01

    Full Text Available The article is based on the analysis of statistical data provided by law-enforcement, judicial and other bodies of Ukraine. The given analysis allows us to give an accurate quantity of a current status of crime victimization in Ukraine, to characterize its basic features (level, rate, structure, dynamics, and etc.. L’article se concentre sur l’analyse des données statystiques fournies par les institutions de contrôle sociale (forces de police et magistrature et par d’autres organes institutionnels ukrainiens. Les analyses effectuées attirent l'attention sur la situation actuelle des victimes du crime en Ukraine et aident à délinéer leur principales caractéristiques (niveau, taux, structure, dynamiques, etc.L’articolo si basa sull’analisi dei dati statistici forniti dalle agenzie del controllo sociale (forze dell'ordine e magistratura e da altri organi istituzionali ucraini. Le analisi effettuate forniscono molte informazioni sulla situazione attuale delle vittime del crimine in Ucraina e aiutano a delinearne le caratteristiche principali (livello, tasso, struttura, dinamiche, ecc..

  4. A statistical design for testing apomictic diversification through linkage analysis.

    Science.gov (United States)

    Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling

    2014-03-01

    The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.

  5. Data Analysis & Statistical Methods for Command File Errors

    Science.gov (United States)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  6. Statistics for Community Governance: The Yawuru Indigenous Population Survey, Western Australia

    Directory of Open Access Journals (Sweden)

    John Taylor

    2014-04-01

    Full Text Available This article presents a case study of an exercise in Aboriginal community governance in Australia. It sets out the background events that led the Yawuru Native Title Holders Aboriginal Corporation in the town of Broome on Australia’s northwest coast to secure information for its own needs as an act of self-determination and essential governance, and it presents some of the key findings from that exercise. As the Indigenous rights agenda shifts from the pursuit of restitution to the management and implementation of benefits, those with proprietary rights are finding it increasingly necessary to build internal capacity for post-native title governance and community planning, including in the area of information retrieval and application. As an incorporated land-holding group, the Yawuru people of Broome are amongst the first in Australia to move in this area of information gathering, certainly in terms of the degree of local control, participation, and conceptual thinking around the logistics and rationale for such an exercise. An innovative addition has been the incorporation of survey output data into a Geographic Information System to provide for spatial analysis and a decision support mechanism for local community planning. In launching and administering the "Knowing our Community" household survey in Broome, the Yawuru have set a precedent in the acquisition and application of demographic information for internal planning and community development in the post-native title determination era.

  7. Statistical analysis of cone penetration resistance of railway ballast

    Directory of Open Access Journals (Sweden)

    Saussine Gilles

    2017-01-01

    Full Text Available Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd sensitivity is important and increases with depth. The (qd distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd distribution for clean ballast; they increase by 50% the average value of (qd for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd distribution.

  8. Statistical analysis of cone penetration resistance of railway ballast

    Science.gov (United States)

    Saussine, Gilles; Dhemaied, Amine; Delforge, Quentin; Benfeddoul, Selim

    2017-06-01

    Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd) increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd) increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd) increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd) sensitivity is important and increases with depth. The (qd) distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd) distribution for clean ballast; they increase by 50% the average value of (qd) for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd) distribution.

  9. Tucker Tensor analysis of Matern functions in spatial statistics

    KAUST Repository

    Litvinenko, Alexander

    2017-11-18

    In this work, we describe advanced numerical tools for working with multivariate functions and for the analysis of large data sets. These tools will drastically reduce the required computing time and the storage cost, and, therefore, will allow us to consider much larger data sets or finer meshes. Covariance matrices are crucial in spatio-temporal statistical tasks, but are often very expensive to compute and store, especially in 3D. Therefore, we approximate covariance functions by cheap surrogates in a low-rank tensor format. We apply the Tucker and canonical tensor decompositions to a family of Matern- and Slater-type functions with varying parameters and demonstrate numerically that their approximations exhibit exponentially fast convergence. We prove the exponential convergence of the Tucker and canonical approximations in tensor rank parameters. Several statistical operations are performed in this low-rank tensor format, including evaluating the conditional covariance matrix, spatially averaged estimation variance, computing a quadratic form, determinant, trace, loglikelihood, inverse, and Cholesky decomposition of a large covariance matrix. Low-rank tensor approximations reduce the computing and storage costs essentially. For example, the storage cost is reduced from an exponential O(n^d) to a linear scaling O(drn), where d is the spatial dimension, n is the number of mesh points in one direction, and r is the tensor rank. Prerequisites for applicability of the proposed techniques are the assumptions that the data, locations, and measurements lie on a tensor (axes-parallel) grid and that the covariance function depends on a distance, ||x-y||.

  10. STATISTICAL ANALYSIS OF RAW SUGAR MATERIAL FOR SUGAR PRODUCER COMPLEX

    Directory of Open Access Journals (Sweden)

    A. A. Gromkovskii

    2015-01-01

    Full Text Available Summary. In the article examines the statistical data on the development of average weight and average sugar content of sugar beet roots. The successful solution of the problem of forecasting these raw indices is essential for solving problems of sugar producing complex control. In the paper by calculating the autocorrelation function demonstrated that the predominant trend component of the growth raw characteristics. For construct the prediction model is proposed to use an autoregressive first and second order. It is shown that despite the small amount of experimental data, which provide raw sugar producing enterprises laboratory, using autoregression is justified. The proposed model allows correctly out properly the dynamics of changes raw indexes in the time, which confirms the estimates. In the article highlighted the fact that in the case the predominance trend components in the dynamics of the studied characteristics of sugar beet proposed prediction models provide the better quality of the forecast. In the presence the oscillations portions of the curve describing the change raw performance, for better construction of the forecast required increase number of measurements data. In the article also presents the results of the use adaptive prediction Brown’s model for predicting sugar beet raw performance. The statistical analysis allowed conclusions about the level of quality sufficient to describe changes raw indices for the forecast development. The optimal discount rates data are identified that determined by the form of the curve of growth sugar content of the beet root and mass in the process of maturation. Formulated conclusions of the quality of the forecast, depending on these factors that determines the expert forecaster. In the article shows the calculated expression, derived from experimental data that allow calculate changes of the raw material feature of sugar beet in the process of maturation.

  11. Summary Health Statistics for U.S. Adults: National Health Interview Survey, 2009. Data from the National Health Interview Survey. Vital and Health Statistics. Series 10, Number 249. DHHS Publication No. (PHS) 2011-1577

    Science.gov (United States)

    Pleis, J. R.; Ward, B. W.; Lucas, J. W.

    2010-01-01

    Objectives: This report presents health statistics from the 2009 National Health Interview Survey (NHIS) for the civilian noninstitutionalized adult population, classified by sex, age, race and ethnicity, education, family income, poverty status, health insurance coverage, marital status, and place and region of residence. Estimates are presented…

  12. GASPS--A Herschel Survey of Gas and Dust in Protoplanetary Disks: Summary and Initial Statistics

    Science.gov (United States)

    Dent, W.R.F.; Thi, W. F.; Kamp, I.; Williams, J. P.; Menard, F.; Andrews, S.; Ardila, D.; Aresu, G.; Augereau, J.-C.; Barrado y Navascues, D.; hide

    2013-01-01

    We describe a large-scale far-infrared line and continuum survey of protoplanetary disk through to young debris disk systems carried out using the ACS instrument on the Herschel Space Observatory. This Open Time Key program, known as GASPS (Gas Survey of Protoplanetary Systems), targeted approx. 250 young stars in narrow wavelength regions covering the [OI] fine structure line at 63 micron the brightest far-infrared line in such objects. A subset of the brightest targets were also surveyed in [OI]145 micron, [CII] at 157 µm, as well as several transitions of H2O and high-excitation CO lines at selected wavelengths between 78 and 180 micron. Additionally, GASPS included continuum photometry at 70, 100 and 160 micron, around the peak of the dust emission. The targets were SED Class II– III T Tauri stars and debris disks from seven nearby young associations, along with a comparable sample of isolated Herbig AeBe stars. The aim was to study the global gas and dust content in a wide sample of circumstellar disks, combining the results with models in a systematic way. In this overview paper we review the scientific aims, target selection and observing strategy of the program. We summarize some of the initial results, showing line identifications, listing the detections, and giving a first statistical study of line detectability. The [OI] line at 63 micron was the brightest line seen in almost all objects, by a factor of 10. Overall [OI] 63 micron detection rates were 49%, with 100% of HAeBe stars and 43% of T Tauri stars detected. A comparison with published disk dust masses (derived mainly from sub-mm continuum, assuming standard values of the mm mass opacity) shows a dust mass threshold for [OI] 63 µm detection of approx.10(exp -5) Solar M.. Normalizing to a distance of 140 pc, 84% of objects with dust masses =10 (exp -5) Solar M can be detected in this line in the present survey; 32% of those of mass 10(exp -6) – 10 (exp -5) Solar M, and only a very small number

  13. GASPS—A Herschel Survey of Gas and Dust in Protoplanetary Disks: Summary and Initial Statistics

    Science.gov (United States)

    Dent, W. R. F.; Thi, W. F.; Kamp, I.; Williams, J. P.; Menard, F.; Andrews, S.; Ardila, D.; Aresu, G.; Augereau, J.-C.; Barrado y Navascues, D.; Brittain, S.; Carmona, A.; Ciardi, D.; Danchi, W.; Donaldson, J.; Duchene, G.; Eiroa, C.; Fedele, D.; Grady, C.; de Gregorio-Molsalvo, I.; Howard, C.; Huélamo, N.; Krivov, A.; Lebreton, J.; Liseau, R.; Martin-Zaidi, C.; Mathews, G.; Meeus, G.; Mendigutía, I.; Montesinos, B.; Morales-Calderon, M.; Mora, A.; Nomura, H.; Pantin, E.; Pascucci, I.; Phillips, N.; Pinte, C.; Podio, L.; Ramsay, S. K.; Riaz, B.; Riviere-Marichalar, P.; Roberge, A.; Sandell, G.; Solano, E.; Tilling, I.; Torrelles, J. M.; Vandenbusche, B.; Vicente, S.; White, G. J.; Woitke, P.

    2013-05-01

    We describe a large-scale far-infrared line and continuum survey of protoplanetary disk through to young debris disk systems carried out using the ACS instrument on the Herschel Space Observatory. This Open Time Key program, known as GASPS (Gas Survey of Protoplanetary Systems), targeted ~250 young stars in narrow wavelength regions covering the [OI] fine structure line at 63 μm the brightest far-infrared line in such objects. A subset of the brightest targets were also surveyed in [OI]145 μm, [CII] at 157 μm, as well as several transitions of H2O and high-excitation CO lines at selected wavelengths between 78 and 180 μm. Additionally, GASPS included continuum photometry at 70, 100 and 160 μm, around the peak of the dust emission. The targets were SED Class II-III T Tauri stars and debris disks from seven nearby young associations, along with a comparable sample of isolated Herbig AeBe stars. The aim was to study the global gas and dust content in a wide sample of circumstellar disks, combining the results with models in a systematic way. In this overview paper we review the scientific aims, target selection and observing strategy of the program. We summarise some of the initial results, showing line identifications, listing the detections, and giving a first statistical study of line detectability. The [OI] line at 63 μm was the brightest line seen in almost all objects, by a factor of ~10. Overall [OI]63 μm detection rates were 49%, with 100% of HAeBe stars and 43% of T Tauri stars detected. A comparison with published disk dust masses (derived mainly from sub-mm continuum, assuming standard values of the mm mass opacity) shows a dust mass threshold for [OI]63 μm detection of ~10-5 Msolar. Normalising to a distance of 140 pc, 84% of objects with dust masses >=10-5 Msolar can be detected in this line in the present survey; 32% of those of mass 10-6-10-5 Msolar, and only a very small number of unusual objects with lower masses can be detected. This is

  14. The Inappropriate Symmetries of Multivariate Statistical Analysis in Geometric Morphometrics.

    Science.gov (United States)

    Bookstein, Fred L

    In today's geometric morphometrics the commonest multivariate statistical procedures, such as principal component analysis or regressions of Procrustes shape coordinates on Centroid Size, embody a tacit roster of symmetries-axioms concerning the homogeneity of the multiple spatial domains or descriptor vectors involved-that do not correspond to actual biological fact. These techniques are hence inappropriate for any application regarding which we have a-priori biological knowledge to the contrary (e.g., genetic/morphogenetic processes common to multiple landmarks, the range of normal in anatomy atlases, the consequences of growth or function for form). But nearly every morphometric investigation is motivated by prior insights of this sort. We therefore need new tools that explicitly incorporate these elements of knowledge, should they be quantitative, to break the symmetries of the classic morphometric approaches. Some of these are already available in our literature but deserve to be known more widely: deflated (spatially adaptive) reference distributions of Procrustes coordinates, Sewall Wright's century-old variant of factor analysis, the geometric algebra of importing explicit biomechanical formulas into Procrustes space. Other methods, not yet fully formulated, might involve parameterized models for strain in idealized forms under load, principled approaches to the separation of functional from Brownian aspects of shape variation over time, and, in general, a better understanding of how the formalism of landmarks interacts with the many other approaches to quantification of anatomy. To more powerfully organize inferences from the high-dimensional measurements that characterize so much of today's organismal biology, tomorrow's toolkit must rely neither on principal component analysis nor on the Procrustes distance formula, but instead on sound prior biological knowledge as expressed in formulas whose coefficients are not all the same. I describe the problems of

  15. A new statistic to express the uncertainty of kriging predictions for purposes of survey planning.

    Science.gov (United States)

    Lark, R. M.; Lapworth, D. J.

    2014-05-01

    It is well-known that one advantage of kriging for spatial prediction is that, given the random effects model, the prediction error variance can be computed a priori for alternative sampling designs. This allows one to compare sampling schemes, in particular sampling at different densities, and so to decide on one which meets requirements in terms of the uncertainty of the resulting predictions. However, the planning of sampling schemes must account not only for statistical considerations, but also logistics and cost. This requires effective communication between statisticians, soil scientists and data users/sponsors such as managers, regulators or civil servants. In our experience the latter parties are not necessarily able to interpret the prediction error variance as a measure of uncertainty for decision making. In some contexts (particularly the solution of very specific problems at large cartographic scales, e.g. site remediation and precision farming) it is possible to translate uncertainty of predictions into a loss function directly comparable with the cost incurred in increasing precision. Often, however, sampling must be planned for more generic purposes (e.g. baseline or exploratory geochemical surveys). In this latter context the prediction error variance may be of limited value to a non-statistician who has to make a decision on sample intensity and associated cost. We propose an alternative criterion for these circumstances to aid communication between statisticians and data users about the uncertainty of geostatistical surveys based on different sampling intensities. The criterion is the consistency of estimates made from two non-coincident instantiations of a proposed sample design. We consider square sample grids, one instantiation is offset from the second by half the grid spacing along the rows and along the columns. If a sample grid is coarse relative to the important scales of variation in the target property then the consistency of predictions

  16. Statistical Analysis of Data with Non-Detectable Values

    Energy Technology Data Exchange (ETDEWEB)

    Frome, E.L.

    2004-08-26

    Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical

  17. Business Sample Survey Measurement on Statistical Thinking and Methods Adoption: the Case of Croatian Small Enterprises

    National Research Council Canada - National Science Library

    Zmuk, Berislav

    2015-01-01

    The objective of this research is to investigate attitudes of management in Croatian small enterprises that use statistical methods towards statistical thinking in order to gain an insight into related issues...

  18. The statistical investigation of the First and Second Byurakan survey galaxies and their neighbors

    Science.gov (United States)

    Nazaryan, Tigran A.

    2014-05-01

    In the thesis we study close pairs of galaxies with the aim of understanding the influence of gravitational interaction on nuclear activity and star formation of paired galaxies. For this purpose we investigate dependences of integral parameters of galaxies, their star formation and properties of nuclei on kinematic parameters of systems and their large-scale environment. The thesis has an introduction, three main chapters, a summary, lists of abbreviations and references, and three appendices. In the first chapter, the methods of selection of sample of pairs of galaxies and measurements of physical parameters of the First Byurakan Survey (Markarian) galaxies and their neighbors are presented, and the databases in appendices A and B are described, which contain parameters of neighbors of Markarian galaxies measured by us, and the parameters of pairs having Markarian galaxies, based on the Sloan Digital Sky Survey (SDSS) data. The selection effects of sample of pairs are discussed, and the statistical comparison of Markarian galaxies and their neighbors is done. The results of statistical study of star formation and activity of nuclei in pairs having Markarian galaxies are presented, as well as the correlations between properties of galaxies in pairs and the physical mechanisms behind them. In the second chapter, the results of statistical study of the Second Byurakan Survey (SBS) galaxies and their neighbors, and star formation and activity of nuclei in those pairs are presented and discussed. In the third chapter, possibilities of using supernovae as indicators of star formation are discussed, the sample of supernovae in pairs of galaxies is presented, and study of star formation in pairs of interacting galaxies by means of that sample of supernovae is done. Also а conclusion about the nature of progenitors of different types of supernovae is made. The short summary of main results of the study concludes the thesis. The thesis has 158 pages. The main results

  19. Guasom Analysis Of The Alhambra Survey

    Science.gov (United States)

    Garabato, Daniel; Manteiga, Minia; Dafonte, Carlos; Álvarez, Marco A.

    2017-10-01

    GUASOM is a data mining tool designed for knowledge discovery in large astronomical spectrophotometric archives developed in the framework of Gaia DPAC (Data Processing and Analysis Consortium). Our tool is based on a type of unsupervised learning Artificial Neural Networks named Self-organizing maps (SOMs). SOMs permit the grouping and visualization of big amount of data for which there is no a priori knowledge and hence they are very useful for analyzing the huge amount of information present in modern spectrophotometric surveys. SOMs are used to organize the information in clusters of objects, as homogeneously as possible according to their spectral energy distributions, and to project them onto a 2D grid where the data structure can be visualized. Each cluster has a representative, called prototype which is a virtual pattern that better represents or resembles the set of input patterns belonging to such a cluster. Prototypes make easier the task of determining the physical nature and properties of the objects populating each cluster. Our algorithm has been tested on the ALHAMBRA survey spectrophotometric observations, here we present our results concerning the survey segmentation, visualization of the data structure, separation between types of objects (stars and galaxies), data homogeneity of neurons, cluster prototypes, redshift distribution and crossmatch with other databases (Simbad).

  20. Turking Statistics: Student-Generated Surveys Increase Student Engagement and Performance

    Science.gov (United States)

    Whitley, Cameron T.; Dietz, Thomas

    2018-01-01

    Thirty years ago, Hubert M. Blalock Jr. published an article in "Teaching Sociology" about the importance of teaching statistics. We honor Blalock's legacy by assessing how using Amazon Mechanical Turk (MTurk) in statistics classes can enhance student learning and increase statistical literacy among social science gradaute students. In…

  1. Statistical analysis as approach to conductive heat transfer modelling

    Science.gov (United States)

    Antonyová, A.; Antony, P.

    2013-04-01

    The main inspiration for article was the problem of high investment into installation of the building insulation. The question of its effectiveness and reliability also after the period of 10 or 15 years was the topic of the international research project carried out at the University of Prešov in Prešov and Vienna University of Technology entitled "Detection and Management of Risk Processes in Building Insulation" and numbered SRDA SK-AT-0008-10. To detect especially the moisture problem as risk process in the space between the wall and insulation led to construction new measuring equipment to test the moisture and temperature without the insulation destruction and this way to describe real situation in old buildings too. The further investigation allowed us to analyse the range of data in the amount of 1680 measurements and express conductive heat transfer using the methods of statistical analysis. Modelling comprises relationships of the environment properties inside the building, in the space between the wall and insulation and in ambient surrounding of the building. Radial distribution function also characterizes the connection of the temperature differences.

  2. Utility green pricing programs: A statistical analysis of program effectiveness

    Energy Technology Data Exchange (ETDEWEB)

    Wiser, Ryan; Olson, Scott; Bird, Lori; Swezey, Blair

    2004-02-01

    Development of renewable energy. Such programs have grown in number in recent years. The design features and effectiveness of these programs varies considerably, however, leading a variety of stakeholders to suggest specific marketing and program design features that might improve customer response and renewable energy sales. This report analyzes actual utility green pricing program data to provide further insight into which program features might help maximize both customer participation in green pricing programs and the amount of renewable energy purchased by customers in those programs. Statistical analysis is performed on both the residential and non-residential customer segments. Data comes from information gathered through a questionnaire completed for 66 utility green pricing programs in early 2003. The questionnaire specifically gathered data on residential and non-residential participation, amount of renewable energy sold, program length, the type of renewable supply used, program price/cost premiums, types of consumer research and program evaluation performed, different sign-up options available, program marketing efforts, and ancillary benefits offered to participants.

  3. Measurement of Plethysmogram and Statistical Method for Analysis

    Science.gov (United States)

    Shimizu, Toshihiro

    The plethysmogram is measured at different points of human body by using the photo interrupter, which sensitively depends on the physical and mental situation of human body. In this paper the statistical method of the data-analysis is investigated to discuss the dependence of plethysmogram on stress and aging. The first one is the representation method based on the return map, which provides usuful information for the waveform, the flucuation in phase and the fluctuation in amplitude. The return map method makes it possible to understand the fluctuation of plethymogram in amplitude and in phase more clearly and globally than in the conventional power spectrum method. The second is the Lisajous plot and the correlation function to analyze the phase difference between the plethysmograms of the right finger tip and of the left finger tip. The third is the R-index, from which we can estimate “the age of the blood flow”. The R-index is defined by the global character of plethysmogram, which is different from the usual APG-index. The stress- and age-dependence of plethysmogram is discussed by using these methods.

  4. Corrected Statistical Energy Analysis Model for Car Interior Noise

    Directory of Open Access Journals (Sweden)

    A. Putra

    2015-01-01

    Full Text Available Statistical energy analysis (SEA is a well-known method to analyze the flow of acoustic and vibration energy in a complex structure. For an acoustic space where significant absorptive materials are present, direct field component from the sound source dominates the total sound field rather than a reverberant field, where the latter becomes the basis in constructing the conventional SEA model. Such environment can be found in a car interior and thus a corrected SEA model is proposed here to counter this situation. The model is developed by eliminating the direct field component from the total sound field and only the power after the first reflection is considered. A test car cabin was divided into two subsystems and by using a loudspeaker as a sound source, the power injection method in SEA was employed to obtain the corrected coupling loss factor and the damping loss factor from the corrected SEA model. These parameters were then used to predict the sound pressure level in the interior cabin using the injected input power from the engine. The results show satisfactory agreement with the directly measured SPL.

  5. Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint

    Energy Technology Data Exchange (ETDEWEB)

    Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad

    2015-12-08

    Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.

  6. Statistical analysis of CSP plants by simulating extensive meteorological series

    Science.gov (United States)

    Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana

    2017-06-01

    The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.

  7. Statistical Analysis of Loss of Offsite Power Events

    Directory of Open Access Journals (Sweden)

    Andrija Volkanovski

    2016-01-01

    Full Text Available This paper presents the results of the statistical analysis of the loss of offsite power events (LOOP registered in four reviewed databases. The reviewed databases include the IRSN (Institut de Radioprotection et de Sûreté Nucléaire SAPIDE database and the GRS (Gesellschaft für Anlagen- und Reaktorsicherheit mbH VERA database reviewed over the period from 1992 to 2011. The US NRC (Nuclear Regulatory Commission Licensee Event Reports (LERs database and the IAEA International Reporting System (IRS database were screened for relevant events registered over the period from 1990 to 2013. The number of LOOP events in each year in the analysed period and mode of operation are assessed during the screening. The LOOP frequencies obtained for the French and German nuclear power plants (NPPs during critical operation are of the same order of magnitude with the plant related events as a dominant contributor. A frequency of one LOOP event per shutdown year is obtained for German NPPs in shutdown mode of operation. For the US NPPs, the obtained LOOP frequency for critical and shutdown mode is comparable to the one assessed in NUREG/CR-6890. Decreasing trend is obtained for the LOOP events registered in three databases (IRSN, GRS, and NRC.

  8. Statistical Analysis of Development Trends in Global Renewable Energy

    Directory of Open Access Journals (Sweden)

    Marina D. Simonova

    2016-01-01

    Full Text Available The article focuses on the economic and statistical analysis of industries associated with the use of renewable energy sources in several countries. The dynamic development and implementation of technologies based on renewable energy sources (hereinafter RES is the defining trend of world energy development. The uneven distribution of hydrocarbon reserves, increasing demand of developing countries and environmental risks associated with the production and consumption of fossil resources has led to an increasing interest of many states to this field. Creating low-carbon economies involves the implementation of plans to increase the proportion of clean energy through renewable energy sources, energy efficiency, reduce greenhouse gas emissions. The priority of this sector is a characteristic feature of modern development of developed (USA, EU, Japan and emerging economies (China, India, Brazil, etc., as evidenced by the inclusion of the development of this segment in the state energy strategies and the revision of existing approaches to energy security. The analysis of the use of renewable energy, its contribution to value added of countries-producers is of a particular interest. Over the last decade, the share of energy produced from renewable sources in the energy balances of the world's largest economies increased significantly. Every year the number of power generating capacity based on renewable energy is growing, especially, this trend is apparent in China, USA and European Union countries. There is a significant increase in direct investment in renewable energy. The total investment over the past ten years increased by 5.6 times. The most rapidly developing kinds are solar energy and wind power.

  9. Statistical analysis and optimization of igbt manufacturing flow

    Directory of Open Access Journals (Sweden)

    Baranov V. V.

    2015-02-01

    Full Text Available The use of computer simulation, design and optimization of power electronic devices formation technological processes can significantly reduce development time, improve the accuracy of calculations, choose the best options for implementation based on strict mathematical analysis. One of the most common power electronic devices is isolated gate bipolar transistor (IGBT, which combines the advantages of MOSFET and bipolar transistor. The achievement of high requirements for these devices is only possible by optimizing device design and manufacturing process parameters. Therefore important and necessary step in the modern cycle of IC design and manufacturing is to carry out the statistical analysis. Procedure of the IGBT threshold voltage optimization was realized. Through screening experiments according to the Plackett-Burman design the most important input parameters (factors that have the greatest impact on the output characteristic was detected. The coefficients of the approximation polynomial adequately describing the relationship between the input parameters and investigated output characteristics ware determined. Using the calculated approximation polynomial, a series of multiple, in a cycle of Monte Carlo, calculations to determine the spread of threshold voltage values at selected ranges of input parameters deviation were carried out. Combinations of input process parameters values were determined randomly by a normal distribution within a given range of changes. The procedure of IGBT process parameters optimization consist a mathematical problem of determining the value range of the input significant structural and technological parameters providing the change of the IGBT threshold voltage in a given interval. The presented results demonstrate the effectiveness of the proposed optimization techniques.

  10. Hierarchical models and the analysis of bird survey information

    Science.gov (United States)

    Sauer, J.R.; Link, W.A.

    2003-01-01

    Management of birds often requires analysis of collections of estimates. We describe a hierarchical modeling approach to the analysis of these data, in which parameters associated with the individual species estimates are treated as random variables, and probability statements are made about the species parameters conditioned on the data. A Markov-Chain Monte Carlo (MCMC) procedure is used to fit the hierarchical model. This approach is computer intensive, and is based upon simulation. MCMC allows for estimation both of parameters and of derived statistics. To illustrate the application of this method, we use the case in which we are interested in attributes of a collection of estimates of population change. Using data for 28 species of grassland-breeding birds from the North American Breeding Bird Survey, we estimate the number of species with increasing populations, provide precision-adjusted rankings of species trends, and describe a measure of population stability as the probability that the trend for a species is within a certain interval. Hierarchical models can be applied to a variety of bird survey applications, and we are investigating their use in estimation of population change from survey data.

  11. TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION

    Directory of Open Access Journals (Sweden)

    А. А. Vershinina

    2014-01-01

    Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.

  12. OSPAR standard method and software for statistical analysis of beach litter data.

    Science.gov (United States)

    Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit

    2017-09-15

    The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. A note on the statistical analysis of point judgment matrices

    African Journals Online (AJOL)

    There is scope for further research into statistical approaches for analyzing judgment matrices. In particular statistically based methods address rank reversal since standard errors are associated with estimates of the weights and thus the rankings are not stated with certainty. However, the weights are constrained to lie in a ...

  14. Computation on Bus Delay at Stops in Beijing through Statistical Analysis

    Directory of Open Access Journals (Sweden)

    Shaokuan Chen

    2013-01-01

    Full Text Available Delays at bus stops have seriously affected the efficiency of bus operation and the improvement of level of services of public transportation and greatly influenced the preferences of passengers to choose bus services. In this paper, the analysis on arriving, dwell and leaving process of buses and the method to calculate bus delays at stops are proposed according to survey data from three bus routes in Beijing. Statistical analysis is then adopted respectively to evaluate average times that buses are docking at curbside and bay-style stops. Moreover, it is noted that different load factors of passengers in buses have significantly influenced the average boarding and alighting time per person. The effectiveness and accuracy of the proposed methods is illustrated through case studies. This study is crucial and helpful for the planners and operators to evaluate the efficiency and level of service of urban public transportation.

  15. Gypsum plasterboard walls: inspection, pathological characterization and statistical survey using an expert system

    Directory of Open Access Journals (Sweden)

    Gaião, C.

    2012-06-01

    Full Text Available This paper presents an expert system to support the inspection and diagnosis of partition walls or wall coverings mounted using the Drywall (DW construction method. This system includes a classification of anomalies in DW and their probable causes. This inspection system was used in a field work that included the observation of 121 DWs. This paper includes a statistical analysis of the anomalies observed during these inspections and their probable causes. The correlation between anomalies and causes in the sample is also thoroughly analyzed. Anomalies are also evaluated for area affected, size, repair urgency and aesthetic value of the affected area. The conclusions of the statistical analysis allowed the creation of an inventory of preventive measures to be implemented in the design, execution and use phases in order to lessen the magnitude or eradicate the occurrence of anomalies in DW. These measures could directly help improve the quality of construction.

    Este trabajo presenta un sistema experto de apoyo a la inspección y diagnóstico de tabiques o revestimientos de yeso laminado. Dicho sistema, que permite la clasificación de las anomalías del yeso laminado y sus causas probables, se empleó en un trabajo de campo en el que se estudiaron 121 elementos construidos con este material. El trabajo incluye el análisis estadístico de las anomalías detectadas durante las inspecciones y sus motivos probables. También se analizó en detalle la correlación entre las anomalías y sus causas, evaluándose aquellas en función de la superficie afectada, la urgencia de las reparaciones y el valor estético de la zona implicada. Las conclusiones del análisis estadístico permitieron la elaboración de un inventario de medidas preventivas que deberían implantarse en las fases de proyecto, ejecución y utilización de estos elementos a fin de erradicar la aparición de anomalías en el yeso laminado o reducir su frecuencia. Dichas

  16. AMA Statistical Information Based Analysis of a Compressive Imaging System

    Science.gov (United States)

    Hope, D.; Prasad, S.

    -based analysis of a compressive imaging system based on a new highly efficient and robust method that enables us to evaluate statistical entropies. Our method is based on the notion of density of states (DOS), which plays a major role in statistical mechanics by allowing one to express macroscopic thermal averages in terms of the number of configuration states of a system for a certain energy level. Instead of computing the number of states at a certain energy level, however, we compute the number of possible configurations (states) of a particular image scene that correspond to a certain probability value. This allows us to compute the probability for each possible state, or configuration, of the scene being imaged. We assess the performance of a single pixel compressive imaging system based on the amount of information encoded and transmitted in parameters that characterize the information in the scene. Amongst many examples, we study the problem of faint companion detection. Here, we show how information in the recorded images depends on the choice of basis for representing the scene and the amount of measurement noise. The noise creates confusion when associating a recorded image with the correct member of the ensemble that produced the image. We show that multiple measurements enable one to mitigate this confusion noise.

  17. A statistical evaluation of the design and precision of the shrimp trawl survey off West Greenland

    DEFF Research Database (Denmark)

    Folmer, Ole; Pennington, M.

    2000-01-01

    Stocks of Pandalus borealis off West Greenland have been assessed using a research trawl survey since 1988. The survey has used a design of randomly placed stations, stratified ton depth data where available, using small blocks elsewhere), with sampling effort proportional to stratum area. In some...... years, a two-stage adaptive sampling scheme was used to place more stations into strata with large first-stage variation in catches. The design of the survey was reviewed in 1998. Modifications in survey design suggested were to shorten tow duration, to pool strata so that effort could be allocated more...... efficiently, to put a higher proportion of stations in high- density areas and to abandon two-stage sampling. All these changes were implemented for the 1998 survey, except that tow duration was reduced to 30 min at 25% of the stations. To analyze the-efficiency of the present survey design, various...

  18. Statistical analysis of compressive low rank tomography with random measurements

    Science.gov (United States)

    Acharya, Anirudh; Guţă, Mădălin

    2017-05-01

    We consider the statistical problem of ‘compressive’ estimation of low rank states (r\\ll d ) with random basis measurements, where r, d are the rank and dimension of the state respectively. We investigate whether for a fixed sample size N, the estimation error associated with a ‘compressive’ measurement setup is ‘close’ to that of the setting where a large number of bases are measured. We generalise and extend previous results, and show that the mean square error (MSE) associated with the Frobenius norm attains the optimal rate rd/N with only O(r log{d}) random basis measurements for all states. An important tool in the analysis is the concentration of the Fisher information matrix (FIM). We demonstrate that although a concentration of the MSE follows from a concentration of the FIM for most states, the FIM fails to concentrate for states with eigenvalues close to zero. We analyse this phenomenon in the case of a single qubit and demonstrate a concentration of the MSE about its optimal despite a lack of concentration of the FIM for states close to the boundary of the Bloch sphere. We also consider the estimation error in terms of a different metric-the quantum infidelity. We show that a concentration in the mean infidelity (MINF) does not exist uniformly over all states, highlighting the importance of loss function choice. Specifically, we show that for states that are nearly pure, the MINF scales as 1/\\sqrt{N} but the constant converges to zero as the number of settings is increased. This demonstrates a lack of ‘compressive’ recovery for nearly pure states in this metric.

  19. Tutorial on Biostatistics: Statistical Analysis for Correlated Binary Eye Data.

    Science.gov (United States)

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2018-02-01

    To describe and demonstrate methods for analyzing correlated binary eye data. We describe non-model based (McNemar's test, Cochran-Mantel-Haenszel test) and model-based methods (generalized linear mixed effects model, marginal model) for analyses involving both eyes. These methods were applied to: (1) CAPT (Complications of Age-related Macular Degeneration Prevention Trial) where one eye was treated and the other observed (paired design); (2) ETROP (Early Treatment for Retinopathy of Prematurity) where bilaterally affected infants had one eye treated conventionally and the other treated early and unilaterally affected infants had treatment assigned randomly; and (3) AREDS (Age-Related Eye Disease Study) where treatment was systemic and outcome was eye-specific (both eyes in the same treatment group). In the CAPT (n = 80), treatment group (30% vision loss in treated vs. 44% in observed eyes) was not statistically significant (p = 0.07) when inter-eye correlation was ignored, but was significant (p = 0.01) with McNemar's test and the marginal model. Using standard logistic regression for unfavorable vision in ETROP, standard errors and p-values were larger for person-level covariates and were smaller for ocular covariates than using models accounting for inter-eye correlation. For risk factors of geographic atrophy in AREDS, two-eye analyses accounting for inter-eye correlation yielded more power than one-eye analyses and provided larger standard errors and p-values than invalid two-eye analyses ignoring inter-eye correlation. Ignoring inter-eye correlation can lead to larger p-values for paired designs and smaller p-values when both eyes are in the same group. Marginal models or mixed effects models using the eye as the unit of analysis provide valid inference.

  20. Probability and Statistics Questions and Tests : a critical analysis

    Directory of Open Access Journals (Sweden)

    Fabrizio Maturo

    2015-06-01

    Full Text Available In probability and statistics courses, a popular method for the evaluation of the students is to assess them using multiple choice tests. The use of these tests allows to evaluate certain types of skills such as fast response, short-term memory, mental clarity and ability to compete. In our opinion, the verification through testing can certainly be useful for the analysis of certain aspects, and to speed up the process of assessment, but we should be aware of the limitations of such a standardized procedure and then exclude that the assessments of pupils, classes and schools can be reduced to processing of test results. To prove this thesis, this article argues in detail the main test limits, presents some recent models which have been proposed in the literature and suggests some alternative valuation methods.   Quesiti e test di Probabilità e Statistica: un'analisi critica Nei corsi di Probabilità e  Statistica, un metodo molto diffuso per la valutazione degli studenti consiste nel sottoporli a quiz a risposta multipla.  L'uso di questi test permette di valutare alcuni tipi di abilità come la rapidità di risposta, la memoria a breve termine, la lucidità mentale e l'attitudine a gareggiare. A nostro parere, la verifica attraverso i test può essere sicuramente utile per l'analisi di alcuni aspetti e per velocizzare il percorso di valutazione ma si deve essere consapevoli dei limiti di una tale procedura standardizzata e quindi escludere che le valutazioni di alunni, classi e scuole possano essere ridotte a elaborazioni di risultati di test. A dimostrazione di questa tesi, questo articolo argomenta in dettaglio i limiti principali dei test, presenta alcuni recenti modelli proposti in letteratura e propone alcuni metodi di valutazione alternativi. Parole Chiave:  item responce theory, valutazione, test, probabilità

  1. Statistical analysis of unstructured amino acid residues in protein structures.

    Science.gov (United States)

    Lobanov, M Yu; Garbuzynskiy, S O; Galzitskaya, O V

    2010-02-01

    We have performed a statistical analysis of unstructured amino acid residues in protein structures available in the databank of protein structures. Data on the occurrence of disordered regions at the ends and in the middle part of protein chains have been obtained: in the regions near the ends (at distance less than 30 residues from the N- or C-terminus), there are 66% of unstructured residues (38% are near the N-terminus and 28% are near the C-terminus), although these terminal regions include only 23% of the amino acid residues. The frequencies of occurrence of unstructured residues have been calculated for each of 20 types in different positions in the protein chain. It has been shown that relative frequencies of occurrence of unstructured residues of 20 types at the termini of protein chains differ from the ones in the middle part of the protein chain; amino acid residues of the same type have different probabilities to be unstructured in the terminal regions and in the middle part of the protein chain. The obtained frequencies of occurrence of unstructured residues in the middle part of the protein chain have been used as a scale for predicting disordered regions from amino acid sequence using the method (FoldUnfold) previously developed by us. This scale of frequencies of occurrence of unstructured residues correlates with the contact scale (previously developed by us and used for the same purpose) at a level of 95%. Testing the new scale on a database of 427 unstructured proteins and 559 completely structured proteins has shown that this scale can be successfully used for the prediction of disordered regions in protein chains.

  2. Statistical Analysis of Upper Ocean Time Series of Vertical Shear.

    Science.gov (United States)

    1982-05-01

    SHEAR ............. 5-1 5.1 Preliminary Statistical Tests ............ 5-1 5.1.1 Autocorrelation and Run Test for Randomness .................... 5-1...parameters are based on the statistical model for S7(NTz) from Section 4. 5.1 PRELIMINARY STATISTICAL TESTS 5.1.1 Autocorrelation and Run Test for Randomness...estimating this interval directly from shear auto- correlation functions, and the second involves the use of the run test . * 5-1 In qeneral, shear in the

  3. Gini s ideas: new perspectives for modern multivariate statistical analysis

    Directory of Open Access Journals (Sweden)

    Angela Montanari

    2013-05-01

    Full Text Available Corrado Gini (1884-1964 may be considered the greatest Italian statistician. We believe that his important contributions to statistics, however mainly limited to the univariate context, may be profitably employed in modern multivariate statistical methods, aimed at overcoming the curse of dimensionality by decomposing multivariate problems into a series of suitably posed univariate ones.In this paper we critically summarize Gini’s proposals and consider their impact on multivariate statistical methods, both reviewing already well established applications and suggesting new perspectives.Particular attention will be devoted to classification and regression trees, multiple linear regression, linear dimension reduction methods and transvariation based discrimination.

  4. Analysis of statistical model properties from discrete nuclear structure data

    Science.gov (United States)

    Firestone, Richard B.

    2012-02-01

    Experimental M1, E1, and E2 photon strengths have been compiled from experimental data in the Evaluated Nuclear Structure Data File (ENSDF) and the Evaluated Gamma-ray Activation File (EGAF). Over 20,000 Weisskopf reduced transition probabilities were recovered from the ENSDF and EGAF databases. These transition strengths have been analyzed for their dependence on transition energies, initial and final level energies, spin/parity dependence, and nuclear deformation. ENSDF BE1W values were found to increase exponentially with energy, possibly consistent with the Axel-Brink hypothesis, although considerable excess strength observed for transitions between 4-8 MeV. No similar energy dependence was observed in EGAF or ARC data. BM1W average values were nearly constant at all energies above 1 MeV with substantial excess strength below 1 MeV and between 4-8 MeV. BE2W values decreased exponentially by a factor of 1000 from 0 to 16 MeV. The distribution of ENSDF transition probabilities for all multipolarities could be described by a lognormal statistical distribution. BE1W, BM1W, and BE2W strengths all increased substantially for initial transition level energies between 4-8 MeV possibly due to dominance of spin-flip and Pygmy resonance transitions at those excitations. Analysis of the average resonance capture data indicated no transition probability dependence on final level spins or energies between 0-3 MeV. The comparison of favored to unfavored transition probabilities for odd-A or odd-Z targets indicated only partial support for the expected branching intensity ratios with many unfavored transitions having nearly the same strength as favored ones. Average resonance capture BE2W transition strengths generally increased with greater deformation. Analysis of ARC data suggest that there is a large E2 admixture in M1 transitions with the mixing ratio δ ≈ 1.0. The ENSDF reduced transition strengths were considerably stronger than those derived from capture gamma ray

  5. Analysis of statistical model properties from discrete nuclear structure data

    Directory of Open Access Journals (Sweden)

    Firestone Richard B.

    2012-02-01

    Full Text Available Experimental M1, E1, and E2 photon strengths have been compiled from experimental data in the Evaluated Nuclear Structure Data File (ENSDF and the Evaluated Gamma-ray Activation File (EGAF. Over 20,000 Weisskopf reduced transition probabilities were recovered from the ENSDF and EGAF databases. These transition strengths have been analyzed for their dependence on transition energies, initial and final level energies, spin/parity dependence, and nuclear deformation. ENSDF BE1W values were found to increase exponentially with energy, possibly consistent with the Axel-Brink hypothesis, although considerable excess strength observed for transitions between 4-8 MeV. No similar energy dependence was observed in EGAF or ARC data. BM1W average values were nearly constant at all energies above 1 MeV with substantial excess strength below 1 MeV and between 4-8 MeV. BE2W values decreased exponentially by a factor of 1000 from 0 to 16 MeV. The distribution of ENSDF transition probabilities for all multipolarities could be described by a lognormal statistical distribution. BE1W, BM1W, and BE2W strengths all increased substantially for initial transition level energies between 4-8 MeV possibly due to dominance of spin-flip and Pygmy resonance transitions at those excitations. Analysis of the average resonance capture data indicated no transition probability dependence on final level spins or energies between 0-3 MeV. The comparison of favored to unfavored transition probabilities for odd-A or odd-Z targets indicated only partial support for the expected branching intensity ratios with many unfavored transitions having nearly the same strength as favored ones. Average resonance capture BE2W transition strengths generally increased with greater deformation. Analysis of ARC data suggest that there is a large E2 admixture in M1 transitions with the mixing ratio δ ≈ 1.0. The ENSDF reduced transition strengths were considerably stronger than those derived from

  6. A statistical framework for differential network analysis from microarray data

    Directory of Open Access Journals (Sweden)

    Datta Somnath

    2010-02-01

    Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the

  7. Olive mill wastewater characteristics: modelling and statistical analysis

    Directory of Open Access Journals (Sweden)

    Martins-Dias, Susete

    2004-09-01

    Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus

  8. On Conceptual Analysis as the Primary Qualitative Approach to Statistics Education Research in Psychology

    Science.gov (United States)

    Petocz, Agnes; Newbery, Glenn

    2010-01-01

    Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…

  9. STATISTICAL ANALYSIS OF THE DEMOLITION OF THE HITCH DEVICES ELEMENTS

    Directory of Open Access Journals (Sweden)

    V. V. Artemchuk

    2009-03-01

    Full Text Available The results of statistical research of wear of automatic coupler body butts and thrust plates of electric locomotives are presented in the article. Due to the increased wear the mentioned elements require special attention.

  10. Climate time series analysis classical statistical and bootstrap methods

    CERN Document Server

    Mudelsee, Manfred

    2010-01-01

    This book presents bootstrap resampling as a computationally intensive method able to meet the challenges posed by the complexities of analysing climate data. It shows how the bootstrap performs reliably in the most important statistical estimation techniques.

  11. Gregor Mendel's Genetic Experiments: A Statistical Analysis after 150 Years

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2016-01-01

    Roč. 12, č. 2 (2016), s. 20-26 ISSN 1801-5603 Institutional support: RVO:67985807 Keywords : genetics * history of science * biostatistics * design of experiments Subject RIV: BB - Applied Statistics, Operational Research

  12. A framework for relating the structures and recovery statistics in pressure time-series surveys for dust devils

    Science.gov (United States)

    Jackson, Brian; Lorenz, Ralph; Davis, Karan

    2018-01-01

    Dust devils are likely the dominant source of dust for the martian atmosphere, but the amount and frequency of dust-lifting depend on the statistical distribution of dust devil parameters. Dust devils exhibit pressure perturbations and, if they pass near a barometric sensor, they may register as a discernible dip in a pressure time-series. Leveraging this fact, several surveys using barometric sensors on landed spacecraft have revealed dust devil structures and occurrence rates. However powerful they are, though, such surveys suffer from non-trivial biases that skew the inferred dust devil properties. For example, such surveys are most sensitive to dust devils with the widest and deepest pressure profiles, but the recovered profiles will be distorted, broader and shallow than the actual profiles. In addition, such surveys often do not provide wind speed measurements alongside the pressure time series, and so the durations of the dust devil signals in the time series cannot be directly converted to profile widths. Fortunately, simple statistical and geometric considerations can de-bias these surveys, allowing conversion of the duration of dust devil signals into physical widths, given only a distribution of likely translation velocities, and the recovery of the underlying distributions of physical parameters. In this study, we develop a scheme for de-biasing such surveys. Applying our model to an in-situ survey using data from the Phoenix lander suggests a larger dust flux and a dust devil occurrence rate about ten times larger than previously inferred. Comparing our results to dust devil track surveys suggests only about one in five low-pressure cells lifts sufficient dust to leave a visible track.

  13. Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.

    Science.gov (United States)

    Rudert, Thomas; Lohmann, Gabriele

    2008-12-01

    To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.

  14. Statistics and data analysis for financial engineering with R examples

    CERN Document Server

    Ruppert, David

    2015-01-01

    The new edition of this influential textbook, geared towards graduate or advanced undergraduate students, teaches the statistics necessary for financial engineering. In doing so, it illustrates concepts using financial markets and economic data, R Labs with real-data exercises, and graphical and analytic methods for modeling and diagnosing modeling errors. Financial engineers now have access to enormous quantities of data. To make use of these data, the powerful methods in this book, particularly about volatility and risks, are essential. Strengths of this fully-revised edition include major additions to the R code and the advanced topics covered. Individual chapters cover, among other topics, multivariate distributions, copulas, Bayesian computations, risk management, multivariate volatility and cointegration. Suggested prerequisites are basic knowledge of statistics and probability, matrices and linear algebra, and calculus. There is an appendix on probability, statistics and linear algebra. Practicing fina...

  15. Statistical analysis of natural disasters and related losses

    CERN Document Server

    Pisarenko, VF

    2014-01-01

    The study of disaster statistics and disaster occurrence is a complicated interdisciplinary field involving the interplay of new theoretical findings from several scientific fields like mathematics, physics, and computer science. Statistical studies on the mode of occurrence of natural disasters largely rely on fundamental findings in the statistics of rare events, which were derived in the 20th century. With regard to natural disasters, it is not so much the fact that the importance of this problem for mankind was recognized during the last third of the 20th century - the myths one encounters in ancient civilizations show that the problem of disasters has always been recognized - rather, it is the fact that mankind now possesses the necessary theoretical and practical tools to effectively study natural disasters, which in turn supports effective, major practical measures to minimize their impact. All the above factors have resulted in considerable progress in natural disaster research. Substantial accrued ma...

  16. GASPS-A Herschel Survey of Gas and Dust in Protoplanetary Disks: Summary and Initial Statistics

    OpenAIRE

    Dent, W. R. F.; Thi, W. F.; Kamp, I.; Williams, J.P.; Menard, F; S.; Andrews; Ardila, D.; Aresu, G.; Augereau, J. -C.; Barrado y Navascues, D; Brittain, S.; Carmona, A.; Ciardi, D.; Danchi, W.; Donaldson, J

    2013-01-01

    We describe a large-scale far-infrared line and continuum survey of protoplanetary disk through to young debris disk systems carried out using the ACS instrument on the Herschel Space Observatory. This Open Time Key program, known as GASPS (Gas Survey of Protoplanetary Systems), targeted ~250 young stars in narrow wavelength regions covering the [OI] fine structure line at 63 μm, the brightest far-infrared line in such objects. A subset of the brightest targets were also surveyed in [OI]145 μ...

  17. GASPS—A Herschel Survey of Gas and Dust in Protoplanetary Disks: Summary and Initial Statistics

    OpenAIRE

    Dent, W. R. F.; Ardila, D.; Ciardi, D.

    2013-01-01

    We describe a large-scale far-infrared line and continuum survey of protoplanetary disk through to young debris disk systems carried out using the ACS instrument on the Herschel Space Observatory. This Open Time Key program, known as GASPS (Gas Survey of Protoplanetary Systems), targeted ∼250 young stars in narrow wavelength regions covering the [OI] fine structure line at 63 μm the brightest far-infrared line in such objects. A subset of the brightest targets were also surveyed in [OI]145 μm...

  18. U.S. Geological Survey Gap Analysis Program

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — The Gap Analysis Program (GAP) is an element of the U.S. Geological Survey (USGS). GAP helps to implement the Department of Interior?s goals of inventory,...

  19. Impact analysis of critical success factors on the benefits from statistical process control implementation

    Directory of Open Access Journals (Sweden)

    Fabiano Rodrigues Soriano

    Full Text Available Abstract The Statistical Process Control - SPC is a set of statistical techniques focused on process control, monitoring and analyzing variation causes in the quality characteristics and/or in the parameters used to control and process improvements. Implementing SPC in organizations is a complex task. The reasons for its failure are related to organizational or social factors such as lack of top management commitment and little understanding about its potential benefits. Other aspects concern technical factors such as lack of training on and understanding about the statistical techniques. The main aim of the present article is to understand the interrelations between conditioning factors associated with top management commitment (Support, SPC Training and Application, as well as to understand the relationships between these factors and the benefits associated with the implementation of the program. The Partial Least Squares Structural Equation Modeling (PLS-SEM was used in the analysis since the main goal is to establish the causal relations. A cross-section survey was used as research method to collect information of samples from Brazilian auto-parts companies, which were selected according to guides from the auto-parts industry associations. A total of 170 companies were contacted by e-mail and by phone in order to be invited to participate in the survey. However, just 93 industries agreed on participating, and only 43 answered the questionnaire. The results showed that the senior management support considerably affects the way companies develop their training programs. In turn, these trainings affect the way companies apply the techniques. Thus, it will reflect on the benefits gotten from implementing the program. It was observed that the managerial and technical aspects are closely connected to each other and that they are represented by the ratio between top management and training support. The technical aspects observed through SPC

  20. Nanotechnology in concrete: Critical review and statistical analysis

    Science.gov (United States)

    Glenn, Jonathan

    This thesis investigates the use of nanotechnology in an extensive literature search in the field of cement and concrete. A summary is presented. The research was divided into two categories: (1) nanoparticles and (2) nanofibers and nanotubes. The successes and challenges of each category is documented in this thesis. The data from the literature search is taken and analyzed using statistical prediction by the use of the Monte Carlo and Bayesian methods. It shows how statistical prediction can be used to analyze patterns and trends and also discover optimal additive dosages for concrete mixes.

  1. Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity

    Science.gov (United States)

    Mukherjee, Shashi Bajaj; Sen, Pradip Kumar

    2010-10-01

    Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.

  2. Categorical and nonparametric data analysis choosing the best statistical technique

    CERN Document Server

    Nussbaum, E Michael

    2014-01-01

    Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain

  3. Local starburst galaxies and their descendants. Statistics from the Sloan Digital Sky Survey

    Science.gov (United States)

    Bergvall, Nils; Marquart, Thomas; Way, Michael J.; Blomqvist, Anna; Holst, Emma; Ostlin, Goran; Zackrisson, Erik

    2016-01-01

    Despite strong interest in the starburst phenomenon in extragalactic astronomy, the concept remains ill-defined. Here we use a strict definition of starburst to examine the statistical properties of starburst galaxies in the local universe. We also seek to establish links between starburst galaxies, post-starburst (hereafter postburst) galaxies, and active galaxies. Data were selected from the Sloan Digital Sky Survey DR7. We applied a novel method of treating dust attenuation and derive star formation rates, ages, and stellar masses assuming a two-component stellar population model. Dynamical masses are calculated from the width of the H-alpha line. These masses agree excellently with the photometric masses. The mass (gas+stars) range is approximately 10( exp 9) - 10(exp 11.5) solar mass. As a selection criterion for starburst galaxies, we use, the birthrate parameter, b = SFR/SFR, requiring that b is greater than 3. For postburst galaxies, we use, the equivalent width of Hdelta in absorption with the criterion EW (sub Hdelta_abs) is greater than 6 A. Results. We find that only 1% of star-forming galaxies are starburst galaxies. They contribute 3-6% to the stellar production and are therefore unimportant for the local star formation activity. The median starburst age is 70 Myr roughly independent of mass, indicating that star formation is mainly regulated by local feedback processes. The b-parameter strongly depends on burst age. Values close to b = 60 are found at ages approximately 10 Myr, while almost no starbursts are found at ages greater than 1 Gyr. The median baryonic burst mass fraction of sub-L galaxies is 5% and decreases slowly towards high masses. The median mass fraction of the recent burst in the postburst sample is 5-10%. A smaller fraction of the postburst galaxies, however, originates in non-bursting galaxies. The age-mass distribution of the postburst progenitors (with mass fractions is greater than 3%) is bimodal with a break at logM(solar mass

  4. Child Mortality in a Developing Country: A Statistical Analysis

    Science.gov (United States)

    Uddin, Md. Jamal; Hossain, Md. Zakir; Ullah, Mohammad Ohid

    2009-01-01

    This study uses data from the "Bangladesh Demographic and Health Survey (BDHS] 1999-2000" to investigate the predictors of child (age 1-4 years) mortality in a developing country like Bangladesh. The cross-tabulation and multiple logistic regression techniques have been used to estimate the predictors of child mortality. The…

  5. Integrating the statistical analysis of spatial data in ecology

    Science.gov (United States)

    A. M. Liebhold; J. Gurevitch

    2002-01-01

    In many areas of ecology there is an increasing emphasis on spatial relationships. Often ecologists are interested in new ways of analyzing data with the objective of quantifying spatial patterns, and in designing surveys and experiments in light of the recognition that there may be underlying spatial pattern in biotic responses. In doing so, ecologists have adopted a...

  6. A Bayesian Statistical Analysis of the Enhanced Greenhouse Effect

    NARCIS (Netherlands)

    de Vos, A.F.; Tol, R.S.J.

    1998-01-01

    This paper demonstrates that there is a robust statistical relationship between the records of the global mean surface air temperature and the atmospheric concentration of carbon dioxide over the period 1870-1991. As such, the enhanced greenhouse effect is a plausible explanation for the observed

  7. A note on the statistical analysis of point judgment matrices

    African Journals Online (AJOL)

    by Saaty in the 1970s which has received considerable attention in the mathematical and statistical literature [11, 18]. The core of .... question is how to determine the weights associated with the objects. 3 Distributional approaches ..... Research Foundation of South Africa for financial support. The authors are also grateful.

  8. Statistical analysis of DNT detection using chemically functionalized microcantilever arrays

    DEFF Research Database (Denmark)

    Bosco, Filippo; Bache, M.; Hwu, E.-T.

    2012-01-01

    from 1 to 2 cantilevers have been reported, without any information on repeatability and reliability of the presented data. In explosive detection high reliability is needed and thus a statistical measurement approach needs to be developed and implemented. We have developed a DVD-based read-out system...

  9. Statistical analysis and optimization of copper biosorption capability ...

    African Journals Online (AJOL)

    enoh

    2012-03-01

    % glucose, 0.5% yeast extract, supplemented with 20 ml/L apple juice) with 15% ... represented by "+" sign, while dead cells, temperature of 25°C, dry weight of 0.13 ... optimum. Using the Microsoft Excel program, statistical t-.

  10. A statistical analysis on the leak detection performance of ...

    Indian Academy of Sciences (India)

    This paper attempts to provide a statistical insight on the concepts of leak detection performance of WSNs when deployed on overground and underground pipelines.The approach in the study employs the hypothesis testing problem to formulate a solution on the detection plan.Through the hypothesis test, the maximum ...

  11. Did Tanzania Achieve the Second Millennium Development Goal? Statistical Analysis

    Science.gov (United States)

    Magoti, Edwin

    2016-01-01

    Development Goal "Achieve universal primary education", the challenges faced, along with the way forward towards achieving the fourth Sustainable Development Goal "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all". Statistics show that Tanzania has made very promising steps…

  12. Herbal gardens of India: A statistical analysis report | Rao | African ...

    African Journals Online (AJOL)

    A knowledge system of the herbal garden in India was developed and these herbal gardens' information was statistically classified for efficient data processing, sharing and retrieving of information, which could act as a decision tool to the farmers, researchers, decision makers and policy makers in the field of medicinal ...

  13. Statistical analysis of stream sediment geochemical data from Oyi ...

    African Journals Online (AJOL)

    Ife Journal of Science ... The results of concentrations of twenty-four elements treated with both univariate and multivariate statistical analytical techniques revealed that all the elements analyzed except Co, Cr, Fe and V ... The cumulative probability plots of the elements showed that Mn and Cu consisted of one population.

  14. Multivariate Statistical Analysis Software Technologies for Astrophysical Research Involving Large Data Bases

    Science.gov (United States)

    Djorgovski, S. G.

    1994-01-01

    We developed a package to process and analyze the data from the digital version of the Second Palomar Sky Survey. This system, called SKICAT, incorporates the latest in machine learning and expert systems software technology, in order to classify the detected objects objectively and uniformly, and facilitate handling of the enormous data sets from digital sky surveys and other sources. The system provides a powerful, integrated environment for the manipulation and scientific investigation of catalogs from virtually any source. It serves three principal functions: image catalog construction, catalog management, and catalog analysis. Through use of the GID3* Decision Tree artificial induction software, SKICAT automates the process of classifying objects within CCD and digitized plate images. To exploit these catalogs, the system also provides tools to merge them into a large, complex database which may be easily queried and modified when new data or better methods of calibrating or classifying become available. The most innovative feature of SKICAT is the facility it provides to experiment with and apply the latest in machine learning technology to the tasks of catalog construction and analysis. SKICAT provides a unique environment for implementing these tools for any number of future scientific purposes. Initial scientific verification and performance tests have been made using galaxy counts and measurements of galaxy clustering from small subsets of the survey data, and a search for very high redshift quasars. All of the tests were successful and produced new and interesting scientific results. Attachments to this report give detailed accounts of the technical aspects of the SKICAT system, and of some of the scientific results achieved to date. We also developed a user-friendly package for multivariate statistical analysis of small and moderate-size data sets, called STATPROG. The package was tested extensively on a number of real scientific applications and has

  15. A Survey of Models and Algorithms for Social Influence Analysis

    Science.gov (United States)

    Sun, Jimeng; Tang, Jie

    Social influence is the behavioral change of a person because of the perceived relationship with other people, organizations and society in general. Social influence has been a widely accepted phenomenon in social networks for decades. Many applications have been built based around the implicit notation of social influence between people, such as marketing, advertisement and recommendations. With the exponential growth of online social network services such as Facebook and Twitter, social influence can for the first time be measured over a large population. In this chapter, we survey the research on social influence analysis with a focus on the computational aspects. First, we present statistical measurements related to social influence. Second, we describe the literature on social similarity and influences. Third, we present the research on social influence maximization which has many practical applications including marketing and advertisement.

  16. Survey Response Rates and Survey Administration in Counseling and Clinical Psychology: A Meta-Analysis

    Science.gov (United States)

    Van Horn, Pamela S.; Green, Kathy E.; Martinussen, Monica

    2009-01-01

    This article reports results of a meta-analysis of survey response rates in published research in counseling and clinical psychology over a 20-year span and describes reported survey administration procedures in those fields. Results of 308 survey administrations showed a weighted average response rate of 49.6%. Among possible moderators, response…

  17. Understanding of Statistical Terms Routinely Used in Presentations: A Survey among Residents who participate at a Summer School

    Directory of Open Access Journals (Sweden)

    Cosmina-Ioana BONDOR

    2014-12-01

    Full Text Available Aim: The aim of our study was to investigate the understanding of statistical terms commonly used in lectures presented at summer schools for residents and young specialists. Material and Method: A survey was distributed to all the participants at the “Diabetic neuropathy from theory to practice” Summer School, 2014. The program was addressed to residents or young specialists in diabetes, neurology, surgery, and orthopedic from Romania. The survey consists of 6 multiple-choice questions and the first four questions evaluate the understanding of statistical terms. Results: There were 51 (42.5% participants who completed the questionnaires. From 204 total questions 81 (39.7% had correct answers. At the question 1, where relative risk was evaluated, only 3 (5.9% respondents answered correctly while at the question 2 (number need to treat about 78.4% (40 of answers were correct. At the question 3 (sensitivity, 22 (43.1% respondents answer correct while at the question 4 (Receiver Operating Characteristic curves only 16 (31.4% respondents provided a correct answer. The overall mean score of correct answers was 1.56±0.91. Conclusion: Our study showed that young specialists who participated to the survey were not familiarized with simple statistical terms commonly used in presentations.

  18. A clinical and statistical survey of cutaneous changes in the first 120 hours of life

    Directory of Open Access Journals (Sweden)

    Dinkar J Sadana

    2014-01-01

    Full Text Available Background: The spectrum of dermatological manifestations during neonatal period varies from transient self-limiting conditions to serious dermatoses; the latter, fortunately few, are disproportionately stressful to the parents, who due to lack of specialized pediatric dermatology clinics frequently get tossed between a dermatologist and a pediatrician. Objectives: This study was formulated to record cutaneous changes over the first five postnatal days of life and to statistically correlate those changes occurring in ≥ 11 neonates with three (parity, associated illnesses, and mode of delivery maternal and three (sex, birth weight, and gestational age neonatal factors. Methods: This descriptive, cross-sectional study at a tertiary care hospital entailed recording detailed dermatological examination of 300 neonates having some (physiological and/or pathological cutaneous changes and their statistical evaluation using the Chi-square test and significance (P < 0.05 as above. Results: Superficial cutaneous desquamation (SCD, Mongolian spots (MS, and erythema toxicum neonatorum (ETN were the first three common changes among a total of 15 conditions observed overall; these three, as also milia and icterus, revealed statistical significance with both maternal as well as neonatal factors. Lanugo and napkin dermatitis (ND were statistically significant with respect to two neonatal factors and cradle cap (CC, a single maternal factor. Gestational age was of statistical significance regarding five cutaneous changes, associated maternal illness during pregnancy regarding four, birth weight as well as parity regarding three each, and sex of the neonate as well as mode of delivery regarding two each. Conclusion: Despite observing a statistically significant correlation of eight cutaneous changes with three maternal and/or three neonatal factors, more extensive studies in neonatal dermatology are required for validation of these unique statistical

  19. The Fragility of Statistically Significant Findings From Randomized Trials in Sports Surgery: A Systematic Survey.

    Science.gov (United States)

    Khan, Moin; Evaniew, Nathan; Gichuru, Mark; Habib, Anthony; Ayeni, Olufemi R; Bedi, Asheesh; Walsh, Michael; Devereaux, P J; Bhandari, Mohit

    2017-07-01

    High-quality, evidence-based orthopaedic care relies on the generation and translation of robust research evidence. The Fragility Index is a novel method for evaluating the robustness of statistically significant findings from randomized controlled trials (RCTs). It is defined as the minimum number of patients in 1 arm of a trial that would have to change status from a nonevent to an event to alter the results of the trial from statistically significant to nonsignificant. To calculate the Fragility Index of statistically significant results from clinical trials in sports medicine and arthroscopic surgery to characterize the robustness of the RCTs in these fields. A search was conducted in Medline, EMBASE, and PubMed for RCTs related to sports medicine and arthroscopic surgery from January 1, 2005, to October 30, 2015. Two reviewers independently assessed titles and abstracts for study eligibility, performed data extraction, and assessed risk of bias. The Fragility Index was calculated using the Fisher exact test for all statistically significant dichotomous outcomes from parallel-group RCTs. Bivariate correlation was performed to evaluate associations between the Fragility Index and trial characteristics. A total of 48 RCTs were included. The median sample size was 64 (interquartile range [IQR], 48.5-89.5), and the median total number of outcome events was 19 (IQR, 10-27). The median Fragility Index was 2 (IQR, 1-2.8), meaning that changing 2 patients from a nonevent to an event in the treatment arm changed the result to a statistically nonsignificant result, or P ≥ .05. Most statistically significant RCTs in sports medicine and arthroscopic surgery are not robust because their statistical significance can be reversed by changing the outcome status on only a few patients in 1 treatment group. Future work is required to determine whether routine reporting of the Fragility Index enhances clinicians' ability to detect trial results that should be viewed cautiously.

  20. The relationship between procrastination, learning strategies and statistics anxiety among Iranian college students: a canonical correlation analysis.

    Science.gov (United States)

    Vahedi, Shahrum; Farrokhi, Farahman; Gahramani, Farahnaz; Issazadegan, Ali

    2012-01-01

    Approximately 66-80%of graduate students experience statistics anxiety and some researchers propose that many students identify statistics courses as the most anxiety-inducing courses in their academic curriculums. As such, it is likely that statistics anxiety is, in part, responsible for many students delaying enrollment in these courses for as long as possible. This paper proposes a canonical model by treating academic procrastination (AP), learning strategies (LS) as predictor variables and statistics anxiety (SA) as explained variables. A questionnaire survey was used for data collection and 246-college female student participated in this study. To examine the mutually independent relations between procrastination, learning strategies and statistics anxiety variables, a canonical correlation analysis was computed. Findings show that two canonical functions were statistically significant. The set of variables (metacognitive self-regulation, source management, preparing homework, preparing for test and preparing term papers) helped predict changes of statistics anxiety with respect to fearful behavior, Attitude towards math and class, Performance, but not Anxiety. These findings could be used in educational and psychological interventions in the context of statistics anxiety reduction.

  1. Survey of statistical and sampling needs for environmental monitoring of commercial low-level radioactive waste disposal facilities

    Energy Technology Data Exchange (ETDEWEB)

    Eberhardt, L.L.; Thomas, J.M.

    1986-07-01

    This project was designed to develop guidance for implementing 10 CFR Part 61 and to determine the overall needs for sampling and statistical work in characterizing, surveying, monitoring, and closing commercial low-level waste sites. When cost-effectiveness and statistical reliability are of prime importance, then double sampling, compositing, and stratification (with optimal allocation) are identified as key issues. If the principal concern is avoiding questionable statistical practice, then the applicability of kriging (for assessing spatial pattern), methods for routine monitoring, and use of standard textbook formulae in reporting monitoring results should be reevaluated. Other important issues identified include sampling for estimating model parameters and the use of data from left-censored (less than detectable limits) distributions.

  2. Statistical methods for data analysis in particle physics

    CERN Document Server

    Lista, Luca

    2017-01-01

    This concise set of course-based notes provides the reader with the main concepts and tools needed to perform statistical analyses of experimental data, in particular in the field of high-energy physics (HEP). First, the book provides an introduction to probability theory and basic statistics, mainly intended as a refresher from readers’ advanced undergraduate studies, but also to help them clearly distinguish between the Frequentist and Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on both discoveries and upper limits, as many applications in HEP concern hypothesis testing, where the main goal is often to provide better and better limits so as to eventually be able to distinguish between competing hypotheses, or to rule out some of them altogether. Many worked-out examples will help newcomers to the field and graduate students alike understand the pitfalls involved in applying theoretical conc...

  3. Statistical methods for data analysis in particle physics

    CERN Document Server

    Lista, Luca

    2017-01-01

    This concise set of course-based notes provides the reader with the main concepts and tools needed to perform statistical analyses of experimental data, in particular in the field of high-energy physics (HEP). First, the book provides an introduction to probability theory and basic statistics, mainly intended as a refresher from readers’ advanced undergraduate studies, but also to help them clearly distinguish between the Frequentist and Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on both discoveries and upper limits, as many applications in HEP concern hypothesis testing, where the main goal is often to provide better and better limits so as to eventually be able to distinguish between competing hypotheses, or to rule out some of them altogether. Many worked-out examples will help newcomers to the field and graduate students alike understand the pitfalls involved in applying theoretical co...

  4. Statistical analysis of motion contrast in optical coherence tomography angiography

    CERN Document Server

    Cheng, Yuxuan; Pan, Cong; Lu, Tongtong; Hong, Tianyu; Ding, Zhihua; Li, Peng

    2015-01-01

    Optical coherence tomography angiography (Angio-OCT), mainly based on the temporal dynamics of OCT scattering signals, has found a range of potential applications in clinical and scientific researches. In this work, based on the model of random phasor sums, temporal statistics of the complex-valued OCT signals are mathematically described. Statistical distributions of the amplitude differential (AD) and complex differential (CD) Angio-OCT signals are derived. The theories are validated through the flow phantom and live animal experiments. Using the model developed in this work, the origin of the motion contrast in Angio-OCT is mathematically explained, and the implications in the improvement of motion contrast are further discussed, including threshold determination and its residual classification error, averaging method, and scanning protocol. The proposed mathematical model of Angio-OCT signals can aid in the optimal design of the system and associated algorithms.

  5. Common misconceptions about data analysis and statistics1

    Science.gov (United States)

    Motulsky, Harvey J

    2015-01-01

    Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word “significant”. (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012

  6. A clinical and statistical survey of cutaneous changes in the first 120 hours of life.

    Science.gov (United States)

    Sadana, Dinkar J; Sharma, Yugal K; Chaudhari, Nitin D; Dash, Kedarnath; Rizvi, Alia; Jethani, Sumit

    2014-11-01

    The spectrum of dermatological manifestations during neonatal period varies from transient self-limiting conditions to serious dermatoses; the latter, fortunately few, are disproportionately stressful to the parents, who due to lack of specialized pediatric dermatology clinics frequently get tossed between a dermatologist and a pediatrician. This study was formulated to record cutaneous changes over the first five postnatal days of life and to statistically correlate those changes occurring in ≥ 11 neonates with three (parity, associated illnesses, and mode of delivery) maternal and three (sex, birth weight, and gestational age) neonatal factors. This descriptive, cross-sectional study at a tertiary care hospital entailed recording detailed dermatological examination of 300 neonates having some (physiological and/or pathological) cutaneous changes and their statistical evaluation using the Chi-square test and significance (P cradle cap (CC), a single maternal factor. Gestational age was of statistical significance regarding five cutaneous changes, associated maternal illness during pregnancy regarding four, birth weight as well as parity regarding three each, and sex of the neonate as well as mode of delivery regarding two each. Despite observing a statistically significant correlation of eight cutaneous changes with three maternal and/or three neonatal factors, more extensive studies in neonatal dermatology are required for validation of these unique statistical correlations.

  7. The Digital Divide in Romania – A Statistical Analysis

    Directory of Open Access Journals (Sweden)

    Daniela BORISOV

    2012-06-01

    Full Text Available The digital divide is a subject of major importance in the current economic circumstances in which Information and Communication Technologies (ICT are seen as a significant determinant of increasing the domestic competitiveness and contribute to better life quality. Latest international reports regarding various aspects of ICT usage in modern society reveal a decrease of overall digital disparity towards the average trends of the worldwide ITC’s sector – this relates to latest advances of mobile and computer penetration rates, both for personal use and for households/ business. In Romania, the low starting point in the development of economy and society in the ICT direction was, in some extent, compensated by the rapid annual growth of the last decade. Even with these dynamic developments, the statistical data still indicate poor positions in European Union hierarchy; in this respect, the prospects of a rapid recovery of the low performance of the Romanian ICT endowment and usage and the issue continue to be regarded as a challenge for progress in economic and societal terms. The paper presents several methods for assessing the current state of ICT related aspects in terms of Internet usage based on the latest data provided by international databases. The current position of Romanian economy is judged according to several economy using statistical methods based on variability measurements: the descriptive statistics indicators, static measures of disparities and distance metrics.

  8. Learning to Translate: A Statistical and Computational Analysis

    Directory of Open Access Journals (Sweden)

    Marco Turchi

    2012-01-01

    Full Text Available We present an extensive experimental study of Phrase-based Statistical Machine Translation, from the point of view of its learning capabilities. Very accurate Learning Curves are obtained, using high-performance computing, and extrapolations of the projected performance of the system under different conditions are provided. Our experiments confirm existing and mostly unpublished beliefs about the learning capabilities of statistical machine translation systems. We also provide insight into the way statistical machine translation learns from data, including the respective influence of translation and language models, the impact of phrase length on performance, and various unlearning and perturbation analyses. Our results support and illustrate the fact that performance improves by a constant amount for each doubling of the data, across different language pairs, and different systems. This fundamental limitation seems to be a direct consequence of Zipf law governing textual data. Although the rate of improvement may depend on both the data and the estimation method, it is unlikely that the general shape of the learning curve will change without major changes in the modeling and inference phases. Possible research directions that address this issue include the integration of linguistic rules or the development of active learning procedures.

  9. Performance Analysis of Statistical Time Division Multiplexing Systems

    Directory of Open Access Journals (Sweden)

    Johnson A. AJIBOYE

    2010-12-01

    Full Text Available Multiplexing is a way of accommodating many input sources of a low capacity over a high capacity outgoing channel. Statistical Time Division Multiplexing (STDM is a technique that allows the number of users to be multiplexed over the channel more than the channel can afford. The STDM normally exploits unused time slots by the non-active users and allocates those slots for the active users. Therefore, STDM is appropriate for bursty sources. In this way STDM normally utilizes channel bandwidth better than traditional Time Division Multiplexing (TDM. In this work, the statistical multiplexer is viewed as M/M/1queuing system and the performance is measured by comparing analytical results to simulation results using Matlab. The index used to determine the performance of the statistical multiplexer is the number of packets both in the system and the queue. Comparison of analytical results was also done between M/M/1 and M/M/2 and also between M/M/1 and M/D/1 queue systems. At high utilizations, M/M/2 performs better than M/M/1. M/D/1 also outperforms M/M1.

  10. Advances in Statistical Methods for Meta-Analysis.

    Science.gov (United States)

    Hedges, Larry V.

    1984-01-01

    The adequacy of traditional effect size measures for research synthesis is challenged. Analogues to analysis of variance and multiple regression analysis for effect sizes are presented. The importance of tests for the consistency of effect sizes in interpreting results, and problems in obtaining well-specified models for meta-analysis are…

  11. Research design and statistical methods in Indian medical journals: a retrospective survey.

    Science.gov (United States)

    Hassan, Shabbeer; Yellur, Rajashree; Subramani, Pooventhan; Adiga, Poornima; Gokhale, Manoj; Iyer, Manasa S; Mayya, Shreemathi S

    2015-01-01

    Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were - study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, pdesign decreased significantly (χ2=16.783, Φ=0.12 pdesigns has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, pdesigns have decreased significantly. Randomized clinical trials are quite rarely published and have high proportion of methodological problems.

  12. Investigating Faculty Familiarity with Assessment Terminology by Applying Cluster Analysis to Interpret Survey Data

    Science.gov (United States)

    Raker, Jeffrey R.; Holme, Thomas A.

    2014-01-01

    A cluster analysis was conducted with a set of survey data on chemistry faculty familiarity with 13 assessment terms. Cluster groupings suggest a high, middle, and low overall familiarity with the terminology and an independent high and low familiarity with terms related to fundamental statistics. The six resultant clusters were found to be…

  13. A survey of gunshot residue analysis methods.

    Science.gov (United States)

    Singer, R L; Davis, D; Houck, M M

    1996-03-01

    A survey was sent to 80 forensic laboratories in 44 States and two Canadian Provinces concerning methodology in analyzing gunshot residue (GSR) and interpreting the results. Of the 80 surveys, 50 (63%) were returned completed. Questions included standard procedures, collection methods, thresholding problems and specificity of data. These results are compared to a previous survey reported in 1990. Implications for the interpretation and future study of these methods are discussed.

  14. Improving the sampling strategy of the Joint Danube Survey 3 (2013) by means of multivariate statistical techniques applied on selected physico-chemical and biological data.

    Science.gov (United States)

    Hamchevici, Carmen; Udrea, Ion

    2013-11-01

    The concept of basin-wide Joint Danube Survey (JDS) was launched by the International Commission for the Protection of the Danube River (ICPDR) as a tool for investigative monitoring under the Water Framework Directive (WFD), with a frequency of 6 years. The first JDS was carried out in 2001 and its success in providing key information for characterisation of the Danube River Basin District as required by WFD lead to the organisation of the second JDS in 2007, which was the world's biggest river research expedition in that year. The present paper presents an approach for improving the survey strategy for the next planned survey JDS3 (2013) by means of several multivariate statistical techniques. In order to design the optimum structure in terms of parameters and sampling sites, principal component analysis (PCA), factor analysis (FA) and cluster analysis were applied on JDS2 data for 13 selected physico-chemical and one biological element measured in 78 sampling sites located on the main course of the Danube. Results from PCA/FA showed that most of the dataset variance (above 75%) was explained by five varifactors loaded with 8 out of 14 variables: physical (transparency and total suspended solids), relevant nutrients (N-nitrates and P-orthophosphates), feedback effects of primary production (pH, alkalinity and dissolved oxygen) and algal biomass. Taking into account the representation of the factor scores given by FA versus sampling sites and the major groups generated by the clustering procedure, the spatial network of the next survey could be carefully tailored, leading to a decreasing of sampling sites by more than 30%. The approach of target oriented sampling strategy based on the selected multivariate statistics can provide a strong reduction in dimensionality of the original data and corresponding costs as well, without any loss of information.

  15. A STATISTICAL SURVEY OF DIOXIN-LIKE COMPOUNDS IN U.S. PORK FAT

    Science.gov (United States)

    The purpose of this paper is to report on the results of a joint survey of the United States Department of Agriculture (USDA) and the United States Environmental Protection Agency (EPA) on the rate of occurrence and concentration of chlorinated dibenzo-p-dioxins (CDDs), chlorinat...

  16. A Statistical Study of Brown Dwarf Companions from the SDSS-III MARVELS Survey

    Science.gov (United States)

    Grieves, Nolan; Ge, Jian; Thomas, Neil; Ma, Bo; De Lee, Nathan M.; Lee, Brian L.; Fleming, Scott W.; Sithajan, Sirinrat; Varosi, Frank; Liu, Jian; Zhao, Bo; Li, Rui; Agol, Eric; MARVELS Team

    2016-01-01

    We present 23 new Brown Dwarf (BD) candidates from the Multi-object APO Radial-Velocity Exoplanet Large-Area Survey (MARVELS) of the Sloan Digital Sky Survey III (SDSS-III). The BD candidates were selected from the processed MARVELS data using the latest University of Florida 2D pipeline, which shows significant improvement and reduction of systematic errors over the 1D pipeline results included in the SDSS Data Release 12. This sample is the largest BD yield from a single radial velocity survey. Of the 23 candidates, 18 are around main sequence stars and 5 are around giant stars. Given a giant contamination rate of ~24% for the MARVELS survey, we find a BD occurrence rate around main sequence stars of ~0.7%, which agrees with previous studies and confirms the BD desert, while the BD occurrence rate around the MARVELS giant stars is ~0.6%. Preliminary results show that our new candidates around solar type stars support a two population hypothesis, where BDs are divided at a mass of ~42.5 MJup. BDs less massive than 42.5 MJup have eccentricity distributions consistent with planet-planet scattering models, where BDs more massive than 42.5 MJup have both period and eccentricity distributions similar to that of stellar binaries. Special Brown Dwarf systems such as multiple BD systems and highly eccentric BDs will also be presented.

  17. A Survey of Word Reordering in Statistical Machine Translation : Computational Models and Language Phenomena

    NARCIS (Netherlands)

    Bisazza, A.; Federico, M.

    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears

  18. 76 FR 67405 - Proposed Information Collection; Comment Request; Federal Statistical System Public Opinion Survey

    Science.gov (United States)

    2011-11-01

    ... From the Federal Register Online via the Government Publishing Office DEPARTMENT OF COMMERCE U.S. Census Bureau Proposed Information Collection; Comment Request; Federal Statistical System Public Opinion... opinion data will enable the Census Bureau to better understand public perceptions, which will provide...

  19. Nature and statistical properties of quasar associated absorption systems in the XQ-100 Legacy Survey

    DEFF Research Database (Denmark)

    Perrotta, Serena; D'Odorico, Valentina; Prochaska, J. Xavier

    2016-01-01

    We statistically study the physical properties of a sample of narrow absorption line (NAL) systems looking for empirical evidences to distinguish between intrinsic and intervening NALs without taking into account any a priori definition or velocity cut-off. We analyze the spectra of 100 quasars...

  20. New statistical potential for quality assessment of protein models and a survey of energy functions

    Directory of Open Access Journals (Sweden)

    Rykunov Dmitry

    2010-03-01

    Full Text Available Abstract Background Scoring functions, such as molecular mechanic forcefields and statistical potentials are fundamentally important tools in protein structure modeling and quality assessment. Results The performances of a number of publicly available scoring functions are compared with a statistical rigor, with an emphasis on knowledge-based potentials. We explored the effect on accuracy of alternative choices for representing interaction center types and other features of scoring functions, such as using information on solvent accessibility, on torsion angles, accounting for secondary structure preferences and side chain orientation. Partially based on the observations made, we present a novel residue based statistical potential, which employs a shuffled reference state definition and takes into account the mutual orientation of residue side chains. Atom- and residue-level statistical potentials and Linux executables to calculate the energy of a given protein proposed in this work can be downloaded from http://www.fiserlab.org/potentials. Conclusions Among the most influential terms we observed a critical role of a proper reference state definition and the benefits of including information about the microenvironment of interaction centers. Molecular mechanical potentials were also tested and found to be over-sensitive to small local imperfections in a structure, requiring unfeasible long energy relaxation before energy scores started to correlate with model quality.

  1. Geostatistical and multivariate statistical analysis of heavily and manifoldly contaminated soil samples.

    Science.gov (United States)

    Schaefer, Kristin; Einax, Jürgen W; Simeonov, Vasil; Tsakovski, Stefan

    2010-04-01

    The surroundings of the former Kremikovtzi steel mill near Sofia (Bulgaria) are influenced by various emissions from the factory. In addition to steel and alloys, they produce different products based on inorganic compounds in different smelters. Soil in this region is multiply contaminated. We collected 65 soil samples and analyzed 15 elements by different methods of atomic spectroscopy for a survey of this field site. Here we present a novel hybrid approach for environmental risk assessment of polluted soil combining geostatistical methods and source apportionment modeling. We could distinguish areas with heavily and slightly polluted soils in the vicinity of the iron smelter by applying unsupervised pattern recognition methods. This result was supported by geostatistical methods such as semivariogram analysis and kriging. The modes of action of the metals examined differ significantly in such a way that iron and lead account for the main pollutants of the iron smelter, whereas, e.g., arsenic shows a haphazard distribution. The application of factor analysis and source-apportionment modeling on absolute principal component scores revealed novel information about the composition of the emissions from the different stacks. It is possible to estimate the impact of every element examined on the pollution due to their emission source. This investigation allows an objective assessment of the different spatial distributions of the elements examined in the soil of the Kremikovtzi region. The geostatistical analysis illustrates this distribution and is supported by multivariate statistical analysis revealing relations between the elements.

  2. Toward a theory of statistical tree-shape analysis

    DEFF Research Database (Denmark)

    Feragen, Aasa; Lo, Pechin Chien Pau; de Bruijne, Marleen

    2013-01-01

    In order to develop statistical methods for shapes with a tree-structure, we construct a shape space framework for tree-shapes and study metrics on the shape space. This shape space has singularities, which correspond to topological transitions in the represented trees. We study two closely related...... metrics on the shape space, TED and QED. QED is a quotient Euclidean distance arising naturally from the shape space formulation, while TED is the classical tree edit distance. Using Gromov's metric geometry we gain new insight into the geometries defined by TED and QED. We show that the new metric QED...

  3. Introduction to statistical data analysis for the life sciences

    CERN Document Server

    Ekstrom, Claus Thorn

    2014-01-01

    This text provides a computational toolbox that enables students to analyze real datasets and gain the confidence and skills to undertake more sophisticated analyses. Although accessible with any statistical software, the text encourages a reliance on R. For those new to R, an introduction to the software is available in an appendix. The book also includes end-of-chapter exercises as well as an entire chapter of case exercises that help students apply their knowledge to larger datasets and learn more about approaches specific to the life sciences.

  4. Improving the Conduct and Reporting of Statistical Analysis in Psychology.

    Science.gov (United States)

    Sijtsma, Klaas; Veldkamp, Coosje L S; Wicherts, Jelte M

    2016-03-01

    We respond to the commentaries Waldman and Lilienfeld (Psychometrika, 2015) and Wigboldus and Dotch (Psychometrika, 2015) provided in response to Sijtsma's (Sijtsma in Psychometrika, 2015) discussion article on questionable research practices. Specifically, we discuss the fear of an increased dichotomy between substantive and statistical aspects of research that may arise when the latter aspects are laid entirely in the hands of a statistician, remedies for false positives and replication failure, and the status of data exploration, and we provide a re-definition of the concept of questionable research practices.

  5. Bayesian statistical analysis of censored data in geotechnical engineering

    DEFF Research Database (Denmark)

    Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans

    2000-01-01

    The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...... is available about the soil parameter distribution.The present paper shows how a characteristic value by computer calcula-tions can be assessed systematically from the actual sample of censored datasupplemented with prior information from a soil parameter data base....

  6. An invariant approach to statistical analysis of shapes

    CERN Document Server

    Lele, Subhash R

    2001-01-01

    INTRODUCTIONA Brief History of MorphometricsFoundations for the Study of Biological FormsDescription of the data SetsMORPHOMETRIC DATATypes of Morphometric DataLandmark Homology and CorrespondenceCollection of Landmark CoordinatesReliability of Landmark Coordinate DataSummarySTATISTICAL MODELS FOR LANDMARK COORDINATE DATAStatistical Models in GeneralModels for Intra-Group VariabilityEffect of Nuisance ParametersInvariance and Elimination of Nuisance ParametersA Definition of FormCoordinate System Free Representation of FormEst

  7. JAWS data collection, analysis highlights, and microburst statistics

    Science.gov (United States)

    Mccarthy, J.; Roberts, R.; Schreiber, W.

    1983-01-01

    Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.

  8. Data analysis of asymmetric structures advanced approaches in computational statistics

    CERN Document Server

    Saito, Takayuki

    2004-01-01

    Data Analysis of Asymmetric Structures provides a comprehensive presentation of a variety of models and theories for the analysis of asymmetry and its applications and provides a wealth of new approaches in every section. It meets both the practical and theoretical needs of research professionals across a wide range of disciplines and  considers data analysis in fields such as psychology, sociology, social science, ecology, and marketing. In seven comprehensive chapters this guide details theories, methods, and models for the analysis of asymmetric structures in a variety of disciplines and presents future opportunities and challenges affecting research developments and business applications.

  9. Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis

    Science.gov (United States)

    Konrad, T. G.; Kropfli, R. A.

    1975-01-01

    Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.

  10. The R software fundamentals of programming and statistical analysis

    CERN Document Server

    Lafaye de Micheaux, Pierre; Liquet, Benoit

    2013-01-01

    The contents of The R Software are presented so as to be both comprehensive and easy for the reader to use. Besides its application as a self-learning text, this book can support lectures on R at any level from beginner to advanced. This book can serve as a textbook on R for beginners as well as more advanced users, working on Windows, MacOs or Linux OSes. The first part of the book deals with the heart of the R language and its fundamental concepts, including data organization, import and export, various manipulations, documentation, plots, programming and maintenance.  The last chapter in this part deals with oriented object programming as well as interfacing R with C/C++ or Fortran, and contains a section on debugging techniques. This is followed by the second part of the book, which provides detailed explanations on how to perform many standard statistical analyses, mainly in the Biostatistics field. Topics from mathematical and statistical settings that are included are matrix operations, integration, o...

  11. A COMPARISON OF SOME STATISTICAL TECHNIQUES FOR ROAD ACCIDENT ANALYSIS

    NARCIS (Netherlands)

    OPPE, S INST ROAD SAFETY RES, SWOV

    1992-01-01

    At the TRRL/SWOV Workshop on Accident Analysis Methodology, heldin Amsterdam in 1988, the need to establish a methodology for the analysis of road accidents was firmly stated by all participants. Data from different countries cannot be compared because there is no agreement on research methodology,

  12. Using multivariate statistical analysis to assess changes in water ...

    African Journals Online (AJOL)

    Canonical correspondence analysis (CCA) showed that the environmental variables used in the analysis, discharge and month of sampling, explained a small proportion of the total variance in the data set – less than 10% at each site. However, the total data set variance, explained by the 4 hypothetical axes generated by ...

  13. Sealed-bid auction of Netherlands mussels: statistical analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.; van Schaik, F.D.J.

    2011-01-01

    This article presents an econometric analysis of the many data on the sealed-bid auction that sells mussels in Yerseke town, the Netherlands. The goals of this analysis are obtaining insight into the important factors that determine the price of these mussels, and quantifying the performance of an

  14. Einstein Slew Survey: Data analysis innovations

    Science.gov (United States)

    Elvis, Martin S.; Plummer, David; Schachter, Jonathan F.; Fabbiano, G.

    1992-01-01

    Several new methods were needed in order to make the Einstein Slew X-ray Sky Survey. The innovations which enabled the Slew Survey to be done are summarized. These methods included experimental approach to large projects, parallel processing on a LAN, percolation source detection, minimum action identifications, and rapid dissemination of the whole data base.

  15. Greek Alcohol Survey: Results and Analysis.

    Science.gov (United States)

    Jensen, Wesley; And Others

    Alcohol use among 458 members of Greek fraternities and sororities at the University of North Dakota was surveyed. The survey instrument, which was an adaptation of a questionnaire developed by Michael A. Looney, was directed to frequency of use, amounts consumed, type of beverage, attitudes, and demographic information. It was found that…

  16. Analysis of Bidirectional Associative Memory using Self-consistent Signal to Noise Analysis and Statistical Neurodynamics

    Science.gov (United States)

    Shouno, Hayaru; Kido, Shoji; Okada, Masato

    2004-09-01

    Bidirectional associative memory (BAM) is a kind of an artificial neural network used to memorize and retrieve heterogeneous pattern pairs. Many efforts have been made to improve BAM from the the viewpoint of computer application, and few theoretical studies have been done. We investigated the theoretical characteristics of BAM using a framework of statistical-mechanical analysis. To investigate the equilibrium state of BAM, we applied self-consistent signal to noise analysis (SCSNA) and obtained a macroscopic parameter equations and relative capacity. Moreover, to investigate not only the equilibrium state but also the retrieval process of reaching the equilibrium state, we applied statistical neurodynamics to the update rule of BAM and obtained evolution equations for the macroscopic parameters. These evolution equations are consistent with the results of SCSNA in the equilibrium state.

  17. Hydration sites of unpaired RNA bases: a statistical analysis of the PDB structures

    Directory of Open Access Journals (Sweden)

    Carugo Oliviero

    2011-10-01

    Full Text Available Abstract Background Hydration is crucial for RNA structure and function. X-ray crystallography is the most commonly used method to determine RNA structures and hydration and, therefore, statistical surveys are based on crystallographic results, the number of which is quickly increasing. Results A statistical analysis of the water molecule distribution in high-resolution X-ray structures of unpaired RNA nucleotides showed that: different bases have the same penchant to be surrounded by water molecules; clusters of water molecules indicate possible hydration sites, which, in some cases, match those of the major and minor grooves of RNA and DNA double helices; complex hydrogen bond networks characterize the solvation of the nucleotides, resulting in a significant rigidity of the base and its surrounding water molecules. Interestingly, the hydration sites around unpaired RNA bases do not match, in general, the positions that are occupied by the second nucleotide when the base-pair is formed. Conclusions The hydration sites around unpaired RNA bases were found. They do not replicate the atom positions of complementary bases in the Watson-Crick pairs.

  18. 2012 Anthropometric Survey of U.S. Army Pilot Personnel: Methods and Summary Statistics

    Science.gov (United States)

    2016-05-01

    Development and Engineering Center. Goals of the survey were to acquire a large body of data from comparably measured males and females to serve the Army...II Pilots Race/Ethnicity Males Females Frequency Percent Frequency Percent White, not of Hispanic descent 842 86.18 30 71.43 Black , not of Hispanic...Representative Age/Race Group Weights for ANSUR II Female Pilots Database (n=42) Age Group White, Not of Hispanic Descent Black , Not of

  19. A Clinical and Statistical Survey of Cutaneous Changes in the First 120 Hours of Life

    OpenAIRE

    Sadana, Dinkar J; Sharma, Yugal K; Chaudhari, Nitin D; Dash, Kedarnath; Rizvi, Alia; Jethani, Sumit

    2014-01-01

    Background: The spectrum of dermatological manifestations during neonatal period varies from transient self-limiting conditions to serious dermatoses; the latter, fortunately few, are disproportionately stressful to the parents, who due to lack of specialized pediatric dermatology clinics frequently get tossed between a dermatologist and a pediatrician. Objectives: This study was formulated to record cutaneous changes over the first five postnatal days of life and to statistically correlate t...

  20. Statistical analysis of questionnaires a unified approach based on R and Stata

    CERN Document Server

    Bartolucci, Francesco; Gnaldi, Michela

    2015-01-01

    Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va

  1. Statistical Analysis of Conductor Motion in LHC Superconducting Dipole Magnets

    CERN Document Server

    Calvi, M; Pugnat, P; Siemko, A

    2004-01-01

    Premature training quenches are usually caused by the transient energy release within the magnet coil as it is energised. The dominant disturbances originate in cable motion and produce observable rapid variation in voltage signals called spikes. The experimental set up and the raw data treatment to detect these phenomena are briefly recalled. The statistical properties of different features of spikes are presented like for instance the maximal amplitude, the energy, the duration and the time correlation between events. The parameterisation of the mechanical activity of magnets is addressed. The mechanical activity of full-scale prototype and first preseries LHC dipole magnets is analysed and correlations with magnet manufacturing procedures and quench performance are established. The predictability of the quench occurrence is discussed and examples presented.

  2. PHAST: Protein-like heteropolymer analysis by statistical thermodynamics

    Science.gov (United States)

    Frigori, Rafael B.

    2017-06-01

    PHAST is a software package written in standard Fortran, with MPI and CUDA extensions, able to efficiently perform parallel multicanonical Monte Carlo simulations of single or multiple heteropolymeric chains, as coarse-grained models for proteins. The outcome data can be straightforwardly analyzed within its microcanonical Statistical Thermodynamics module, which allows for computing the entropy, caloric curve, specific heat and free energies. As a case study, we investigate the aggregation of heteropolymers bioinspired on Aβ25-33 fragments and their cross-seeding with IAPP20-29 isoforms. Excellent parallel scaling is observed, even under numerically difficult first-order like phase transitions, which are properly described by the built-in fully reconfigurable force fields. Still, the package is free and open source, this shall motivate users to readily adapt it to specific purposes.

  3. Detailed statistical analysis plan for the pulmonary protection trial

    DEFF Research Database (Denmark)

    Buggeskov, Katrine B; Jakobsen, Janus C; Secher, Niels H

    2014-01-01

    BACKGROUND: Pulmonary dysfunction complicates cardiac surgery that includes cardiopulmonary bypass. The pulmonary protection trial evaluates effect of pulmonary perfusion on pulmonary function in patients suffering from chronic obstructive pulmonary disease. This paper presents the statistical plan...... for the main publication to avoid risk of outcome reporting bias, selective reporting, and data-driven results as an update to the published design and method for the trial. RESULTS: The pulmonary protection trial is a randomized, parallel group clinical trial that assesses the effect of pulmonary perfusion......: The pulmonary protection trial investigates the effect of pulmonary perfusion during cardiopulmonary bypass in chronic obstructive pulmonary disease patients. A preserved oxygenation index following pulmonary perfusion may indicate an effect and inspire to a multicenter confirmatory trial to assess a more...

  4. Statistical mechanics analysis of LDPC coding in MIMO Gaussian channels

    Energy Technology Data Exchange (ETDEWEB)

    Alamino, Roberto C; Saad, David [Neural Computing Research Group, Aston University, Birmingham B4 7ET (United Kingdom)

    2007-10-12

    Using analytical methods of statistical mechanics, we analyse the typical behaviour of a multiple-input multiple-output (MIMO) Gaussian channel with binary inputs under low-density parity-check (LDPC) network coding and joint decoding. The saddle point equations for the replica symmetric solution are found in particular realizations of this channel, including a small and large number of transmitters and receivers. In particular, we examine the cases of a single transmitter, a single receiver and symmetric and asymmetric interference. Both dynamical and thermodynamical transitions from the ferromagnetic solution of perfect decoding to a non-ferromagnetic solution are identified for the cases considered, marking the practical and theoretical limits of the system under the current coding scheme. Numerical results are provided, showing the typical level of improvement/deterioration achieved with respect to the single transmitter/receiver result, for the various cases.

  5. Statistical analysis of complex systems with nonclassical invariant measures

    KAUST Repository

    Fratalocchi, Andrea

    2011-02-28

    I investigate the problem of finding a statistical description of a complex many-body system whose invariant measure cannot be constructed stemming from classical thermodynamics ensembles. By taking solitons as a reference system and by employing a general formalism based on the Ablowitz-Kaup-Newell-Segur scheme, I demonstrate how to build an invariant measure and, within a one-dimensional phase space, how to develop a suitable thermodynamics. A detailed example is provided with a universal model of wave propagation, with reference to a transparent potential sustaining gray solitons. The system shows a rich thermodynamic scenario, with a free-energy landscape supporting phase transitions and controllable emergent properties. I finally discuss the origin of such behavior, trying to identify common denominators in the area of complex dynamics.

  6. Statistical Analysis of Upper Bound using Data with Uncertainties

    CERN Document Server

    Tng, Barry Jia Hao

    2014-01-01

    Let $F$ be the unknown distribution of a non-negative continuous random variable. We would like to determine if $supp(F) \\subseteq [0,c]$ where $c$ is a constant (a proposed upper bound). Instead of directly observing $X_1,...,X_n i.i.d. \\sim F$, we only get to observe as data $Y_1,...,Y_n$ where $Y_i = X_i + \\epsilon_i$, with $\\epsilon_i$ being random variables representing errors. In this paper, we will explore methods to handle this statistical problem for two primary cases - parametric and nonparametric. The data from deep inelastic scattering experiments on measurements of $R=\\sigma_L / \\sigma_T$ would be used to test code which has been written to implement the discussed methods.

  7. Statistical analysis of NOMAO customer votes for spots of France

    CERN Document Server

    Palovics, Robert; Benczur, Andras; Pap, Julia; Ermann, Leonardo; Phan, Samuel; Chepelianskii, Alexei D; Shepelyansky, Dima L

    2015-01-01

    We investigate the statistical properties of votes of customers for spots of France collected by the startup company NOMAO. The frequencies of votes per spot and per customer are characterized by a power law distributions which remain stable on a time scale of a decade when the number of votes is varied by almost two orders of magnitude. Using the computer science methods we explore the spectrum and the eigenvalues of a matrix containing user ratings to geolocalized items. Eigenvalues nicely map to large towns and regions but show certain level of instability as we modify the interpretation of the underlying matrix. We evaluate imputation strategies that provide improved prediction performance by reaching geographically smooth eigenvectors. We point on possible links between distribution of votes and the phenomenon of self-organized criticality.

  8. Statistical Analysis of Complexity Generators for Cost Estimation

    Science.gov (United States)

    Rowell, Ginger Holmes

    1999-01-01

    Predicting the cost of cutting edge new technologies involved with spacecraft hardware can be quite complicated. A new feature of the NASA Air Force Cost Model (NAFCOM), called the Complexity Generator, is being developed to model the complexity factors that drive the cost of space hardware. This parametric approach is also designed to account for the differences in cost, based on factors that are unique to each system and subsystem. The cost driver categories included in this model are weight, inheritance from previous missions, technical complexity, and management factors. This paper explains the Complexity Generator framework, the statistical methods used to select the best model within this framework, and the procedures used to find the region of predictability and the prediction intervals for the cost of a mission.

  9. Statistical Lineament Analysis in South Greenland Based on Landsat Imagery

    DEFF Research Database (Denmark)

    Conradsen, Knut; Nilsson, Gert; Thyrsted, Tage

    1986-01-01

    Linear features, mapped visually from MSS channel-7 photoprints (1: 1 000 000) of Landsat images from South Greenland, were digitized and analyzed statistically. A sinusoidal curve was fitted to the frequency distribution which was then divided into ten significant classes of azimuthal trends. Maps...... showing the density of linear features for each of the ten classes indicate that many of the classes are distributed in zones defined by elongate maxima or rows of maxima. In cases where the elongate maxima and the linear feature direction of the class in question are parallel, a zone of major crustal...... discontinuity is inferred. In the area investigated, such zones coincide with geochemical boundaries and graben structures, and the intersections of some zones seem to control intrusion sites. In cases where there is no parallelism between the elongate maxima and the linear feature direction, an en echelon...

  10. Comparative Analysis of Kernel Methods for Statistical Shape Learning

    National Research Council Canada - National Science Library

    Rathi, Yogesh; Dambreville, Samuel; Tannenbaum, Allen

    2006-01-01

    .... In this work, we perform a comparative analysis of shape learning techniques such as linear PCA, kernel PCA, locally linear embedding and propose a new method, kernelized locally linear embedding...

  11. Consolidity analysis for fully fuzzy functions, matrices, probability and statistics

    OpenAIRE

    Walaa Ibrahim Gabr

    2015-01-01

    The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, ...

  12. Statistical methods for temporal and space-time analysis of community composition data.

    Science.gov (United States)

    Legendre, Pierre; Gauthier, Olivier

    2014-03-07

    This review focuses on the analysis of temporal beta diversity, which is the variation in community composition along time in a study area. Temporal beta diversity is measured by the variance of the multivariate community composition time series and that variance can be partitioned using appropriate statistical methods. Some of these methods are classical, such as simple or canonical ordination, whereas others are recent, including the methods of temporal eigenfunction analysis developed for multiscale exploration (i.e. addressing several scales of variation) of univariate or multivariate response data, reviewed, to our knowledge for the first time in this review. These methods are illustrated with ecological data from 13 years of benthic surveys in Chesapeake Bay, USA. The following methods are applied to the Chesapeake data: distance-based Moran's eigenvector maps, asymmetric eigenvector maps, scalogram, variation partitioning, multivariate correlogram, multivariate regression tree, and two-way MANOVA to study temporal and space-time variability. Local (temporal) contributions to beta diversity (LCBD indices) are computed and analysed graphically and by regression against environmental variables, and the role of species in determining the LCBD values is analysed by correlation analysis. A tutorial detailing the analyses in the R language is provided in an appendix.

  13. Multivariate Statistical Analysis of the Tularosa-Hueco Basin

    Science.gov (United States)

    Agrawala, G.; Walton, J. C.

    2006-12-01

    The border region is growing rapidly and experiencing a sharp decline both in water quality and availability putting a strain on the quickly diminishing resource. Since water is used primarily for agricultural, domestic, commercial, livestock, mining and power generation, its rapid depletion is of major concern in the region. Tools such as Principal Component Analysis (PCA), Correspondence Analysis and Cluster Analysis have the potential to present new insight into this problem. The Tularosa-Hueco Basin is analyzed here using some of these Multivariate Analysis methods. PCA is applied to geo-chemical data from the region and a Cluster Analysis is applied to the results in order to group wells with similar characteristics. The derived Principal Axis and well groups are presented as biplots and overlaid on a digital elevation map of the region providing a visualization of potential interactions and flow path between surface water and ground water. Simulation by this modeling technique give a valuable insight to the water chemistry and the potential pollution threats to the already water diminishing resources.

  14. Statistical and Spatial Analysis of Borderland Ground Water Geochemistry

    Science.gov (United States)

    Agrawala, G. K.; Woocay, A.; Walton, J. C.

    2007-12-01

    The border region is growing rapidly and experiencing a sharp decline both in water quality and availability putting a strain on the quickly diminishing resource. Since water is used primarily for agricultural, domestic, commercial, livestock, mining and power generation, its rapid depletion is of major concern in the region. Tools such as Principal Component Analysis (PCA), Correspondence Analysis and Cluster Analysis have the potential to present new insight into this problem. The Borderland groundwater is analyzed here using some of these Multivariate Analysis methods. PCA is applied to geo-chemical data from the region and a Cluster Analysis is applied to the results in order to group wells with similar characteristics. The derived Principal Axis and well groups are presented as biplots and overlaid on a digital elevation map of the region providing a visualization of potential interactions and flow path between surface water and ground water. Simulation by this modeling technique give a valuable insight to the water chemistry and the potential pollution threats to the already water diminishing resources.

  15. New Statistical Approach to the Analysis of Hierarchical Data

    Science.gov (United States)

    Neuman, S. P.; Guadagnini, A.; Riva, M.

    2014-12-01

    Many variables possess a hierarchical structure reflected in how their increments vary in space and/or time. Quite commonly the increments (a) fluctuate in a highly irregular manner; (b) possess symmetric, non-Gaussian frequency distributions characterized by heavy tails that often decay with separation distance or lag; (c) exhibit nonlinear power-law scaling of sample structure functions in a midrange of lags, with breakdown in such scaling at small and large lags; (d) show extended power-law scaling (ESS) at all lags; and (e) display nonlinear scaling of power-law exponent with order of sample structure function. Some interpret this to imply that the variables are multifractal, which explains neither breakdowns in power-law scaling nor ESS. We offer an alternative interpretation consistent with all above phenomena. It views data as samples from stationary, anisotropic sub-Gaussian random fields subordinated to truncated fractional Brownian motion (tfBm) or truncated fractional Gaussian noise (tfGn). The fields are scaled Gaussian mixtures with random variances. Truncation of fBm and fGn entails filtering out components below data measurement or resolution scale and above domain scale. Our novel interpretation of the data allows us to obtain maximum likelihood estimates of all parameters characterizing the underlying truncated sub-Gaussian fields. These parameters in turn make it possible to downscale or upscale all statistical moments to situations entailing smaller or larger measurement or resolution and sampling scales, respectively. They also allow one to perform conditional or unconditional Monte Carlo simulations of random field realizations corresponding to these scales. Aspects of our approach are illustrated on field and laboratory measured porous and fractured rock permeabilities, as well as soil texture characteristics and neural network estimates of unsaturated hydraulic parameters in a deep vadose zone near Phoenix, Arizona. We also use our approach

  16. Power flow as a complement to statistical energy analysis and finite element analysis

    Science.gov (United States)

    Cuschieri, J. M.

    1987-01-01

    Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.

  17. A Statistical Survey of Peculiar L and T Dwarfs in SDSS, 2MASS, and WISE

    Science.gov (United States)

    Kellogg, Kendra; Metchev, Stanimir; Miles-Páez, Paulo A.; Tannock, Megan E.

    2017-09-01

    We present the final results from a targeted search for brown dwarfs with unusual near-infrared colors. From a positional cross-match of the Sloan Digital Sky Survey (SDSS), 2-Micron All-Sky Survey (2MASS), and Wide-Field Infrared Survey Explorer (WISE) catalogs, we have identified 144 candidate peculiar L and T dwarfs. Spectroscopy confirms that 20 of the objects are peculiar or are candidate binaries. Of the 420 objects in our full sample 9 are young (≲ 200 {Myr}; 2.1%) and another 8 (1.9%) are unusually red, with no signatures of youth. With a spectroscopic J-{K}s color of 2.58 ± 0.11 mag, one of the new objects, the L6 dwarf 2MASS J03530419+0418193, is among the reddest field dwarfs currently known and is one of the reddest objects with no signatures of youth known to date. We have also discovered another potentially very-low-gravity object, the L1 dwarf 2MASS J00133470+1109403, and independently identified the young L7 dwarf 2MASS J00440332+0228112, which was first reported by Schneider and collaborators. Our results confirm that signatures of low gravity are no longer discernible in low to moderate resolution spectra of objects older than ˜200 Myr. The 1.9% of unusually red L dwarfs that do not show other signatures of youth could be slightly older, up to ˜400 Myr. In this case a red J-{K}s color may be more diagnostic of moderate youth than individual spectral features. However, its is also possible that these objects are relatively metal-rich, and thus have enhanced atmospheric dust content.

  18. Navy-Wide Personnel Survey (NPS) 1997: Statistical Tables for Officers and Enlisted Personnel

    Science.gov (United States)

    1998-02-01

    to take part in this survey. " Your participation is voluntary. Please take the time to give careful , frank " answers. It should take about forty...0 Filipino - )Pacific islander (Guamanian, Samoan, etc) 10. Did you receive premarital counseling? -0 Eskiman/Aleut 0None of the above N -11. If yes...schedule 0 0 0 0 00 (2 O 01 I I Availability of health care 1&I0 0Q 1 and education services for l l®!@ ; - special needs C 0 0 0 0 (!0’ (D1 0 - m

  19. Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation

    Science.gov (United States)

    Jefferys, William H.; Berger, James O.

    1992-01-01

    'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.

  20. Statistical associations between housing quality and health among Finnish households with children - Results from two (repeated) national surveys.

    Science.gov (United States)

    Turunen, Mari; Iso-Markku, Kati; Pekkonen, Maria; Haverinen-Shaughnessy, Ulla

    2017-01-01

    The overall aim of the study was to assess housing and health issues related to Finnish housing stock and possible changes occurring in the course of time. Based on two housing and health questionnaire surveys, first one in 2007 and the second one in 2011, we examined factors associated with housing satisfaction and health symptoms that residents themselves reported on a general population level. A special emphasis was on housing quality and health issues among households with children. The total number of survey responses was 2674, response rate being slightly lower in the 2011 (29%) survey than in 2007 (43%). Differences in housing and health issues observed between 2007 and 2011 surveys were relatively small. From the various housing factors studied, largest differences between surveys were seen in thermal comfort during summer, which could be attributed to climate factors. From the five health outcome variables studied, only self-reported upper respiratory symptoms appeared to have significant temporal variation between the surveys. Overall, issues related to crowding, inaccessibility, use of chemicals, indoor air quality (e.g. ventilation adequacy), and dampness and mold could cause more unsatisfactory housing conditions among the families with children. Respondents who had children reported respiratory symptoms less commonly, whereas risk for respiratory infections was increased among these respondents. Modeling self-reported health symptoms led to selection of nine to twelve statistically significant housing variables together with up to five socio-economic variables, i.e. complex models which are difficult to interpret quantitatively. The models' sensitivity for properly indicating symptoms was rather low, varying from 4% to 22%, which illustrates that it is quite impossible to predict individuals' symptoms with a set of housing characteristics. However, the associations observed on the population level may be used to develop policies that are protective of

  1. Spatial and Statistical Analysis of Leptospirosis in Guilan Province, Iran

    Science.gov (United States)

    Nia, A. Mohammadi; Alimohammadi, A.; Habibi, R.; Shirzadi, M. R.

    2015-12-01

    The most underdiagnosed water-borne bacterial zoonosis in the world is Leptospirosis which especially impacts tropical and humid regions. According to World Health Organization (WHO), the number of human cases is not known precisely. Available reports showed that worldwide incidences vary from 0.1-1 per 100 000 per year in temperate climates to 10-100 per 100 000 in the humid tropics. Pathogenic bacteria that is spread by the urines of rats is the main reason of water and soil infections. Rice field farmers who are in contact with infected water or soil, contain the most burden of leptospirosis prevalence. In recent years, this zoonotic disease have been occurred in north of Iran endemically. Guilan as the second rice production province (average=750 000 000 Kg, 40% of country production) after Mazandaran, has one of the most rural population (Male=487 679, Female=496 022) and rice workers (47 621 insured workers) among Iran provinces. The main objectives of this study were to analyse yearly spatial distribution and the possible spatial clusters of leptospirosis to better understand epidemiological aspects of them in the province. Survey was performed during the period of 2009-2013 at rural district level throughout the study area. Global clustering methods including the average nearest neighbour distance, Moran's I and General G indices were utilized to investigate the annual spatial distribution of diseases. At the end, significant spatial clusters have been detected with the objective of informing priority areas for public health planning and resource allocation.

  2. SEDA: A software package for the Statistical Earthquake Data Analysis

    Science.gov (United States)

    Lombardi, A. M.

    2017-03-01

    In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package.

  3. Statistical language analysis for automatic exfiltration event detection.

    Energy Technology Data Exchange (ETDEWEB)

    Robinson, David Gerald

    2010-04-01

    This paper discusses the recent development a statistical approach for the automatic identification of anomalous network activity that is characteristic of exfiltration events. This approach is based on the language processing method eferred to as latent dirichlet allocation (LDA). Cyber security experts currently depend heavily on a rule-based framework for initial detection of suspect network events. The application of the rule set typically results in an extensive list of uspect network events that are then further explored manually for suspicious activity. The ability to identify anomalous network events is heavily dependent on the experience of the security personnel wading through the network log. Limitations f this approach are clear: rule-based systems only apply to exfiltration behavior that has previously been observed, and experienced cyber security personnel are rare commodities. Since the new methodology is not a discrete rule-based pproach, it is more difficult for an insider to disguise the exfiltration events. A further benefit is that the methodology provides a risk-based approach that can be implemented in a continuous, dynamic or evolutionary fashion. This permits uspect network activity to be identified early with a quantifiable risk associated with decision making when responding to suspicious activity.

  4. Mapping indoor radon-222 in Denmark: Design and test of the statistical model used in the second nationwide survey

    DEFF Research Database (Denmark)

    Andersen, C.E.; Ulbak, K.; Damkjær, A.

    2001-01-01

    is to describe the design of this model, and to report results of model tests. The model is based on a transformation of the data to normality and on analytical (conditionally) unbiased estimators of the quantities of interest. Bayesian statistics are used to minimize the effect of small sample size. In each...... important outcome of the survey is the prediction of the fraction of houses in each municipality with an annual average radon concentration above 200 Bq m(-3). To obtain the most accurate estimate and to assess the associated uncertainties, a statistical model has been developed. The purpose of this paper......%-confidence interval = [3.4,4.5]) is consistent with the weighted sum of the observations for Denmark taken as a whole (4.6% with 95%-confidence interval = [3.8,5.6]). The total number of single-family houses within each municipality is used as weight. Model estimates are also found to be consistent...

  5. Statistics Education Research in Malaysia and the Philippines: A Comparative Analysis

    Science.gov (United States)

    Reston, Enriqueta; Krishnan, Saras; Idris, Noraini

    2014-01-01

    This paper presents a comparative analysis of statistics education research in Malaysia and the Philippines by modes of dissemination, research areas, and trends. An electronic search for published research papers in the area of statistics education from 2000-2012 yielded 20 for Malaysia and 19 for the Philippines. Analysis of these papers showed…

  6. Spatial statistical analysis of dissatisfaction with the performance of ...

    African Journals Online (AJOL)

    The analysis reveals spatial clustering in the level of dissatisfaction with the performance of local government. It also reveals percentage of respondents dissatisfied with dwelling, mean sense of safety index, and percentage agree the country is going in the wrong direction, as significant predictors of the level of local ...

  7. The statistical analysis of results of solidification of fly ash

    Directory of Open Access Journals (Sweden)

    Pliešovská Natália

    1996-09-01

    Full Text Available The analysis shows, that there is no statical dependence between contents of heavy metals in fly ash on one side, and contents in leaching characteristics of heavy metals from the stabilized waste and from the waste itself on the other side.

  8. Statistical Analysis of Hit/Miss Data (Preprint)

    Science.gov (United States)

    2012-07-01

    HDBK-1823A, 2009). Other agencies and industries have also made use of this guidance (Gandossi et al., 2010) and ( Drury et al., 2006). It should...2002. Drury , Ghylin, and Holness, Error Analysis and Threat Magnitude for Carry-on Bag Inspection, Proceedings of the Human Factors and Ergonomic

  9. Statistical Analysis Of Trace Element Concentrations In Shale ...

    African Journals Online (AJOL)

    Principal component and regression analysis of geochemical data in sampled shale – carbonate sediments in Guyuk, Northeastern Nigeria reveal enrichments of four predictor elements, Ni, Co, Cr and Cu to gypsum mineralisation. Ratios of their enrichments are Cu(10:1), Ni(8:1), Co(58:1) and Cr(30:1) The >70% ...

  10. Open Access Publishing Trend Analysis: Statistics beyond the Perception

    Science.gov (United States)

    Poltronieri, Elisabetta; Bravo, Elena; Curti, Moreno; Maurizio Ferri,; Mancini, Cristina

    2016-01-01

    Introduction: The purpose of this analysis was twofold: to track the number of open access journals acquiring impact factor, and to investigate the distribution of subject categories pertaining to these journals. As a case study, journals in which the researchers of the National Institute of Health (Istituto Superiore di Sanità) in Italy have…

  11. Multivariate statistical analysis of a multi-step industrial processes

    DEFF Research Database (Denmark)

    Reinikainen, S.P.; Høskuldsson, Agnar

    2007-01-01

    multivariate multi-step processes, where results from each step are used to evaluate future results, is presented. The methods presented are based on Priority PLS Regression. The basic idea is to compute the weights in the regression analysis for given steps, but adjust all data by the resulting score vectors...

  12. A statistical inference method for the stochastic reachability analysis

    NARCIS (Netherlands)

    Bujorianu, L.M.

    2005-01-01

    Many control systems have large, infinite state space that can not be easily abstracted. One method to analyse and verify these systems is reachability analysis. It is frequently used for air traffic control and power plants. Because of lack of complete information about the environment or

  13. Statistical analysis of geodetic networks for detecting regional events

    Science.gov (United States)

    Granat, Robert

    2004-01-01

    We present an application of hidden Markov models (HMMs) to analysis of geodetic time series in Southern California. Our model fitting method uses a regularized version of the deterministic annealing expectation-maximization algorithm to ensure that model solutions are both robust and of high quality.

  14. Sealed-Bid Auction of Dutch Mussels : Statistical Analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.; van Schaik, F.D.J.

    2007-01-01

    This article presents an econometric analysis of the many data on the sealed-bid auction that sells mussels in Yerseke town, the Netherlands. The goals of this analy- sis are obtaining insight into the important factors that determine the price of these mussels, and quantifying the performance of an

  15. A comparative assessment of statistical methods for extreme weather analysis

    Science.gov (United States)

    Schlögl, Matthias; Laaha, Gregor

    2017-04-01

    Extreme weather exposure assessment is of major importance for scientists and practitioners alike. We compare different extreme value approaches and fitting methods with respect to their value for assessing extreme precipitation and temperature impacts. Based on an Austrian data set from 25 meteorological stations representing diverse meteorological conditions, we assess the added value of partial duration series over the standardly used annual maxima series in order to give recommendations for performing extreme value statistics of meteorological hazards. Results show the merits of the robust L-moment estimation, which yielded better results than maximum likelihood estimation in 62 % of all cases. At the same time, results question the general assumption of the threshold excess approach (employing partial duration series, PDS) being superior to the block maxima approach (employing annual maxima series, AMS) due to information gain. For low return periods (non-extreme events) the PDS approach tends to overestimate return levels as compared to the AMS approach, whereas an opposite behavior was found for high return levels (extreme events). In extreme cases, an inappropriate threshold was shown to lead to considerable biases that may outperform the possible gain of information from including additional extreme events by far. This effect was neither visible from the square-root criterion, nor from standardly used graphical diagnosis (mean residual life plot), but from a direct comparison of AMS and PDS in synoptic quantile plots. We therefore recommend performing AMS and PDS approaches simultaneously in order to select the best suited approach. This will make the analyses more robust, in cases where threshold selection and dependency introduces biases to the PDS approach, but also in cases where the AMS contains non-extreme events that may introduce similar biases. For assessing the performance of extreme events we recommend conditional performance measures that focus

  16. Detailed statistical analysis plan for the pulmonary protection trial.

    Science.gov (United States)

    Buggeskov, Katrine B; Jakobsen, Janus C; Secher, Niels H; Jonassen, Thomas; Andersen, Lars W; Steinbrüchel, Daniel A; Wetterslev, Jørn

    2014-12-23

    Pulmonary dysfunction complicates cardiac surgery that includes cardiopulmonary bypass. The pulmonary protection trial evaluates effect of pulmonary perfusion on pulmonary function in patients suffering from chronic obstructive pulmonary disease. This paper presents the statistical plan for the main publication to avoid risk of outcome reporting bias, selective reporting, and data-driven results as an update to the published design and method for the trial. The pulmonary protection trial is a randomized, parallel group clinical trial that assesses the effect of pulmonary perfusion with oxygenated blood or Custodiol™ HTK (histidine-tryptophan-ketoglutarate) solution versus no pulmonary perfusion in 90 chronic obstructive pulmonary disease patients. Patients, the statistician, and the conclusion drawers are blinded to intervention allocation. The primary outcome is the oxygenation index from 10 to 15 minutes after the end of cardiopulmonary bypass until 24 hours thereafter. Secondary outcome measures are oral tracheal intubation time, days alive outside the intensive care unit, days alive outside the hospital, and 30- and 90-day mortality, and one or more of the following selected serious adverse events: pneumothorax or pleural effusion requiring drainage, major bleeding, reoperation, severe infection, cerebral event, hyperkaliemia, acute myocardial infarction, cardiac arrhythmia, renal replacement therapy, and readmission for a respiratory-related problem. The pulmonary protection trial investigates the effect of pulmonary perfusion during cardiopulmonary bypass in chronic obstructive pulmonary disease patients. A preserved oxygenation index following pulmonary perfusion may indicate an effect and inspire to a multicenter confirmatory trial to assess a more clinically relevant outcome. ClinicalTrials.gov identifier: NCT01614951, registered on 6 June 2012.

  17. Statistical Analysis of the Grid Connected Photovoltaic System Performance Ratio

    Directory of Open Access Journals (Sweden)

    Javier Vilariño-García

    2017-05-01

    Full Text Available A methodology based on the application of variance analysis and Tukey's method to a data set of solar radiation in the plane of the photovoltaic modules and the corresponding values of power delivered to the grid at intervals of 10 minutes presents from sunrise to sunset during the 52 weeks of the year 2013. These data were obtained through a monitoring system located in a photovoltaic plant of 10 MW of rated power located in Cordoba, consisting of 16 transformers and 98 investors. The application of the comparative method among the middle of the performance index of the processing centers to detect with an analysis of variance if there is significant difference in average at least the rest at a level of significance of 5% and then by testing Tukey which one or more processing centers that are below average due to a fault to be detected and corrected are.

  18. Methods of learning in statistical education: Design and analysis of a randomized trial

    Science.gov (United States)

    Boyd, Felicity Turner

    Background. Recent psychological and technological advances suggest that active learning may enhance understanding and retention of statistical principles. A randomized trial was designed to evaluate the addition of innovative instructional methods within didactic biostatistics courses for public health professionals. Aims. The primary objectives were to evaluate and compare the addition of two active learning methods (cooperative and internet) on students' performance; assess their impact on performance after adjusting for differences in students' learning style; and examine the influence of learning style on trial participation. Methods. Consenting students enrolled in a graduate introductory biostatistics course were randomized to cooperative learning, internet learning, or control after completing a pretest survey. The cooperative learning group participated in eight small group active learning sessions on key statistical concepts, while the internet learning group accessed interactive mini-applications on the same concepts. Controls received no intervention. Students completed evaluations after each session and a post-test survey. Study outcome was performance quantified by examination scores. Intervention effects were analyzed by generalized linear models using intent-to-treat analysis and marginal structural models accounting for reported participation. Results. Of 376 enrolled students, 265 (70%) consented to randomization; 69, 100, and 96 students were randomized to the cooperative, internet, and control groups, respectively. Intent-to-treat analysis showed no differences between study groups; however, 51% of students in the intervention groups had dropped out after the second session. After accounting for reported participation, expected examination scores were 2.6 points higher (of 100 points) after completing one cooperative learning session (95% CI: 0.3, 4.9) and 2.4 points higher after one internet learning session (95% CI: 0.0, 4.7), versus

  19. Determinants of ICT Infrastructure: A Cross-Country Statistical Analysis

    OpenAIRE

    Jens J. Krüger; Rhiel, Mathias

    2016-01-01

    We investigate economic and institutional determinants of ICT infrastructure for a broad cross section ofmore than 100 countries. The ICT variable is constructed from a principal components analysis. The explanatory variables are selected by variants of the Lasso estimator from the machine learning literature.In addition to least squares, we also apply robust and semiparametric regression estimators. The results show that the regressions are able to explain ICT infrastructure very well. Maj...

  20. Practical guidance for statistical analysis of operational event data

    Energy Technology Data Exchange (ETDEWEB)

    Atwood, C.L.

    1995-10-01

    This report presents ways to avoid mistakes that are sometimes made in analysis of operational event data. It then gives guidance on what to do when a model is rejected, a list of standard types of models to consider, and principles for choosing one model over another. For estimating reliability, it gives advice on which failure modes to model, and moment formulas for combinations of failure modes. The issues are illustrated with many examples and case studies.

  1. Statistical analysis of joint toxicity in biological growth experiments

    DEFF Research Database (Denmark)

    Spliid, Henrik; Tørslev, J.

    1994-01-01

    The authors formulate a model for the analysis of designed biological growth experiments where a mixture of toxicants is applied to biological target organisms. The purpose of such experiments is to assess the toxicity of the mixture in comparison with the toxicity observed when the toxicants are...... is applied on data from an experiment where inhibition of the growth of the bacteria Pseudomonas fluorescens caused by different mixtures of pentachlorophenol and aniline was studied....

  2. Analysis of tensile bond strengths using Weibull statistics.

    Science.gov (United States)

    Burrow, Michael F; Thomas, David; Swain, Mike V; Tyas, Martin J

    2004-09-01

    Tensile strength tests of restorative resins bonded to dentin, and the resultant strengths of interfaces between the two, exhibit wide variability. Many variables can affect test results, including specimen preparation and storage, test rig design and experimental technique. However, the more fundamental source of variability, that associated with the brittle nature of the materials, has received little attention. This paper analyzes results from micro-tensile tests on unfilled resins and adhesive bonds between restorative resin composite and dentin in terms of reliability using the Weibull probability of failure method. Results for the tensile strengths of Scotchbond Multipurpose Adhesive (3M) and Clearfil LB Bond (Kuraray) bonding resins showed Weibull moduli (m) of 6.17 (95% confidence interval, 5.25-7.19) and 5.01 (95% confidence interval, 4.23-5.8). Analysis of results for micro-tensile tests on bond strengths to dentin gave moduli between 1.81 (Clearfil Liner Bond 2V) and 4.99 (Gluma One Bond, Kulzer). Material systems with m in this range do not have a well-defined strength. The Weibull approach also enables the size dependence of the strength to be estimated. An example where the bonding area was changed from 3.1 to 1.1 mm diameter is shown. Weibull analysis provides a method for determining the reliability of strength measurements in the analysis of data from bond strength and tensile tests on dental restorative materials.

  3. The statistical analysis of single-subject data: a comparative examination.

    Science.gov (United States)

    Nourbakhsh, M R; Ottenbacher, K J

    1994-08-01

    The purposes of this study were to examine whether the use of three different statistical methods for analyzing single-subject data led to similar results and to identify components of graphed data that influence agreement (or disagreement) among the statistical procedures. Forty-two graphs containing single-subject data were examined. Twenty-one were AB charts of hypothetical data. The other 21 graphs appeared in Journal of Applied Behavioral Analysis, Physical Therapy, Journal of the Association for Persons With Severe Handicaps, and Journal of Behavior Therapy and Experimental Psychiatry. Three different statistical tests--the C statistic, the two-standard deviation band method, and the split-middle method of trend estimation--were used to analyze the 42 graphs. A relatively low degree of agreement (38%) was found among the three statistical tests. The highest rate of agreement for any two statistical procedures (71%) was found for the two-standard deviation band method and the C statistic. A logistic regression analysis revealed that overlap in single-subject graphed data was the best predictor of disagreement among the three statistical tests (beta = .49, P < .03). The results indicate that interpretation of data from single-subject research designs is directly influenced by the method of data analysis selected. Variation exists across both visual and statistical methods of data reduction. The advantages and disadvantages of statistical and visual analysis are described.

  4. Meta-analysis and The Cochrane Collaboration: 20 years of the Cochrane Statistical Methods Group.

    Science.gov (United States)

    McKenzie, Joanne E; Salanti, Georgia; Lewis, Steff C; Altman, Douglas G

    2013-11-26

    The Statistical Methods Group has played a pivotal role in The Cochrane Collaboration over the past 20 years. The Statistical Methods Group has determined the direction of statistical methods used within Cochrane reviews, developed guidance for these methods, provided training, and continued to discuss and consider new and controversial issues in meta-analysis. The contribution of Statistical Methods Group members to the meta-analysis literature has been extensive and has helped to shape the wider meta-analysis landscape.In this paper, marking the 20th anniversary of The Cochrane Collaboration, we reflect on the history of the Statistical Methods Group, beginning in 1993 with the identification of aspects of statistical synthesis for which consensus was lacking about the best approach. We highlight some landmark methodological developments that Statistical Methods Group members have contributed to in the field of meta-analysis. We discuss how the Group implements and disseminates statistical methods within The Cochrane Collaboration. Finally, we consider the importance of robust statistical methodology for Cochrane systematic reviews, note research gaps, and reflect on the challenges that the Statistical Methods Group faces in its future direction.

  5. Statistical analysis of the ambiguities in the asteroid period determinations

    Science.gov (United States)

    Butkiewicz, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.

    2014-07-01

    A synodic period of an asteroid can be derived from its lightcurve by standard methods like Fourier-series fitting. A problem appears when results of observations are based on less than a full coverage of a lightcurve and/or high level of noise. Also, long gaps between individual lightcurves create an ambiguity in the cycle count which leads to aliases. Excluding binary systems and objects with non-principal-axis rotation, the rotation period is usually identical to the period of the second Fourier harmonic of the lightcurve. There are cases, however, where it may be connected with the 1st, 3rd, or 4th harmonic and it is difficult to choose among them when searching for the period. To help remove such uncertainties we analysed asteroid lightcurves for a range of shapes and observing/illuminating geometries. We simulated them using a modified internal code from the ISAM service (Marciniak et al. 2012, A&A 545, A131). In our computations, shapes of asteroids were modeled as Gaussian random spheres (Muinonen 1998, A&A, 332, 1087). A combination of Lommel-Seeliger and Lambert scattering laws was assumed. For each of the 100 shapes, we randomly selected 1000 positions of the spin axis, systematically changing the solar phase angle with a step of 5°. For each lightcurve, we determined its peak-to-peak amplitude, fitted the 6th-order Fourier series and derived the amplitudes of its harmonics. Instead of the number of the lightcurve extrema, which in many cases is subjective, we characterized each lightcurve by the order of the highest-amplitude Fourier harmonic. The goal of our simulations was to derive statistically significant conclusions (based on the underlying assumptions) about the dominance of different harmonics in the lightcurves of the specified amplitude and phase angle. The results, presented in the Figure, can be used in individual cases to estimate the probability that the obtained lightcurve is dominated by a specified Fourier harmonic. Some of the

  6. Pharmacists' satisfaction with their work: Analysis of an alumni survey.

    Science.gov (United States)

    Gustafsson, Maria; Mattsson, Sofia; Wallman, Andy; Gallego, Gisselle

    2017-09-01

    The level of job satisfaction among practicing pharmacists is important because it has been found to affect job performance and employee turnover. The Swedish pharmacy market has undergone major changes in recent years, and little is known about pharmacists' job satisfaction. The objective of this study was to investigate the level of job satisfaction and associated factors among graduates from the web-based pharmacy programs at Umeå University. Job satisfaction of pharmacists was measured as part of an alumni survey conducted with those who graduated from the pharmacy programmes between 2006 and 2014. Data analysis included descriptive statistics, and logistic regression was used to explore factors affecting job satisfaction. The total number of graduates who completed the survey was 222 (response rate 43%.) The majority of respondents were female (95%), and most were employed at a community pharmacy (85%). The mean age was 39.7 years. The majority of graduates (91%) were satisfied with their job "most of the time" or "all of the time", and 87% of the respondents would "definitely" or "maybe" choose the same career again. The multivariate analysis showed that increasing years in the current position (OR: 0.672 (0.519-0.871)) was associated with lower job satisfaction. Older age (OR: 1.123 (1.022-1.234)), the perception that the knowledge and skills acquired during university education is useful in the current job (OR: 4.643 (1.255-17.182)) and access to continuing professional development (OR: 9.472 (1.965-45.662)) were associated with higher job satisfaction. Most graduates from the web-based pharmacy programmes were satisfied with their current job. Access to continuing professional development seems to be important for the level of job satisfaction among pharmacists. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Statistical Analysis Software for the TRS-80 Microcomputer.

    Science.gov (United States)

    1981-09-01

    007011260 11240 LC-LCMC 112500070 11210 11240 F0LsCe*PLC 11270 0X01l-FOX i1280 RETURN 67 11290 14Xa(4CX/DF)C (1/3) - (1-(21(9.OF)DI)/SQR(2/(9*OF)) 11300...Linear Regression"FRIT I007 PRINT#4 Analysis of Variance’ 100m KPB898 : zs-X : oOSUu ISO 1009 IF 10.4 0070 20 101001 10.I~3 THEN ZT=*20 10070 10120

  8. Ball lightning diameter-lifetime statistical analysis of SKB databank

    Science.gov (United States)

    Amirov, Anvar Kh; Bychkov, Vladimir L.

    1995-03-01

    Revelation of the significance of diameter as a factor for the lifetime as a parameter for different ways of Ball Lightning (BL) disappearance has been made. Methods for non-parametric regression analysis have been applied for pairs diameter - radiation losses in correspondence to BL disappearance. BL diameter as a factor turned out to be significant for BL life-time in the case of explosion and decay and insignificant in the case of extinction. Dependence logarithm of radiation losses - logarithm of BL volume obtained with the help of nonparametric regression treatment turned out to be different according to BL ways of disappearance.

  9. Assessing Statistically Significant Heavy-Metal Concentrations in Abandoned Mine Areas via Hot Spot Analysis of Portable XRF Data.

    Science.gov (United States)

    Kim, Sung-Min; Choi, Yosoon

    2017-06-18

    To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z -score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z -scores: high content with a high z -score (HH), high content with a low z -score (HL), low content with a high z -score (LH), and low content with a low z -score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1-4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.

  10. Assessing Statistically Significant Heavy-Metal Concentrations in Abandoned Mine Areas via Hot Spot Analysis of Portable XRF Data

    Directory of Open Access Journals (Sweden)

    Sung-Min Kim

    2017-06-01

    Full Text Available To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z-score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z-scores: high content with a high z-score (HH, high content with a low z-score (HL, low content with a high z-score (LH, and low content with a low z-score (LL. The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1–4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.

  11. Analysis of compressive fracture in rock using statistical techniques

    Energy Technology Data Exchange (ETDEWEB)

    Blair, S.C.

    1994-12-01

    Fracture of rock in compression is analyzed using a field-theory model, and the processes of crack coalescence and fracture formation and the effect of grain-scale heterogeneities on macroscopic behavior of rock are studied. The model is based on observations of fracture in laboratory compression tests, and incorporates assumptions developed using fracture mechanics analysis of rock fracture. The model represents grains as discrete sites, and uses superposition of continuum and crack-interaction stresses to create cracks at these sites. The sites are also used to introduce local heterogeneity. Clusters of cracked sites can be analyzed using percolation theory. Stress-strain curves for simulated uniaxial tests were analyzed by studying the location of cracked sites, and partitioning of strain energy for selected intervals. Results show that the model implicitly predicts both development of shear-type fracture surfaces and a strength-vs-size relation that are similar to those observed for real rocks. Results of a parameter-sensitivity analysis indicate that heterogeneity in the local stresses, attributed to the shape and loading of individual grains, has a first-order effect on strength, and that increasing local stress heterogeneity lowers compressive strength following an inverse power law. Peak strength decreased with increasing lattice size and decreasing mean site strength, and was independent of site-strength distribution. A model for rock fracture based on a nearest-neighbor algorithm for stress redistribution is also presented and used to simulate laboratory compression tests, with promising results.

  12. Statistical methods for the forensic analysis of striated tool marks

    Energy Technology Data Exchange (ETDEWEB)

    Hoeksema, Amy Beth [Iowa State Univ., Ames, IA (United States)

    2013-01-01

    In forensics, fingerprints can be used to uniquely identify suspects in a crime. Similarly, a tool mark left at a crime scene can be used to identify the tool that was used. However, the current practice of identifying matching tool marks involves visual inspection of marks by forensic experts which can be a very subjective process. As a result, declared matches are often successfully challenged in court, so law enforcement agencies are particularly interested in encouraging research in more objective approaches. Our analysis is based on comparisons of profilometry data, essentially depth contours of a tool mark surface taken along a linear path. In current practice, for stronger support of a match or non-match, multiple marks are made in the lab under the same conditions by the suspect tool. We propose the use of a likelihood ratio test to analyze the difference between a sample of comparisons of lab tool marks to a field tool mark, against a sample of comparisons of two lab tool marks. Chumbley et al. (2010) point out that the angle of incidence between the tool and the marked surface can have a substantial impact on the tool mark and on the effectiveness of both manual and algorithmic matching procedures. To better address this problem, we describe how the analysis can be enhanced to model the effect of tool angle and allow for angle estimation for a tool mark left at a crime scene. With sufficient development, such methods may lead to more defensible forensic analyses.

  13. Limitations of Using Microsoft Excel Version 2016 (MS Excel 2016) for Statistical Analysis for Medical Research.

    Science.gov (United States)

    Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak

    2016-06-01

    Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.

  14. Fourier Transform Raman and Statistical Analysis of Thermally Altered Samples of Amber.

    Science.gov (United States)

    Badea, Georgiana I; Caggiani, Maria C; Colomban, Philippe; Mangone, Annarosa; Teodor, Eugenia D; Teodor, Eugen S; Radu, Gabriel L

    2015-12-01

    We report the experimental results that refer to a Fourier transform Raman (FT-Raman) survey of thermally altered Baltic and Romanian amber and the related statistical interpretation of data using principal component analysis (PCA). Although FT-Raman spectra show several small changes in the characteristic features of the investigated amber samples which may be used for discrimination, their visual recognition is relatively difficult, especially when interpreting data from archeological samples, and thus multivariate data analysis may be the solution to more accurately assign the geological origin based on overall characteristic spectral features. The two categories of amber have different behavior in terms of degradation during the experimental alteration, and Romanian amber is more susceptible to physico-chemical transformations by the aggressive environment when compared with Baltic amber. The obtained data were in accordance with the Fourier transform infrared (FT-IR) remarks published previously in a dedicated journal. The Raman technique is an alternative method that requires little to no sample preparation, water does not cause interference, and the spectra can be collected from a small volume (1-50 μm in diameter).

  15. Instrument and Survey Analysis Technical Report: Program Implementation Survey. Technical Report #1112

    Science.gov (United States)

    Alonzo, Julie; Tindal, Gerald

    2011-01-01

    This technical document provides guidance to educators on the creation and interpretation of survey instruments, particularly as they relate to an analysis of program implementation. Illustrative examples are drawn from a survey of educators related to the use of the easyCBM learning system. This document includes specific sections on…

  16. Comparability of mixed IC₅₀ data - a statistical analysis.

    Directory of Open Access Journals (Sweden)

    Tuomo Kalliokoski

    Full Text Available The biochemical half maximal inhibitory concentration (IC50 is the most commonly used metric for on-target activity in lead optimization. It is used to guide lead optimization, build large-scale chemogenomics analysis, off-target activity and toxicity models based on public data. However, the use of public biochemical IC50 data is problematic, because they are assay specific and comparable only under certain conditions. For large scale analysis it is not feasible to check each data entry manually and it is very tempting to mix all available IC50 values from public database even if assay information is not reported. As previously reported for Ki database analysis, we first analyzed the types of errors, the redundancy and the variability that can be found in ChEMBL IC50 database. For assessing the variability of IC50 data independently measured in two different labs at least ten IC50 data for identical protein-ligand systems against the same target were searched in ChEMBL. As a not sufficient number of cases of this type are available, the variability of IC50 data was assessed by comparing all pairs of independent IC50 measurements on identical protein-ligand systems. The standard deviation of IC50 data is only 25% larger than the standard deviation of Ki data, suggesting that mixing IC50 data from different assays, even not knowing assay conditions details, only adds a moderate amount of noise to the overall data. The standard deviation of public ChEMBL IC50 data, as expected, resulted greater than the standard deviation of in-house intra-laboratory/inter-day IC50 data. Augmenting mixed public IC50 data by public Ki data does not deteriorate the quality of the mixed IC50 data, if the Ki is corrected by an offset. For a broad dataset such as ChEMBL database a Ki- IC50 conversion factor of 2 was found to be the most reasonable.

  17. Statistical Power Analysis with Missing Data A Structural Equation Modeling Approach

    CERN Document Server

    Davey, Adam

    2009-01-01

    Statistical power analysis has revolutionized the ways in which we conduct and evaluate research.  Similar developments in the statistical analysis of incomplete (missing) data are gaining more widespread applications. This volume brings statistical power and incomplete data together under a common framework, in a way that is readily accessible to those with only an introductory familiarity with structural equation modeling.  It answers many practical questions such as: How missing data affects the statistical power in a study How much power is likely with different amounts and types

  18. The art of data analysis how to answer almost any question using basic statistics

    CERN Document Server

    Jarman, Kristin H

    2013-01-01

    A friendly and accessible approach to applying statistics in the real worldWith an emphasis on critical thinking, The Art of Data Analysis: How to Answer Almost Any Question Using Basic Statistics presents fun and unique examples, guides readers through the entire data collection and analysis process, and introduces basic statistical concepts along the way.Leaving proofs and complicated mathematics behind, the author portrays the more engaging side of statistics and emphasizes its role as a problem-solving tool.  In addition, light-hearted case studies

  19. First metatarsocuneiform motion: a radiographic and statistical analysis.

    Science.gov (United States)

    Fritz, G R; Prieskorn, D

    1995-03-01

    Fifty volunteers with 100 asymptomatic feet were evaluated by physical examination, radiographic analysis, and questionnaire. This investigation was used to evaluate first metatarsocuneiform motion and establish normal values at this joint. Normal first ray sagittal range of motion was 4.37 degrees (SD, +/- 3.4 degrees). The shape of the distal cuneiform was then categorized by three classification methods. Multiple independent variables were cross-referenced to determine their relationship with motion and shape at the distal cuneiform. Hyperflexibility of the thumb correlated with first ray hypermobility. No correlation was found between first ray motion and sex, age, intermetatarsal angle, side, skin stretch, hyperextension of the knee, hyperextension of the elbow, or shape of the distal cuneiform.

  20. Bayesian statistics, factor analysis, and PET images I. Mathematical background.

    Science.gov (United States)

    Phillips, P R

    1989-01-01

    The problem of image reconstruction in positron emission tomography (PET) is examined, although the approach is quite general and may have other applications. The approach is based on the maximum-likelihood method L.A. Shepp and Y. Vardi (1982), with their assumption that the number of image pixels is greater than the number of data points. In this situation a (nonunique) solution can be written down directly, although it is not guaranteed to be positive definite. The arbitrariness in this solution can be precisely characterized by a geometric argument. A unique solution can be obtained only by introducing prior information. It is suggested that factor analysis is an efficient way to do this. In the simplest application of the method, the solution is written as the sum of two parts, r(alpha )+t(alpha), where r(alpha) is determined solely by the data and t(alpha) is determined by r(alpha) and the prior information.