WorldWideScience

Sample records for statistical genetic analysis

  1. Developments in statistical analysis in quantitative genetics

    DEFF Research Database (Denmark)

    Sorensen, Daniel

    2009-01-01

    of genetic means and variances, models for the analysis of categorical and count data, the statistical genetics of a model postulating that environmental variance is partly under genetic control, and a short discussion of models that incorporate massive genetic marker information. We provide an overview......A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap...... and by Markov chain Monte Carlo (McMC). In this overview a number of specific areas are chosen to illustrate the enormous flexibility that McMC has provided for fitting models and exploring features of data that were previously inaccessible. The selected areas are inferences of the trajectories over time...

  2. Statistical methods in spatial genetics

    DEFF Research Database (Denmark)

    Guillot, Gilles; Leblois, Raphael; Coulon, Aurelie

    2009-01-01

    The joint analysis of spatial and genetic data is rapidly becoming the norm in population genetics. More and more studies explicitly describe and quantify the spatial organization of genetic variation and try to relate it to underlying ecological processes. As it has become increasingly difficult...... to keep abreast with the latest methodological developments, we review the statistical toolbox available to analyse population genetic data in a spatially explicit framework. We mostly focus on statistical concepts but also discuss practical aspects of the analytical methods, highlighting not only...

  3. Statistics for Learning Genetics

    Science.gov (United States)

    Charles, Abigail Sheena

    This study investigated the knowledge and skills that biology students may need to help them understand statistics/mathematics as it applies to genetics. The data are based on analyses of current representative genetics texts, practicing genetics professors' perspectives, and more directly, students' perceptions of, and performance in, doing statistically-based genetics problems. This issue is at the emerging edge of modern college-level genetics instruction, and this study attempts to identify key theoretical components for creating a specialized biological statistics curriculum. The goal of this curriculum will be to prepare biology students with the skills for assimilating quantitatively-based genetic processes, increasingly at the forefront of modern genetics. To fulfill this, two college level classes at two universities were surveyed. One university was located in the northeastern US and the other in the West Indies. There was a sample size of 42 students and a supplementary interview was administered to a select 9 students. Interviews were also administered to professors in the field in order to gain insight into the teaching of statistics in genetics. Key findings indicated that students had very little to no background in statistics (55%). Although students did perform well on exams with 60% of the population receiving an A or B grade, 77% of them did not offer good explanations on a probability question associated with the normal distribution provided in the survey. The scope and presentation of the applicable statistics/mathematics in some of the most used textbooks in genetics teaching, as well as genetics syllabi used by instructors do not help the issue. It was found that the text books, often times, either did not give effective explanations for students, or completely left out certain topics. The omission of certain statistical/mathematical oriented topics was seen to be also true with the genetics syllabi reviewed for this study. Nonetheless

  4. Gregor Mendel's Genetic Experiments: A Statistical Analysis after 150 Years

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2016-01-01

    Roč. 12, č. 2 (2016), s. 20-26 ISSN 1801-5603 Institutional support: RVO:67985807 Keywords : genetics * history of science * biostatistics * design of experiments Subject RIV: BB - Applied Statistics, Operational Research

  5. Integrated analysis of genetic data with R

    Directory of Open Access Journals (Sweden)

    Zhao Jing

    2006-01-01

    Full Text Available Abstract Genetic data are now widely available. There is, however, an apparent lack of concerted effort to produce software systems for statistical analysis of genetic data compared with other fields of statistics. It is often a tremendous task for end-users to tailor them for particular data, especially when genetic data are analysed in conjunction with a large number of covariates. Here, R http://www.r-project.org, a free, flexible and platform-independent environment for statistical modelling and graphics is explored as an integrated system for genetic data analysis. An overview of some packages currently available for analysis of genetic data is given. This is followed by examples of package development and practical applications. With clear advantages in data management, graphics, statistical analysis, programming, internet capability and use of available codes, it is a feasible, although challenging, task to develop it into an integrated platform for genetic analysis; this will require the joint efforts of many researchers.

  6. Linear Mixed Models in Statistical Genetics

    NARCIS (Netherlands)

    R. de Vlaming (Ronald)

    2017-01-01

    markdownabstractOne of the goals of statistical genetics is to elucidate the genetic architecture of phenotypes (i.e., observable individual characteristics) that are affected by many genetic variants (e.g., single-nucleotide polymorphisms; SNPs). A particular aim is to identify specific SNPs that

  7. Statistical aspects of forensic genetics

    DEFF Research Database (Denmark)

    Tvedebrink, Torben

    This PhD thesis deals with statistical models intended for forensic genetics, which is the part of forensic medicine concerned with analysis of DNA evidence from criminal cases together with calculation of alleged paternity and affinity in family reunification cases. The main focus of the thesis...... is on crime cases as these differ from the other types of cases since the biological material often is used for person identification contrary to affinity. Common to all cases, however, is that the DNA is used as evidence in order to assess the probability of observing the biological material given different...... of the DNA evidence under competing hypotheses the biological evidence may be used in the court’s deliberation and trial on equal footing with other evidence and expert statements. These probabilities are based on population genetic models whose assumptions must be validated. The thesis’s first two articles...

  8. Statistical methods and challenges in connectome genetics

    KAUST Repository

    Pluta, Dustin; Yu, Zhaoxia; Shen, Tong; Chen, Chuansheng; Xue, Gui; Ombao, Hernando

    2018-01-01

    The study of genetic influences on brain connectivity, known as connectome genetics, is an exciting new direction of research in imaging genetics. We here review recent results and current statistical methods in this area, and discuss some

  9. A Powerful Approach to Estimating Annotation-Stratified Genetic Covariance via GWAS Summary Statistics.

    Science.gov (United States)

    Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu

    2017-12-07

    Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  10. COMPUTER METHODS OF GENETIC ANALYSIS.

    Directory of Open Access Journals (Sweden)

    A. L. Osipov

    2017-02-01

    Full Text Available The basic statistical methods used in conducting the genetic analysis of human traits. We studied by segregation analysis, linkage analysis and allelic associations. Developed software for the implementation of these methods support.

  11. Statistical methods and challenges in connectome genetics

    KAUST Repository

    Pluta, Dustin

    2018-03-12

    The study of genetic influences on brain connectivity, known as connectome genetics, is an exciting new direction of research in imaging genetics. We here review recent results and current statistical methods in this area, and discuss some of the persistent challenges and possible directions for future work.

  12. The Analysis of Polyploid Genetic Data.

    Science.gov (United States)

    Meirmans, Patrick G; Liu, Shenglin; van Tienderen, Peter H

    2018-03-16

    Though polyploidy is an important aspect of the evolutionary genetics of both plants and animals, the development of population genetic theory of polyploids has seriously lagged behind that of diploids. This is unfortunate since the analysis of polyploid genetic data-and the interpretation of the results-requires even more scrutiny than with diploid data. This is because of several polyploidy-specific complications in segregation and genotyping such as tetrasomy, double reduction, and missing dosage information. Here, we review the theoretical and statistical aspects of the population genetics of polyploids. We discuss several widely used types of inferences, including genetic diversity, Hardy-Weinberg equilibrium, population differentiation, genetic distance, and detecting population structure. For each, we point out how the statistical approach, expected result, and interpretation differ between different ploidy levels. We also discuss for each type of inference what biases may arise from the polyploid-specific complications and how these biases can be overcome. From our overview, it is clear that the statistical toolbox that is available for the analysis of genetic data is flexible and still expanding. Modern sequencing techniques will soon be able to overcome some of the current limitations to the analysis of polyploid data, though the techniques are lagging behind those available for diploids. Furthermore, the availability of more data may aggravate the biases that can arise, and increase the risk of false inferences. Therefore, simulations such as we used throughout this review are an important tool to verify the results of analyses of polyploid genetic data.

  13. DHLAS: A web-based information system for statistical genetic analysis of HLA population data.

    Science.gov (United States)

    Thriskos, P; Zintzaras, E; Germenis, A

    2007-03-01

    DHLAS (database HLA system) is a user-friendly, web-based information system for the analysis of human leukocyte antigens (HLA) data from population studies. DHLAS has been developed using JAVA and the R system, it runs on a Java Virtual Machine and its user-interface is web-based powered by the servlet engine TOMCAT. It utilizes STRUTS, a Model-View-Controller framework and uses several GNU packages to perform several of its tasks. The database engine it relies upon for fast access is MySQL, but others can be used a well. The system estimates metrics, performs statistical testing and produces graphs required for HLA population studies: (i) Hardy-Weinberg equilibrium (calculated using both asymptotic and exact tests), (ii) genetics distances (Euclidian or Nei), (iii) phylogenetic trees using the unweighted pair group method with averages and neigbor-joining method, (iv) linkage disequilibrium (pairwise and overall, including variance estimations), (v) haplotype frequencies (estimate using the expectation-maximization algorithm) and (vi) discriminant analysis. The main merit of DHLAS is the incorporation of a database, thus, the data can be stored and manipulated along with integrated genetic data analysis procedures. In addition, it has an open architecture allowing the inclusion of other functions and procedures.

  14. A weighted U statistic for association analyses considering genetic heterogeneity.

    Science.gov (United States)

    Wei, Changshuai; Elston, Robert C; Lu, Qing

    2016-07-20

    Converging evidence suggests that common complex diseases with the same or similar clinical manifestations could have different underlying genetic etiologies. While current research interests have shifted toward uncovering rare variants and structural variations predisposing to human diseases, the impact of heterogeneity in genetic studies of complex diseases has been largely overlooked. Most of the existing statistical methods assume the disease under investigation has a homogeneous genetic effect and could, therefore, have low power if the disease undergoes heterogeneous pathophysiological and etiological processes. In this paper, we propose a heterogeneity-weighted U (HWU) method for association analyses considering genetic heterogeneity. HWU can be applied to various types of phenotypes (e.g., binary and continuous) and is computationally efficient for high-dimensional genetic data. Through simulations, we showed the advantage of HWU when the underlying genetic etiology of a disease was heterogeneous, as well as the robustness of HWU against different model assumptions (e.g., phenotype distributions). Using HWU, we conducted a genome-wide analysis of nicotine dependence from the Study of Addiction: Genetics and Environments dataset. The genome-wide analysis of nearly one million genetic markers took 7h, identifying heterogeneous effects of two new genes (i.e., CYP3A5 and IKBKB) on nicotine dependence. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  15. Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.

    Science.gov (United States)

    Deng, Yangqing; Pan, Wei

    2017-12-01

    There is growing interest in testing genetic pleiotropy, which is when a single genetic variant influences multiple traits. Several methods have been proposed; however, these methods have some limitations. First, all the proposed methods are based on the use of individual-level genotype and phenotype data; in contrast, for logistical, and other, reasons, summary statistics of univariate SNP-trait associations are typically only available based on meta- or mega-analyzed large genome-wide association study (GWAS) data. Second, existing tests are based on marginal pleiotropy, which cannot distinguish between direct and indirect associations of a single genetic variant with multiple traits due to correlations among the traits. Hence, it is useful to consider conditional analysis, in which a subset of traits is adjusted for another subset of traits. For example, in spite of substantial lowering of low-density lipoprotein cholesterol (LDL) with statin therapy, some patients still maintain high residual cardiovascular risk, and, for these patients, it might be helpful to reduce their triglyceride (TG) level. For this purpose, in order to identify new therapeutic targets, it would be useful to identify genetic variants with pleiotropic effects on LDL and TG after adjusting the latter for LDL; otherwise, a pleiotropic effect of a genetic variant detected by a marginal model could simply be due to its association with LDL only, given the well-known correlation between the two types of lipids. Here, we develop a new pleiotropy testing procedure based only on GWAS summary statistics that can be applied for both marginal analysis and conditional analysis. Although the main technical development is based on published union-intersection testing methods, care is needed in specifying conditional models to avoid invalid statistical estimation and inference. In addition to the previously used likelihood ratio test, we also propose using generalized estimating equations under the

  16. A weighted U-statistic for genetic association analyses of sequencing data.

    Science.gov (United States)

    Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

    2014-12-01

    With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.

  17. SimHap GUI: an intuitive graphical user interface for genetic association analysis.

    Science.gov (United States)

    Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

    2008-12-25

    Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis.

  18. A robust statistical method for association-based eQTL analysis.

    Directory of Open Access Journals (Sweden)

    Ning Jiang

    Full Text Available It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS is statistical inference of linkage disequilibrium (LD between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation.We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations.The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.

  19. MetaGenyo: a web tool for meta-analysis of genetic association studies.

    Science.gov (United States)

    Martorell-Marugan, Jordi; Toro-Dominguez, Daniel; Alarcon-Riquelme, Marta E; Carmona-Saez, Pedro

    2017-12-16

    Genetic association studies (GAS) aims to evaluate the association between genetic variants and phenotypes. In the last few years, the number of this type of study has increased exponentially, but the results are not always reproducible due to experimental designs, low sample sizes and other methodological errors. In this field, meta-analysis techniques are becoming very popular tools to combine results across studies to increase statistical power and to resolve discrepancies in genetic association studies. A meta-analysis summarizes research findings, increases statistical power and enables the identification of genuine associations between genotypes and phenotypes. Meta-analysis techniques are increasingly used in GAS, but it is also increasing the amount of published meta-analysis containing different errors. Although there are several software packages that implement meta-analysis, none of them are specifically designed for genetic association studies and in most cases their use requires advanced programming or scripting expertise. We have developed MetaGenyo, a web tool for meta-analysis in GAS. MetaGenyo implements a complete and comprehensive workflow that can be executed in an easy-to-use environment without programming knowledge. MetaGenyo has been developed to guide users through the main steps of a GAS meta-analysis, covering Hardy-Weinberg test, statistical association for different genetic models, analysis of heterogeneity, testing for publication bias, subgroup analysis and robustness testing of the results. MetaGenyo is a useful tool to conduct comprehensive genetic association meta-analysis. The application is freely available at http://bioinfo.genyo.es/metagenyo/ .

  20. An integrated system for genetic analysis

    Directory of Open Access Journals (Sweden)

    Duan Xiao

    2006-04-01

    Full Text Available Abstract Background Large-scale genetic mapping projects require data management systems that can handle complex phenotypes and detect and correct high-throughput genotyping errors, yet are easy to use. Description We have developed an Integrated Genotyping System (IGS to meet this need. IGS securely stores, edits and analyses genotype and phenotype data. It stores information about DNA samples, plates, primers, markers and genotypes generated by a genotyping laboratory. Data are structured so that statistical genetic analysis of both case-control and pedigree data is straightforward. Conclusion IGS can model complex phenotypes and contain genotypes from whole genome association studies. The database makes it possible to integrate genetic analysis with data curation. The IGS web site http://bioinformatics.well.ox.ac.uk/project-igs.shtml contains further information.

  1. Statistical Analysis of Big Data on Pharmacogenomics

    Science.gov (United States)

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  2. Event History Analysis in Quantitative Genetics

    DEFF Research Database (Denmark)

    Maia, Rafael Pimentel

    Event history analysis is a clas of statistical methods specially designed to analyze time-to-event characteristics, e.g. the time until death. The aim of the thesis was to present adequate multivariate versions of mixed survival models that properly represent the genetic aspects related to a given...

  3. PopSc: Computing Toolkit for Basic Statistics of Molecular Population Genetics Simultaneously Implemented in Web-Based Calculator, Python and R.

    Science.gov (United States)

    Chen, Shi-Yi; Deng, Feilong; Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia

    2016-01-01

    Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis.

  4. PopSc: Computing Toolkit for Basic Statistics of Molecular Population Genetics Simultaneously Implemented in Web-Based Calculator, Python and R.

    Directory of Open Access Journals (Sweden)

    Shi-Yi Chen

    Full Text Available Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i genetic diversity of DNA sequences, (ii statistical tests for neutral evolution, and (iii measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis.

  5. Statistical methods to detect novel genetic variants using publicly available GWAS summary data.

    Science.gov (United States)

    Guo, Bin; Wu, Baolin

    2018-03-01

    We propose statistical methods to detect novel genetic variants using only genome-wide association studies (GWAS) summary data without access to raw genotype and phenotype data. With more and more summary data being posted for public access in the post GWAS era, the proposed methods are practically very useful to identify additional interesting genetic variants and shed lights on the underlying disease mechanism. We illustrate the utility of our proposed methods with application to GWAS meta-analysis results of fasting glucose from the international MAGIC consortium. We found several novel genome-wide significant loci that are worth further study. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Hierarchical linear modeling of longitudinal pedigree data for genetic association analysis

    DEFF Research Database (Denmark)

    Tan, Qihua; B Hjelmborg, Jacob V; Thomassen, Mads

    2014-01-01

    -effect models to explicitly model the genetic relationship. These have proved to be an efficient way of dealing with sample clustering in pedigree data. Although current algorithms implemented in popular statistical packages are useful for adjusting relatedness in the mixed modeling of genetic effects...... associated with blood pressure with estimated inflation factors of 0.99, suggesting that our modeling of random effects efficiently handles the genetic relatedness in pedigrees. Application to simulated data captures important variants specified in the simulation. Our results show that the method is useful......Genetic association analysis on complex phenotypes under a longitudinal design involving pedigrees encounters the problem of correlation within pedigrees, which could affect statistical assessment of the genetic effects. Approaches have been proposed to integrate kinship correlation into the mixed...

  7. Network statistics of genetically-driven gene co-expression modules in mouse crosses

    Directory of Open Access Journals (Sweden)

    Marie-Pier eScott-Boyer

    2013-12-01

    Full Text Available In biology, networks are used in different contexts as ways to represent relationships between entities, such as for instance interactions between genes, proteins or metabolites. Despite progress in the analysis of such networks and their potential to better understand the collective impact of genes on complex traits, one remaining challenge is to establish the biologic validity of gene co-expression networks and to determine what governs their organization. We used WGCNA to construct and analyze seven gene expression datasets from several tissues of mouse recombinant inbred strains (RIS. For six out of the 7 networks, we found that linkage to module QTLs (mQTLs could be established for 29.3% of gene co-expression modules detected in the several mouse RIS. For about 74.6% of such genetically-linked modules, the mQTL was on the same chromosome as the one contributing most genes to the module, with genes originating from that chromosome showing higher connectivity than other genes in the modules. Such modules (that we considered as genetically-driven had network statistic properties (density, centralization and heterogeneity that set them apart from other modules in the network. Altogether, a sizeable portion of gene co-expression modules detected in mouse RIS panels had genetic determinants as their main organizing principle. In addition to providing a biologic interpretation validation for these modules, these genetic determinants imparted on them particular properties that set them apart from other modules in the network, to the point that they can be predicted to a large extent on the basis of their network statistics.

  8. Epileptic MEG Spike Detection Using Statistical Features and Genetic Programming with KNN

    Directory of Open Access Journals (Sweden)

    Turky N. Alotaiby

    2017-01-01

    Full Text Available Epilepsy is a neurological disorder that affects millions of people worldwide. Monitoring the brain activities and identifying the seizure source which starts with spike detection are important steps for epilepsy treatment. Magnetoencephalography (MEG is an emerging epileptic diagnostic tool with high-density sensors; this makes manual analysis a challenging task due to the vast amount of MEG data. This paper explores the use of eight statistical features and genetic programing (GP with the K-nearest neighbor (KNN for interictal spike detection. The proposed method is comprised of three stages: preprocessing, genetic programming-based feature generation, and classification. The effectiveness of the proposed approach has been evaluated using real MEG data obtained from 28 epileptic patients. It has achieved a 91.75% average sensitivity and 92.99% average specificity.

  9. Multivariate Methods for Meta-Analysis of Genetic Association Studies.

    Science.gov (United States)

    Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G

    2018-01-01

    Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.

  10. A novel statistic for genome-wide interaction analysis.

    Directory of Open Access Journals (Sweden)

    Xuesen Wu

    2010-09-01

    Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001analysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  11. Analysis of the genetic diversity of selected East African sweet potato

    African Journals Online (AJOL)

    The genetic relationship of the germplasm was evaluated using the Jaccard's coefficient for dissimilarity analysis, unweighted pair group method with arithmetic means (UPGMA) tree and principal component analysis (PCoA) on DARwin software, while summary statistics was done using PowerMarker and Popgene ...

  12. GENES - a software package for analysis in experimental statistics and quantitative genetics

    Directory of Open Access Journals (Sweden)

    Cosme Damião Cruz

    2013-06-01

    Full Text Available GENES is a software package used for data analysis and processing with different biometricmodels and is essential in genetic studies applied to plant and animal breeding. It allows parameterestimation to analyze biologicalphenomena and is fundamental for the decision-making process andpredictions of success and viability of selection strategies. The program can be downloaded from theInternet (http://www.ufv.br/dbg/genes/genes.htm orhttp://www.ufv.br/dbg/biodata.htm and is available inPortuguese, English and Spanish. Specific literature (http://www.livraria.ufv.br/ and a set of sample filesare also provided, making GENES easy to use. The software is integrated into the programs MS Word, MSExcel and Paint, ensuring simplicity and effectiveness indata import and export ofresults, figures and data.It is also compatible with the free software R and Matlab, through the supply of useful scripts available forcomplementary analyses in different areas, including genome wide selection, prediction of breeding valuesand use of neural networks in genetic improvement.

  13. A statistical simulation model for field testing of non-target organisms in environmental risk assessment of genetically modified plants.

    Science.gov (United States)

    Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore

    2014-04-01

    Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.

  14. Statistical data analysis using SAS intermediate statistical methods

    CERN Document Server

    Marasinghe, Mervyn G

    2018-01-01

    The aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data. The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude. Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitab...

  15. Analysis of genetic effects of nuclear-cytoplasmic interaction on quantitative traits: genetic model for diploid plants.

    Science.gov (United States)

    Han, Lide; Yang, Jian; Zhu, Jun

    2007-06-01

    A genetic model was proposed for simultaneously analyzing genetic effects of nuclear, cytoplasm, and nuclear-cytoplasmic interaction (NCI) as well as their genotype by environment (GE) interaction for quantitative traits of diploid plants. In the model, the NCI effects were further partitioned into additive and dominance nuclear-cytoplasmic interaction components. Mixed linear model approaches were used for statistical analysis. On the basis of diallel cross designs, Monte Carlo simulations showed that the genetic model was robust for estimating variance components under several situations without specific effects. Random genetic effects were predicted by an adjusted unbiased prediction (AUP) method. Data on four quantitative traits (boll number, lint percentage, fiber length, and micronaire) in Upland cotton (Gossypium hirsutum L.) were analyzed as a worked example to show the effectiveness of the model.

  16. Analysis of conditional genetic effects and variance components in developmental genetics.

    Science.gov (United States)

    Zhu, J

    1995-12-01

    A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.

  17. Teaching biology through statistics: application of statistical methods in genetics and zoology courses.

    Science.gov (United States)

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math-biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology.

  18. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

    Science.gov (United States)

    Peakall, Rod; Smouse, Peter E

    2012-10-01

    GenAlEx: Genetic Analysis in Excel is a cross-platform package for population genetic analyses that runs within Microsoft Excel. GenAlEx offers analysis of diploid codominant, haploid and binary genetic loci and DNA sequences. Both frequency-based (F-statistics, heterozygosity, HWE, population assignment, relatedness) and distance-based (AMOVA, PCoA, Mantel tests, multivariate spatial autocorrelation) analyses are provided. New features include calculation of new estimators of population structure: G'(ST), G''(ST), Jost's D(est) and F'(ST) through AMOVA, Shannon Information analysis, linkage disequilibrium analysis for biallelic data and novel heterogeneity tests for spatial autocorrelation analysis. Export to more than 30 other data formats is provided. Teaching tutorials and expanded step-by-step output options are included. The comprehensive guide has been fully revised. GenAlEx is written in VBA and provided as a Microsoft Excel Add-in (compatible with Excel 2003, 2007, 2010 on PC; Excel 2004, 2011 on Macintosh). GenAlEx, and supporting documentation and tutorials are freely available at: http://biology.anu.edu.au/GenAlEx. rod.peakall@anu.edu.au.

  19. 12th Workshop on Stochastic Models, Statistics and Their Applications

    CERN Document Server

    Rafajłowicz, Ewaryst; Szajowski, Krzysztof

    2015-01-01

    This volume presents the latest advances and trends in stochastic models and related statistical procedures. Selected peer-reviewed contributions focus on statistical inference, quality control, change-point analysis and detection, empirical processes, time series analysis, survival analysis and reliability, statistics for stochastic processes, big data in technology and the sciences, statistical genetics, experiment design, and stochastic models in engineering. Stochastic models and related statistical procedures play an important part in furthering our understanding of the challenging problems currently arising in areas of application such as the natural sciences, information technology, engineering, image analysis, genetics, energy and finance, to name but a few. This collection arises from the 12th Workshop on Stochastic Models, Statistics and Their Applications, Wroclaw, Poland.

  20. Statistical framework for detection of genetically modified organisms based on Next Generation Sequencing.

    Science.gov (United States)

    Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy

    2016-02-01

    Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  1. Genetic Code Analysis Toolkit: A novel tool to explore the coding properties of the genetic code and DNA sequences

    Science.gov (United States)

    Kraljić, K.; Strüngmann, L.; Fimmel, E.; Gumbel, M.

    2018-01-01

    The genetic code is degenerated and it is assumed that redundancy provides error detection and correction mechanisms in the translation process. However, the biological meaning of the code's structure is still under current research. This paper presents a Genetic Code Analysis Toolkit (GCAT) which provides workflows and algorithms for the analysis of the structure of nucleotide sequences. In particular, sets or sequences of codons can be transformed and tested for circularity, comma-freeness, dichotomic partitions and others. GCAT comes with a fertile editor custom-built to work with the genetic code and a batch mode for multi-sequence processing. With the ability to read FASTA files or load sequences from GenBank, the tool can be used for the mathematical and statistical analysis of existing sequence data. GCAT is Java-based and provides a plug-in concept for extensibility. Availability: Open source Homepage:http://www.gcat.bio/

  2. Bayesian analysis of genetic association across tree-structured routine healthcare data in the UK Biobank

    DEFF Research Database (Denmark)

    Cortes, Adrian; Dendrou, Calliope A; Motyer, Allan

    2017-01-01

    Genetic discovery from the multitude of phenotypes extractable from routine healthcare data can transform understanding of the human phenome and accelerate progress toward precision medicine. However, a critical question when analyzing high-dimensional and heterogeneous data is how best...... to interrogate increasingly specific subphenotypes while retaining statistical power to detect genetic associations. Here we develop and employ a new Bayesian analysis framework that exploits the hierarchical structure of diagnosis classifications to analyze genetic variants against UK Biobank disease phenotypes...... derived from self-reporting and hospital episode statistics. Our method displays a more than 20% increase in power to detect genetic effects over other approaches and identifies new associations between classical human leukocyte antigen (HLA) alleles and common immune-mediated diseases (IMDs). By applying...

  3. Bayesian analysis of genetic association across tree-structured routine healthcare data in the UK Biobank.

    Science.gov (United States)

    Cortes, Adrian; Dendrou, Calliope A; Motyer, Allan; Jostins, Luke; Vukcevic, Damjan; Dilthey, Alexander; Donnelly, Peter; Leslie, Stephen; Fugger, Lars; McVean, Gil

    2017-09-01

    Genetic discovery from the multitude of phenotypes extractable from routine healthcare data can transform understanding of the human phenome and accelerate progress toward precision medicine. However, a critical question when analyzing high-dimensional and heterogeneous data is how best to interrogate increasingly specific subphenotypes while retaining statistical power to detect genetic associations. Here we develop and employ a new Bayesian analysis framework that exploits the hierarchical structure of diagnosis classifications to analyze genetic variants against UK Biobank disease phenotypes derived from self-reporting and hospital episode statistics. Our method displays a more than 20% increase in power to detect genetic effects over other approaches and identifies new associations between classical human leukocyte antigen (HLA) alleles and common immune-mediated diseases (IMDs). By applying the approach to genetic risk scores (GRSs), we show the extent of genetic sharing among IMDs and expose differences in disease perception or diagnosis with potential clinical implications.

  4. Integrated genetic analysis microsystems

    International Nuclear Information System (INIS)

    Lagally, Eric T; Mathies, Richard A

    2004-01-01

    With the completion of the Human Genome Project and the ongoing DNA sequencing of the genomes of other animals, bacteria, plants and others, a wealth of new information about the genetic composition of organisms has become available. However, as the demand for sequence information grows, so does the workload required both to generate this sequence and to use it for targeted genetic analysis. Microfabricated genetic analysis systems are well poised to assist in the collection and use of these data through increased analysis speed, lower analysis cost and higher parallelism leading to increased assay throughput. In addition, such integrated microsystems may point the way to targeted genetic experiments on single cells and in other areas that are otherwise very difficult. Concomitant with these advantages, such systems, when fully integrated, should be capable of forming portable systems for high-speed in situ analyses, enabling a new standard in disciplines such as clinical chemistry, forensics, biowarfare detection and epidemiology. This review will discuss the various technologies available for genetic analysis on the microscale, and efforts to integrate them to form fully functional robust analysis devices. (topical review)

  5. Improved score statistics for meta-analysis in single-variant and gene-level association studies.

    Science.gov (United States)

    Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo

    2018-06-01

    Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.

  6. Analysis of a genetically structured variance heterogeneity model using the Box-Cox transformation.

    Science.gov (United States)

    Yang, Ye; Christensen, Ole F; Sorensen, Daniel

    2011-02-01

    Over recent years, statistical support for the presence of genetic factors operating at the level of the environmental variance has come from fitting a genetically structured heterogeneous variance model to field or experimental data in various species. Misleading results may arise due to skewness of the marginal distribution of the data. To investigate how the scale of measurement affects inferences, the genetically structured heterogeneous variance model is extended to accommodate the family of Box-Cox transformations. Litter size data in rabbits and pigs that had previously been analysed in the untransformed scale were reanalysed in a scale equal to the mode of the marginal posterior distribution of the Box-Cox parameter. In the rabbit data, the statistical evidence for a genetic component at the level of the environmental variance is considerably weaker than that resulting from an analysis in the original metric. In the pig data, the statistical evidence is stronger, but the coefficient of correlation between additive genetic effects affecting mean and variance changes sign, compared to the results in the untransformed scale. The study confirms that inferences on variances can be strongly affected by the presence of asymmetry in the distribution of data. We recommend that to avoid one important source of spurious inferences, future work seeking support for a genetic component acting on environmental variation using a parametric approach based on normality assumptions confirms that these are met.

  7. A Review of Pathway-Based Analysis Tools That Visualize Genetic Variants

    Directory of Open Access Journals (Sweden)

    Elisa Cirillo

    2017-11-01

    Full Text Available Pathway analysis is a powerful method for data analysis in genomics, most often applied to gene expression analysis. It is also promising for single-nucleotide polymorphism (SNP data analysis, such as genome-wide association study data, because it allows the interpretation of variants with respect to the biological processes in which the affected genes and proteins are involved. Such analyses support an interactive evaluation of the possible effects of variations on function, regulation or interaction of gene products. Current pathway analysis software often does not support data visualization of variants in pathways as an alternate method to interpret genetic association results, and specific statistical methods for pathway analysis of SNP data are not combined with these visualization features. In this review, we first describe the visualization options of the tools that were identified by a literature review, in order to provide insight for improvements in this developing field. Tool evaluation was performed using a computational epistatic dataset of gene–gene interactions for obesity risk. Next, we report the necessity to include in these tools statistical methods designed for the pathway-based analysis with SNP data, expressly aiming to define features for more comprehensive pathway-based analysis tools. We conclude by recognizing that pathway analysis of genetic variations data requires a sophisticated combination of the most useful and informative visual aspects of the various tools evaluated.

  8. A review of statistical methods for testing genetic anticipation: looking for an answer in Lynch syndrome

    DEFF Research Database (Denmark)

    Boonstra, Philip S; Gruber, Stephen B; Raymond, Victoria M

    2010-01-01

    Anticipation, manifested through decreasing age of onset or increased severity in successive generations, has been noted in several genetic diseases. Statistical methods for genetic anticipation range from a simple use of the paired t-test for age of onset restricted to affected parent-child pairs......, and this right truncation effect is more pronounced in children than in parents. In this study, we first review different statistical methods for testing genetic anticipation in affected parent-child pairs that address the issue of bias due to right truncation. Using affected parent-child pair data, we compare...... the issue of multiplex ascertainment and its effect on the different methods. We then focus on exploring genetic anticipation in Lynch syndrome and analyze new data on the age of onset in affected parent-child pairs from families seen at the University of Michigan Cancer Genetics clinic with a mutation...

  9. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research—an update

    Science.gov (United States)

    Peakall, Rod; Smouse, Peter E.

    2012-01-01

    Summary: GenAlEx: Genetic Analysis in Excel is a cross-platform package for population genetic analyses that runs within Microsoft Excel. GenAlEx offers analysis of diploid codominant, haploid and binary genetic loci and DNA sequences. Both frequency-based (F-statistics, heterozygosity, HWE, population assignment, relatedness) and distance-based (AMOVA, PCoA, Mantel tests, multivariate spatial autocorrelation) analyses are provided. New features include calculation of new estimators of population structure: G′ST, G′′ST, Jost’s Dest and F′ST through AMOVA, Shannon Information analysis, linkage disequilibrium analysis for biallelic data and novel heterogeneity tests for spatial autocorrelation analysis. Export to more than 30 other data formats is provided. Teaching tutorials and expanded step-by-step output options are included. The comprehensive guide has been fully revised. Availability and implementation: GenAlEx is written in VBA and provided as a Microsoft Excel Add-in (compatible with Excel 2003, 2007, 2010 on PC; Excel 2004, 2011 on Macintosh). GenAlEx, and supporting documentation and tutorials are freely available at: http://biology.anu.edu.au/GenAlEx. Contact: rod.peakall@anu.edu.au PMID:22820204

  10. A strategy analysis for genetic association studies with known inbreeding

    Directory of Open Access Journals (Sweden)

    del Giacco Stefano

    2011-07-01

    Full Text Available Abstract Background Association studies consist in identifying the genetic variants which are related to a specific disease through the use of statistical multiple hypothesis testing or segregation analysis in pedigrees. This type of studies has been very successful in the case of Mendelian monogenic disorders while it has been less successful in identifying genetic variants related to complex diseases where the insurgence depends on the interactions between different genes and the environment. The current technology allows to genotype more than a million of markers and this number has been rapidly increasing in the last years with the imputation based on templates sets and whole genome sequencing. This type of data introduces a great amount of noise in the statistical analysis and usually requires a great number of samples. Current methods seldom take into account gene-gene and gene-environment interactions which are fundamental especially in complex diseases. In this paper we propose to use a non-parametric additive model to detect the genetic variants related to diseases which accounts for interactions of unknown order. Although this is not new to the current literature, we show that in an isolated population, where the most related subjects share also most of their genetic code, the use of additive models may be improved if the available genealogical tree is taken into account. Specifically, we form a sample of cases and controls with the highest inbreeding by means of the Hungarian method, and estimate the set of genes/environmental variables, associated with the disease, by means of Random Forest. Results We have evidence, from statistical theory, simulations and two applications, that we build a suitable procedure to eliminate stratification between cases and controls and that it also has enough precision in identifying genetic variants responsible for a disease. This procedure has been successfully used for the beta-thalassemia, which is

  11. Beginning statistics with data analysis

    CERN Document Server

    Mosteller, Frederick; Rourke, Robert EK

    2013-01-01

    This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.

  12. A functional U-statistic method for association analysis of sequencing data.

    Science.gov (United States)

    Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

    2017-11-01

    Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.

  13. Analysis of a genetically structured variance heterogeneity model using the Box-Cox transformation

    DEFF Research Database (Denmark)

    Yang, Ye; Christensen, Ole Fredslund; Sorensen, Daniel

    2011-01-01

    of the marginal distribution of the data. To investigate how the scale of measurement affects inferences, the genetically structured heterogeneous variance model is extended to accommodate the family of Box–Cox transformations. Litter size data in rabbits and pigs that had previously been analysed...... in the untransformed scale were reanalysed in a scale equal to the mode of the marginal posterior distribution of the Box–Cox parameter. In the rabbit data, the statistical evidence for a genetic component at the level of the environmental variance is considerably weaker than that resulting from an analysis...... in the original metric. In the pig data, the statistical evidence is stronger, but the coefficient of correlation between additive genetic effects affecting mean and variance changes sign, compared to the results in the untransformed scale. The study confirms that inferences on variances can be strongly affected...

  14. Host traits explain the genetic structure of parasites: a meta-analysis

    Czech Academy of Sciences Publication Activity Database

    Blasco-Costa, Maria Isabel; Poulin, R.

    2013-01-01

    Roč. 140, č. 10 (2013), s. 1316-1322 ISSN 0031-1820 EU Projects: European Commission(XE) 252124 - PARAPOPGENE Institutional support: RVO:60077344 Keywords : meta-analysis * host traits * parasite traits * F-statistics * population genetic structure * dispersal * autogenic life cycle * allogenic life cycle Subject RIV: EH - Ecology, Behaviour Impact factor: 2.350, year: 2013

  15. Genetic Analysis of Elevated Mastitis Risk Based on Mastitis Indicator Data

    DEFF Research Database (Denmark)

    Sørensen, Lars Peter; Løvendahl, Peter

    Whole-genome sequences and multiple trait phenotypes from large numbers of individuals will soon be available. Well established statistical modeling approaches enable the genetic analyses of complex trait phenotypes while accounting for a variety of additive and non-additive genetic mechanisms....... These modeling approaches have proven to be highly useful to determine population genetic parameters as well as prediction of genetic risk or value. We present statistical modelling approaches that use prior biological information for evaluating the collective action of sets of genetic variants. We have applied...

  16. Research design and statistical analysis

    CERN Document Server

    Myers, Jerome L; Lorch Jr, Robert F

    2013-01-01

    Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data.  The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations.  Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions.  Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations

  17. Metabolome Comparison of Transgenic and Non-transgenic Rice by Statistical Analysis of FTIR and NMR Spectra

    Directory of Open Access Journals (Sweden)

    Keykhosrow Keymanesh

    2009-06-01

    Full Text Available Modern biotechnology, based on recombinant DNA techniques, has made it possible to introduce new traits with great potential for crop improvement. However, concerns about unintended effects of gene transformation that possibly threaten environment or consumer health have persuaded scientists to set up pre-release tests on genetically modified organisms. Assessment of ‘substantial equivalence’ concept that established by comparison of genetically modified organism with a comparator with a history of safe use could be the first step of a comprehensive risk assessment. Metabolite level is the richest in performance of changes which stem from genetic or environmental factors. Since assessment of all metabolites in detail is very costly and practically impossible, statistical evaluation of processed data of grain spectroscopic values could be a time and cost effective substitution for complex chemical analysis. To investigate the ability of multivariate statistical techniques in comparison of metabolomes as well as testing a method for such comparisons with available tools, a transgenic rice in combination with its traditionally bred parent were used as test material, and the discriminant analysis were applied as supervised method and principal component analysis as unsupervised classification method on the processed data which were extracted from Fourier transform infrared spectroscopy and nuclear magnetic resonance spectral data of powdered rice and rice extraction and barley grain samples, of which the latter was considered as control. The results confirmed the capability of statistics, even with initial data processing applications in metabolome studies. Meanwhile, this study confirms that the supervised method results in more distinctive results.

  18. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.

    Science.gov (United States)

    Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti

    2016-07-01

    A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  19. Attitudes towards genetic testing: analysis of contradictions

    DEFF Research Database (Denmark)

    Jallinoja, P; Hakonen, A; Aro, A R

    1998-01-01

    A survey study was conducted among 1169 people to evaluate attitudes towards genetic testing in Finland. Here we present an analysis of the contradictions detected in people's attitudes towards genetic testing. This analysis focuses on the approval of genetic testing as an individual choice and o...... studies on attitudes towards genetic testing as well as in the health care context, e.g. in genetic counselling.......A survey study was conducted among 1169 people to evaluate attitudes towards genetic testing in Finland. Here we present an analysis of the contradictions detected in people's attitudes towards genetic testing. This analysis focuses on the approval of genetic testing as an individual choice...... and on the confidence in control of the process of genetic testing and its implications. Our analysis indicated that some of the respondents have contradictory attitudes towards genetic testing. It is proposed that contradictory attitudes towards genetic testing should be given greater significance both in scientific...

  20. Applications of modern statistical methods to analysis of data in physical science

    Science.gov (United States)

    Wicker, James Eric

    Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance

  1. Two-level mixed modeling of longitudinal pedigree data for genetic association analysis

    DEFF Research Database (Denmark)

    Tan, Q.

    2013-01-01

    of follow-up. Approaches have been proposed to integrate kinship correlation into the mixed effect models to explicitly model the genetic relationship which have been proven as an efficient way for dealing with sample clustering in pedigree data. Although useful for adjusting relatedness in the mixed...... assess the genetic associations with the mean level and the rate of change in a phenotype both with kinship correlation integrated in the mixed effect models. We apply our method to longitudinal pedigree data to estimate the genetic effects on systolic blood pressure measured over time in large pedigrees......Genetic association analysis on complex phenotypes under a longitudinal design involving pedigrees encounters the problem of correlation within pedigrees which could affect statistical assessment of the genetic effects on both the mean level of the phenotype and its rate of change over the time...

  2. A hybrid correlation analysis with application to imaging genetics

    Science.gov (United States)

    Hu, Wenxing; Fang, Jian; Calhoun, Vince D.; Wang, Yu-Ping

    2018-03-01

    Investigating the association between brain regions and genes continues to be a challenging topic in imaging genetics. Current brain region of interest (ROI)-gene association studies normally reduce data dimension by averaging the value of voxels in each ROI. This averaging may lead to a loss of information due to the existence of functional sub-regions. Pearson correlation is widely used for association analysis. However, it only detects linear correlation whereas nonlinear correlation may exist among ROIs. In this work, we introduced distance correlation to ROI-gene association analysis, which can detect both linear and nonlinear correlations and overcome the limitation of averaging operations by taking advantage of the information at each voxel. Nevertheless, distance correlation usually has a much lower value than Pearson correlation. To address this problem, we proposed a hybrid correlation analysis approach, by applying canonical correlation analysis (CCA) to the distance covariance matrix instead of directly computing distance correlation. Incorporating CCA into distance correlation approach may be more suitable for complex disease study because it can detect highly associated pairs of ROI and gene groups, and may improve the distance correlation level and statistical power. In addition, we developed a novel nonlinear CCA, called distance kernel CCA, which seeks the optimal combination of features with the most significant dependence. This approach was applied to imaging genetic data from the Philadelphia Neurodevelopmental Cohort (PNC). Experiments showed that our hybrid approach produced more consistent results than conventional CCA across resampling and both the correlation and statistical significance were increased compared to distance correlation analysis. Further gene enrichment analysis and region of interest (ROI) analysis confirmed the associations of the identified genes with brain ROIs. Therefore, our approach provides a powerful tool for finding

  3. Statistical methods for the analysis of high-throughput metabolomics data

    Directory of Open Access Journals (Sweden)

    Fabian J. Theis

    2013-01-01

    Full Text Available Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regulation as well as environmental factors. This potential to connect genotypic to phenotypic information promises new insights and biomarkers for different research fields, including biomedical and pharmaceutical research. In the statistical analysis of metabolomics data, many techniques from other omics fields can be reused. However recently, a number of tools specific for metabolomics data have been developed as well. The focus of this mini review will be on recent advancements in the analysis of metabolomics data especially by utilizing Gaussian graphical models and independent component analysis.

  4. Statistical data analysis handbook

    National Research Council Canada - National Science Library

    Wall, Francis J

    1986-01-01

    It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...

  5. FADTTS: functional analysis of diffusion tensor tract statistics.

    Science.gov (United States)

    Zhu, Hongtu; Kong, Linglong; Li, Runze; Styner, Martin; Gerig, Guido; Lin, Weili; Gilmore, John H

    2011-06-01

    The aim of this paper is to present a functional analysis of a diffusion tensor tract statistics (FADTTS) pipeline for delineating the association between multiple diffusion properties along major white matter fiber bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these white matter tract properties in various diffusion tensor imaging studies. The FADTTS integrates five statistical tools: (i) a multivariate varying coefficient model for allowing the varying coefficient functions in terms of arc length to characterize the varying associations between fiber bundle diffusion properties and a set of covariates, (ii) a weighted least squares estimation of the varying coefficient functions, (iii) a functional principal component analysis to delineate the structure of the variability in fiber bundle diffusion properties, (iv) a global test statistic to test hypotheses of interest, and (v) a simultaneous confidence band to quantify the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of FADTTS. We apply FADTTS to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment. FADTTS can be used to facilitate the understanding of normal brain development, the neural bases of neuropsychiatric disorders, and the joint effects of environmental and genetic factors on white matter fiber bundles. The advantages of FADTTS compared with the other existing approaches are that they are capable of modeling the structured inter-subject variability, testing the joint effects, and constructing their simultaneous confidence bands. However, FADTTS is not crucial for estimation and reduces to the functional analysis method for the single measure. Copyright © 2011 Elsevier Inc. All rights reserved.

  6. Raps markers for genetic diversity analysis in rice (Oryza sativa L)

    Energy Technology Data Exchange (ETDEWEB)

    Alvarez, A; Fuentes, Jorge L [Centro de Estudios Aplicados al Desarrollo Nuclear, La Habana (Cuba); Deus, Juan E [Instituto de Investigaciones del Arroz, Habana (Cuba); Duque, Maria C [Centro Internacional de la Agricultura Tropical. Proyecto de Arroz , Cali (Colombia)

    1999-07-01

    The establishment of relationships between genotypes existing in gene banks that may be used in new crosses, and about genetic diversity in available germplasm, is very useful for plant breeders. In this work, a genetic diversity analysis among 20 varieties of the Cuban rice germplasm bank was performed by using RAPD markers. Twenty four decamer primers were screened which produced 61 polymorphic bands out of 105 consistent and reproducible amplified fragments (58.1 %). The proportion of polymorphic bands varied for each primer, with an average of 3 polymorphic bands per primer, these results agreed with previous reports on RAPD polymorphism in rice germplasm. Depending on the primer, 1 to 7 distinct patterns were obtained among the screened genotypes. Pair-wise genetic distances between genotypes were computed based on Dice's coefficient. Three major, statistically robust groups were obtained in the UPGMA dendrogram (A, B and C) which clearly corresponded to different genetic pools. Additionally, more insight could be gained according to the sub-grouping pattern within group A, which included the principal semi-dwarf commercial varieties. The present study allowed to prove the efficiency of RAPD markers for genetic diversity analysis in closely related germplasm, particularly for the semi-dwarf Cuban commercial rice cultivars. Also, the existence of a narrow genetic base among these varieties has been confirmed, pointing at the urgent necessity of widen it.

  7. Raps markers for genetic diversity analysis in rice (Oryza sativa L)

    International Nuclear Information System (INIS)

    Alvarez, A.; Fuentes, Jorge L.; Deus, Juan E.; Duque, Maria C.

    1999-01-01

    The establishment of relationships between genotypes existing in gene banks that may be used in new crosses, and about genetic diversity in available germplasm, is very useful for plant breeders. In this work, a genetic diversity analysis among 20 varieties of the Cuban rice germplasm bank was performed by using RAPD markers. Twenty four decamer primers were screened which produced 61 polymorphic bands out of 105 consistent and reproducible amplified fragments (58.1 %). The proportion of polymorphic bands varied for each primer, with an average of 3 polymorphic bands per primer, these results agreed with previous reports on RAPD polymorphism in rice germplasm. Depending on the primer, 1 to 7 distinct patterns were obtained among the screened genotypes. Pair-wise genetic distances between genotypes were computed based on Dice's coefficient. Three major, statistically robust groups were obtained in the UPGMA dendrogram (A, B and C) which clearly corresponded to different genetic pools. Additionally, more insight could be gained according to the sub-grouping pattern within group A, which included the principal semi-dwarf commercial varieties. The present study allowed to prove the efficiency of RAPD markers for genetic diversity analysis in closely related germplasm, particularly for the semi-dwarf Cuban commercial rice cultivars. Also, the existence of a narrow genetic base among these varieties has been confirmed, pointing at the urgent necessity of widen it

  8. A statistical assessment of differences and equivalences between genetically modified and reference plant varieties

    NARCIS (Netherlands)

    Voet, van der H.; Perry, J.N.; Amzal, B.; Paoletti, C.

    2011-01-01

    Background - Safety assessment of genetically modified organisms is currently often performed by comparative evaluation. However, natural variation of plant characteristics between commercial varieties is usually not considered explicitly in the statistical computations underlying the assessment.

  9. Application of Multivariate Statistical Analysis to Biomarkers in Se-Turkey Crude Oils

    Science.gov (United States)

    Gürgey, K.; Canbolat, S.

    2017-11-01

    Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%), Stable Carbon Isotope, Gas Chromatography (GC), and Gas Chromatography-Mass Spectrometry (GC-MS) data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE) Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.

  10. APPLICATION OF MULTIVARIATE STATISTICAL ANALYSIS TO BIOMARKERS IN SE-TURKEY CRUDE OILS

    Directory of Open Access Journals (Sweden)

    K. Gürgey

    2017-11-01

    Full Text Available Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%, Stable Carbon Isotope, Gas Chromatography (GC, and Gas Chromatography-Mass Spectrometry (GC-MS data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.

  11. Advanced statistical methods in data science

    CERN Document Server

    Chen, Jiahua; Lu, Xuewen; Yi, Grace; Yu, Hao

    2016-01-01

    This book gathers invited presentations from the 2nd Symposium of the ICSA- CANADA Chapter held at the University of Calgary from August 4-6, 2015. The aim of this Symposium was to promote advanced statistical methods in big-data sciences and to allow researchers to exchange ideas on statistics and data science and to embraces the challenges and opportunities of statistics and data science in the modern world. It addresses diverse themes in advanced statistical analysis in big-data sciences, including methods for administrative data analysis, survival data analysis, missing data analysis, high-dimensional and genetic data analysis, longitudinal and functional data analysis, the design and analysis of studies with response-dependent and multi-phase designs, time series and robust statistics, statistical inference based on likelihood, empirical likelihood and estimating functions. The editorial group selected 14 high-quality presentations from this successful symposium and invited the presenters to prepare a fu...

  12. Statistical Power in Meta-Analysis

    Science.gov (United States)

    Liu, Jin

    2015-01-01

    Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…

  13. Some Conceptual Deficiencies in "Developmental" Behavior Genetics.

    Science.gov (United States)

    Gottlieb, Gilbert

    1995-01-01

    Criticizes the application of the statistical procedures of the population-genetic approach within evolutionary biology to the study of psychological development. Argues that the application of the statistical methods of population genetics--primarily the analysis of variance--to the causes of psychological development is bound to result in a…

  14. Rweb:Web-based Statistical Analysis

    Directory of Open Access Journals (Sweden)

    Jeff Banfield

    1999-03-01

    Full Text Available Rweb is a freely accessible statistical analysis environment that is delivered through the World Wide Web (WWW. It is based on R, a well known statistical analysis package. The only requirement to run the basic Rweb interface is a WWW browser that supports forms. If you want graphical output you must, of course, have a browser that supports graphics. The interface provides access to WWW accessible data sets, so you may run Rweb on your own data. Rweb can provide a four window statistical computing environment (code input, text output, graphical output, and error information through browsers that support Javascript. There is also a set of point and click modules under development for use in introductory statistics courses.

  15. Regularized Statistical Analysis of Anatomy

    DEFF Research Database (Denmark)

    Sjöstrand, Karl

    2007-01-01

    This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus...... and mind. Statistics represents a quintessential part of such investigations as they are preluded by a clinical hypothesis that must be verified based on observed data. The massive amounts of image data produced in each examination pose an important and interesting statistical challenge...... efficient algorithms which make the analysis of large data sets feasible, and gives examples of applications....

  16. Joint multi-population analysis for genetic linkage of bipolar disorder or "wellness" to chromosome 4p.

    Science.gov (United States)

    Visscher, P M; Haley, C S; Ewald, H; Mors, O; Egeland, J; Thiel, B; Ginns, E; Muir, W; Blackwood, D H

    2005-02-05

    To test the hypothesis that the same genetic loci confer susceptibility to, or protection from, disease in different populations, and that a combined analysis would improve the map resolution of a common susceptibility locus, we analyzed data from three studies that had reported linkage to bipolar disorder in a small region on chromosome 4p. Data sets comprised phenotypic information and genetic marker data on Scottish, Danish, and USA extended pedigrees. Across the three data sets, 913 individuals appeared in the pedigrees, 462 were classified, either as unaffected (323) or affected (139) with unipolar or bipolar disorder. A consensus linkage map was created from 14 microsatellite markers in a 33 cM region. Phenotypic and genetic data were analyzed using a variance component (VC) and allele sharing method. All previously reported elevated test statistics in the region were confirmed with one or both analysis methods, indicating the presence of one or more susceptibility genes to bipolar disorder in the three populations in the studied chromosome segment. When the results from both the VC and allele sharing method were considered, there was strong evidence for a susceptibility locus in the data from Scotland, some evidence in the data from Denmark and relatively less evidence in the data from the USA. The test statistics from the Scottish data set dominated the test statistics from the other studies, and no improved map resolution for a putative genetic locus underlying susceptibility in all three studies was obtained. Studies reporting linkage to the same region require careful scrutiny and preferably joint or meta analysis on the same basis in order to ensure that the results are truly comparable. (c) 2004 Wiley-Liss, Inc.

  17. Statistical Methods for Population Genetic Inference Based on Low-Depth Sequencing Data from Modern and Ancient DNA

    DEFF Research Database (Denmark)

    Korneliussen, Thorfinn Sand

    Due to the recent advances in DNA sequencing technology genomic data are being generated at an unprecedented rate and we are gaining access to entire genomes at population level. The technology does, however, not give direct access to the genetic variation and the many levels of preprocessing...... that is required before being able to make inferences from the data introduces multiple levels of uncertainty, especially for low-depth data. Therefore methods that take into account the inherent uncertainty are needed for being able to make robust inferences in the downstream analysis of such data. This poses...... a problem for a range of key summary statistics within populations genetics where existing methods are based on the assumption that the true genotypes are known. Motivated by this I present: 1) a new method for the estimation of relatedness between pairs of individuals, 2) a new method for estimating...

  18. 4P: fast computing of population genetics statistics from large DNA polymorphism panels.

    Science.gov (United States)

    Benazzo, Andrea; Panziera, Alex; Bertorelle, Giorgio

    2015-01-01

    Massive DNA sequencing has significantly increased the amount of data available for population genetics and molecular ecology studies. However, the parallel computation of simple statistics within and between populations from large panels of polymorphic sites is not yet available, making the exploratory analyses of a set or subset of data a very laborious task. Here, we present 4P (parallel processing of polymorphism panels), a stand-alone software program for the rapid computation of genetic variation statistics (including the joint frequency spectrum) from millions of DNA variants in multiple individuals and multiple populations. It handles a standard input file format commonly used to store DNA variation from empirical or simulation experiments. The computational performance of 4P was evaluated using large SNP (single nucleotide polymorphism) datasets from human genomes or obtained by simulations. 4P was faster or much faster than other comparable programs, and the impact of parallel computing using multicore computers or servers was evident. 4P is a useful tool for biologists who need a simple and rapid computer program to run exploratory population genetics analyses in large panels of genomic data. It is also particularly suitable to analyze multiple data sets produced in simulation studies. Unix, Windows, and MacOs versions are provided, as well as the source code for easier pipeline implementations.

  19. Genetic analysis

    NARCIS (Netherlands)

    Koornneef, M.; Alonso-Blanco, C.; Stam, P.

    2006-01-01

    The Mendelian analysis of genetic variation, available as induced mutants or as natural variation, requires a number of steps that are described in this chapter. These include the determination of the number of genes involved in the observed trait's variation, the determination of dominance

  20. Investigating the genetic relationship between Alzheimer’s disease and cancer using GWAS summary statistics

    NARCIS (Netherlands)

    Feng, Yen Chen Anne; Cho, Kelly; Lindstrom, Sara; Kraft, Peter; Cormack, Jean; Blalock, Kendra; Campbell, Peter T.; Casey, Graham; Conti, David V.; Edlund, Christopher K.; Figueiredo, Jane; James Gauderman, W.; Gong, Jian; Green, Roger C.; Gruber, Stephen B.; Harju, John F.; Harrison, Tabitha A.; Jacobs, Eric J; Jenkins, Mark A.; Jiao, Shuo; Li, Li; Lin, Yi; Manion, Frank J.; Moreno, Victor; Mukherjee, Bhramar; Peters, Ulrike; Raskin, Leon; Schumacher, Fredrick R.; Seminara, Daniela; Severi, Gianluca; Stenzel, Stephanie L.; Thomas, Duncan C.; Hopper, John L.; Southey, Melissa C.; Makalic, Enes; Schmidt, Daniel F.; Fletcher, Olivia; Peto, Julian; Gibson, Lorna; Dos-Santos-Silva, Isabel; Hunter, David J.; Lindström, Sara; Kraft, Peter; Ahsan, Habib; Whittemore, Alice S.; Waisfisz, Quinten; Meijers-Heijboer, Hanne; Adank, Muriel A.; van der Luijt, Rob B.; Uitterlinden, Andre G; Hofman, Albert; Meindl, Alfons; Schmutzler, Rita K.; Müller-Myhsok, Bertram; Lichtner, Peter; Nevanlinna, Heli; Muranen, Taru A.; Aittomäki, Kristiina; Blomqvist, Carl; Chang-Claude, Jenny; Hein, Rebecca; Dahmen, Norbert; Beckman, Lars; Crisponi, Laura; Hall, Per; Czene, Kamila; Irwanto, Astrid; Liu, Jianjun; Easton, Douglas F.; Turnbull, Clare A.; Rahman, Nazneen; Kote-Jarai, Zsofia; Muir, Kenneth; Giles, Graham G.; Severi, Gianluca; Neal, David E.; Donovan, Jenny L.; Hamdy, Freddie C.; Wiklund, Fredrik; Gronberg, Henrik; Haiman, Christopher; Schumacher, Fred; Travis, Ruth C.; Riboli, Elio; Kraft, Peter; Hunter, David J.; Gapstur, Susan M.; Berndt, Sonja I.; Chanock, Stephen J.; Han, Younghun; Su, Li; Wei, Yongyue; Hung, Rayjean J.; Brhane, Yonathan; McLaughlin, John; Brennan, Paul; McKay, James D.; Bickeböller, Heike; Rosenberger, Albert; Houlston, Richard S.; Caporaso, Neil E; Landi, Maria Teresa; Heinrich, Joachim; Risch, Angela; Wu, Xifeng; Ye, Yuanqing; Christiani, David C.; Amos, Christopher I; Liang, Liming; Driver, Jane A.; IGAP Consortium, Colorectal Transdisciplinary Study (CORECT); Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE)

    2017-01-01

    Growing evidence from both epidemiology and basic science suggest an inverse association between Alzheimer’s disease (AD) and cancer. We examined the genetic relationship between AD and various cancer types using GWAS summary statistics from the IGAP and GAME-ON consortia. Sample size ranged from

  1. Statistical methods for astronomical data analysis

    CERN Document Server

    Chattopadhyay, Asis Kumar

    2014-01-01

    This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...

  2. Arthritis Genetics Analysis Aids Drug Discovery

    Science.gov (United States)

    ... NIH Research Matters January 13, 2014 Arthritis Genetics Analysis Aids Drug Discovery An international research team identified 42 new ... Edition Distracted Driving Raises Crash Risk Arthritis Genetics Analysis Aids Drug Discovery Oxytocin Affects Facial Recognition Connect with Us ...

  3. Genetic markers as a predictive tool based on statistics in medical practice: ethical considerations through the analysis of the use of HLA-B27 in rheumatology in France

    Directory of Open Access Journals (Sweden)

    Hélène eColineaux

    2015-10-01

    Full Text Available INTRODUCTION. The use of genetic predictive markers in medical practice does not necessarily bear the same kind of medical and ethical consequences than that of genes directly involved in monogenic diseases. However, the French bioethics law framed in the same way the production and use of any genetic information. It seems therefore necessary to explore the practical and ethical context of the actual use of predictive markers in order to highlight their specific stakes. In this study, we document the uses of HLA-B*27, which are an interesting example of the multiple features of genetic predictive marker in general medical practice.MATERIAL & METHODS. The aims of this monocentric and qualitative study were to identify concrete and ethical issues of using the HLA-B*27 marker and the interests and limits of the legal framework as perceived by prescribers. In this regard, a thematic and descriptive analysis of five rheumatologists’ semi-structured and face-to-face interviews was performed.RESULTS. According to most of the interviewees, HLA-B*27 is an overframed test because they considered that this test is not really genetic or at least does not have the same nature as classical genetic tests; HLA-B*27 is not concerned by the ethical challenges of genetic test; the major ethics stake of this marker is not linked to its genetic nature but rather to the complexity of the probabilistic information. This study allows also showing that HLA-B*27, validated for a certain usage, may be used in different ways in practice.DISCUSSION. This marker and its clinical uses underline the challenges of translating both statistical concepts and unifying legal framework in clinical practice. This study allows identifying some new aspects and stakes of genetics in medicine and shows the need of additional studies about the use of predictive genetic markers, in order to provide a better basis for decisions and legal framework regarding these practices.

  4. The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms.

    Science.gov (United States)

    Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

    2013-08-01

    To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches - for example, analysis of variance (ANOVA) - are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will

  5. Genetic algorithms and the analysis of SnIa data

    International Nuclear Information System (INIS)

    Nesseris, Savvas

    2011-01-01

    The Genetic Algorithm is a heuristic that can be used to produce model independent solutions to an optimization problem, thus making it ideal for use in cosmology and more specifically in the analysis of type Ia supernovae data. In this work we use the Genetic Algorithms (GA) in order to derive a null test on the spatially flat cosmological constant model ΛCDM. This is done in two steps: first, we apply the GA to the Constitution SNIa data in order to acquire a model independent reconstruction of the expansion history of the Universe H(z) and second, we use the reconstructed H(z) in conjunction with the Om statistic, which is constant only for the ΛCDM model, to derive our constraints. We find that while ΛCDM is consistent with the data at the 2σ level, some deviations from ΛCDM model at low redshifts can be accommodated.

  6. Methods for meta-analysis of multiple traits using GWAS summary statistics.

    Science.gov (United States)

    Ray, Debashree; Boehnke, Michael

    2018-03-01

    Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses

  7. Genetic programming based models in plant tissue culture: An addendum to traditional statistical approach.

    Science.gov (United States)

    Mridula, Meenu R; Nair, Ashalatha S; Kumar, K Satheesh

    2018-02-01

    In this paper, we compared the efficacy of observation based modeling approach using a genetic algorithm with the regular statistical analysis as an alternative methodology in plant research. Preliminary experimental data on in vitro rooting was taken for this study with an aim to understand the effect of charcoal and naphthalene acetic acid (NAA) on successful rooting and also to optimize the two variables for maximum result. Observation-based modelling, as well as traditional approach, could identify NAA as a critical factor in rooting of the plantlets under the experimental conditions employed. Symbolic regression analysis using the software deployed here optimised the treatments studied and was successful in identifying the complex non-linear interaction among the variables, with minimalistic preliminary data. The presence of charcoal in the culture medium has a significant impact on root generation by reducing basal callus mass formation. Such an approach is advantageous for establishing in vitro culture protocols as these models will have significant potential for saving time and expenditure in plant tissue culture laboratories, and it further reduces the need for specialised background.

  8. Genetic diversity of grape germplasm as revealed by microsatellite ...

    African Journals Online (AJOL)

    aghomotsegin

    In this work, cluster analysis and principal component analysis (PCA) were used to study the genetic ... Key words: Vitis vinifera L., simple sequence repeat (SSR), genetic diversity, .... The data were used for the following statistical analyses.

  9. Genetic structure of populations and differentiation in forest trees

    Science.gov (United States)

    Raymond P. Guries; F. Thomas Ledig

    1981-01-01

    Electrophoretic techniques permit population biologists to analyze genetic structure of natural populations by using large numbers of allozyme loci. Several methods of analysis have been applied to allozyme data, including chi-square contingency tests, F-statistics, and genetic distance. This paper compares such statistics for pitch pine (Pinus rigida...

  10. Microsatellite data analysis for population genetics

    Science.gov (United States)

    Theories and analytical tools of population genetics have been widely applied for addressing various questions in the fields of ecological genetics, conservation biology, and any context where the role of dispersal or gene flow is important. Underlying much of population genetics is the analysis of ...

  11. The Analysis of Polyploid Genetic Data

    NARCIS (Netherlands)

    Meirmans, P.G.; Liu, S.; van Tienderen, P.H.

    2018-01-01

    Though polyploidy is an important aspect of the evolutionary genetics of both plants and animals, the development of population genetic theory of polyploids has seriously lagged behind that of diploids. This is unfortunate since the analysis of polyploid genetic data—and the interpretation of the

  12. Statistical design of personalized medicine interventions: The Clarification of Optimal Anticoagulation through Genetics (COAG trial

    Directory of Open Access Journals (Sweden)

    Gage Brian F

    2010-11-01

    Full Text Available Abstract Background There is currently much interest in pharmacogenetics: determining variation in genes that regulate drug effects, with a particular emphasis on improving drug safety and efficacy. The ability to determine such variation motivates the application of personalized drug therapies that utilize a patient's genetic makeup to determine a safe and effective drug at the correct dose. To ascertain whether a genotype-guided drug therapy improves patient care, a personalized medicine intervention may be evaluated within the framework of a randomized controlled trial. The statistical design of this type of personalized medicine intervention requires special considerations: the distribution of relevant allelic variants in the study population; and whether the pharmacogenetic intervention is equally effective across subpopulations defined by allelic variants. Methods The statistical design of the Clarification of Optimal Anticoagulation through Genetics (COAG trial serves as an illustrative example of a personalized medicine intervention that uses each subject's genotype information. The COAG trial is a multicenter, double blind, randomized clinical trial that will compare two approaches to initiation of warfarin therapy: genotype-guided dosing, the initiation of warfarin therapy based on algorithms using clinical information and genotypes for polymorphisms in CYP2C9 and VKORC1; and clinical-guided dosing, the initiation of warfarin therapy based on algorithms using only clinical information. Results We determine an absolute minimum detectable difference of 5.49% based on an assumed 60% population prevalence of zero or multiple genetic variants in either CYP2C9 or VKORC1 and an assumed 15% relative effectiveness of genotype-guided warfarin initiation for those with zero or multiple genetic variants. Thus we calculate a sample size of 1238 to achieve a power level of 80% for the primary outcome. We show that reasonable departures from these

  13. An investigation of the statistical power of neutrality tests based on comparative and population genetic data

    DEFF Research Database (Denmark)

    Zhai, Weiwei; Nielsen, Rasmus; Slatkin, Montgomery

    2009-01-01

    In this report, we investigate the statistical power of several tests of selective neutrality based on patterns of genetic diversity within and between species. The goal is to compare tests based solely on population genetic data with tests using comparative data or a combination of comparative...... and population genetic data. We show that in the presence of repeated selective sweeps on relatively neutral background, tests based on the d(N)/d(S) ratios in comparative data almost always have more power to detect selection than tests based on population genetic data, even if the overall level of divergence...... selection. The Hudson-Kreitman-Aguadé test is the most powerful test for detecting positive selection among the population genetic tests investigated, whereas McDonald-Kreitman test typically has more power to detect negative selection. We discuss our findings in the light of the discordant results obtained...

  14. A Statistical Toolkit for Data Analysis

    International Nuclear Information System (INIS)

    Donadio, S.; Guatelli, S.; Mascialino, B.; Pfeiffer, A.; Pia, M.G.; Ribon, A.; Viarengo, P.

    2006-01-01

    The present project aims to develop an open-source and object-oriented software Toolkit for statistical data analysis. Its statistical testing component contains a variety of Goodness-of-Fit tests, from Chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Goodman, Fisz-Cramer-von Mises, Kuiper, Tiku. Thanks to the component-based design and the usage of the standard abstract interfaces for data analysis, this tool can be used by other data analysis systems or integrated in experimental software frameworks. This Toolkit has been released and is downloadable from the web. In this paper we describe the statistical details of the algorithms, the computational features of the Toolkit and describe the code validation

  15. An analysis of the genetic diversity and genetic structure of ...

    African Journals Online (AJOL)

    Scientific approaches to conservation of threatened species depend on a good understanding of the genetic information of wild and artificial population. The genetic diversity and structure analysis of 10 Eucommia ulmoides population was analyzed using inter-simple sequence repeat (ISSR) markers in this paper.

  16. Statistical considerations on safety analysis

    International Nuclear Information System (INIS)

    Pal, L.; Makai, M.

    2004-01-01

    The authors have investigated the statistical methods applied to safety analysis of nuclear reactors and arrived at alarming conclusions: a series of calculations with the generally appreciated safety code ATHLET were carried out to ascertain the stability of the results against input uncertainties in a simple experimental situation. Scrutinizing those calculations, we came to the conclusion that the ATHLET results may exhibit chaotic behavior. A further conclusion is that the technological limits are incorrectly set when the output variables are correlated. Another formerly unnoticed conclusion of the previous ATHLET calculations that certain innocent looking parameters (like wall roughness factor, the number of bubbles per unit volume, the number of droplets per unit volume) can influence considerably such output parameters as water levels. The authors are concerned with the statistical foundation of present day safety analysis practices and can only hope that their own misjudgment will be dispelled. Until then, the authors suggest applying correct statistical methods in safety analysis even if it makes the analysis more expensive. It would be desirable to continue exploring the role of internal parameters (wall roughness factor, steam-water surface in thermal hydraulics codes, homogenization methods in neutronics codes) in system safety codes and to study their effects on the analysis. In the validation and verification process of a code one carries out a series of computations. The input data are not precisely determined because measured data have an error, calculated data are often obtained from a more or less accurate model. Some users of large codes are content with comparing the nominal output obtained from the nominal input, whereas all the possible inputs should be taken into account when judging safety. At the same time, any statement concerning safety must be aleatory, and its merit can be judged only when the probability is known with which the

  17. Statistical shape analysis with applications in R

    CERN Document Server

    Dryden, Ian L

    2016-01-01

    A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...

  18. Spatial analysis statistics, visualization, and computational methods

    CERN Document Server

    Oyana, Tonny J

    2015-01-01

    An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...

  19. Genetic variability of indigenous cowpea genotypes as determined ...

    African Journals Online (AJOL)

    Bayesian statistics coupled with the Markov chain Monte Carlo technique was applied to determine population structure, while the genetic variability was established by analysis of molecular variance. UPGMA analysis allowed the separation of the genotypes into three groups, but no relationship between the genetic and ...

  20. Heuristic versus statistical physics approach to optimization problems

    International Nuclear Information System (INIS)

    Jedrzejek, C.; Cieplinski, L.

    1995-01-01

    Optimization is a crucial ingredient of many calculation schemes in science and engineering. In this paper we assess several classes of methods: heuristic algorithms, methods directly relying on statistical physics such as the mean-field method and simulated annealing; and Hopfield-type neural networks and genetic algorithms partly related to statistical physics. We perform the analysis for three types of problems: (1) the Travelling Salesman Problem, (2) vector quantization, and (3) traffic control problem in multistage interconnection network. In general, heuristic algorithms perform better (except for genetic algorithms) and much faster but have to be specific for every problem. The key to improving the performance could be to include heuristic features into general purpose statistical physics methods. (author)

  1. Statistical power to detect genetic (covariance of complex traits using SNP data in unrelated samples.

    Directory of Open Access Journals (Sweden)

    Peter M Visscher

    2014-04-01

    Full Text Available We have recently developed analysis methods (GREML to estimate the genetic variance of a complex trait/disease and the genetic correlation between two complex traits/diseases using genome-wide single nucleotide polymorphism (SNP data in unrelated individuals. Here we use analytical derivations and simulations to quantify the sampling variance of the estimate of the proportion of phenotypic variance captured by all SNPs for quantitative traits and case-control studies. We also derive the approximate sampling variance of the estimate of a genetic correlation in a bivariate analysis, when two complex traits are either measured on the same or different individuals. We show that the sampling variance is inversely proportional to the number of pairwise contrasts in the analysis and to the variance in SNP-derived genetic relationships. For bivariate analysis, the sampling variance of the genetic correlation additionally depends on the harmonic mean of the proportion of variance explained by the SNPs for the two traits and the genetic correlation between the traits, and depends on the phenotypic correlation when the traits are measured on the same individuals. We provide an online tool for calculating the power of detecting genetic (covariation using genome-wide SNP data. The new theory and online tool will be helpful to plan experimental designs to estimate the missing heritability that has not yet been fully revealed through genome-wide association studies, and to estimate the genetic overlap between complex traits (diseases in particular when the traits (diseases are not measured on the same samples.

  2. Smoking and caffeine consumption: a genetic analysis of their association.

    Science.gov (United States)

    Treur, Jorien L; Taylor, Amy E; Ware, Jennifer J; Nivard, Michel G; Neale, Michael C; McMahon, George; Hottenga, Jouke-Jan; Baselmans, Bart M L; Boomsma, Dorret I; Munafò, Marcus R; Vink, Jacqueline M

    2017-07-01

    Smoking and caffeine consumption show a strong positive correlation, but the mechanism underlying this association is unclear. Explanations include shared genetic/environmental factors or causal effects. This study employed three methods to investigate the association between smoking and caffeine. First, bivariate genetic models were applied to data of 10 368 twins from the Netherlands Twin Register in order to estimate genetic and environmental correlations between smoking and caffeine use. Second, from the summary statistics of meta-analyses of genome-wide association studies on smoking and caffeine, the genetic correlation was calculated by LD-score regression. Third, causal effects were tested using Mendelian randomization analysis in 6605 Netherlands Twin Register participants and 5714 women from the Avon Longitudinal Study of Parents and Children. Through twin modelling, a genetic correlation of r0.47 and an environmental correlation of r0.30 were estimated between current smoking (yes/no) and coffee use (high/low). Between current smoking and total caffeine use, this was r0.44 and r0.00, respectively. LD-score regression also indicated sizeable genetic correlations between smoking and coffee use (r0.44 between smoking heaviness and cups of coffee per day, r0.28 between smoking initiation and coffee use and r0.25 between smoking persistence and coffee use). Consistent with the relatively high genetic correlations and lower environmental correlations, Mendelian randomization provided no evidence for causal effects of smoking on caffeine or vice versa. Genetic factors thus explain most of the association between smoking and caffeine consumption. These findings suggest that quitting smoking may be more difficult for heavy caffeine consumers, given their genetic susceptibility. © 2016 The Authors.Addiction Biology published by John Wiley & Sons Ltd on behalf of Society for the Study of Addiction.

  3. Genetics researchers’ and iRB professionals’ attitudes toward genetic research review: a comparative analysis

    Science.gov (United States)

    Edwards, Karen L.; Lemke, Amy A.; Trinidad, Susan B.; Lewis, Susan M.; Starks, Helene; Snapinn, Katherine W.; Griffin, Mary Quinn; Wiesner, Georgia L.; Burke, Wylie

    2012-01-01

    Purpose Genetic research involving human participants can pose challenging questions related to ethical and regulatory standards for research oversight. However, few empirical studies describe how genetic researchers and institutional review board (IRB) professionals conceptualize ethical issues in genetic research or where common ground might exist. Methods Parallel online surveys collected information from human genetic researchers (n = 351) and IRB professionals (n = 208) regarding their views about human participant oversight for genetic protocols. Results A range of opinions were observed within groups on most issues. In both groups, a minority thought it likely that people would be harmed by participation in genetic research or identified from coded genetic data. A majority of both groups agreed that reconsent should be required for four of the six scenarios presented. Statistically significant differences were observed between groups on some issues, with more genetic researcher respondents trusting the confidentiality of coded data, fewer expecting harms from reidentification, and fewer considering reconsent necessary in certain scenarios. Conclusions The range of views observed within and between IRB and genetic researcher groups highlights the complexity and unsettled nature of many ethical issues in genome research. Our findings also identify areas where researcher and IRB views diverge and areas of common ground. PMID:22241102

  4. Genetic analysis in Bartter syndrome from India.

    Science.gov (United States)

    Sharma, Pradeep Kumar; Saikia, Bhaskar; Sharma, Rachna; Ankur, Kumar; Khilnani, Praveen; Aggarwal, Vinay Kumar; Cheong, Hae

    2014-10-01

    Bartter syndrome is a group of inherited, salt-losing tubulopathies presenting as hypokalemic metabolic alkalosis with normotensive hyperreninemia and hyperaldosteronism. Around 150 cases have been reported in literature till now. Mutations leading to salt losing tubulopathies are not routinely tested in Indian population. The authors have done the genetic analysis for the first time in the Bartter syndrome on two cases from India. First case was antenatal Bartter syndrome presenting with massive polyuria and hyperkalemia. Mutational analysis revealed compound heterozygous mutations in KCNJ1(ROMK) gene [p(Leu220Phe), p(Thr191Pro)]. Second case had a phenotypic presentation of classical Bartter syndrome however, genetic analysis revealed only heterozygous novel mutation in SLC12A gene p(Ala232Thr). Bartter syndrome is a clinical diagnosis and genetic analysis is recommended for prognostication and genetic counseling.

  5. Application of descriptive statistics in analysis of experimental data

    OpenAIRE

    Mirilović Milorad; Pejin Ivana

    2008-01-01

    Statistics today represent a group of scientific methods for the quantitative and qualitative investigation of variations in mass appearances. In fact, statistics present a group of methods that are used for the accumulation, analysis, presentation and interpretation of data necessary for reaching certain conclusions. Statistical analysis is divided into descriptive statistical analysis and inferential statistics. The values which represent the results of an experiment, and which are the subj...

  6. Gregor Mendel, His Experiments and Their Statistical Evaluation

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2014-01-01

    Roč. 99, č. 1 (2014), s. 87-99 ISSN 1211-8788 Institutional support: RVO:67985807 Keywords : Mendel * history of genetics * Mendel-Fisher controversy * statistical analysis * binomial distribution * numerical simulation Subject RIV: BB - Applied Statistics, Operational Research http://www.mzm.cz/fileadmin/user_upload/publikace/casopisy/amm_sb_99_1_2014/08kalina.pdf

  7. Statistical Analysis of Research Data | Center for Cancer Research

    Science.gov (United States)

    Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data.  The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.

  8. Statistical analysis with Excel for dummies

    CERN Document Server

    Schmuller, Joseph

    2013-01-01

    Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro

  9. Statistical analysis of dynamic parameters of the core

    International Nuclear Information System (INIS)

    Ionov, V.S.

    2007-01-01

    The transients of various types were investigated for the cores of zero power critical facilities in RRC KI and NPP. Dynamic parameters of neutron transients were explored by tool statistical analysis. Its have sufficient duration, few channels for currents of chambers and reactivity and also some channels for technological parameters. On these values the inverse period. reactivity, lifetime of neutrons, reactivity coefficients and some effects of a reactivity are determinate, and on the values were restored values of measured dynamic parameters as result of the analysis. The mathematical means of statistical analysis were used: approximation(A), filtration (F), rejection (R), estimation of parameters of descriptive statistic (DSP), correlation performances (kk), regression analysis(KP), the prognosis (P), statistician criteria (SC). The calculation procedures were realized by computer language MATLAB. The reasons of methodical and statistical errors are submitted: inadequacy of model operation, precision neutron-physical parameters, features of registered processes, used mathematical model in reactivity meters, technique of processing for registered data etc. Examples of results of statistical analysis. Problems of validity of the methods used for definition and certification of values of statistical parameters and dynamic characteristics are considered (Authors)

  10. An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations.

    Directory of Open Access Journals (Sweden)

    Arunabha Majumdar

    2018-02-01

    Full Text Available Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy. For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package 'CPBayes' implementing the proposed method.

  11. CONFIDENCE LEVELS AND/VS. STATISTICAL HYPOTHESIS TESTING IN STATISTICAL ANALYSIS. CASE STUDY

    Directory of Open Access Journals (Sweden)

    ILEANA BRUDIU

    2009-05-01

    Full Text Available Estimated parameters with confidence intervals and testing statistical assumptions used in statistical analysis to obtain conclusions on research from a sample extracted from the population. Paper to the case study presented aims to highlight the importance of volume of sample taken in the study and how this reflects on the results obtained when using confidence intervals and testing for pregnant. If statistical testing hypotheses not only give an answer "yes" or "no" to some questions of statistical estimation using statistical confidence intervals provides more information than a test statistic, show high degree of uncertainty arising from small samples and findings build in the "marginally significant" or "almost significant (p very close to 0.05.

  12. Collecting operational event data for statistical analysis

    International Nuclear Information System (INIS)

    Atwood, C.L.

    1994-09-01

    This report gives guidance for collecting operational data to be used for statistical analysis, especially analysis of event counts. It discusses how to define the purpose of the study, the unit (system, component, etc.) to be studied, events to be counted, and demand or exposure time. Examples are given of classification systems for events in the data sources. A checklist summarizes the essential steps in data collection for statistical analysis

  13. Guidelines for collecting and maintaining archives for genetic monitoring

    Science.gov (United States)

    Jennifer A. Jackson; Linda Laikre; C. Scott Baker; Katherine C. Kendall; F. W. Allendorf; M. K. Schwartz

    2011-01-01

    Rapid advances in molecular genetic techniques and the statistical analysis of genetic data have revolutionized the way that populations of animals, plants and microorganisms can be monitored. Genetic monitoring is the practice of using molecular genetic markers to track changes in the abundance, diversity or distribution of populations, species or ecosystems over time...

  14. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2013-01-01

    Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...

  15. AMOVA-based clustering of population genetic data

    NARCIS (Netherlands)

    Meirmans, P.G.

    2012-01-01

    Determining the genetic structure of populations is becoming an increasingly important aspect of genetic studies. One of the most frequently used methods is the calculation of F-statistics using an Analysis of Molecular Variance (AMOVA). However, this has the drawback that the population hierarchy

  16. Rapid Genetic Analysis in Congenital Hyperinsulinism

    DEFF Research Database (Denmark)

    Christesen, Henrik Thybo; Brusgaard, Klaus; Alm, Jan

    2007-01-01

    BACKGROUND: In severe, medically unresponsive congenital hyperinsulinism (CHI), the histological differentiation of focal versus diffuse disease is vital, since the surgical management is completely different. Genetic analysis may help in the differential diagnosis, as focal CHI is associated...... with a paternal germline ABCC8 or KCNJ11 mutation and a focal loss of maternal chromosome 11p15, whereas a maternal mutation, or homozygous/compound heterozygous ABCC8 and KCNJ11 mutations predict diffuse-type disease. However, genotyping usually takes too long to be helpful in the absence of a founder mutation....... METHODS: In 4 patients, a rapid genetic analysis of the ABBC8 and KCNJ11 genes was performed within 2 weeks on request prior to the decision of pancreatic surgery. RESULTS: Two patients had no mutations, rendering the genetic analysis non-informative. Peroperative multiple biopsies showed diffuse disease...

  17. Method for statistical data analysis of multivariate observations

    CERN Document Server

    Gnanadesikan, R

    1997-01-01

    A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte

  18. Advances in statistical models for data analysis

    CERN Document Server

    Minerva, Tommaso; Vichi, Maurizio

    2015-01-01

    This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.

  19. Statistical power and utility of meta-analysis methods for cross-phenotype genome-wide association studies.

    Science.gov (United States)

    Zhu, Zhaozhong; Anttila, Verneri; Smoller, Jordan W; Lee, Phil H

    2018-01-01

    Advances in recent genome wide association studies (GWAS) suggest that pleiotropic effects on human complex traits are widespread. A number of classic and recent meta-analysis methods have been used to identify genetic loci with pleiotropic effects, but the overall performance of these methods is not well understood. In this work, we use extensive simulations and case studies of GWAS datasets to investigate the power and type-I error rates of ten meta-analysis methods. We specifically focus on three conditions commonly encountered in the studies of multiple traits: (1) extensive heterogeneity of genetic effects; (2) characterization of trait-specific association; and (3) inflated correlation of GWAS due to overlapping samples. Although the statistical power is highly variable under distinct study conditions, we found the superior power of several methods under diverse heterogeneity. In particular, classic fixed-effects model showed surprisingly good performance when a variant is associated with more than a half of study traits. As the number of traits with null effects increases, ASSET performed the best along with competitive specificity and sensitivity. With opposite directional effects, CPASSOC featured the first-rate power. However, caution is advised when using CPASSOC for studying genetically correlated traits with overlapping samples. We conclude with a discussion of unresolved issues and directions for future research.

  20. Comparison of Pyrolysis Mass Spectrometry and Near Infrared Spectroscopy for Genetic Analysis of Lignocellulose Chemical Composition in Populus

    Directory of Open Access Journals (Sweden)

    Jianxing Zhang

    2014-03-01

    Full Text Available Genetic analysis of wood chemical composition is often limited by the cost and throughput of direct analytical methods. The speed and low cost of Fourier transform near infrared (FT-NIR overcomes many of these limitations, but it is an indirect method relying on calibration models that are typically developed and validated with small sample sets. In this study, we used >1500 young greenhouse grown trees from a clonally propagated single Populus family, grown at low and high nitrogen, and compared FT-NIR calibration sample sizes of 150, 250, 500 and 750 on calibration and prediction model statistics, and heritability estimates developed with pyrolysis molecular beam mass spectrometry (pyMBMS wood chemical composition. As calibration sample size increased from 150 to 750, predictive model statistics improved slightly. Overall, stronger calibration and prediction statistics were obtained with lignin, S-lignin, S/G ratio, and m/z 144 (an ion from cellulose, than with C5 and C6 carbohydrates, and m/z 114 (an ion from xylan. Although small differences in model statistics were observed between the 250 and 500 sample calibration sets, when predicted values were used for calculating genetic control, the 500 sample set gave substantially more similar results to those obtained with the pyMBMS data. With the 500 sample calibration models, genetic correlations obtained with FT-NIR and pyMBMS methods were similar. Quantitative trait loci (QTL analysis with pyMBMS and FT-NIR predictions identified only three common loci for lignin traits. FT-NIR identified four QTLs that were not found with pyMBMS data, and these QTLs were for the less well predicted carbohydrate traits.

  1. Application of Statistical Tools for Data Analysis and Interpretation in Rice Plant Pathology

    Directory of Open Access Journals (Sweden)

    Parsuram Nayak

    2018-01-01

    Full Text Available There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal component analysis, cluster analysis, factor analysis, pattern analysis, discriminant analysis, multivariate analysis of variance, correspondence analysis, canonical correlation analysis, redundancy analysis, genetic diversity analysis, and stability analysis, which involve in joint regression, additive main effects and multiplicative interactions, and genotype-by-environment interaction biplot analysis. The advanced statistical tools, such as non-parametric analysis of disease association, meta-analysis, Bayesian analysis, and decision theory, take an important place in analysis of disease dynamics. Disease forecasting methods by simulation models for plant diseases have a great potentiality in practical disease control strategies. Common mathematical tools such as monomolecular, exponential, logistic, Gompertz and linked differential equations take an important place in growth curve analysis of disease epidemics. The highly informative means of displaying a range of numerical data through construction of box and whisker plots has been suggested. The probable applications of recent advanced tools of linear and non-linear mixed models like the linear mixed model, generalized linear model, and generalized linear mixed models have been presented. The most recent technologies such as micro-array analysis, though cost effective, provide estimates of gene expressions for thousands of genes simultaneously and need attention by the molecular biologists. Some of these advanced tools can be well applied in different branches of rice research, including crop improvement, crop production, crop protection, social sciences as well as agricultural engineering. The rice research scientists should take advantage of these new opportunities adequately in

  2. Statistical models and methods for reliability and survival analysis

    CERN Document Server

    Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo

    2013-01-01

    Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical

  3. Classification, (big) data analysis and statistical learning

    CERN Document Server

    Conversano, Claudio; Vichi, Maurizio

    2018-01-01

    This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...

  4. Statistical hot spot analysis of reactor cores

    International Nuclear Information System (INIS)

    Schaefer, H.

    1974-05-01

    This report is an introduction into statistical hot spot analysis. After the definition of the term 'hot spot' a statistical analysis is outlined. The mathematical method is presented, especially the formula concerning the probability of no hot spots in a reactor core is evaluated. A discussion with the boundary conditions of a statistical hot spot analysis is given (technological limits, nominal situation, uncertainties). The application of the hot spot analysis to the linear power of pellets and the temperature rise in cooling channels is demonstrated with respect to the test zone of KNK II. Basic values, such as probability of no hot spots, hot spot potential, expected hot spot diagram and cumulative distribution function of hot spots, are discussed. It is shown, that the risk of hot channels can be dispersed equally over all subassemblies by an adequate choice of the nominal temperature distribution in the core

  5. Predicting Flowering Behavior and Exploring Its Genetic Determinism in an Apple Multi-family Population Based on Statistical Indices and Simplified Phenotyping.

    Science.gov (United States)

    Durand, Jean-Baptiste; Allard, Alix; Guitton, Baptiste; van de Weg, Eric; Bink, Marco C A M; Costes, Evelyne

    2017-01-01

    Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS) were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI) with an autoregressive coefficient (γ g ) efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γ g and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.

  6. The statistical analysis of anisotropies

    International Nuclear Information System (INIS)

    Webster, A.

    1977-01-01

    One of the many uses to which a radio survey may be put is an analysis of the distribution of the radio sources on the celestial sphere to find out whether they are bunched into clusters or lie in preferred regions of space. There are many methods of testing for clustering in point processes and since they are not all equally good this contribution is presented as a brief guide to what seems to be the best of them. The radio sources certainly do not show very strong clusering and may well be entirely unclustered so if a statistical method is to be useful it must be both powerful and flexible. A statistic is powerful in this context if it can efficiently distinguish a weakly clustered distribution of sources from an unclustered one, and it is flexible if it can be applied in a way which avoids mistaking defects in the survey for true peculiarities in the distribution of sources. The paper divides clustering statistics into two classes: number density statistics and log N/log S statistics. (Auth.)

  7. Basic statistical tools in research and data analysis

    Directory of Open Access Journals (Sweden)

    Zulfiqar Ali

    2016-01-01

    Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

  8. Analysis of genetic diversity inpigeonpeagermplasm using ...

    Indian Academy of Sciences (India)

    Navya

    2016-11-25

    Nov 25, 2016 ... accessions from Orissa (105) and AP (15) do not group with any Indian accessions. ... In the present work, comparison between SSAP and REMAP revealed ... (sequence-specific amplified polymorphism) for genetic analysis of sweet potato. ... Sharma,V.and Nandinemi, M.R. 2014 Assessment of genetic ...

  9. Reproducible statistical analysis with multiple languages

    DEFF Research Database (Denmark)

    Lenth, Russell; Højsgaard, Søren

    2011-01-01

    This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....

  10. Common pitfalls in statistical analysis: "P" values, statistical significance and confidence intervals

    Directory of Open Access Journals (Sweden)

    Priya Ranganathan

    2015-01-01

    Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper

  11. A statistical model for mapping morphological shape

    Directory of Open Access Journals (Sweden)

    Li Jiahan

    2010-07-01

    Full Text Available Abstract Background Living things come in all shapes and sizes, from bacteria, plants, and animals to humans. Knowledge about the genetic mechanisms for biological shape has far-reaching implications for a range spectrum of scientific disciplines including anthropology, agriculture, developmental biology, evolution and biomedicine. Results We derived a statistical model for mapping specific genes or quantitative trait loci (QTLs that control morphological shape. The model was formulated within the mixture framework, in which different types of shape are thought to result from genotypic discrepancies at a QTL. The EM algorithm was implemented to estimate QTL genotype-specific shapes based on a shape correspondence analysis. Computer simulation was used to investigate the statistical property of the model. Conclusion By identifying specific QTLs for morphological shape, the model developed will help to ask, disseminate and address many major integrative biological and genetic questions and challenges in the genetic control of biological shape and function.

  12. Genetic differentiation and origin of the Jordanian population: an analysis of Alu insertion polymorphisms.

    Science.gov (United States)

    Bahri, Raoudha; El Moncer, Wifak; Al-Batayneh, Khalid; Sadiq, May; Esteban, Esther; Moral, Pedro; Chaabani, Hassen

    2012-05-01

    Although much of Jordan is covered by desert, its north-western region forms part of the Fertile Crescent region that had given a rich past to Jordanians. This past, scarcely described by historians, is not yet clarified by sufficient genetic data. Thus in this paper we aim to determine the genetic differentiation of the Jordanian population and to discuss its origin. A total of 150 unrelated healthy Jordanians were investigated for ten Alu insertion polymorphisms. Genetic relationships among populations were estimated by a principal component (PC) plot based on the analyses of the R-matrix software. Statistical analysis showed that the Jordanian population is not significantly different from the United Arab Emirates population or the North Africans. This observation, well represented in PC plot, suggests a common origin of these populations belonging respectively to ancient Mesopotamia, Arabia, and North Africa. Our results are compatible with ancient peoples' movements from Arabia to ancient Mesopotamia and North Africa as proposed by historians and supported by previous genetic results. The original genetic profile of the Jordanian population, very likely Arabian Semitic, has not been subject to significant change despite the succession of several civilizations.

  13. Study design and statistical analysis of data in human population studies with the micronucleus assay.

    Science.gov (United States)

    Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano

    2011-01-01

    The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.

  14. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2017-01-01

    The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...

  15. Statistical evaluation of diagnostic performance topics in ROC analysis

    CERN Document Server

    Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E

    2016-01-01

    Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...

  16. An Efficient Stepwise Statistical Test to Identify Multiple Linked Human Genetic Variants Associated with Specific Phenotypic Traits.

    Directory of Open Access Journals (Sweden)

    Iksoo Huh

    Full Text Available Recent advances in genotyping methodologies have allowed genome-wide association studies (GWAS to accurately identify genetic variants that associate with common or pathological complex traits. Although most GWAS have focused on associations with single genetic variants, joint identification of multiple genetic variants, and how they interact, is essential for understanding the genetic architecture of complex phenotypic traits. Here, we propose an efficient stepwise method based on the Cochran-Mantel-Haenszel test (for stratified categorical data to identify causal joint multiple genetic variants in GWAS. This method combines the CMH statistic with a stepwise procedure to detect multiple genetic variants associated with specific categorical traits, using a series of associated I × J contingency tables and a null hypothesis of no phenotype association. Through a new stratification scheme based on the sum of minor allele count criteria, we make the method more feasible for GWAS data having sample sizes of several thousands. We also examine the properties of the proposed stepwise method via simulation studies, and show that the stepwise CMH test performs better than other existing methods (e.g., logistic regression and detection of associations by Markov blanket for identifying multiple genetic variants. Finally, we apply the proposed approach to two genomic sequencing datasets to detect linked genetic variants associated with bipolar disorder and obesity, respectively.

  17. Bayesian Inference in Statistical Analysis

    CERN Document Server

    Box, George E P

    2011-01-01

    The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob

  18. Analysis of Variance: What Is Your Statistical Software Actually Doing?

    Science.gov (United States)

    Li, Jian; Lomax, Richard G.

    2011-01-01

    Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…

  19. Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.

    Science.gov (United States)

    Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V

    2018-04-01

    A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.

  20. Sensitivity analysis and related analysis : A survey of statistical techniques

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical

  1. Applications of statistical physics and information theory to the analysis of DNA sequences

    Science.gov (United States)

    Grosse, Ivo

    2000-10-01

    DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.

  2. Gene set analysis for interpreting genetic studies

    DEFF Research Database (Denmark)

    Pers, Tune H

    2016-01-01

    Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...

  3. Quantitative genetic analysis of total glucosinolate, oil and protein ...

    African Journals Online (AJOL)

    Quantitative genetic analysis of total glucosinolate, oil and protein contents in Ethiopian mustard ( Brassica carinata A. Braun) ... Seeds were analyzed using HPLC (glucosinolates), NMR (oil) and NIRS (protein). Analyses of variance, Hayman's method of diallel analysis and a mixed linear model of genetic analysis were ...

  4. Evolutionary Computation Methods and their applications in Statistics

    Directory of Open Access Journals (Sweden)

    Francesco Battaglia

    2013-05-01

    Full Text Available A brief discussion of the genesis of evolutionary computation methods, their relationship to artificial intelligence, and the contribution of genetics and Darwin’s theory of natural evolution is provided. Then, the main evolutionary computation methods are illustrated: evolution strategies, genetic algorithms, estimation of distribution algorithms, differential evolution, and a brief description of some evolutionary behavior methods such as ant colony and particle swarm optimization. We also discuss the role of the genetic algorithm for multivariate probability distribution random generation, rather than as a function optimizer. Finally, some relevant applications of genetic algorithm to statistical problems are reviewed: selection of variables in regression, time series model building, outlier identification, cluster analysis, design of experiments.

  5. Genetic diversity analysis in rice mutants using isozyme and Morphological markers

    Energy Technology Data Exchange (ETDEWEB)

    Fuentes, Jorge L; Alvarez, Alba [Centro de Estudios Aplicados al Desarrollo Nuclear, La Habana (Cuba); Deus, Juan E [Instituto de Investigaciones del Arroz. Bauta, La Habana (Cuba); Duque, Miriam C [Centro Internacional de Agricultura Tropical, Cali (Colombia); Cornide, Maria T [Centro Nacional de Investigaciones Cientificas, La Habana (Cuba)

    1999-07-01

    In this work, isozyme and agromorphologic variability of radiation-induced rice mutants with different cytoplasm base was surveyed. Agromorphologic data (plant type, lodging resistance, life cycle and yielding) were transformed into binary data. This markers, along with isozyme (Peroxidases, Esterases, Catalases, Alcohol Dehydrogenases and Polyphenoloxidase) data, were considered for genetic diversity analyses in order to estimate the extent of diversity generated by ionizing radiation. Genetic Similarity between individuals was obtained based on Dice's Coefficient. The UPGMA phenogram defined three main clusters that clearly corresponded to the different cytoplasm sources. However, further discrimination between control varieties and their mutants could be obtained. Bootstrapping analysis was performed to estimate the robustness of the group in the phenogram. According to their bootstrap P value (99.6%), Basmati-370 mutant lines could be considered statistically different from their control. This analysis is suggested as an useful supporting tool for an accurate varietal validation. A Multiple Correspondence Analysis (MCA) showed individuals dispersion around the three principal axis of variation. In general the UPGMA phenogram pattern was corroborated at MCA. Variables such as life cycle, presence of bands Est-a and Prx-m and the absence of Est-i, Prx-h and Prx-i accounted for the higher contribution to variation. The adequacy of morphological and isozyme descriptors for new mutant lines validation is also discussed.

  6. Genetic diversity analysis in rice mutants using isozyme and Morphological markers

    International Nuclear Information System (INIS)

    Fuentes, Jorge L.; Alvarez, Alba; Deus, Juan E.; Duque, Miriam C.; Cornide, Maria T.

    1999-01-01

    In this work, isozyme and agromorphologic variability of radiation-induced rice mutants with different cytoplasm base was surveyed. Agromorphologic data (plant type, lodging resistance, life cycle and yielding) were transformed into binary data. This markers, along with isozyme (Peroxidases, Esterases, Catalases, Alcohol Dehydrogenases and Polyphenoloxidase) data, were considered for genetic diversity analyses in order to estimate the extent of diversity generated by ionizing radiation. Genetic Similarity between individuals was obtained based on Dice's Coefficient. The UPGMA phenogram defined three main clusters that clearly corresponded to the different cytoplasm sources. However, further discrimination between control varieties and their mutants could be obtained. Bootstrapping analysis was performed to estimate the robustness of the group in the phenogram. According to their bootstrap P value (99.6%), Basmati-370 mutant lines could be considered statistically different from their control. This analysis is suggested as an useful supporting tool for an accurate varietal validation. A Multiple Correspondence Analysis (MCA) showed individuals dispersion around the three principal axis of variation. In general the UPGMA phenogram pattern was corroborated at MCA. Variables such as life cycle, presence of bands Est-a and Prx-m and the absence of Est-i, Prx-h and Prx-i accounted for the higher contribution to variation. The adequacy of morphological and isozyme descriptors for new mutant lines validation is also discussed

  7. Predicting Flowering Behavior and Exploring Its Genetic Determinism in an Apple Multi-family Population Based on Statistical Indices and Simplified Phenotyping

    Directory of Open Access Journals (Sweden)

    Jean-Baptiste Durand

    2017-06-01

    Full Text Available Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI with an autoregressive coefficient (γg efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γg and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.

  8. Management of Uncertainty by Statistical Process Control and a Genetic Tuned Fuzzy System

    Directory of Open Access Journals (Sweden)

    Stephan Birle

    2016-01-01

    Full Text Available In food industry, bioprocesses like fermentation often are a crucial part of the manufacturing process and decisive for the final product quality. In general, they are characterized by highly nonlinear dynamics and uncertainties that make it difficult to control these processes by the use of traditional control techniques. In this context, fuzzy logic controllers offer quite a straightforward way to control processes that are affected by nonlinear behavior and uncertain process knowledge. However, in order to maintain process safety and product quality it is necessary to specify the controller performance and to tune the controller parameters. In this work, an approach is presented to establish an intelligent control system for oxidoreductive yeast propagation as a representative process biased by the aforementioned uncertainties. The presented approach is based on statistical process control and fuzzy logic feedback control. As the cognitive uncertainty among different experts about the limits that define the control performance as still acceptable may differ a lot, a data-driven design method is performed. Based upon a historic data pool statistical process corridors are derived for the controller inputs control error and change in control error. This approach follows the hypothesis that if the control performance criteria stay within predefined statistical boundaries, the final process state meets the required quality definition. In order to keep the process on its optimal growth trajectory (model based reference trajectory a fuzzy logic controller is used that alternates the process temperature. Additionally, in order to stay within the process corridors, a genetic algorithm was applied to tune the input and output fuzzy sets of a preliminarily parameterized fuzzy controller. The presented experimental results show that the genetic tuned fuzzy controller is able to keep the process within its allowed limits. The average absolute error to the

  9. Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span

    Directory of Open Access Journals (Sweden)

    Jordan MI

    2006-05-01

    Full Text Available Abstract Background The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of molecular sequence and profiling data. Here, the potential of such modeling is demonstrated by examining the 5,225 free-text items in the Caenorhabditis Genetic Center (CGC Bibliography using techniques from statistical information retrieval. Items in the CGC biomedical text corpus were modeled using the Latent Dirichlet Allocation (LDA model. LDA is a hierarchical Bayesian model which represents a document as a random mixture over latent topics; each topic is characterized by a distribution over words. Results An LDA model estimated from CGC items had better predictive performance than two standard models (unigram and mixture of unigrams trained using the same data. To illustrate the practical utility of LDA models of biomedical corpora, a trained CGC LDA model was used for a retrospective study of nematode genes known to be associated with life span modification. Corpus-, document-, and word-level LDA parameters were combined with terms from the Gene Ontology to enhance the explanatory value of the CGC LDA model, and to suggest additional candidates for age-related genes. A novel, pairwise document similarity measure based on the posterior distribution on the topic simplex was formulated and used to search the CGC database for "homologs" of a "query" document discussing the life span-modifying clk-2 gene. Inspection of these document homologs enabled and facilitated the production of hypotheses about the function and role of clk-2. Conclusion Like other graphical models for genetic, genomic and other types of biological data, LDA provides a method for extracting unanticipated insights and generating predictions amenable to subsequent experimental validation.

  10. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization.

    Directory of Open Access Journals (Sweden)

    Xiaoquan Wen

    2017-03-01

    Full Text Available We propose a novel statistical framework for integrating the result from molecular quantitative trait loci (QTL mapping into genome-wide genetic association analysis of complex traits, with the primary objectives of quantitatively assessing the enrichment of the molecular QTLs in complex trait-associated genetic variants and the colocalizations of the two types of association signals. We introduce a natural Bayesian hierarchical model that treats the latent association status of molecular QTLs as SNP-level annotations for candidate SNPs of complex traits. We detail a computational procedure to seamlessly perform enrichment, fine-mapping and colocalization analyses, which is a distinct feature compared to the existing colocalization analysis procedures in the literature. The proposed approach is computationally efficient and requires only summary-level statistics. We evaluate and demonstrate the proposed computational approach through extensive simulation studies and analyses of blood lipid data and the whole blood eQTL data from the GTEx project. In addition, a useful utility from our proposed method enables the computation of expected colocalization signals using simple characteristics of the association data. Using this utility, we further illustrate the importance of enrichment analysis on the ability to discover colocalized signals and the potential limitations of currently available molecular QTL data. The software pipeline that implements the proposed computation procedures, enloc, is freely available at https://github.com/xqwen/integrative.

  11. Online Statistical Modeling (Regression Analysis) for Independent Responses

    Science.gov (United States)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  12. Application of Ontology Technology in Health Statistic Data Analysis.

    Science.gov (United States)

    Guo, Minjiang; Hu, Hongpu; Lei, Xingyun

    2017-01-01

    Research Purpose: establish health management ontology for analysis of health statistic data. Proposed Methods: this paper established health management ontology based on the analysis of the concepts in China Health Statistics Yearbook, and used protégé to define the syntactic and semantic structure of health statistical data. six classes of top-level ontology concepts and their subclasses had been extracted and the object properties and data properties were defined to establish the construction of these classes. By ontology instantiation, we can integrate multi-source heterogeneous data and enable administrators to have an overall understanding and analysis of the health statistic data. ontology technology provides a comprehensive and unified information integration structure of the health management domain and lays a foundation for the efficient analysis of multi-source and heterogeneous health system management data and enhancement of the management efficiency.

  13. Explorations in Statistics: The Analysis of Change

    Science.gov (United States)

    Curran-Everett, Douglas; Williams, Calvin L.

    2015-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…

  14. Common pitfalls in statistical analysis: “P” values, statistical significance and confidence intervals

    Science.gov (United States)

    Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc

    2015-01-01

    In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958

  15. msap: a tool for the statistical analysis of methylation-sensitive amplified polymorphism data.

    Science.gov (United States)

    Pérez-Figueroa, A

    2013-05-01

    In this study msap, an R package which analyses methylation-sensitive amplified polymorphism (MSAP or MS-AFLP) data is presented. The program provides a deep analysis of epigenetic variation starting from a binary data matrix indicating the banding pattern between the isoesquizomeric endonucleases HpaII and MspI, with differential sensitivity to cytosine methylation. After comparing the restriction fragments, the program determines if each fragment is susceptible to methylation (representative of epigenetic variation) or if there is no evidence of methylation (representative of genetic variation). The package provides, in a user-friendly command line interface, a pipeline of different analyses of the variation (genetic and epigenetic) among user-defined groups of samples, as well as the classification of the methylation occurrences in those groups. Statistical testing provides support to the analyses. A comprehensive report of the analyses and several useful plots could help researchers to assess the epigenetic and genetic variation in their MSAP experiments. msap is downloadable from CRAN (http://cran.r-project.org/) and its own webpage (http://msap.r-forge.R-project.org/). The package is intended to be easy to use even for those people unfamiliar with the R command line environment. Advanced users may take advantage of the available source code to adapt msap to more complex analyses. © 2013 Blackwell Publishing Ltd.

  16. A statistical assessment of differences and equivalences between genetically modified and reference plant varieties

    Directory of Open Access Journals (Sweden)

    Amzal Billy

    2011-02-01

    Full Text Available Abstract Background Safety assessment of genetically modified organisms is currently often performed by comparative evaluation. However, natural variation of plant characteristics between commercial varieties is usually not considered explicitly in the statistical computations underlying the assessment. Results Statistical methods are described for the assessment of the difference between a genetically modified (GM plant variety and a conventional non-GM counterpart, and for the assessment of the equivalence between the GM variety and a group of reference plant varieties which have a history of safe use. It is proposed to present the results of both difference and equivalence testing for all relevant plant characteristics simultaneously in one or a few graphs, as an aid for further interpretation in safety assessment. A procedure is suggested to derive equivalence limits from the observed results for the reference plant varieties using a specific implementation of the linear mixed model. Three different equivalence tests are defined to classify any result in one of four equivalence classes. The performance of the proposed methods is investigated by a simulation study, and the methods are illustrated on compositional data from a field study on maize grain. Conclusions A clear distinction of practical relevance is shown between difference and equivalence testing. The proposed tests are shown to have appropriate performance characteristics by simulation, and the proposed simultaneous graphical representation of results was found to be helpful for the interpretation of results from a practical field trial data set.

  17. Neural networks for genetic epidemiology: past, present, and future

    Directory of Open Access Journals (Sweden)

    Motsinger-Reif Alison A

    2008-07-01

    Full Text Available Abstract During the past two decades, the field of human genetics has experienced an information explosion. The completion of the human genome project and the development of high throughput SNP technologies have created a wealth of data; however, the analysis and interpretation of these data have created a research bottleneck. While technology facilitates the measurement of hundreds or thousands of genes, statistical and computational methodologies are lacking for the analysis of these data. New statistical methods and variable selection strategies must be explored for identifying disease susceptibility genes for common, complex diseases. Neural networks (NN are a class of pattern recognition methods that have been successfully implemented for data mining and prediction in a variety of fields. The application of NN for statistical genetics studies is an active area of research. Neural networks have been applied in both linkage and association analysis for the identification of disease susceptibility genes. In the current review, we consider how NN have been used for both linkage and association analyses in genetic epidemiology. We discuss both the successes of these initial NN applications, and the questions that arose during the previous studies. Finally, we introduce evolutionary computing strategies, Genetic Programming Neural Networks (GPNN and Grammatical Evolution Neural Networks (GENN, for using NN in association studies of complex human diseases that address some of the caveats illuminated by previous work.

  18. Genetic analysis of bulimia nervosa: methods and sample description.

    Science.gov (United States)

    Kaye, Walter H; Devlin, Bernie; Barbarich, Nicole; Bulik, Cynthia M; Thornton, Laura; Bacanu, Silviu-Alin; Fichter, Manfred M; Halmi, Katherine A; Kaplan, Allan S; Strober, Michael; Woodside, D Blake; Bergen, Andrew W; Crow, Scott; Mitchell, James; Rotondo, Alessandro; Mauri, Mauro; Cassano, Giovanni; Keel, Pamela; Plotnicov, Katherine; Pollice, Christine; Klump, Kelly L; Lilenfeld, Lisa R; Ganjei, J Kelly; Quadflieg, Norbert; Berrettini, Wade H

    2004-05-01

    Twin and family studies suggest that genetic variants contribute to the pathogenesis of bulimia nervosa (BN) and anorexia nervosa (AN). The Price Foundation has supported an international, multisite study of families with these disorders to identify these genetic variations. The current study presents the clinical characteristics of this sample as well as a description of the study methodology. All probands met modified criteria for BN or bulimia nervosa with a history of AN (BAN) as defined in the 4th ed. of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; American Psychiatric Association, 1994). All affected relatives met DSM-IV criteria for BN, AN, BAN, or eating disorders not otherwise specified (EDNOS). Probands and affected relatives were assessed diagnostically using both trained-rater and self-report assessments. DNA samples were collected from probands, affected relatives, and available biologic parents. Assessments were obtained from 163 BN probands and 165 BAN probands. Overall, there were 365 relative pairs available for linkage analysis. Of the affected relatives of BN probands, 62 were diagnosed as BN (34.8%), 49 as BAN (27.5%), 35 as AN (19.7%), and 32 as EDNOS (18.0%). For the relatives of BAN probands, 42 were diagnosed as BN (22.5%), 67 as BAN (35.8%), 48 as AN (25.7%), and 30 as EDNOS (16.0%). This study represents the largest genetic study of eating disorders to date. Clinical data indicate that although there are a large number of individuals with BN disorders, a range of eating pathology is represented in the sample, allowing for the examination of several different phenotypes in molecular genetic analyses. Copyright 2004 by Wiley Periodicals, Inc. Int J Eat Disord 35: 556-570, 2004.

  19. TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION

    Directory of Open Access Journals (Sweden)

    А. А. Vershinina

    2014-01-01

    Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.

  20. Multivariate genetic analysis of brain structure in an extended twin design

    DEFF Research Database (Denmark)

    Posthuma, D; de Geus, E.J.; Neale, M.C.

    2000-01-01

    quantitative scale and thus can be assessed in affected and unaffected individuals. Continuous measures increase the statistical power to detect genetic effects (Neale et al., 1994), and allow studies to be designed to collect data from informative subjects such as extreme concordant or discordant pairs....... Intermediate phenotypes for discrete traits, such as psychiatric disorders, can be neurotransmitter levels, brain function, or structure. In this paper we conduct a multivariate analysis of data from 111 twin pairs and 34 additional siblings on cerebellar volume, intracranial space, and body height....... The analysis is carried out on the raw data and specifies a model for the mean and the covariance structure. Results suggest that cerebellar volume and intracranial space vary with age and sex. Brain volumes tend to decrease slightly with age, and males generally have a larger brain volume than females...

  1. Statistical analysis of network data with R

    CERN Document Server

    Kolaczyk, Eric D

    2014-01-01

    Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).

  2. Semiclassical analysis, Witten Laplacians, and statistical mechanis

    CERN Document Server

    Helffer, Bernard

    2002-01-01

    This important book explains how the technique of Witten Laplacians may be useful in statistical mechanics. It considers the problem of analyzing the decay of correlations, after presenting its origin in statistical mechanics. In addition, it compares the Witten Laplacian approach with other techniques, such as the transfer matrix approach and its semiclassical analysis. The author concludes by providing a complete proof of the uniform Log-Sobolev inequality. Contents: Witten Laplacians Approach; Problems in Statistical Mechanics with Discrete Spins; Laplace Integrals and Transfer Operators; S

  3. A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

    Science.gov (United States)

    Luo, Li; Zhu, Yun; Xiong, Momiao

    2012-06-01

    The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

  4. Gene flow analysis method, the D-statistic, is robust in a wide parameter space.

    Science.gov (United States)

    Zheng, Yichen; Janke, Axel

    2018-01-08

    We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text], to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.

  5. A statistical approach to plasma profile analysis

    International Nuclear Information System (INIS)

    Kardaun, O.J.W.F.; McCarthy, P.J.; Lackner, K.; Riedel, K.S.

    1990-05-01

    A general statistical approach to the parameterisation and analysis of tokamak profiles is presented. The modelling of the profile dependence on both the radius and the plasma parameters is discussed, and pertinent, classical as well as robust, methods of estimation are reviewed. Special attention is given to statistical tests for discriminating between the various models, and to the construction of confidence intervals for the parameterised profiles and the associated global quantities. The statistical approach is shown to provide a rigorous approach to the empirical testing of plasma profile invariance. (orig.)

  6. Study designs, use of statistical tests, and statistical analysis software choice in 2015: Results from two Pakistani monthly Medline indexed journals.

    Science.gov (United States)

    Shaikh, Masood Ali

    2017-09-01

    Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.

  7. Statistical analysis of brake squeal noise

    Science.gov (United States)

    Oberst, S.; Lai, J. C. S.

    2011-06-01

    Despite substantial research efforts applied to the prediction of brake squeal noise since the early 20th century, the mechanisms behind its generation are still not fully understood. Squealing brakes are of significant concern to the automobile industry, mainly because of the costs associated with warranty claims. In order to remedy the problems inherent in designing quieter brakes and, therefore, to understand the mechanisms, a design of experiments study, using a noise dynamometer, was performed by a brake system manufacturer to determine the influence of geometrical parameters (namely, the number and location of slots) of brake pads on brake squeal noise. The experimental results were evaluated with a noise index and ranked for warm and cold brake stops. These data are analysed here using statistical descriptors based on population distributions, and a correlation analysis, to gain greater insight into the functional dependency between the time-averaged friction coefficient as the input and the peak sound pressure level data as the output quantity. The correlation analysis between the time-averaged friction coefficient and peak sound pressure data is performed by applying a semblance analysis and a joint recurrence quantification analysis. Linear measures are compared with complexity measures (nonlinear) based on statistics from the underlying joint recurrence plots. Results show that linear measures cannot be used to rank the noise performance of the four test pad configurations. On the other hand, the ranking of the noise performance of the test pad configurations based on the noise index agrees with that based on nonlinear measures: the higher the nonlinearity between the time-averaged friction coefficient and peak sound pressure, the worse the squeal. These results highlight the nonlinear character of brake squeal and indicate the potential of using nonlinear statistical analysis tools to analyse disc brake squeal.

  8. The Statistical Analysis of Time Series

    CERN Document Server

    Anderson, T W

    2011-01-01

    The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George

  9. Analysis of genetic structure and relationship among nine ...

    Indian Academy of Sciences (India)

    These results indicated that the clustering analysis using the Structure program might provide an ..... of the current genetic relations among the breeds, and con- tribute to ... sis of the genetic structure of the Canary goat populations using.

  10. Analysis of room transfer function and reverberant signal statistics

    DEFF Research Database (Denmark)

    Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn

    2008-01-01

    For some time now, statistical analysis has been a valuable tool in analyzing room transfer functions (RTFs). This work examines existing statistical time-frequency models and techniques for RTF analysis (e.g., Schroeder's stochastic model and the standard deviation over frequency bands for the RTF...... magnitude and phase). RTF fractional octave smoothing, as with 1-slash 3 octave analysis, may lead to RTF simplifications that can be useful for several audio applications, like room compensation, room modeling, auralisation purposes. The aim of this work is to identify the relationship of optimal response...... and the corresponding ratio of the direct and reverberant signal. In addition, this work examines the statistical quantities for speech and audio signals prior to their reproduction within rooms and when recorded in rooms. Histograms and other statistical distributions are used to compare RTF minima of typical...

  11. Timing Analysis of Genetic Logic Circuits using D-VASim

    DEFF Research Database (Denmark)

    Baig, Hasan; Madsen, Jan

    and propagation delay analysis of single as well as cascaded geneticlogic circuits can be performed. D-VASim allows user to change the circuit parameters during runtime simulation to observe its effectson circuit’s timing behavior. The results obtained from D-VASim can be used not only to characterize the timing...... delay analysis may play a very significant role in the designing of genetic logic circuits. In thisdemonstration, we present the capability of D-VASim (Dynamic Virtual Analyzer and Simulator) to perform the timing and propagationdelay analysis of genetic logic circuits. Using D-VASim, the timing...... behavior of geneticlogic circuits but also to analyze the timing constraints of cascaded genetic logic circuits....

  12. The relationship between the number of loci and the statistical support for the topology of UPGMA trees obtained from genetic distance data.

    Science.gov (United States)

    Highton, R

    1993-12-01

    An analysis of the relationship between the number of loci utilized in an electrophoretic study of genetic relationships and the statistical support for the topology of UPGMA trees is reported for two published data sets. These are Highton and Larson (Syst. Zool.28:579-599, 1979), an analysis of the relationships of 28 species of plethodonine salamanders, and Hedges (Syst. Zool., 35:1-21, 1986), a similar study of 30 taxa of Holarctic hylid frogs. As the number of loci increases, the statistical support for the topology at each node in UPGMA trees was determined by both the bootstrap and jackknife methods. The results show that the bootstrap and jackknife probabilities supporting the topology at some nodes of UPGMA trees increase as the number of loci utilized in a study is increased, as expected for nodes that have groupings that reflect phylogenetic relationships. The pattern of increase varies and is especially rapid in the case of groups with no close relatives. At nodes that likely do not represent correct phylogenetic relationships, the bootstrap probabilities do not increase and often decline with the addition of more loci.

  13. Transit safety & security statistics & analysis 2002 annual report (formerly SAMIS)

    Science.gov (United States)

    2004-12-01

    The Transit Safety & Security Statistics & Analysis 2002 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...

  14. Transit safety & security statistics & analysis 2003 annual report (formerly SAMIS)

    Science.gov (United States)

    2005-12-01

    The Transit Safety & Security Statistics & Analysis 2003 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...

  15. A statistical design for testing apomictic diversification through linkage analysis.

    Science.gov (United States)

    Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling

    2014-03-01

    The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.

  16. Genetic analysis of Mexican Criollo cattle populations.

    Science.gov (United States)

    Ulloa-Arvizu, R; Gayosso-Vázquez, A; Ramos-Kuri, M; Estrada, F J; Montaño, M; Alonso, R A

    2008-10-01

    The objective of this study was to evaluate the genetic structure of Mexican Criollo cattle populations using microsatellite genetic markers. DNA samples were collected from 168 animals from four Mexican Criollo cattle populations, geographically isolated in remote areas of Sierra Madre Occidental (West Highlands). Also were included samples from two breeds with Iberian origin: the fighting bull (n = 24) and the milking central American Criollo (n = 24) and one Asiatic breed: Guzerat (n = 32). Genetic analysis consisted of the estimation of the genetic diversity in each population by the allele number and the average expected heterozygosity found in nine microsatellite loci. Furthermore, genetic relationships among the populations were defined by their genetic distances. Our data shows that Mexican cattle populations have a relatively high level of genetic diversity based either on the mean number of alleles (10.2-13.6) and on the expected heterozygosity (0.71-0.85). The degree of observed homozygosity within the Criollo populations was remarkable and probably caused by inbreeding (reduced effective population size) possibly due to reproductive structure within populations. Our data shows that considerable genetic differentiation has been occurred among the Criollo cattle populations in different regions of Mexico.

  17. Statistical Modelling of Wind Proles - Data Analysis and Modelling

    DEFF Research Database (Denmark)

    Jónsson, Tryggvi; Pinson, Pierre

    The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....

  18. Statistical analysis of long term spatial and temporal trends of ...

    Indian Academy of Sciences (India)

    Statistical analysis of long term spatial and temporal trends of temperature ... CGCM3; HadCM3; modified Mann–Kendall test; statistical analysis; Sutlej basin. ... Water Resources Systems Division, National Institute of Hydrology, Roorkee 247 ...

  19. MONITORING OF GENETIC DIVERSITY IN FARMED DEER POPULATIONS USING MICROSATELLITE MARKERS

    Directory of Open Access Journals (Sweden)

    Pavol Bajzík

    2011-12-01

    Full Text Available Deer (Cervidaei belong to the most important species used as farmed animals. We focused on assesing the genetic diversity among five deer populations. Analysis has been performed on a total of 183 animals originating from Czech Republic, Hungary, New Zealand, Poland and Slovak Republic. Genetic variability were investigated using 8 microsatellite markers used in deer. Statistical data of all populations we obtained on the basis of Nei statistics, using by POWERMARKER 3.23 programme. Graphical view of relationships among populations and individuals in the populations was obtained using the Dendroscope software. Molecular genetic data combinated with evaluation in statistical programmes could lead to a complex view of populations and diffrences among them.doi:10.5219/172

  20. CORSSA: The Community Online Resource for Statistical Seismicity Analysis

    Science.gov (United States)

    Michael, Andrew J.; Wiemer, Stefan

    2010-01-01

    Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.

  1. Multivariate statistical analysis a high-dimensional approach

    CERN Document Server

    Serdobolskii, V

    2000-01-01

    In the last few decades the accumulation of large amounts of in­ formation in numerous applications. has stimtllated an increased in­ terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de­ ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat­ ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari­ ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen­ ...

  2. Applied multivariate statistical analysis

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners.  It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added.  All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior.  All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...

  3. Comparative linkage meta-analysis reveals regionally-distinct, disparate genetic architectures: application to bipolar disorder and schizophrenia.

    Directory of Open Access Journals (Sweden)

    Brady Tang

    2011-04-01

    Full Text Available New high-throughput, population-based methods and next-generation sequencing capabilities hold great promise in the quest for common and rare variant discovery and in the search for "missing heritability." However, the optimal analytic strategies for approaching such data are still actively debated, representing the latest rate-limiting step in genetic progress. Since it is likely a majority of common variants of modest effect have been identified through the application of tagSNP-based microarray platforms (i.e., GWAS, alternative approaches robust to detection of low-frequency (1-5% MAF and rare (<1% variants are of great importance. Of direct relevance, we have available an accumulated wealth of linkage data collected through traditional genetic methods over several decades, the full value of which has not been exhausted. To that end, we compare results from two different linkage meta-analysis methods--GSMA and MSP--applied to the same set of 13 bipolar disorder and 16 schizophrenia GWLS datasets. Interestingly, we find that the two methods implicate distinct, largely non-overlapping, genomic regions. Furthermore, based on the statistical methods themselves and our contextualization of these results within the larger genetic literatures, our findings suggest, for each disorder, distinct genetic architectures may reside within disparate genomic regions. Thus, comparative linkage meta-analysis (CLMA may be used to optimize low-frequency and rare variant discovery in the modern genomic era.

  4. Statistical evaluation of vibration analysis techniques

    Science.gov (United States)

    Milner, G. Martin; Miller, Patrice S.

    1987-01-01

    An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.

  5. HistFitter software framework for statistical data analysis

    CERN Document Server

    Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.

    2015-01-01

    We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...

  6. A Statistical Framework for the Functional Analysis of Metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.

    2008-10-01

    Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.

  7. Genetics of healthy aging in Europe: the EU-integrated project GEHA (GEnetics of Healthy Aging)

    DEFF Research Database (Denmark)

    Franceschi, Claudio; Bezrukov, Vladyslav; Blanché, Hélène

    2007-01-01

    The aim of the 5-year European Union (EU)-Integrated Project GEnetics of Healthy Aging (GEHA), constituted by 25 partners (24 from Europe plus the Beijing Genomics Institute from China), is to identify genes involved in healthy aging and longevity, which allow individuals to survive to advanced old......DNA). The genetic analysis will be performed by 9 high-throughput platforms, within the framework of centralized databases for phenotypic, genetic, and mtDNA data. Additional advanced approaches (bioinformatics, advanced statistics, mathematical modeling, functional genomics and proteomics, molecular biology...... age in good cognitive and physical function and in the absence of major age-related diseases. To achieve this aim a coherent, tightly integrated program of research that unites demographers, geriatricians, geneticists, genetic epidemiologists, molecular biologists, bioinfomaticians, and statisticians...

  8. A statistical approach to quantification of genetically modified organisms (GMO) using frequency distributions.

    Science.gov (United States)

    Gerdes, Lars; Busch, Ulrich; Pecoraro, Sven

    2014-12-14

    According to Regulation (EU) No 619/2011, trace amounts of non-authorised genetically modified organisms (GMO) in feed are tolerated within the EU if certain prerequisites are met. Tolerable traces must not exceed the so-called 'minimum required performance limit' (MRPL), which was defined according to the mentioned regulation to correspond to 0.1% mass fraction per ingredient. Therefore, not yet authorised GMO (and some GMO whose approvals have expired) have to be quantified at very low level following the qualitative detection in genomic DNA extracted from feed samples. As the results of quantitative analysis can imply severe legal and financial consequences for producers or distributors of feed, the quantification results need to be utterly reliable. We developed a statistical approach to investigate the experimental measurement variability within one 96-well PCR plate. This approach visualises the frequency distribution as zygosity-corrected relative content of genetically modified material resulting from different combinations of transgene and reference gene Cq values. One application of it is the simulation of the consequences of varying parameters on measurement results. Parameters could be for example replicate numbers or baseline and threshold settings, measurement results could be for example median (class) and relative standard deviation (RSD). All calculations can be done using the built-in functions of Excel without any need for programming. The developed Excel spreadsheets are available (see section 'Availability of supporting data' for details). In most cases, the combination of four PCR replicates for each of the two DNA isolations already resulted in a relative standard deviation of 15% or less. The aims of the study are scientifically based suggestions for minimisation of uncertainty of measurement especially in -but not limited to- the field of GMO quantification at low concentration levels. Four PCR replicates for each of the two DNA isolations

  9. Statistical analysis on extreme wave height

    Digital Repository Service at National Institute of Oceanography (India)

    Teena, N.V.; SanilKumar, V.; Sudheesh, K.; Sajeev, R.

    -294. • WAFO (2000) – A MATLAB toolbox for analysis of random waves and loads, Lund University, Sweden, homepage http://www.maths.lth.se/matstat/wafo/,2000. 15    Table 1: Statistical results of data and fitted distribution for cumulative distribution...

  10. Genetic and immunohistochemical analysis of HSPA5 in mouse and human retinas

    OpenAIRE

    Chintalapudi, Sumana R.; Wang, XiaoFei; Li, Huiling; Lau, Yin H. Chan; Williams, Robert W.; Jablonski, Monica M.

    2016-01-01

    Purpose Photoreceptor degenerative diseases?are among the leading causes of vision loss. Although the causative genetic mutations are often known, mechanisms leading to photoreceptor degeneration remain poorly defined. We have previously demonstrated that the photoreceptor membrane-associated protein XAP-1 antigen is a product of the HSPA5 gene. In this study, we used systems genetic methods, statistical modeling, and immunostaining to identify and analyze candidate genes that modulate Hspa5 ...

  11. Statistical Analysis of Zebrafish Locomotor Response.

    Science.gov (United States)

    Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai

    2015-01-01

    Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure.

  12. Time Series Analysis Based on Running Mann Whitney Z Statistics

    Science.gov (United States)

    A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...

  13. Sensitivity analysis of ranked data: from order statistics to quantiles

    NARCIS (Netherlands)

    Heidergott, B.F.; Volk-Makarewicz, W.

    2015-01-01

    In this paper we provide the mathematical theory for sensitivity analysis of order statistics of continuous random variables, where the sensitivity is with respect to a distributional parameter. Sensitivity analysis of order statistics over a finite number of observations is discussed before

  14. Genetic and immunohistochemical analysis of HSPA5 in mouse and human retinas.

    Science.gov (United States)

    Chintalapudi, Sumana R; Wang, XiaoFei; Li, Huiling; Lau, Yin H Chan; Williams, Robert W; Jablonski, Monica M

    2016-01-01

    Photoreceptor degenerative diseases are among the leading causes of vision loss. Although the causative genetic mutations are often known, mechanisms leading to photoreceptor degeneration remain poorly defined. We have previously demonstrated that the photoreceptor membrane-associated protein XAP-1 antigen is a product of the HSPA5 gene. In this study, we used systems genetic methods, statistical modeling, and immunostaining to identify and analyze candidate genes that modulate Hspa5 expression in the retina. Quantitative trait locus (QTL) mapping was used to map the genomic region that regulates Hspa5 in the cross between C57BL/6J X DBA/2J mice (BXD) genetic reference panel. The stepwise refinement of candidate genes was based on expression QTL mapping, gene expression correlation analyses (direct and partial), and analysis of regional sequence variants. The subcellular localization of candidate proteins and HSPA5 in mouse and human retinas was evaluated by immunohistochemistry. Differences in the localization of extracellular HSPA5 were assessed between healthy human donor and atrophic age-related macular degeneration (AMD) donor eyes. In the eyes of healthy mice, extracellular HSPA5 was confined to the area around the cone photoreceptor outer segments. Mapping variation in Hspa5 mRNA expression levels in the retina revealed a statistically significant trans -acting expression QTL (eQTL) on Chromosome 2 (Chr 2) and a suggestive locus on Chr 15. Sulf2 on Chr 2 was the strongest candidate gene based on partial correlation analysis, Pearson correlation with Hspa5 , expression levels in the retina, a missense variant in exon 14, and its reported function in the extracellular matrix and interphotoreceptor matrix. SULF2 is localized to the rod and cone photoreceptors in both human and mouse retinas. In human retinas with no pathology, extracellular HSPA5 was localized around many cones within the macular area. In contrast, fewer HSPA5-immunopositive cones were

  15. Statistical Methods for Studying Genetic Variation in Populations

    Science.gov (United States)

    2012-08-01

    iteration will converge to a local optimum, similar to what happens in an EM algorithm. Empirically, a near global optimal can be obtained by multiple...and E Matthysen. Genetic variability and gene flow 131 in the globally , critically-endangered Taita thrush. Conservation Genetics, 1:45–55, 2000. 4.5.2...Libioulle, Edouard Louis, Sarah Hansoul, Cynthia Sandor, Frédéric Farnir, Denis Franchi - mont, Séverine Vermeire, Olivier Dewit, Martine de Vos, Anna

  16. Habitat fragmentation causes rapid genetic differentiation and ...

    African Journals Online (AJOL)

    ... city buildings. These results were supported by multiple statistical analyses including Mantel's test, PCOORDA and AMOVA. Genetic enrichment and epigenetic variation studies can be included in habitat fragmentation analysis and its implications in inducing homogenization and susceptibility in natural plant populations.

  17. Analysis of genetic relationships of mulberry (Morus L.) germplasm ...

    African Journals Online (AJOL)

    STORAGESEVER

    2009-06-03

    Jun 3, 2009 ... Full Length Research Paper. Analysis of genetic ... Key words: Mulberry, molecular marker, genetic diversity, SRAP. ... Europe, North and South America, and Africa, and it is cultivated ... Xingjiang autonomous region, China.

  18. Feature-Based Statistical Analysis of Combustion Simulation Data

    Energy Technology Data Exchange (ETDEWEB)

    Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T

    2011-11-18

    We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion

  19. Statistical learning methods in high-energy and astrophysics analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zimmermann, J. [Forschungszentrum Juelich GmbH, Zentrallabor fuer Elektronik, 52425 Juelich (Germany) and Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de; Kiesling, C. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)

    2004-11-21

    We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application.

  20. Statistical learning methods in high-energy and astrophysics analysis

    International Nuclear Information System (INIS)

    Zimmermann, J.; Kiesling, C.

    2004-01-01

    We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application

  1. The fuzzy approach to statistical analysis

    NARCIS (Netherlands)

    Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.

    2006-01-01

    For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;

  2. Statistical analysis applied to safety culture self-assessment

    International Nuclear Information System (INIS)

    Macedo Soares, P.P.

    2002-01-01

    Interviews and opinion surveys are instruments used to assess the safety culture in an organization as part of the Safety Culture Enhancement Programme. Specific statistical tools are used to analyse the survey results. This paper presents an example of an opinion survey with the corresponding application of the statistical analysis and the conclusions obtained. Survey validation, Frequency statistics, Kolmogorov-Smirnov non-parametric test, Student (T-test) and ANOVA means comparison tests and LSD post-hoc multiple comparison test, are discussed. (author)

  3. Inference and Analysis of Population Structure Using Genetic Data and Network Theory.

    Science.gov (United States)

    Greenbaum, Gili; Templeton, Alan R; Bar-David, Shirli

    2016-04-01

    Clustering individuals to subpopulations based on genetic data has become commonplace in many genetic studies. Inference about population structure is most often done by applying model-based approaches, aided by visualization using distance-based approaches such as multidimensional scaling. While existing distance-based approaches suffer from a lack of statistical rigor, model-based approaches entail assumptions of prior conditions such as that the subpopulations are at Hardy-Weinberg equilibria. Here we present a distance-based approach for inference about population structure using genetic data by defining population structure using network theory terminology and methods. A network is constructed from a pairwise genetic-similarity matrix of all sampled individuals. The community partition, a partition of a network to dense subgraphs, is equated with population structure, a partition of the population to genetically related groups. Community-detection algorithms are used to partition the network into communities, interpreted as a partition of the population to subpopulations. The statistical significance of the structure can be estimated by using permutation tests to evaluate the significance of the partition's modularity, a network theory measure indicating the quality of community partitions. To further characterize population structure, a new measure of the strength of association (SA) for an individual to its assigned community is presented. The strength of association distribution (SAD) of the communities is analyzed to provide additional population structure characteristics, such as the relative amount of gene flow experienced by the different subpopulations and identification of hybrid individuals. Human genetic data and simulations are used to demonstrate the applicability of the analyses. The approach presented here provides a novel, computationally efficient model-free method for inference about population structure that does not entail assumption of

  4. Foundation of statistical energy analysis in vibroacoustics

    CERN Document Server

    Le Bot, A

    2015-01-01

    This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.

  5. HistFitter software framework for statistical data analysis

    Energy Technology Data Exchange (ETDEWEB)

    Baak, M. [CERN, Geneva (Switzerland); Besjes, G.J. [Radboud University Nijmegen, Nijmegen (Netherlands); Nikhef, Amsterdam (Netherlands); Cote, D. [University of Texas, Arlington (United States); Koutsman, A. [TRIUMF, Vancouver (Canada); Lorenz, J. [Ludwig-Maximilians-Universitaet Muenchen, Munich (Germany); Excellence Cluster Universe, Garching (Germany); Short, D. [University of Oxford, Oxford (United Kingdom)

    2015-04-15

    We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface. (orig.)

  6. HistFitter software framework for statistical data analysis

    International Nuclear Information System (INIS)

    Baak, M.; Besjes, G.J.; Cote, D.; Koutsman, A.; Lorenz, J.; Short, D.

    2015-01-01

    We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface. (orig.)

  7. Robust statistics and geochemical data analysis

    International Nuclear Information System (INIS)

    Di, Z.

    1987-01-01

    Advantages of robust procedures over ordinary least-squares procedures in geochemical data analysis is demonstrated using NURE data from the Hot Springs Quadrangle, South Dakota, USA. Robust principal components analysis with 5% multivariate trimming successfully guarded the analysis against perturbations by outliers and increased the number of interpretable factors. Regression with SINE estimates significantly increased the goodness-of-fit of the regression and improved the correspondence of delineated anomalies with known uranium prospects. Because of the ubiquitous existence of outliers in geochemical data, robust statistical procedures are suggested as routine procedures to replace ordinary least-squares procedures

  8. Genetic analysis of repeated, biparental, diploid, hydatidiform moles

    DEFF Research Database (Denmark)

    Sunde, Lone; Vejerslev, Lars O.; Jensen, Mie Poulsen

    1993-01-01

    for the abnormal development can be envisaged, environmental as well as genetic. To conform to current ideas of molar pathogenesis, it is suggested that the present conceptuses might have arisen from imbalances in imprinted genomic regions. This could be a consequence of uniparental disomy in critical regions......A woman presented with five consecutive pregnancies displaying molar morphology. In the fifth pregnancy, a non-malformed, liveborn infant was delivered. Genetic analyses (RFLP analysis, cytogenetics, flow cytometry) were performed in pregnancies II-V. It was demonstrated that these pregnancies...... originated in separate conceptions, all conceptuses were diploid, and all had maternally as well as paternally derived genetic markers. By cytogenetic analysis, aberrant heteromorphisms were noted; no other abnormalities were observed in chromosome structure or in DNA sequence. Many different causes...

  9. Review: domestic animal forensic genetics - biological evidence, genetic markers, analytical approaches and challenges.

    Science.gov (United States)

    Kanthaswamy, S

    2015-10-01

    handling, evidence testing, statistical analysis and reporting that meet the rules of scientific acceptance, reliability and human forensic identification standards. © 2015 Stichting International Foundation for Animal Genetics.

  10. Using Pre-Statistical Analysis to Streamline Monitoring Assessments

    International Nuclear Information System (INIS)

    Reed, J.K.

    1999-01-01

    A variety of statistical methods exist to aid evaluation of groundwater quality and subsequent decision making in regulatory programs. These methods are applied because of large temporal and spatial extrapolations commonly applied to these data. In short, statistical conclusions often serve as a surrogate for knowledge. However, facilities with mature monitoring programs that have generated abundant data have inherently less uncertainty because of the sheer quantity of analytical results. In these cases, statistical tests can be less important, and ''expert'' data analysis should assume an important screening role.The WSRC Environmental Protection Department, working with the General Separations Area BSRI Environmental Restoration project team has developed a method for an Integrated Hydrogeological Analysis (IHA) of historical water quality data from the F and H Seepage Basins groundwater remediation project. The IHA combines common sense analytical techniques and a GIS presentation that force direct interactive evaluation of the data. The IHA can perform multiple data analysis tasks required by the RCRA permit. These include: (1) Development of a groundwater quality baseline prior to remediation startup, (2) Targeting of constituents for removal from RCRA GWPS, (3) Targeting of constituents for removal from UIC, permit, (4) Targeting of constituents for reduced, (5)Targeting of monitoring wells not producing representative samples, (6) Reduction in statistical evaluation, and (7) Identification of contamination from other facilities

  11. Genetic diversity of popcorn genotypes using molecular analysis.

    Science.gov (United States)

    Resh, F S; Scapim, C A; Mangolin, C A; Machado, M F P S; do Amaral, A T; Ramos, H C C; Vivas, M

    2015-08-19

    In this study, we analyzed dominant molecular markers to estimate the genetic divergence of 26 popcorn genotypes and evaluate whether using various dissimilarity coefficients with these dominant markers influences the results of cluster analysis. Fifteen random amplification of polymorphic DNA primers produced 157 amplified fragments, of which 65 were monomorphic and 92 were polymorphic. To calculate the genetic distances among the 26 genotypes, the complements of the Jaccard, Dice, and Rogers and Tanimoto similarity coefficients were used. A matrix of Dij values (dissimilarity matrix) was constructed, from which the genetic distances among genotypes were represented in a more simplified manner as a dendrogram generated using the unweighted pair-group method with arithmetic average. Clusters determined by molecular analysis generally did not group material from the same parental origin together. The largest genetic distance was between varieties 17 (UNB-2) and 18 (PA-091). In the identification of genotypes with the smallest genetic distance, the 3 coefficients showed no agreement. The 3 dissimilarity coefficients showed no major differences among their grouping patterns because agreement in determining the genotypes with large, medium, and small genetic distances was high. The largest genetic distances were observed for the Rogers and Tanimoto dissimilarity coefficient (0.74), followed by the Jaccard coefficient (0.65) and the Dice coefficient (0.48). The 3 coefficients showed similar estimations for the cophenetic correlation coefficient. Correlations among the matrices generated using the 3 coefficients were positive and had high magnitudes, reflecting strong agreement among the results obtained using the 3 evaluated dissimilarity coefficients.

  12. Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.

    Science.gov (United States)

    Rudert, Thomas; Lohmann, Gabriele

    2008-12-01

    To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.

  13. Conceptual Incongruence between Prion Disease and Genetic Diversity in Ovine Species within European Union defined by Informational Statistics Terms

    Directory of Open Access Journals (Sweden)

    Gheorghe Hrinca

    2016-11-01

    Full Text Available Biodiversity and the studies of spongiform encephalopathies in the farm animals are highly topical concerns of the contemporary scientific world. Both themes are very interesting for the life sciences and very important for the application field of animal breeding. The implementation of these two concepts creates an antithetical paradigm: the achievement of genetic prophylaxis joins with the decrease of genetic diversity. The paper examines the genetic diversity and its evolution in sheep livestock from the European space in the context in which the European Community has developed very laborious and costly programs targeted both for conservation and enhancement of biodiversity and to eradicate the scrapie in small ruminants. This paper utilises a precise method to quantify the genetic biodiversity in all sheep populations in Europe by a modern concept derived from informational statistics - informational energy. In addition, the paper proposes concrete and viable solutions to achieve these two desiderata at optimal levels in connection with a perfect perspicacity of sheep breeder which consists in accuracy of the reproduction process and correct application of the selection criteria.

  14. RESEARCH NOTE Molecular genetic analysis of consanguineous ...

    Indian Academy of Sciences (India)

    Navya

    Molecular genetic analysis of consanguineous families with primary microcephaly ... Translational Research Institute, Academic Health System, Hamad Medical ..... bridging the gap between homozygosity mapping and deep sequencing.

  15. A genetic analysis of Trichuris trichiura and Trichuris suis from Ecuador

    DEFF Research Database (Denmark)

    Meekums, Hayley; Hawash, Mohamed B F; Sparks, Alexandra M

    2015-01-01

    BACKGROUND: Since the nematodes Trichuris trichiura and T. suis are morphologically indistinguishable, genetic analysis is required to assess epidemiological cross-over between people and pigs. This study aimed to clarify the transmission biology of trichuriasis in Ecuador. FINDINGS: Adult...... Trichuris worms were collected during a parasitological survey of 132 people and 46 pigs in Esmeraldas Province, Ecuador. Morphometric analysis of 49 pig worms and 64 human worms revealed significant variation. In discriminant analysis morphometric characteristics correctly classified male worms according...... to genetically analyse Trichuris parasites. Although T. trichiura does not appear to be zoonotic in Ecuador, there is evidence of genetic exchange between T. trichiura and T. suis warranting more detailed genetic sampling....

  16. Inferring Demographic History Using Two-Locus Statistics.

    Science.gov (United States)

    Ragsdale, Aaron P; Gutenkunst, Ryan N

    2017-06-01

    Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.

  17. Sensitivity analysis and optimization of system dynamics models : Regression analysis and statistical design of experiments

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for

  18. Population Structure, Genetic Diversity and Molecular Marker-Trait Association Analysis for High Temperature Stress Tolerance in Rice.

    Directory of Open Access Journals (Sweden)

    Sharat Kumar Pradhan

    Full Text Available Rice exhibits enormous genetic diversity, population structure and molecular marker-traits associated with abiotic stress tolerance to high temperature stress. A set of breeding lines and landraces representing 240 germplasm lines were studied. Based on spikelet fertility percent under high temperature, tolerant genotypes were broadly classified into four classes. Genetic diversity indicated a moderate level of genetic base of the population for the trait studied. Wright's F statistic estimates showed a deviation of Hardy-Weinberg expectation in the population. The analysis of molecular variance revealed 25 percent variation between population, 61 percent among individuals and 14 percent within individuals in the set. The STRUCTURE analysis categorized the entire population into three sub-populations and suggested that most of the landraces in each sub-population had a common primary ancestor with few admix individuals. The composition of materials in the panel showed the presence of many QTLs representing the entire genome for the expression of tolerance. The strongly associated marker RM547 tagged with spikelet fertility under stress and the markers like RM228, RM205, RM247, RM242, INDEL3 and RM314 indirectly controlling the high temperature stress tolerance were detected through both mixed linear model and general linear model TASSEL analysis. These markers can be deployed as a resource for marker-assisted breeding program of high temperature stress tolerance.

  19. Population Structure, Genetic Diversity and Molecular Marker-Trait Association Analysis for High Temperature Stress Tolerance in Rice.

    Science.gov (United States)

    Pradhan, Sharat Kumar; Barik, Saumya Ranjan; Sahoo, Ambika; Mohapatra, Sudipti; Nayak, Deepak Kumar; Mahender, Anumalla; Meher, Jitandriya; Anandan, Annamalai; Pandit, Elssa

    2016-01-01

    Rice exhibits enormous genetic diversity, population structure and molecular marker-traits associated with abiotic stress tolerance to high temperature stress. A set of breeding lines and landraces representing 240 germplasm lines were studied. Based on spikelet fertility percent under high temperature, tolerant genotypes were broadly classified into four classes. Genetic diversity indicated a moderate level of genetic base of the population for the trait studied. Wright's F statistic estimates showed a deviation of Hardy-Weinberg expectation in the population. The analysis of molecular variance revealed 25 percent variation between population, 61 percent among individuals and 14 percent within individuals in the set. The STRUCTURE analysis categorized the entire population into three sub-populations and suggested that most of the landraces in each sub-population had a common primary ancestor with few admix individuals. The composition of materials in the panel showed the presence of many QTLs representing the entire genome for the expression of tolerance. The strongly associated marker RM547 tagged with spikelet fertility under stress and the markers like RM228, RM205, RM247, RM242, INDEL3 and RM314 indirectly controlling the high temperature stress tolerance were detected through both mixed linear model and general linear model TASSEL analysis. These markers can be deployed as a resource for marker-assisted breeding program of high temperature stress tolerance.

  20. Multivariate Statistical Methods as a Tool of Financial Analysis of Farm Business

    Czech Academy of Sciences Publication Activity Database

    Novák, J.; Sůvová, H.; Vondráček, Jiří

    2002-01-01

    Roč. 48, č. 1 (2002), s. 9-12 ISSN 0139-570X Institutional research plan: AV0Z1030915 Keywords : financial analysis * financial ratios * multivariate statistical methods * correlation analysis * discriminant analysis * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research

  1. Statistical analysis and interpretation of prenatal diagnostic imaging studies, Part 2: descriptive and inferential statistical methods.

    Science.gov (United States)

    Tuuli, Methodius G; Odibo, Anthony O

    2011-08-01

    The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.

  2. Analysis of genetic structure in Melia volkensii (Gurke.) populations ...

    African Journals Online (AJOL)

    Administrator

    2Farm Forestry Programme, Kenya Forestry Research Institute, P. O. Box 20412, Nairobi, Kenya. Accepted 5 ... were used to estimate genetic distances between populations and for construction of neighbour-joining phenograms. Analysis of Molecular Variance (AMOVA) indicated significant genetic differentiation between ...

  3. Statistical analysis of environmental data

    International Nuclear Information System (INIS)

    Beauchamp, J.J.; Bowman, K.O.; Miller, F.L. Jr.

    1975-10-01

    This report summarizes the analyses of data obtained by the Radiological Hygiene Branch of the Tennessee Valley Authority from samples taken around the Browns Ferry Nuclear Plant located in Northern Alabama. The data collection was begun in 1968 and a wide variety of types of samples have been gathered on a regular basis. The statistical analysis of environmental data involving very low-levels of radioactivity is discussed. Applications of computer calculations for data processing are described

  4. Highly Robust Statistical Methods in Medical Image Analysis

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2012-01-01

    Roč. 32, č. 2 (2012), s. 3-16 ISSN 0208-5216 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust statistics * classification * faces * robust image analysis * forensic science Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.208, year: 2012 http://www.ibib.waw.pl/bbe/bbefulltext/BBE_32_2_003_FT.pdf

  5. Statistical Power Analysis with Missing Data A Structural Equation Modeling Approach

    CERN Document Server

    Davey, Adam

    2009-01-01

    Statistical power analysis has revolutionized the ways in which we conduct and evaluate research.  Similar developments in the statistical analysis of incomplete (missing) data are gaining more widespread applications. This volume brings statistical power and incomplete data together under a common framework, in a way that is readily accessible to those with only an introductory familiarity with structural equation modeling.  It answers many practical questions such as: How missing data affects the statistical power in a study How much power is likely with different amounts and types

  6. Statistical Analysis of Data for Timber Strengths

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2003-01-01

    Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...... for timber are investigated....

  7. Numeric computation and statistical data analysis on the Java platform

    CERN Document Server

    Chekanov, Sergei V

    2016-01-01

    Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...

  8. A Divergence Statistics Extension to VTK for Performance Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-02-01

    This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.

  9. The power and robustness of maximum LOD score statistics.

    Science.gov (United States)

    Yoo, Y J; Mendell, N R

    2008-07-01

    The maximum LOD score statistic is extremely powerful for gene mapping when calculated using the correct genetic parameter value. When the mode of genetic transmission is unknown, the maximum of the LOD scores obtained using several genetic parameter values is reported. This latter statistic requires higher critical value than the maximum LOD score statistic calculated from a single genetic parameter value. In this paper, we compare the power of maximum LOD scores based on three fixed sets of genetic parameter values with the power of the LOD score obtained after maximizing over the entire range of genetic parameter values. We simulate family data under nine generating models. For generating models with non-zero phenocopy rates, LOD scores maximized over the entire range of genetic parameters yielded greater power than maximum LOD scores for fixed sets of parameter values with zero phenocopy rates. No maximum LOD score was consistently more powerful than the others for generating models with a zero phenocopy rate. The power loss of the LOD score maximized over the entire range of genetic parameters, relative to the maximum LOD score calculated using the correct genetic parameter value, appeared to be robust to the generating models.

  10. On the Statistical Validation of Technical Analysis

    Directory of Open Access Journals (Sweden)

    Rosane Riera Freire

    2007-06-01

    Full Text Available Technical analysis, or charting, aims on visually identifying geometrical patterns in price charts in order to antecipate price "trends". In this paper we revisit the issue of thecnical analysis validation which has been tackled in the literature without taking care for (i the presence of heterogeneity and (ii statistical dependence in the analyzed data - various agglutinated return time series from distinct financial securities. The main purpose here is to address the first cited problem by suggesting a validation methodology that also "homogenizes" the securities according to the finite dimensional probability distribution of their return series. The general steps go through the identification of the stochastic processes for the securities returns, the clustering of similar securities and, finally, the identification of presence, or absence, of informatinal content obtained from those price patterns. We illustrate the proposed methodology with a real data exercise including several securities of the global market. Our investigation shows that there is a statistically significant informational content in two out of three common patterns usually found through technical analysis, namely: triangle, rectangle and head and shoulders.

  11. Genetic divergence analysis in pumpkin

    International Nuclear Information System (INIS)

    Quamruzzaman, A.M.; Moniruzzaman, M.

    2013-01-01

    Genetic divergence among 18 punpkin genotypes was estimated using Mahalanohis's 1) statistic. Altogether lour clusters were formed where cluster I contained the highest number of genotypes (8) and cluster II contained the lowest (I). The highest intra-cluster distance was observed h.ir cluster I (0.83 I) and the lowest for clustcr IV (0.65 I). The highest inter-cluster distance was observed between cluster I and 11(24.346). Cluster II recorded the highest mean for fruit number/plant, TSS, fruit yield and niinitnuiii III cavity length and cavity diameter. Cluster III had the second highest mean for fruit diameter, fruit number/plant, individual unit weight, fruit yield and the fewest number of days to 1st Female flowering, earliness being a desirable trait. These crosses may produce new recombinants with desirable traits. (author)

  12. Data management and statistical analysis for environmental assessment

    International Nuclear Information System (INIS)

    Wendelberger, J.R.; McVittie, T.I.

    1995-01-01

    Data management and statistical analysis for environmental assessment are important issues on the interface of computer science and statistics. Data collection for environmental decision making can generate large quantities of various types of data. A database/GIS system developed is described which provides efficient data storage as well as visualization tools which may be integrated into the data analysis process. FIMAD is a living database and GIS system. The system has changed and developed over time to meet the needs of the Los Alamos National Laboratory Restoration Program. The system provides a repository for data which may be accessed by different individuals for different purposes. The database structure is driven by the large amount and varied types of data required for environmental assessment. The integration of the database with the GIS system provides the foundation for powerful visualization and analysis capabilities

  13. Comparison of statistical tests for association between rare variants and binary traits.

    OpenAIRE

    Bacanu, SA; Nelson, MR; Whittaker, JC

    2012-01-01

    : Genome-wide association studies have found thousands of common genetic variants associated with a wide variety of diseases and other complex traits. However, a large portion of the predicted genetic contribution to many traits remains unknown. One plausible explanation is that some of the missing variation is due to the effects of rare variants. Nonetheless, the statistical analysis of rare variants is challenging. A commonly used method is to contrast, within the same region (gene), the fr...

  14. Guidelines for collecting and maintaining archives for genetic monitoring

    Science.gov (United States)

    Jackson, Jennifer A.; Laikre, Linda; Baker, C. Scott; Kendall, Katherine C.; ,

    2012-01-01

    Rapid advances in molecular genetic techniques and the statistical analysis of genetic data have revolutionized the way that populations of animals, plants and microorganisms can be monitored. Genetic monitoring is the practice of using molecular genetic markers to track changes in the abundance, diversity or distribution of populations, species or ecosystems over time, and to follow adaptive and non-adaptive genetic responses to changing external conditions. In recent years, genetic monitoring has become a valuable tool in conservation management of biological diversity and ecological analysis, helping to illuminate and define cryptic and poorly understood species and populations. Many of the detected biodiversity declines, changes in distribution and hybridization events have helped to drive changes in policy and management. Because a time series of samples is necessary to detect trends of change in genetic diversity and species composition, archiving is a critical component of genetic monitoring. Here we discuss the collection, development, maintenance, and use of archives for genetic monitoring. This includes an overview of the genetic markers that facilitate effective monitoring, describes how tissue and DNA can be stored, and provides guidelines for proper practice.

  15. Genetic and immunohistochemical analysis of HSPA5 in mouse and human retinas

    Science.gov (United States)

    Chintalapudi, Sumana R.; Wang, XiaoFei; Li, Huiling; Lau, Yin H. Chan; Williams, Robert W.; Jablonski, Monica M.

    2016-01-01

    Purpose Photoreceptor degenerative diseases are among the leading causes of vision loss. Although the causative genetic mutations are often known, mechanisms leading to photoreceptor degeneration remain poorly defined. We have previously demonstrated that the photoreceptor membrane-associated protein XAP-1 antigen is a product of the HSPA5 gene. In this study, we used systems genetic methods, statistical modeling, and immunostaining to identify and analyze candidate genes that modulate Hspa5 expression in the retina. Methods Quantitative trait locus (QTL) mapping was used to map the genomic region that regulates Hspa5 in the cross between C57BL/6J X DBA/2J mice (BXD) genetic reference panel. The stepwise refinement of candidate genes was based on expression QTL mapping, gene expression correlation analyses (direct and partial), and analysis of regional sequence variants. The subcellular localization of candidate proteins and HSPA5 in mouse and human retinas was evaluated by immunohistochemistry. Differences in the localization of extracellular HSPA5 were assessed between healthy human donor and atrophic age-related macular degeneration (AMD) donor eyes. Results In the eyes of healthy mice, extracellular HSPA5 was confined to the area around the cone photoreceptor outer segments. Mapping variation in Hspa5 mRNA expression levels in the retina revealed a statistically significant trans-acting expression QTL (eQTL) on Chromosome 2 (Chr 2) and a suggestive locus on Chr 15. Sulf2 on Chr 2 was the strongest candidate gene based on partial correlation analysis, Pearson correlation with Hspa5, expression levels in the retina, a missense variant in exon 14, and its reported function in the extracellular matrix and interphotoreceptor matrix. SULF2 is localized to the rod and cone photoreceptors in both human and mouse retinas. In human retinas with no pathology, extracellular HSPA5 was localized around many cones within the macular area. In contrast, fewer HSPA5

  16. Compliance strategy for statistically based neutron overpower protection safety analysis methodology

    International Nuclear Information System (INIS)

    Holliday, E.; Phan, B.; Nainer, O.

    2009-01-01

    The methodology employed in the safety analysis of the slow Loss of Regulation (LOR) event in the OPG and Bruce Power CANDU reactors, referred to as Neutron Overpower Protection (NOP) analysis, is a statistically based methodology. Further enhancement to this methodology includes the use of Extreme Value Statistics (EVS) for the explicit treatment of aleatory and epistemic uncertainties, and probabilistic weighting of the initial core states. A key aspect of this enhanced NOP methodology is to demonstrate adherence, or compliance, with the analysis basis. This paper outlines a compliance strategy capable of accounting for the statistical nature of the enhanced NOP methodology. (author)

  17. Development of an unbiased statistical method for the analysis of unigenic evolution

    Directory of Open Access Journals (Sweden)

    Shilton Brian H

    2006-03-01

    Full Text Available Abstract Background Unigenic evolution is a powerful genetic strategy involving random mutagenesis of a single gene product to delineate functionally important domains of a protein. This method involves selection of variants of the protein which retain function, followed by statistical analysis comparing expected and observed mutation frequencies of each residue. Resultant mutability indices for each residue are averaged across a specified window of codons to identify hypomutable regions of the protein. As originally described, the effect of changes to the length of this averaging window was not fully eludicated. In addition, it was unclear when sufficient functional variants had been examined to conclude that residues conserved in all variants have important functional roles. Results We demonstrate that the length of averaging window dramatically affects identification of individual hypomutable regions and delineation of region boundaries. Accordingly, we devised a region-independent chi-square analysis that eliminates loss of information incurred during window averaging and removes the arbitrary assignment of window length. We also present a method to estimate the probability that conserved residues have not been mutated simply by chance. In addition, we describe an improved estimation of the expected mutation frequency. Conclusion Overall, these methods significantly extend the analysis of unigenic evolution data over existing methods to allow comprehensive, unbiased identification of domains and possibly even individual residues that are essential for protein function.

  18. Microsatellite analysis of chloroquine resistance associated alleles and neutral loci reveal genetic structure of Indian Plasmodium falciparum.

    Science.gov (United States)

    Mallick, Prashant K; Sutton, Patrick L; Singh, Ruchi; Singh, Om P; Dash, Aditya P; Singh, Ashok K; Carlton, Jane M; Bhasin, Virendra K

    2013-10-01

    Efforts to control malignant malaria caused by Plasmodium falciparum are hampered by the parasite's acquisition of resistance to antimalarial drugs, e.g., chloroquine. This necessitates evaluating the spread of chloroquine resistance in any malaria-endemic area. India displays highly variable malaria epidemiology and also shares porous international borders with malaria-endemic Southeast Asian countries having multi-drug resistant malaria. Malaria epidemiology in India is believed to be affected by two major factors: high genetic diversity and evolving drug resistance in P. falciparum. How transmission intensity of malaria can influence the genetic structure of chloroquine-resistant P. falciparum population in India is unknown. Here, genetic diversity within and among P. falciparum populations is analyzed with respect to their prevalence and chloroquine resistance observed in 13 different locations in India. Microsatellites developed for P. falciparum, including three putatively neutral and seven microsatellites thought to be under a hitchhiking effect due to chloroquine selection were used. Genetic hitchhiking is observed in five of seven microsatellites flanking the gene responsible for chloroquine resistance. Genetic admixture analysis and F-statistics detected genetically distinct groups in accordance with transmission intensity of different locations and the probable use of chloroquine. A large genetic break between the chloroquine-resistant parasite of the Northeast-East-Island group and Southwest group (FST=0.253, Pstructure for Indian P. falciparum population. Overall, the study suggests that transmission intensity can be an efficient driver for genetic differentiation at both neutral and adaptive loci across India. Copyright © 2013 Elsevier B.V. All rights reserved.

  19. QTL identification of grain protein concentration and its genetic ...

    Indian Academy of Sciences (India)

    culated using the statistical software package SPSS 12.0. (SPSS, Chicago, USA). ... Joint QTL analysis for PC and GWP/SC was carried out according to the multiple interval ..... testing for epistasis: an application in maize. Theor. Appl. Genet.

  20. Diagnosis checking of statistical analysis in RCTs indexed in PubMed.

    Science.gov (United States)

    Lee, Paul H; Tse, Andy C Y

    2017-11-01

    Statistical analysis is essential for reporting of the results of randomized controlled trials (RCTs), as well as evaluating their effectiveness. However, the validity of a statistical analysis also depends on whether the assumptions of that analysis are valid. To review all RCTs published in journals indexed in PubMed during December 2014 to provide a complete picture of how RCTs handle assumptions of statistical analysis. We reviewed all RCTs published in December 2014 that appeared in journals indexed in PubMed using the Cochrane highly sensitive search strategy. The 2014 impact factors of the journals were used as proxies for their quality. The type of statistical analysis used and whether the assumptions of the analysis were tested were reviewed. In total, 451 papers were included. Of the 278 papers that reported a crude analysis for the primary outcomes, 31 (27·2%) reported whether the outcome was normally distributed. Of the 172 papers that reported an adjusted analysis for the primary outcomes, diagnosis checking was rarely conducted, with only 20%, 8·6% and 7% checked for generalized linear model, Cox proportional hazard model and multilevel model, respectively. Study characteristics (study type, drug trial, funding sources, journal type and endorsement of CONSORT guidelines) were not associated with the reporting of diagnosis checking. The diagnosis of statistical analyses in RCTs published in PubMed-indexed journals was usually absent. Journals should provide guidelines about the reporting of a diagnosis of assumptions. © 2017 Stichting European Society for Clinical Investigation Journal Foundation.

  1. Genetic polymorphisms, forensic efficiency and phylogenetic analysis of 15 autosomal STR loci in the Kazak population of Ili Kazak Autonomous Prefecture, northwestern China.

    Science.gov (United States)

    Feng, Chunmei; Wang, Xin; Wang, Xiaolong; Yu, Hao; Zhang, Guohua

    2018-03-01

    We investigated the frequencies of 15 autosomal STR loci in the Kazak population of the Ili Kazak Autonomous Prefecture with the aim of expanding the available population information in human genetic databases and for forensic DNA analysis. Genetic polymorphisms of 15 autosomal short tandem repeat (STR) loci were analysed in 456 individuals of the Kazak population from Ili Kazakh Autonomous Prefecture, northwestern China. A total of 173 alleles at 15 autosomal STR loci were found; the allele frequencies ranged from 0.5022-0.0011. The combined power of discrimination and exclusion statistics for the 15 STR loci were 0.999 999 999 85 and 0.999 998 800 65, respectively. In addition, phylogenetic analysis involving the Ili Uygur population and other relevant populations was carried out. A neighbour-joining tree and multidimensional scaling plot were generated based on Nei's standard genetic distance. Results of the population comparison indicated that the Ili Uygur population was most closely related genetically to the Uygur populations from other regions in China. These findings are consistent with the historical and geographic backgrounds of these populations.

  2. A κ-generalized statistical mechanics approach to income analysis

    Science.gov (United States)

    Clementi, F.; Gallegati, M.; Kaniadakis, G.

    2009-02-01

    This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.

  3. A κ-generalized statistical mechanics approach to income analysis

    International Nuclear Information System (INIS)

    Clementi, F; Gallegati, M; Kaniadakis, G

    2009-01-01

    This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low–middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful

  4. Molecular Diversity Analysis and Genetic Mapping of Pod Shatter Resistance Loci in Brassica carinata L.

    Directory of Open Access Journals (Sweden)

    Rosy Raman

    2017-11-01

    Full Text Available Seed lost due to easy pod dehiscence at maturity (pod shatter is a major problem in several members of Brassicaceae family. We investigated the level of pod shatter resistance in Ethiopian mustard (Brassica carinata and identified quantitative trait loci (QTL for targeted introgression of this trait in Ethiopian mustard and its close relatives of the genus Brassica. A set of 83 accessions of B. carinata, collected from the Australian Grains Genebank, was evaluated for pod shatter resistance based on pod rupture energy (RE. In comparison to B. napus (RE = 2.16 mJ, B. carinata accessions had higher RE values (2.53 to 20.82 mJ. A genetic linkage map of an F2 population from two contrasting B. carinata selections, BC73526 (shatter resistant with high RE and BC73524 (shatter prone with low RE comprising 300 individuals, was constructed using a set of 6,464 high quality DArTseq markers and subsequently used for QTL analysis. Genetic analysis of the F2 and F2:3 derived lines revealed five statistically significant QTL (LOD ≥ 3 that are linked with pod shatter resistance on chromosomes B1, B3, B8, and C5. Herein, we report for the first time, identification of genetic loci associated with pod shatter resistance in B. carinata. These characterized accessions would be useful in Brassica breeding programs for introgression of pod shatter resistance alleles in to elite breeding lines. Molecular markers would assist marker-assisted selection for tracing the introgression of resistant alleles. Our results suggest that the value of the germplasm collections can be harnessed through genetic and genomics tools.

  5. Normality Tests for Statistical Analysis: A Guide for Non-Statisticians

    Science.gov (United States)

    Ghasemi, Asghar; Zahediasl, Saleh

    2012-01-01

    Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS. PMID:23843808

  6. Development of computer-assisted instruction application for statistical data analysis android platform as learning resource

    Science.gov (United States)

    Hendikawati, P.; Arifudin, R.; Zahid, M. Z.

    2018-03-01

    This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.

  7. Screening of spontaneous castor bean accesses for genetic ...

    African Journals Online (AJOL)

    ... discriminant power between the castor bean accesses, being the multivariate analysis efficient in this process. The castor bean accesses ACS-001 CRSP and ACS-001-MASP are promising for introduction in genetic improvement programs of this culture. Keywords: Ricinus communis L., genotype, multivariate statistics, ...

  8. A genetic epidemiological mega analysis of smoking initiation in adolescents

    NARCIS (Netherlands)

    Maes, H.H.; Prom-Wormley, E.; Eaves, L.J.; Rhee, S.H.; Hewitt, J.K.; Young, S.; Corley, R.; McGue, M.K.; Iacono, W.G.; Legrand, L.; Samek, D.; Murrelle, E.L.; Silberg, J.L.; Miles, D.; Schieken, R.M.; Beunen, G.P.; Thomis, M.; Rose, R.J.; Dick, D.M.; Boomsma, D.I.; Bartels, M.; Vink, J.M.; Lichtenstein, P.; White, V.; Kaprio, J.; Neale, M.C.

    2017-01-01

    Introduction. Previous studies in adolescents were not adequately powered to accurately disentangle genetic and environmental influences on smoking initiation across adolescence. Methods. Mega-analysis of pooled genetically informative data on smoking initiation was performed, with structural

  9. Statistical analysis of metallicity in spiral galaxies

    Energy Technology Data Exchange (ETDEWEB)

    Galeotti, P [Consiglio Nazionale delle Ricerche, Turin (Italy). Lab. di Cosmo-Geofisica; Turin Univ. (Italy). Ist. di Fisica Generale)

    1981-04-01

    A principal component analysis of metallicity and other integral properties of 33 spiral galaxies is presented; the involved parameters are: morphological type, diameter, luminosity and metallicity. From the statistical analysis it is concluded that the sample has only two significant dimensions and additonal tests, involving different parameters, show similar results. Thus it seems that only type and luminosity are independent variables, being the other integral properties of spiral galaxies correlated with them.

  10. The power and statistical behaviour of allele-sharing statistics when ...

    Indian Academy of Sciences (India)

    Unknown

    3Human Genetics Division, School of Medicine, University of Southampton, Southampton SO16 6YD, UK. Abstract ... that the statistic S-#alleles gives good performance for recessive ... (H50) of the families are linked to the single marker. The.

  11. Parameter determination for quantitative PIXE analysis using genetic algorithms

    International Nuclear Information System (INIS)

    Aspiazu, J.; Belmont-Moreno, E.

    1996-01-01

    For biological and environmental samples, PIXE technique is in particular advantage for elemental analysis, but the quantitative analysis implies accomplishing complex calculations that require the knowledge of more than a dozen parameters. Using a genetic algorithm, the authors give here an account of the procedure to obtain the best values for the parameters necessary to fit the efficiency for a X-ray detector. The values for some variables involved in quantitative PIXE analysis, were manipulated in a similar way as the genetic information is treated in a biological process. The authors carried out the algorithm until they reproduce, within the confidence interval, the elemental concentrations corresponding to a reference material

  12. Statistical Analysis of Protein Ensembles

    Science.gov (United States)

    Máté, Gabriell; Heermann, Dieter

    2014-04-01

    As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.

  13. State analysis of BOP using statistical and heuristic methods

    International Nuclear Information System (INIS)

    Heo, Gyun Young; Chang, Soon Heung

    2003-01-01

    Under the deregulation environment, the performance enhancement of BOP in nuclear power plants is being highlighted. To analyze performance level of BOP, we use the performance test procedures provided from an authorized institution such as ASME. However, through plant investigation, it was proved that the requirements of the performance test procedures about the reliability and quantity of sensors was difficult to be satisfied. As a solution of this, state analysis method that are the expanded concept of signal validation, was proposed on the basis of the statistical and heuristic approaches. Authors recommended the statistical linear regression model by analyzing correlation among BOP parameters as a reference state analysis method. Its advantage is that its derivation is not heuristic, it is possible to calculate model uncertainty, and it is easy to apply to an actual plant. The error of the statistical linear regression model is below 3% under normal as well as abnormal system states. Additionally a neural network model was recommended since the statistical model is impossible to apply to the validation of all of the sensors and is sensitive to the outlier that is the signal located out of a statistical distribution. Because there are a lot of sensors need to be validated in BOP, wavelet analysis (WA) were applied as a pre-processor for the reduction of input dimension and for the enhancement of training accuracy. The outlier localization capability of WA enhanced the robustness of the neural network. The trained neural network restored the degraded signals to the values within ±3% of the true signals

  14. Quantitative analysis of fetal facial morphology using 3D ultrasound and statistical shape modeling: a feasibility study.

    Science.gov (United States)

    Dall'Asta, Andrea; Schievano, Silvia; Bruse, Jan L; Paramasivam, Gowrishankar; Kaihura, Christine Tita; Dunaway, David; Lees, Christoph C

    2017-07-01

    The antenatal detection of facial dysmorphism using 3-dimensional ultrasound may raise the suspicion of an underlying genetic condition but infrequently leads to a definitive antenatal diagnosis. Despite advances in array and noninvasive prenatal testing, not all genetic conditions can be ascertained from such testing. The aim of this study was to investigate the feasibility of quantitative assessment of fetal face features using prenatal 3-dimensional ultrasound volumes and statistical shape modeling. STUDY DESIGN: Thirteen normal and 7 abnormal stored 3-dimensional ultrasound fetal face volumes were analyzed, at a median gestation of 29 +4  weeks (25 +0 to 36 +1 ). The 20 3-dimensional surface meshes generated were aligned and served as input for a statistical shape model, which computed the mean 3-dimensional face shape and 3-dimensional shape variations using principal component analysis. Ten shape modes explained more than 90% of the total shape variability in the population. While the first mode accounted for overall size differences, the second highlighted shape feature changes from an overall proportionate toward a more asymmetric face shape with a wide prominent forehead and an undersized, posteriorly positioned chin. Analysis of the Mahalanobis distance in principal component analysis shape space suggested differences between normal and abnormal fetuses (median and interquartile range distance values, 7.31 ± 5.54 for the normal group vs 13.27 ± 9.82 for the abnormal group) (P = .056). This feasibility study demonstrates that objective characterization and quantification of fetal facial morphology is possible from 3-dimensional ultrasound. This technique has the potential to assist in utero diagnosis, particularly of rare conditions in which facial dysmorphology is a feature. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Precision Statistical Analysis of Images Based on Brightness Distribution

    Directory of Open Access Journals (Sweden)

    Muzhir Shaban Al-Ani

    2017-07-01

    Full Text Available Study the content of images is considered an important topic in which reasonable and accurate analysis of images are generated. Recently image analysis becomes a vital field because of huge number of images transferred via transmission media in our daily life. These crowded media with images lead to highlight in research area of image analysis. In this paper, the implemented system is passed into many steps to perform the statistical measures of standard deviation and mean values of both color and grey images. Whereas the last step of the proposed method concerns to compare the obtained results in different cases of the test phase. In this paper, the statistical parameters are implemented to characterize the content of an image and its texture. Standard deviation, mean and correlation values are used to study the intensity distribution of the tested images. Reasonable results are obtained for both standard deviation and mean value via the implementation of the system. The major issue addressed in the work is concentrated on brightness distribution via statistical measures applying different types of lighting.

  16. Genetic analysis of Myanmar Vigna species in responses to salt ...

    African Journals Online (AJOL)

    Genetic analysis of Myanmar Vigna species in responses to salt stress at the ... of reduction was highly dependent on different genotypes and salinity levels. ... the mechanism of salt tolerance and for the provision of genetic resources for ...

  17. Fisher statistics for analysis of diffusion tensor directional information.

    Science.gov (United States)

    Hutchinson, Elizabeth B; Rutecki, Paul A; Alexander, Andrew L; Sutula, Thomas P

    2012-04-30

    A statistical approach is presented for the quantitative analysis of diffusion tensor imaging (DTI) directional information using Fisher statistics, which were originally developed for the analysis of vectors in the field of paleomagnetism. In this framework, descriptive and inferential statistics have been formulated based on the Fisher probability density function, a spherical analogue of the normal distribution. The Fisher approach was evaluated for investigation of rat brain DTI maps to characterize tissue orientation in the corpus callosum, fornix, and hilus of the dorsal hippocampal dentate gyrus, and to compare directional properties in these regions following status epilepticus (SE) or traumatic brain injury (TBI) with values in healthy brains. Direction vectors were determined for each region of interest (ROI) for each brain sample and Fisher statistics were applied to calculate the mean direction vector and variance parameters in the corpus callosum, fornix, and dentate gyrus of normal rats and rats that experienced TBI or SE. Hypothesis testing was performed by calculation of Watson's F-statistic and associated p-value giving the likelihood that grouped observations were from the same directional distribution. In the fornix and midline corpus callosum, no directional differences were detected between groups, however in the hilus, significant (pstatistical comparison of tissue structural orientation. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Statistical analysis of RHIC beam position monitors performance

    Science.gov (United States)

    Calaga, R.; Tomás, R.

    2004-04-01

    A detailed statistical analysis of beam position monitors (BPM) performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.

  19. Statistical analysis of RHIC beam position monitors performance

    Directory of Open Access Journals (Sweden)

    R. Calaga

    2004-04-01

    Full Text Available A detailed statistical analysis of beam position monitors (BPM performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.

  20. A fielded wiki for personality genetics

    DEFF Research Database (Denmark)

    Nielsen, Finn Årup

    2010-01-01

    I describe a fielded wiki, where a Web form interface allows the entry, analysis and visualization of results from scientific papers in the personality genetics domain. Papers in this domain typically report the mean and standard deviation of multiple personality trait scores from statistics...... on human subjects grouped based on genotype. The wiki organizes the basic data in a single table with fixed columns, each row recording statistical values with respect to a specific personality trait reported in a specific paper with a specific genotype group. From this basic data hard-coded meta...

  1. Genetic analysis for grain quality traits in pakistani wheat varieties

    International Nuclear Information System (INIS)

    Minhas, N.M.; Ajmal, S.U.; Iqbal, Z.; Munir, M.

    2014-01-01

    A set of eight parental diallel involving seven commercial wheat cultivars and one breeding line was made to investigate the nature of gene action determining inheritance pattern of grain quality characters. Highly significant differences were observed among the genotypes for 1000 grain weight, protein content, wet gluten and lysine content. Adequacy tests were employed to estimate the fitness of data sets to additive dominance model. Both the tests i.e. analysis of uniformity of Wr, Vr and joint regression analysis validated the data of these traits for genetic analysis. Gene actions for grain quality traits were ascertained following Hayman's analysis of variance. Results of the genetic analysis revealed that both additive and dominance genetic components were involved in the manifestation of characters under study. However, additive gene effects were more pronounced in the genetic control of these traits. Non significance of b1, b2 and b3 values revealed the absence of directional dominance, symmetrical distribution of genes among the parental lines and absence of specific genes action respectively in all the traits. Maternal effects were also noted in 1000 grain weight, protein content and wet gluten percentage. It is concluded that additive effects are crucial in the expression of grain quality characters of wheat in germplasm under study and single plant selection may be recommended in segregating generations for effective improvement in these characters. (author)

  2. Statistics Education Research in Malaysia and the Philippines: A Comparative Analysis

    Science.gov (United States)

    Reston, Enriqueta; Krishnan, Saras; Idris, Noraini

    2014-01-01

    This paper presents a comparative analysis of statistics education research in Malaysia and the Philippines by modes of dissemination, research areas, and trends. An electronic search for published research papers in the area of statistics education from 2000-2012 yielded 20 for Malaysia and 19 for the Philippines. Analysis of these papers showed…

  3. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  4. Selected papers on analysis, probability, and statistics

    CERN Document Server

    Nomizu, Katsumi

    1994-01-01

    This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.

  5. Loss of genetic variation in Greek hatchery populations of the European sea bass (Dicentrarchus labrax L. as revealed by microsatellite DNA analysis

    Directory of Open Access Journals (Sweden)

    D. LOUKOVITIS

    2014-10-01

    Full Text Available Genetic variation in four reared stocks of European sea bass Dicentrarchus labrax L., originating from Greek commercial farms, was assessed using five polymorphic microsatellite markers and was compared with that of three natural populations from Greece and France. The total number of alleles per marker ranged from 8 to 22 alleles, and hatchery samples showed the same levels of observed heterozygosity with samples from the wild but substantially smaller allelic richness and expected heterozygosity. The genetic differentiation of cultivated samples between them as well as from the wild origin fish was significant as indicated by Fst analysis. All population pairwise comparisons were statistically significant, except for the pair of the two natural Greek populations. Results of microsatellite DNA analysis herein showed a 37 % reduction of the mean allele number in the hatchery samples compared to the wild ones, suggesting random genetic drift and inbreeding events operating in the hatcheries. Knowledge of the genetic variation in D. labrax cultured populations compared with that in the wild ones is essential for setting up appropriate guidelines for proper monitoring and management of the stocks either under traditional practices or for the implementation of selective breeding programmes.

  6. Quantitative Evaluation of Hybrid Aspen Xylem and Immunolabeling Patterns Using Image Analysis and Multivariate Statistics

    Directory of Open Access Journals (Sweden)

    David Sandquist

    2015-06-01

    Full Text Available A new method is presented for quantitative evaluation of hybrid aspen genotype xylem morphology and immunolabeling micro-distribution. This method can be used as an aid in assessing differences in genotypes from classic tree breeding studies, as well as genetically engineered plants. The method is based on image analysis, multivariate statistical evaluation of light, and immunofluorescence microscopy images of wood xylem cross sections. The selected immunolabeling antibodies targeted five different epitopes present in aspen xylem cell walls. Twelve down-regulated hybrid aspen genotypes were included in the method development. The 12 knock-down genotypes were selected based on pre-screening by pyrolysis-IR of global chemical content. The multivariate statistical evaluations successfully identified comparative trends for modifications in the down-regulated genotypes compared to the unmodified control, even when no definitive conclusions could be drawn from individual studied variables alone. Of the 12 genotypes analyzed, three genotypes showed significant trends for modifications in both morphology and immunolabeling. Six genotypes showed significant trends for modifications in either morphology or immunocoverage. The remaining three genotypes did not show any significant trends for modification.

  7. Analysis of statistical misconception in terms of statistical reasoning

    Science.gov (United States)

    Maryati, I.; Priatna, N.

    2018-05-01

    Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.

  8. Genetic data analysis for plant and animal breeding

    Science.gov (United States)

    This book is an advanced textbook covering the application of quantitative genetics theory to analysis of actual data (both trait and DNA marker information) for breeding populations of crops, trees, and animals. Chapter 1 is an introduction to basic software used for trait data analysis. Chapter 2 ...

  9. Comparative analysis of positive and negative attitudes toward statistics

    Science.gov (United States)

    Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah

    2015-02-01

    Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.

  10. Variable-number-of-tandem-repeats analysis of genetic diversity in Pasteuria ramosa.

    Science.gov (United States)

    Mouton, L; Ebert, D

    2008-05-01

    Variable-number-of-tandem-repeats (VNTR) markers are increasingly being used in population genetic studies of bacteria. They were recently developed for Pasteuria ramosa, an endobacterium that infects Daphnia species. In the present study, we genotyped P. ramosa in 18 infected hosts from the United Kingdom, Belgium, and two lakes in the United States using seven VNTR markers. Two Daphnia species were collected: D. magna and D. dentifera. Six loci showed length polymorphism, with as many as five alleles identified for a single locus. Similarity coefficient calculations showed that the extent of genetic variation between pairs of isolates within populations differed according to the population, but it was always less than the genetic distances among populations. Analysis of the genetic distances performed using principal component analysis revealed strong clustering by location of origin, but not by host Daphnia species. Our study demonstrated that the VNTR markers available for P. ramosa are informative in revealing genetic differences within and among populations and may therefore become an important tool for providing detailed analysis of population genetics and epidemiology.

  11. A roadmap for the genetic analysis of renal aging.

    Science.gov (United States)

    Noordmans, Gerda A; Hillebrands, Jan-Luuk; van Goor, Harry; Korstanje, Ron

    2015-10-01

    Several studies show evidence for the genetic basis of renal disease, which renders some individuals more prone than others to accelerated renal aging. Studying the genetics of renal aging can help us to identify genes involved in this process and to unravel the underlying pathways. First, this opinion article will give an overview of the phenotypes that can be observed in age-related kidney disease. Accurate phenotyping is essential in performing genetic analysis. For kidney aging, this could include both functional and structural changes. Subsequently, this article reviews the studies that report on candidate genes associated with renal aging in humans and mice. Several loci or candidate genes have been found associated with kidney disease, but identification of the specific genetic variants involved has proven to be difficult. CUBN, UMOD, and SHROOM3 were identified by human GWAS as being associated with albuminuria, kidney function, and chronic kidney disease (CKD). These are promising examples of genes that could be involved in renal aging, and were further mechanistically evaluated in animal models. Eventually, we will provide approaches for performing genetic analysis. We should leverage the power of mouse models, as testing in humans is limited. Mouse and other animal models can be used to explain the underlying biological mechanisms of genes and loci identified by human GWAS. Furthermore, mouse models can be used to identify genetic variants associated with age-associated histological changes, of which Far2, Wisp2, and Esrrg are examples. A new outbred mouse population with high genetic diversity will facilitate the identification of genes associated with renal aging by enabling high-resolution genetic mapping while also allowing the control of environmental factors, and by enabling access to renal tissues at specific time points for histology, proteomics, and gene expression. © 2015 The Authors. Aging Cell published by the Anatomical Society and John

  12. A roadmap for the genetic analysis of renal aging

    Science.gov (United States)

    Noordmans, Gerda A; Hillebrands, Jan-Luuk; van Goor, Harry; Korstanje, Ron

    2015-01-01

    Several studies show evidence for the genetic basis of renal disease, which renders some individuals more prone than others to accelerated renal aging. Studying the genetics of renal aging can help us to identify genes involved in this process and to unravel the underlying pathways. First, this opinion article will give an overview of the phenotypes that can be observed in age-related kidney disease. Accurate phenotyping is essential in performing genetic analysis. For kidney aging, this could include both functional and structural changes. Subsequently, this article reviews the studies that report on candidate genes associated with renal aging in humans and mice. Several loci or candidate genes have been found associated with kidney disease, but identification of the specific genetic variants involved has proven to be difficult. CUBN, UMOD, and SHROOM3 were identified by human GWAS as being associated with albuminuria, kidney function, and chronic kidney disease (CKD). These are promising examples of genes that could be involved in renal aging, and were further mechanistically evaluated in animal models. Eventually, we will provide approaches for performing genetic analysis. We should leverage the power of mouse models, as testing in humans is limited. Mouse and other animal models can be used to explain the underlying biological mechanisms of genes and loci identified by human GWAS. Furthermore, mouse models can be used to identify genetic variants associated with age-associated histological changes, of which Far2, Wisp2, and Esrrg are examples. A new outbred mouse population with high genetic diversity will facilitate the identification of genes associated with renal aging by enabling high-resolution genetic mapping while also allowing the control of environmental factors, and by enabling access to renal tissues at specific time points for histology, proteomics, and gene expression. PMID:26219736

  13. Vapor Pressure Data Analysis and Statistics

    Science.gov (United States)

    2016-12-01

    near 8, 2000, and 200, respectively. The A (or a) value is directly related to vapor pressure and will be greater for high vapor pressure materials...1, (10) where n is the number of data points, Yi is the natural logarithm of the i th experimental vapor pressure value, and Xi is the...VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE

  14. Statistical analysis of planktic foraminifera of the surface Continental ...

    African Journals Online (AJOL)

    Planktic foraminiferal assemblage recorded from selected samples obtained from shallow continental shelf sediments off southwestern Nigeria were subjected to statistical analysis. The Principal Component Analysis (PCA) was used to determine variants of planktic parameters. Values obtained for these parameters were ...

  15. Imaging mass spectrometry statistical analysis.

    Science.gov (United States)

    Jones, Emrys A; Deininger, Sören-Oliver; Hogendoorn, Pancras C W; Deelder, André M; McDonnell, Liam A

    2012-08-30

    Imaging mass spectrometry is increasingly used to identify new candidate biomarkers. This clinical application of imaging mass spectrometry is highly multidisciplinary: expertise in mass spectrometry is necessary to acquire high quality data, histology is required to accurately label the origin of each pixel's mass spectrum, disease biology is necessary to understand the potential meaning of the imaging mass spectrometry results, and statistics to assess the confidence of any findings. Imaging mass spectrometry data analysis is further complicated because of the unique nature of the data (within the mass spectrometry field); several of the assumptions implicit in the analysis of LC-MS/profiling datasets are not applicable to imaging. The very large size of imaging datasets and the reporting of many data analysis routines, combined with inadequate training and accessible reviews, have exacerbated this problem. In this paper we provide an accessible review of the nature of imaging data and the different strategies by which the data may be analyzed. Particular attention is paid to the assumptions of the data analysis routines to ensure that the reader is apprised of their correct usage in imaging mass spectrometry research. Copyright © 2012 Elsevier B.V. All rights reserved.

  16. Comparative results of RAPD and ISSR markers for genetic diversity ...

    African Journals Online (AJOL)

    PRECIOUS

    the mean level of genetic similarity with populations of M. baccifera by using RAPD .... The statistical data for 9 RAPD and 17 ISSR primers used for analyzing 12 accessions of M. .... (Numerical Taxonomy and Multivariate Analysis System, Bio-.

  17. Polyglot programming in applications used for genetic data analysis.

    Science.gov (United States)

    Nowak, Robert M

    2014-01-01

    Applications used for the analysis of genetic data process large volumes of data with complex algorithms. High performance, flexibility, and a user interface with a web browser are required by these solutions, which can be achieved by using multiple programming languages. In this study, I developed a freely available framework for building software to analyze genetic data, which uses C++, Python, JavaScript, and several libraries. This system was used to build a number of genetic data processing applications and it reduced the time and costs of development.

  18. Applied Behavior Analysis and Statistical Process Control?

    Science.gov (United States)

    Hopkins, B. L.

    1995-01-01

    Incorporating statistical process control (SPC) methods into applied behavior analysis is discussed. It is claimed that SPC methods would likely reduce applied behavior analysts' intimate contacts with problems and would likely yield poor treatment and research decisions. Cases and data presented by Pfadt and Wheeler (1995) are cited as examples.…

  19. Gene ontology analysis of pairwise genetic associations in two genome-wide studies of sporadic ALS

    Directory of Open Access Journals (Sweden)

    Kim Nora

    2012-07-01

    Full Text Available Abstract Background It is increasingly clear that common human diseases have a complex genetic architecture characterized by both additive and nonadditive genetic effects. The goal of the present study was to determine whether patterns of both additive and nonadditive genetic associations aggregate in specific functional groups as defined by the Gene Ontology (GO. Results We first estimated all pairwise additive and nonadditive genetic effects using the multifactor dimensionality reduction (MDR method that makes few assumptions about the underlying genetic model. Statistical significance was evaluated using permutation testing in two genome-wide association studies of ALS. The detection data consisted of 276 subjects with ALS and 271 healthy controls while the replication data consisted of 221 subjects with ALS and 211 healthy controls. Both studies included genotypes from approximately 550,000 single-nucleotide polymorphisms (SNPs. Each SNP was mapped to a gene if it was within 500 kb of the start or end. Each SNP was assigned a p-value based on its strongest joint effect with the other SNPs. We then used the Exploratory Visual Analysis (EVA method and software to assign a p-value to each gene based on the overabundance of significant SNPs at the α = 0.05 level in the gene. We also used EVA to assign p-values to each GO group based on the overabundance of significant genes at the α = 0.05 level. A GO category was determined to replicate if that category was significant at the α = 0.05 level in both studies. We found two GO categories that replicated in both studies. The first, ‘Regulation of Cellular Component Organization and Biogenesis’, a GO Biological Process, had p-values of 0.010 and 0.014 in the detection and replication studies, respectively. The second, ‘Actin Cytoskeleton’, a GO Cellular Component, had p-values of 0.040 and 0.046 in the detection and replication studies, respectively. Conclusions Pathway

  20. A Unifying Model for the Analysis of Phenotypic, Genetic and Geographic Data

    DEFF Research Database (Denmark)

    Guillot, Gilles; Rena, Sabrina; Ledevin, Ronan

    2012-01-01

    Recognition of evolutionary units (species, populations) requires integrating several kinds of data such as genetic or phenotypic markers or spatial information, in order to get a comprehensive view concerning the dierentiation of the units. We propose a statistical model with a double original...... advantage: (i) it incorporates information about the spatial distribution of the samples, with the aim to increase inference power and to relate more explicitly observed patterns to geography; and (ii) it allows one to analyze genetic and phenotypic data within a unied model and inference framework, thus...... an intricate case of inter- and intra-species dierentiation based on an original data-set of georeferenced genetic and morphometric markers obtained on Myodes voles from Sweden. A computer program is made available as an extension of the R package Geneland....

  1. Microsatellite analysis of chloroquine resistance associated alleles and neutral loci reveal genetic structure of Indian Plasmodium falciparum

    Science.gov (United States)

    Mallick, Prashant K.; Sutton, Patrick L.; Singh, Ruchi; Singh, Om P.; Dash, Aditya P.; Singh, Ashok K.; Carlton, Jane M.; Bhasin, Virendra K.

    2013-01-01

    Efforts to control malignant malaria caused by Plasmodium falciparum are hampered by the parasite’s acquisition of resistance to antimalarial drugs, e.g., chloroquine. This necessitates evaluating the spread of chloroquine resistance in any malaria-endemic area. India displays highly variable malaria epidemiology and also shares porous international borders with malaria-endemic Southeast Asian countries having multi-drug resistant malaria. Malaria epidemiology in India is believed to be affected by two major factors: high genetic diversity and evolving drug resistance in P. falciparum. How transmission intensity of malaria can influence the genetic structure of chloroquine-resistant P. falciparum population in India is unknown. Here, genetic diversity within and among P. falciparum populations is analyzed with respect to their prevalence and chloroquine resistance observed in 13 different locations in India. Microsatellites developed for P. falciparum, including three putatively neutral and seven microsatellites thought to be under a hitchhiking effect due to chloroquine selection were used. Genetic hitchhiking is observed in five of seven microsatellites flanking the gene responsible for chloroquine resistance. Genetic admixture analysis and F-statistics detected genetically distinct groups in accordance with transmission intensity of different locations and the probable use of chloroquine. A large genetic break between the chloroquine-resistant parasite of the Northeast-East-Island group and Southwest group (FST = 0.253, P<0.001) suggests a long period of isolation or a possibility of different origin between them. A pattern of significant isolation by distance was observed in low transmission areas (r = 0.49, P=0.003, N = 83, Mantel test). An unanticipated pattern of spread of hitchhiking suggests genetic structure for Indian P. falciparum population. Overall, the study suggests that transmission intensity can be an efficient driver for genetic differentiation

  2. Pitfalls in genetic analysis of pheochromocytomas/paragangliomas-case report.

    Science.gov (United States)

    Canu, Letizia; Rapizzi, Elena; Zampetti, Benedetta; Fucci, Rossella; Nesi, Gabriella; Richter, Susan; Qin, Nan; Giachè, Valentino; Bergamini, Carlo; Parenti, Gabriele; Valeri, Andrea; Ercolino, Tonino; Eisenhofer, Graeme; Mannelli, Massimo

    2014-07-01

    About 35% of patients with pheochromocytoma/paraganglioma carry a germline mutation in one of the 10 main susceptibility genes. The recent introduction of next-generation sequencing will allow the analysis of all these genes in one run. When positive, the analysis is generally unequivocal due to the association between a germline mutation and a concordant clinical presentation or positive family history. When genetic analysis reveals a novel mutation with no clinical correlates, particularly in the presence of a missense variant, the question arises whether the mutation is pathogenic or a rare polymorphism. We report the case of a 35-year-old patient operated for a pheochromocytoma who turned out to be a carrier of a novel SDHD (succinate dehydrogenase subunit D) missense mutation. With no positive family history or clinical correlates, we decided to perform additional analyses to test the clinical significance of the mutation. We performed in silico analysis, tissue loss of heterozygosity analysis, immunohistochemistry, Western blot analysis, SDH enzymatic assay, and measurement of the succinate/fumarate concentration ratio in the tumor tissue by tandem mass spectrometry. Although the in silico analysis gave contradictory results according to the different methods, all the other tests demonstrated that the SDH complex was conserved and normally active. We therefore came to the conclusion that the variant was a nonpathogenic polymorphism. Advancements in technology facilitate genetic analysis of patients with pheochromocytoma but also offer new challenges to the clinician who, in some cases, needs clinical correlates and/or functional tests to give significance to the results of the genetic assay.

  3. Determinism and mass-media portrayals of genetics.

    Science.gov (United States)

    Condit, C M; Ofulue, N; Sheedy, K M

    1998-01-01

    Scholars have expressed concern that the introduction of substantial coverage of "medical genetics" in the mass media during the past 2 decades represents an increase in biological determinism in public discourse. To test this contention, we analyzed the contents of a randomly selected, structured sample of American public newspapers (n=250) and magazines (n=722) published during 1919-95. Three coders, using three measures, all with intercoder reliability >85%, were employed. Results indicate that the introduction of the discourse of medical genetics is correlated with both a statistically significant decrease in the degree to which articles attribute human characteristics to genetic causes (P<.001) and a statistically significant increase in the differentiation of attributions to genetic and other causes among various conditions or outcomes (P<. 016). There has been no statistically significant change in the relative proportions of physical phenomena attributed to genetic causes, but there has been a statistically significant decrease in the number of articles assigning genetic causes to mental (P<.002) and behavioral (P<.000) characteristics. These results suggest that the current discourse of medical genetics is not accurately described as more biologically deterministic than its antecedents. PMID:9529342

  4. Statistical analysis of JET disruptions

    International Nuclear Information System (INIS)

    Tanga, A.; Johnson, M.F.

    1991-07-01

    In the operation of JET and of any tokamak many discharges are terminated by a major disruption. The disruptive termination of a discharge is usually an unwanted event which may cause damage to the structure of the vessel. In a reactor disruptions are potentially a very serious problem, hence the importance of studying them and devising methods to avoid disruptions. Statistical information has been collected about the disruptions which have occurred at JET over a long span of operations. The analysis is focused on the operational aspects of the disruptions rather than on the underlining physics. (Author)

  5. Simulation Experiments in Practice : Statistical Design and Regression Analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic

  6. Genetic diversity analysis of common beans based on molecular markers

    Directory of Open Access Journals (Sweden)

    Homar R. Gill-Langarica

    2011-01-01

    Full Text Available A core collection of the common bean (Phaseolus vulgaris L., representing genetic diversity in the entire Mexican holding, is kept at the INIFAP (Instituto Nacional de Investigaciones Forestales, Agricolas y Pecuarias, Mexico Germplasm Bank. After evaluation, the genetic structure of this collection (200 accessions was compared with that of landraces from the states of Oaxaca, Chiapas and Veracruz (10 genotypes from each, as well as a further 10 cultivars, by means of four amplified fragment length polymorphisms (AFLP +3/+3 primer combinations and seven simple sequence repeats (SSR loci, in order to define genetic diversity, variability and mutual relationships. Data underwent cluster (UPGMA and molecular variance (AMOVA analyses. AFLP analysis produced 530 bands (88.5% polymorphic while SSR primers amplified 174 alleles, all polymorphic (8.2 alleles per locus. AFLP indicated that the highest genetic diversity was to be found in ten commercial-seed classes from two major groups of accessions from Central Mexico and Chiapas, which seems to be an important center of diversity in the south. A third group included genotypes from Nueva Granada, Mesoamerica, Jalisco and Durango races. Here, SSR analysis indicated a reduced number of shared haplotypes among accessions, whereas the highest genetic components of AMOVA variation were found within accessions. Genetic diversity observed in the common-bean core collection represents an important sample of the total Phaseolus genetic variability at the main Germplasm Bank of INIFAP. Molecular marker strategies could contribute to a better understanding of the genetic structure of the core collection as well as to its improvement and validation.

  7. Genetic diversity analysis of common beans based on molecular markers

    Directory of Open Access Journals (Sweden)

    Homar R. Gill-Langarica

    Full Text Available A core collection of the common bean (Phaseolus vulgaris L., representing genetic diversity in the entire Mexican holding, is kept at the INIFAP (Instituto Nacional de Investigaciones Forestales, Agricolas y Pecuarias, Mexico Germplasm Bank. After evaluation, the genetic structure of this collection (200 accessions was compared with that of landraces from the states of Oaxaca, Chiapas and Veracruz (10 genotypes from each, as well as a further 10 cultivars, by means of four amplified fragment length polymorphisms (AFLP +3/+3 primer combinations and seven simple sequence repeats (SSR loci, in order to define genetic diversity, variability and mutual relationships. Data underwent cluster (UPGMA and molecular variance (AMOVA analyses. AFLP analysis produced 530 bands (88.5% polymorphic while SSR primers amplified 174 alleles, all polymorphic (8.2 alleles per locus. AFLP indicated that the highest genetic diversity was to be found in ten commercial-seed classes from two major groups of accessions from Central Mexico and Chiapas, which seems to be an important center of diversity in the south. A third group included genotypes from Nueva Granada, Mesoamerica, Jalisco and Durango races. Here, SSR analysis indicated a reduced number of shared haplotypes among accessions, whereas the highest genetic components of AMOVA variation were found within accessions. Genetic diversity observed in the common-bean core collection represents an important sample of the total Phaseolus genetic variability at the main Germplasm Bank of INIFAP. Molecular marker strategies could contribute to a better understanding of the genetic structure of the core collection as well as to its improvement and validation.

  8. Genetic diversity analysis of common beans based on molecular markers.

    Science.gov (United States)

    Gill-Langarica, Homar R; Muruaga-Martínez, José S; Vargas-Vázquez, M L Patricia; Rosales-Serna, Rigoberto; Mayek-Pérez, Netzahualcoyotl

    2011-10-01

    A core collection of the common bean (Phaseolus vulgaris L.), representing genetic diversity in the entire Mexican holding, is kept at the INIFAP (Instituto Nacional de Investigaciones Forestales, Agricolas y Pecuarias, Mexico) Germplasm Bank. After evaluation, the genetic structure of this collection (200 accessions) was compared with that of landraces from the states of Oaxaca, Chiapas and Veracruz (10 genotypes from each), as well as a further 10 cultivars, by means of four amplified fragment length polymorphisms (AFLP) +3/+3 primer combinations and seven simple sequence repeats (SSR) loci, in order to define genetic diversity, variability and mutual relationships. Data underwent cluster (UPGMA) and molecular variance (AMOVA) analyses. AFLP analysis produced 530 bands (88.5% polymorphic) while SSR primers amplified 174 alleles, all polymorphic (8.2 alleles per locus). AFLP indicated that the highest genetic diversity was to be found in ten commercial-seed classes from two major groups of accessions from Central Mexico and Chiapas, which seems to be an important center of diversity in the south. A third group included genotypes from Nueva Granada, Mesoamerica, Jalisco and Durango races. Here, SSR analysis indicated a reduced number of shared haplotypes among accessions, whereas the highest genetic components of AMOVA variation were found within accessions. Genetic diversity observed in the common-bean core collection represents an important sample of the total Phaseolus genetic variability at the main Germplasm Bank of INIFAP. Molecular marker strategies could contribute to a better understanding of the genetic structure of the core collection as well as to its improvement and validation.

  9. Statistical and population genetics issues of two Hungarian datasets from the aspect of DNA evidence interpretation.

    Science.gov (United States)

    Szabolcsi, Zoltán; Farkas, Zsuzsa; Borbély, Andrea; Bárány, Gusztáv; Varga, Dániel; Heinrich, Attila; Völgyi, Antónia; Pamjav, Horolma

    2015-11-01

    When the DNA profile from a crime-scene matches that of a suspect, the weight of DNA evidence depends on the unbiased estimation of the match probability of the profiles. For this reason, it is required to establish and expand the databases that reflect the actual allele frequencies in the population applied. 21,473 complete DNA profiles from Databank samples were used to establish the allele frequency database to represent the population of Hungarian suspects. We used fifteen STR loci (PowerPlex ESI16) including five, new ESS loci. The aim was to calculate the statistical, forensic efficiency parameters for the Databank samples and compare the newly detected data to the earlier report. The population substructure caused by relatedness may influence the frequency of profiles estimated. As our Databank profiles were considered non-random samples, possible relationships between the suspects can be assumed. Therefore, population inbreeding effect was estimated using the FIS calculation. The overall inbreeding parameter was found to be 0.0106. Furthermore, we tested the impact of the two allele frequency datasets on 101 randomly chosen STR profiles, including full and partial profiles. The 95% confidence interval estimates for the profile frequencies (pM) resulted in a tighter range when we used the new dataset compared to the previously published ones. We found that the FIS had less effect on frequency values in the 21,473 samples than the application of minimum allele frequency. No genetic substructure was detected by STRUCTURE analysis. Due to the low level of inbreeding effect and the high number of samples, the new dataset provides unbiased and precise estimates of LR for statistical interpretation of forensic casework and allows us to use lower allele frequencies. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  10. Statistical analysis of the Ft. Calhoun reactor coolant pump system

    International Nuclear Information System (INIS)

    Patel, Bimal; Heising, C.D.

    1997-01-01

    In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety. As a demonstration of such an approach, a specific system is analyzed: the reactor coolant pumps (RCPs) of the Ft. Calhoun nuclear power plant. This research uses capability analysis, Shewhart X-bar, R charts, canonical correlation methods, and design of experiments to analyze the process for the state of statistical control. The results obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The analysis shows that statistical process control methods can be applied as an early warning system capable of identifying significant equipment problems well in advance of traditional control room alarm indicators. Such a system would provide operators with ample time to respond to possible emergency situations and thus improve plant safety and reliability. (Author)

  11. Research and Development of Statistical Analysis Software System of Maize Seedling Experiment

    OpenAIRE

    Hui Cao

    2014-01-01

    In this study, software engineer measures were used to develop a set of software system for maize seedling experiments statistics and analysis works. During development works, B/S structure software design method was used and a set of statistics indicators for maize seedling evaluation were established. The experiments results indicated that this set of software system could finish quality statistics and analysis for maize seedling very well. The development of this software system explored a...

  12. A genetic analysis of Trichuris trichiura and Trichuris suis from Ecuador.

    Science.gov (United States)

    Meekums, Hayley; Hawash, Mohamed B F; Sparks, Alexandra M; Oviedo, Yisela; Sandoval, Carlos; Chico, Martha E; Stothard, J Russell; Cooper, Philip J; Nejsum, Peter; Betson, Martha

    2015-03-19

    Since the nematodes Trichuris trichiura and T. suis are morphologically indistinguishable, genetic analysis is required to assess epidemiological cross-over between people and pigs. This study aimed to clarify the transmission biology of trichuriasis in Ecuador. Adult Trichuris worms were collected during a parasitological survey of 132 people and 46 pigs in Esmeraldas Province, Ecuador. Morphometric analysis of 49 pig worms and 64 human worms revealed significant variation. In discriminant analysis morphometric characteristics correctly classified male worms according to host species. In PCR-RFLP analysis of the ribosomal Internal Transcribed Spacer (ITS-2) and 18S DNA (59 pig worms and 82 human worms), nearly all Trichuris exhibited expected restriction patterns. However, two pig-derived worms showed a "heterozygous-type" ITS-2 pattern, with one also having a "heterozygous-type" 18S pattern. Phylogenetic analysis of the mitochondrial large ribosomal subunit partitioned worms by host species. Notably, some Ecuadorian T. suis clustered with porcine Trichuris from USA and Denmark and some with Chinese T. suis. This is the first study in Latin America to genetically analyse Trichuris parasites. Although T. trichiura does not appear to be zoonotic in Ecuador, there is evidence of genetic exchange between T. trichiura and T. suis warranting more detailed genetic sampling.

  13. Statistical trend analysis methods for temporal phenomena

    Energy Technology Data Exchange (ETDEWEB)

    Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.

  14. Statistical trend analysis methods for temporal phenomena

    International Nuclear Information System (INIS)

    Lehtinen, E.; Pulkkinen, U.; Poern, K.

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods

  15. A markerless protocol for genetic analysis of Aggregatibacter actinomycetemcomitans

    Science.gov (United States)

    Cheng, Ya-An; Jee, Jason; Hsu, Genie; Huang, Yanyan; Chen, Casey; Lin, Chun-Pin

    2015-01-01

    Background/Purpose The genomes of different Aggregatibacter actinomycetemcomitans strains contain many strain-specific genes and genomic islands (defined as DNA found in some but not all strains) of unknown functions. Genetic analysis for the functions of these islands will be constrained by the limited availability of genetic markers and vectors for A. actinomycetemcomitans. In this study we tested a novel genetic approach of gene deletion and restoration in a naturally competent A. actinomycetemcomitans strain D7S-1. Methods Specific genes’ deletion mutants and mutants restored with the deleted genes were constructed by a markerless loxP/Cre system. In mutants with sequential deletion of multiple genes loxP with different spacer regions were used to avoid unwanted recombinations between loxP sites. Results Eight single-gene deletion mutants, four multiple-gene deletion mutants, and two mutants with restored genes were constructed. No unintended non-specific deletion mutants were generated by this protocol. The protocol did not negatively affect the growth and biofilm formation of A. actinomycetemcomitans. Conclusion The protocol described in this study is efficient and specific for genetic manipulation of A. actinomycetemcomitans, and will be amenable for functional analysis of multiple genes in A. actinomycetemcomitans. PMID:24530245

  16. StOCNET : Software for the statistical analysis of social networks

    NARCIS (Netherlands)

    Huisman, M.; van Duijn, M.A.J.

    2003-01-01

    StOCNET3 is an open software system in a Windows environment for the advanced statistical analysis of social networks. It provides a platform to make a number of recently developed and therefore not (yet) standard statistical methods available to a wider audience. A flexible user interface utilizing

  17. The Inappropriate Symmetries of Multivariate Statistical Analysis in Geometric Morphometrics.

    Science.gov (United States)

    Bookstein, Fred L

    In today's geometric morphometrics the commonest multivariate statistical procedures, such as principal component analysis or regressions of Procrustes shape coordinates on Centroid Size, embody a tacit roster of symmetries -axioms concerning the homogeneity of the multiple spatial domains or descriptor vectors involved-that do not correspond to actual biological fact. These techniques are hence inappropriate for any application regarding which we have a-priori biological knowledge to the contrary (e.g., genetic/morphogenetic processes common to multiple landmarks, the range of normal in anatomy atlases, the consequences of growth or function for form). But nearly every morphometric investigation is motivated by prior insights of this sort. We therefore need new tools that explicitly incorporate these elements of knowledge, should they be quantitative, to break the symmetries of the classic morphometric approaches. Some of these are already available in our literature but deserve to be known more widely: deflated (spatially adaptive) reference distributions of Procrustes coordinates, Sewall Wright's century-old variant of factor analysis, the geometric algebra of importing explicit biomechanical formulas into Procrustes space. Other methods, not yet fully formulated, might involve parameterized models for strain in idealized forms under load, principled approaches to the separation of functional from Brownian aspects of shape variation over time, and, in general, a better understanding of how the formalism of landmarks interacts with the many other approaches to quantification of anatomy. To more powerfully organize inferences from the high-dimensional measurements that characterize so much of today's organismal biology, tomorrow's toolkit must rely neither on principal component analysis nor on the Procrustes distance formula, but instead on sound prior biological knowledge as expressed in formulas whose coefficients are not all the same. I describe the problems

  18. AutoBayes: A System for Generating Data Analysis Programs from Statistical Models

    OpenAIRE

    Fischer, Bernd; Schumann, Johann

    2003-01-01

    Data analysis is an important scientific task which is required whenever information needs to be extracted from raw data. Statistical approaches to data analysis, which use methods from probability theory and numerical analysis, are well-founded but dificult to implement: the development of a statistical data analysis program for any given application is time-consuming and requires substantial knowledge and experience in several areas. In this paper, we describe AutoBayes, a program synthesis...

  19. Expression quantitative trait loci and genetic regulatory network analysis reveals that Gabra2 is involved in stress responses in the mouse.

    Science.gov (United States)

    Dai, Jiajuan; Wang, Xusheng; Chen, Ying; Wang, Xiaodong; Zhu, Jun; Lu, Lu

    2009-11-01

    Previous studies have revealed that the subunit alpha 2 (Gabra2) of the gamma-aminobutyric acid receptor plays a critical role in the stress response. However, little is known about the gentetic regulatory network for Gabra2 and the stress response. We combined gene expression microarray analysis and quantitative trait loci (QTL) mapping to characterize the genetic regulatory network for Gabra2 expression in the hippocampus of BXD recombinant inbred (RI) mice. Our analysis found that the expression level of Gabra2 exhibited much variation in the hippocampus across the BXD RI strains and between the parental strains, C57BL/6J, and DBA/2J. Expression QTL (eQTL) mapping showed three microarray probe sets of Gabra2 to have highly significant linkage likelihood ratio statistic (LRS) scores. Gene co-regulatory network analysis showed that 10 genes, including Gria3, Chka, Drd3, Homer1, Grik2, Odz4, Prkag2, Grm5, Gabrb1, and Nlgn1 are directly or indirectly associated with stress responses. Eleven genes were implicated as Gabra2 downstream genes through mapping joint modulation. The genetical genomics approach demonstrates the importance and the potential power of the eQTL studies in identifying genetic regulatory networks that contribute to complex traits, such as stress responses.

  20. Network similarity and statistical analysis of earthquake seismic data

    OpenAIRE

    Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban

    2016-01-01

    We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We cal...

  1. Statistical analysis and interpolation of compositional data in materials science.

    Science.gov (United States)

    Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M

    2015-02-09

    Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.

  2. An Application of Multivariate Statistical Analysis for Query-Driven Visualization

    Energy Technology Data Exchange (ETDEWEB)

    Gosink, Luke J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Garth, Christoph [Univ. of California, Davis, CA (United States); Anderson, John C. [Univ. of California, Davis, CA (United States); Bethel, E. Wes [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Joy, Kenneth I. [Univ. of California, Davis, CA (United States)

    2011-03-01

    Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.

  3. Genetic analysis of PAX3 for diagnosis of Waardenburg syndrome type I.

    Science.gov (United States)

    Matsunaga, Tatsuo; Mutai, Hideki; Namba, Kazunori; Morita, Noriko; Masuda, Sawako

    2013-04-01

    PAX3 genetic analysis increased the diagnostic accuracy for Waardenburg syndrome type I (WS1). Analysis of the three-dimensional (3D) structure of PAX3 helped verify the pathogenicity of a missense mutation, and multiple ligation-dependent probe amplification (MLPA) analysis of PAX3 increased the sensitivity of genetic diagnosis in patients with WS1. Clinical diagnosis of WS1 is often difficult in individual patients with isolated, mild, or non-specific symptoms. The objective of the present study was to facilitate the accurate diagnosis of WS1 through genetic analysis of PAX3 and to expand the spectrum of known PAX3 mutations. In two Japanese families with WS1, we conducted a clinical evaluation of symptoms and genetic analysis, which involved direct sequencing, MLPA analysis, quantitative PCR of PAX3, and analysis of the predicted 3D structure of PAX3. The normal-hearing control group comprised 92 subjects who had normal hearing according to pure tone audiometry. In one family, direct sequencing of PAX3 identified a heterozygous mutation, p.I59F. Analysis of PAX3 3D structures indicated that this mutation distorted the DNA-binding site of PAX3. In the other family, MLPA analysis and subsequent quantitative PCR detected a large, heterozygous deletion spanning 1759-2554 kb that eliminated 12-18 genes including a whole PAX3 gene.

  4. DMPD: The Toll-like receptors: analysis by forward genetic methods. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 16001129 The Toll-like receptors: analysis by forward genetic methods. Beutler B. I...mmunogenetics. 2005 Jul;57(6):385-92. (.png) (.svg) (.html) (.csml) Show The Toll-like receptors: analysis by forwar...d genetic methods. PubmedID 16001129 Title The Toll-like receptors: analysis by forward genetic meth

  5. Genetic Alterations in Pesticide Exposed Bolivian Farmers

    DEFF Research Database (Denmark)

    Jørs, Erik; González, Ana Rosa; Ascarrunz, Maria Eugenia

    2007-01-01

    : Questionnaires were applied and blood tests taken from 81 volunteers from La Paz County, of whom 48 were pesticide exposed farmers and 33 non-exposed controls. Sixty males and 21 females participated with a mean age of 37.3 years (range 17-76). Data of exposure and possible genetic damage were collected...... and evaluated by well known statistical methods, controlling for relevant confounders. To measure genetic damage chromosomal aberrations and the comet assay analysis were performed. Results: Pesticide exposed farmers had a higher degree of genetic damage compared to the control group. The number of chromosomal......, probably related to exposure to pesticides. Due to the potentially negative long term health effects of genetic damage on reproduction and the development of cancer, preventive measures are recommended. Effective control with imports and sales, banning of the most toxic pesticides, education...

  6. EMBO Course “Formal Analysis of Genetic Regulation”

    CERN Document Server

    1979-01-01

    The E M B 0 course on "Formal Analysis of Genetic Regulation" A course entitled "Formal analysis of Genetic Regulation" was held at the University of Brussels from 6 to 16 September 1977 under the auspices of EMBO (European Molecular Biology Organization). As indicated by the title of the book (but not explicitly enough by the title of the course), the main emphasis was put on a dynamic analysis of systems using logical methods, that is, methods in which functions and variables take only a limited number of values - typically two. In this respect, this course was complementary to an EMBO course using continuous methods which was held some months later in Israel by Prof. Segel. People from four very different laboratories took an active part in teaching our course in Brussels : Drs Anne LEUSSLER and Philippe VAN HAM, from the Laboratory of Prof. Jean FLORINE (Laboratoire des Systemes logiques et numeriques, Faculte des Sciences appliquees, Universite Libre de Bruxelles). Dr Stuart KAUFFMAN (Dept. of Biochemist...

  7. Explorations in Statistics: The Analysis of Ratios and Normalized Data

    Science.gov (United States)

    Curran-Everett, Douglas

    2013-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…

  8. The nature of statistics

    CERN Document Server

    Wallis, W Allen

    2014-01-01

    Focusing on everyday applications as well as those of scientific research, this classic of modern statistical methods requires little to no mathematical background. Readers develop basic skills for evaluating and using statistical data. Lively, relevant examples include applications to business, government, social and physical sciences, genetics, medicine, and public health. ""W. Allen Wallis and Harry V. Roberts have made statistics fascinating."" - The New York Times ""The authors have set out with considerable success, to write a text which would be of interest and value to the student who,

  9. Statistical Energy Analysis (SEA) and Energy Finite Element Analysis (EFEA) Predictions for a Floor-Equipped Composite Cylinder

    Science.gov (United States)

    Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.

    2011-01-01

    Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.

  10. Simulation Experiments in Practice : Statistical Design and Regression Analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is

  11. Phenotypic and molecular genetic analysis of Pyruvate Kinase ...

    African Journals Online (AJOL)

    Phenotypic and molecular genetic analysis of Pyruvate Kinase deficiency in a Tunisian family. Jaouani Mouna, Hamdi Nadia, Chaouch Leila, Kalai Miniar, Mellouli Fethi, Darragi Imen, Boudriga Imen, Chaouachi Dorra, Bejaoui Mohamed, Abbes Salem ...

  12. Theory and Practice in Quantitative Genetics

    DEFF Research Database (Denmark)

    Posthuma, Daniëlle; Beem, A Leo; de Geus, Eco J C

    2003-01-01

    With the rapid advances in molecular biology, the near completion of the human genome, the development of appropriate statistical genetic methods and the availability of the necessary computing power, the identification of quantitative trait loci has now become a realistic prospect for quantitative...... geneticists. We briefly describe the theoretical biometrical foundations underlying quantitative genetics. These theoretical underpinnings are translated into mathematical equations that allow the assessment of the contribution of observed (using DNA samples) and unobserved (using known genetic relationships......) genetic variation to population variance in quantitative traits. Several statistical models for quantitative genetic analyses are described, such as models for the classical twin design, multivariate and longitudinal genetic analyses, extended twin analyses, and linkage and association analyses. For each...

  13. Statistical trend analysis methodology for rare failures in changing technical systems

    International Nuclear Information System (INIS)

    Ott, K.O.; Hoffmann, H.J.

    1983-07-01

    A methodology for a statistical trend analysis (STA) in failure rates is presented. It applies primarily to relatively rare events in changing technologies or components. The formulation is more general and the assumptions are less restrictive than in a previously published version. Relations of the statistical analysis and probabilistic assessment (PRA) are discussed in terms of categorization of decisions for action following particular failure events. The significance of tentatively identified trends is explored. In addition to statistical tests for trend significance, a combination of STA and PRA results quantifying the trend complement is proposed. The STA approach is compared with other concepts for trend characterization. (orig.)

  14. Physics-Based Image Segmentation Using First Order Statistical Properties and Genetic Algorithm for Inductive Thermography Imaging.

    Science.gov (United States)

    Gao, Bin; Li, Xiaoqing; Woo, Wai Lok; Tian, Gui Yun

    2018-05-01

    Thermographic inspection has been widely applied to non-destructive testing and evaluation with the capabilities of rapid, contactless, and large surface area detection. Image segmentation is considered essential for identifying and sizing defects. To attain a high-level performance, specific physics-based models that describe defects generation and enable the precise extraction of target region are of crucial importance. In this paper, an effective genetic first-order statistical image segmentation algorithm is proposed for quantitative crack detection. The proposed method automatically extracts valuable spatial-temporal patterns from unsupervised feature extraction algorithm and avoids a range of issues associated with human intervention in laborious manual selection of specific thermal video frames for processing. An internal genetic functionality is built into the proposed algorithm to automatically control the segmentation threshold to render enhanced accuracy in sizing the cracks. Eddy current pulsed thermography will be implemented as a platform to demonstrate surface crack detection. Experimental tests and comparisons have been conducted to verify the efficacy of the proposed method. In addition, a global quantitative assessment index F-score has been adopted to objectively evaluate the performance of different segmentation algorithms.

  15. Adaptive divergence despite strong genetic drift: genomic analysis of the evolutionary mechanisms causing genetic differentiation in the island fox (Urocyon littoralis)

    Science.gov (United States)

    FUNK, W. CHRIS; LOVICH, ROBERT E.; HOHENLOHE, PAUL A.; HOFMAN, COURTNEY A.; MORRISON, SCOTT A.; SILLETT, T. SCOTT; GHALAMBOR, CAMERON K.; MALDONADO, JESUS E.; RICK, TORBEN C.; DAY, MITCH D.; POLATO, NICHOLAS R.; FITZPATRICK, SARAH W.; COONAN, TIMOTHY J.; CROOKS, KEVIN R.; DILLON, ADAM; GARCELON, DAVID K.; KING, JULIE L.; BOSER, CHRISTINA L.; GOULD, NICHOLAS; ANDELT, WILLIAM F.

    2016-01-01

    The evolutionary mechanisms generating the tremendous biodiversity of islands have long fascinated evolutionary biologists. Genetic drift and divergent selection are predicted to be strong on islands and both could drive population divergence and speciation. Alternatively, strong genetic drift may preclude adaptation. We conducted a genomic analysis to test the roles of genetic drift and divergent selection in causing genetic differentiation among populations of the island fox (Urocyon littoralis). This species consists of 6 subspecies, each of which occupies a different California Channel Island. Analysis of 5293 SNP loci generated using Restriction-site Associated DNA (RAD) sequencing found support for genetic drift as the dominant evolutionary mechanism driving population divergence among island fox populations. In particular, populations had exceptionally low genetic variation, small Ne (range = 2.1–89.7; median = 19.4), and significant genetic signatures of bottlenecks. Moreover, islands with the lowest genetic variation (and, by inference, the strongest historical genetic drift) were most genetically differentiated from mainland gray foxes, and vice versa, indicating genetic drift drives genome-wide divergence. Nonetheless, outlier tests identified 3.6–6.6% of loci as high FST outliers, suggesting that despite strong genetic drift, divergent selection contributes to population divergence. Patterns of similarity among populations based on high FST outliers mirrored patterns based on morphology, providing additional evidence that outliers reflect adaptive divergence. Extremely low genetic variation and small Ne in some island fox populations, particularly on San Nicolas Island, suggest that they may be vulnerable to fixation of deleterious alleles, decreased fitness, and reduced adaptive potential. PMID:26992010

  16. Adaptive divergence despite strong genetic drift: genomic analysis of the evolutionary mechanisms causing genetic differentiation in the island fox (Urocyon littoralis).

    Science.gov (United States)

    Funk, W Chris; Lovich, Robert E; Hohenlohe, Paul A; Hofman, Courtney A; Morrison, Scott A; Sillett, T Scott; Ghalambor, Cameron K; Maldonado, Jesus E; Rick, Torben C; Day, Mitch D; Polato, Nicholas R; Fitzpatrick, Sarah W; Coonan, Timothy J; Crooks, Kevin R; Dillon, Adam; Garcelon, David K; King, Julie L; Boser, Christina L; Gould, Nicholas; Andelt, William F

    2016-05-01

    The evolutionary mechanisms generating the tremendous biodiversity of islands have long fascinated evolutionary biologists. Genetic drift and divergent selection are predicted to be strong on islands and both could drive population divergence and speciation. Alternatively, strong genetic drift may preclude adaptation. We conducted a genomic analysis to test the roles of genetic drift and divergent selection in causing genetic differentiation among populations of the island fox (Urocyon littoralis). This species consists of six subspecies, each of which occupies a different California Channel Island. Analysis of 5293 SNP loci generated using Restriction-site Associated DNA (RAD) sequencing found support for genetic drift as the dominant evolutionary mechanism driving population divergence among island fox populations. In particular, populations had exceptionally low genetic variation, small Ne (range = 2.1-89.7; median = 19.4), and significant genetic signatures of bottlenecks. Moreover, islands with the lowest genetic variation (and, by inference, the strongest historical genetic drift) were most genetically differentiated from mainland grey foxes, and vice versa, indicating genetic drift drives genome-wide divergence. Nonetheless, outlier tests identified 3.6-6.6% of loci as high FST outliers, suggesting that despite strong genetic drift, divergent selection contributes to population divergence. Patterns of similarity among populations based on high FST outliers mirrored patterns based on morphology, providing additional evidence that outliers reflect adaptive divergence. Extremely low genetic variation and small Ne in some island fox populations, particularly on San Nicolas Island, suggest that they may be vulnerable to fixation of deleterious alleles, decreased fitness and reduced adaptive potential. © 2016 John Wiley & Sons Ltd.

  17. Analysis of thrips distribution: application of spatial statistics and Kriging

    Science.gov (United States)

    John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard

    1991-01-01

    Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...

  18. Statistical wind analysis for near-space applications

    Science.gov (United States)

    Roney, Jason A.

    2007-09-01

    Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.

  19. Analysis of photon statistics with Silicon Photomultiplier

    International Nuclear Information System (INIS)

    D'Ascenzo, N.; Saveliev, V.; Wang, L.; Xie, Q.

    2015-01-01

    The Silicon Photomultiplier (SiPM) is a novel silicon-based photodetector, which represents the modern perspective of low photon flux detection. The aim of this paper is to provide an introduction on the statistical analysis methods needed to understand and estimate in quantitative way the correct features and description of the response of the SiPM to a coherent source of light

  20. Development of statistical analysis code for meteorological data (W-View)

    International Nuclear Information System (INIS)

    Tachibana, Haruo; Sekita, Tsutomu; Yamaguchi, Takenori

    2003-03-01

    A computer code (W-View: Weather View) was developed to analyze the meteorological data statistically based on 'the guideline of meteorological statistics for the safety analysis of nuclear power reactor' (Nuclear Safety Commission on January 28, 1982; revised on March 29, 2001). The code gives statistical meteorological data to assess the public dose in case of normal operation and severe accident to get the license of nuclear reactor operation. This code was revised from the original code used in a large office computer code to enable a personal computer user to analyze the meteorological data simply and conveniently and to make the statistical data tables and figures of meteorology. (author)

  1. Statistical analysis of the Ft. Calhoun reactor coolant pump system

    International Nuclear Information System (INIS)

    Heising, Carolyn D.

    1998-01-01

    In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety. As a demonstration of such an approach to plant maintenance and control, a specific system is analyzed: the reactor coolant pumps (RCPs) of the Ft. Calhoun nuclear power plant. This research uses capability analysis, Shewhart X-bar, R-charts, canonical correlation methods, and design of experiments to analyze the process for the state of statistical control. The results obtained show that six out of ten parameters are under control specifications limits and four parameters are not in the state of statistical control. The analysis shows that statistical process control methods can be applied as an early warning system capable of identifying significant equipment problems well in advance of traditional control room alarm indicators Such a system would provide operators with ample time to respond to possible emergency situations and thus improve plant safety and reliability. (author)

  2. Genetics Home Reference: homocystinuria

    Science.gov (United States)

    ... an increased risk of abnormal blood clotting, and brittle bones that are prone to fracture ( osteoporosis ) or other ... information about a genetic condition can statistics provide? Why are some genetic conditions more common in particular ...

  3. Genetic polymorphisms of pharmacogenomic VIP variants in the Yi population from China.

    Science.gov (United States)

    Yan, Mengdan; Li, Dianzhen; Zhao, Guige; Li, Jing; Niu, Fanglin; Li, Bin; Chen, Peng; Jin, Tianbo

    2018-03-30

    Drug response and target therapeutic dosage are different among individuals. The variability is largely genetically determined. With the development of pharmacogenetics and pharmacogenomics, widespread research have provided us a wealth of information on drug-related genetic polymorphisms, and the very important pharmacogenetic (VIP) variants have been identified for the major populations around the world whereas less is known regarding minorities in China, including the Yi ethnic group. Our research aims to screen the potential genetic variants in Yi population on pharmacogenomics and provide a theoretical basis for future medication guidance. In the present study, 80 VIP variants (selected from the PharmGKB database) were genotyped in 100 unrelated and healthy Yi adults recruited for our research. Through statistical analysis, we made a comparison between the Yi and other 11 populations listed in the HapMap database for significant SNPs detection. Two specific SNPs were subsequently enrolled in an observation on global allele distribution with the frequencies downloaded from ALlele FREquency Database. Moreover, F-statistics (Fst), genetic structure and phylogenetic tree analyses were conducted for determination of genetic similarity between the 12 ethnic groups. Using the χ2 tests, rs1128503 (ABCB1), rs7294 (VKORC1), rs9934438 (VKORC1), rs1540339 (VDR) and rs689466 (PTGS2) were identified as the significantly different loci for further analysis. The global allele distribution revealed that the allele "A" of rs1540339 and rs9934438 were more frequent in Yi people, which was consistent with the most populations in East Asia. F-statistics (Fst), genetic structure and phylogenetic tree analyses demonstrated that the Yi and CHD shared a closest relationship on their genetic backgrounds. Additionally, Yi was considered similar to the Han people from Shaanxi province among the domestic ethnic populations in China. Our results demonstrated significant differences on

  4. Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers

    Science.gov (United States)

    Keiffer, Greggory L.; Lane, Forrest C.

    2016-01-01

    Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…

  5. Simulation Experiments in Practice: Statistical Design and Regression Analysis

    OpenAIRE

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...

  6. Statistical analysis of thermal conductivity of nanofluid containing ...

    Indian Academy of Sciences (India)

    Thermal conductivity measurements of nanofluids were analysed via two-factor completely randomized design and comparison of data means is carried out with Duncan's multiple-range test. Statistical analysis of experimental data show that temperature and weight fraction have a reasonable impact on the thermal ...

  7. A novel genetic tool for clonal analysis of fourth chromosome mutations

    OpenAIRE

    Sousa-Neves, Rui; Schinaman, Joseph M.

    2012-01-01

    The fourth chromosome of Drosophila remains one of the most intractable regions of the fly genome to genetic analysis. The main difficulty posed to the genetic analyses of mutations on this chromosome arises from the fact that it does not undergo meiotic recombination, which makes recombination mapping impossible, and also prevents clonal analysis of mutations, a technique which relies on recombination to introduce the prerequisite recessive markers and FLP-recombinase recognition targets (FR...

  8. Longitudinal data analysis a handbook of modern statistical methods

    CERN Document Server

    Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert

    2008-01-01

    Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint

  9. Mathematical statistics

    CERN Document Server

    Pestman, Wiebe R

    2009-01-01

    This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.

  10. Analysis of genetic polymorphism of nine short tandem repeat loci in ...

    African Journals Online (AJOL)

    Yomi

    2012-03-15

    Mar 15, 2012 ... Key words: short tandem repeat, repeat motif, genetic polymorphism, Han population, forensic genetics. INTRODUCTION. Short tandem repeat (STR) is widely .... Data analysis. The exact test of Hardy-Weinberg equilibrium was conducted with. Arlequin version 3.5 software (Computational and Molecular.

  11. Bayesian Sensitivity Analysis of Statistical Models with Missing Data.

    Science.gov (United States)

    Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng

    2014-04-01

    Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.

  12. Advanced data analysis in neuroscience integrating statistical and computational models

    CERN Document Server

    Durstewitz, Daniel

    2017-01-01

    This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering.  Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...

  13. Quantitative analysis and IBM SPSS statistics a guide for business and finance

    CERN Document Server

    Aljandali, Abdulkader

    2016-01-01

    This guide is for practicing statisticians and data scientists who use IBM SPSS for statistical analysis of big data in business and finance. This is the first of a two-part guide to SPSS for Windows, introducing data entry into SPSS, along with elementary statistical and graphical methods for summarizing and presenting data. Part I also covers the rudiments of hypothesis testing and business forecasting while Part II will present multivariate statistical methods, more advanced forecasting methods, and multivariate methods. IBM SPSS Statistics offers a powerful set of statistical and information analysis systems that run on a wide variety of personal computers. The software is built around routines that have been developed, tested, and widely used for more than 20 years. As such, IBM SPSS Statistics is extensively used in industry, commerce, banking, local and national governments, and education. Just a small subset of users of the package include the major clearing banks, the BBC, British Gas, British Airway...

  14. What type of statistical model to choose for the analysis of radioimmunoassays

    International Nuclear Information System (INIS)

    Huet, S.

    1984-01-01

    The current techniques used for statistical analysis of radioimmunoassays are not very satisfactory for either the statistician or the biologist. They are based on an attempt to make the response curve linear to avoid complicated computations. The present article shows that this practice has considerable effects (often neglected) on the statistical assumptions which must be formulated. A more strict analysis is proposed by applying the four-parameter logistic model. The advantages of this method are: the statistical assumptions formulated are based on observed data, and the model can be applied to almost all radioimmunoassays [fr

  15. Coherent spectroscopic methods for monitoring pathogens, genetically modified products and nanostructured materials in colloidal solution

    International Nuclear Information System (INIS)

    Moguilnaya, T.; Suminov, Y.; Botikov, A.; Ignatov, S.; Kononenko, A.; Agibalov, A.

    2017-01-01

    We developed the new automatic method that combines the method of forced luminescence and stimulated Brillouin scattering. This method is used for monitoring pathogens, genetically modified products and nanostructured materials in colloidal solution. We carried out the statistical spectral analysis of pathogens, genetically modified soy and nano-particles of silver in water from different regions in order to determine the statistical errors of the method. We studied spectral characteristics of these objects in water to perform the initial identification with 95% probability. These results were used for creation of the model of the device for monitor of pathogenic organisms and working model of the device to determine the genetically modified soy in meat.

  16. Isolation and genetic analysis of pure cells from forensic biological mixtures: The precision of a digital approach.

    Science.gov (United States)

    Fontana, F; Rapone, C; Bregola, G; Aversa, R; de Meo, A; Signorini, G; Sergio, M; Ferrarini, A; Lanzellotto, R; Medoro, G; Giorgini, G; Manaresi, N; Berti, A

    2017-07-01

    Latest genotyping technologies allow to achieve a reliable genetic profile for the offender identification even from extremely minute biological evidence. The ultimate challenge occurs when genetic profiles need to be retrieved from a mixture, which is composed of biological material from two or more individuals. In this case, DNA profiling will often result in a complex genetic profile, which is then subject matter for statistical analysis. In principle, when more individuals contribute to a mixture with different biological fluids, their single genetic profiles can be obtained by separating the distinct cell types (e.g. epithelial cells, blood cells, sperm), prior to genotyping. Different approaches have been investigated for this purpose, such as fluorescent-activated cell sorting (FACS) or laser capture microdissection (LCM), but currently none of these methods can guarantee the complete separation of different type of cells present in a mixture. In other fields of application, such as oncology, DEPArray™ technology, an image-based, microfluidic digital sorter, has been widely proven to enable the separation of pure cells, with single-cell precision. This study investigates the applicability of DEPArray™ technology to forensic samples analysis, focusing on the resolution of the forensic mixture problem. For the first time, we report here the development of an application-specific DEPArray™ workflow enabling the detection and recovery of pure homogeneous cell pools from simulated blood/saliva and semen/saliva mixtures, providing full genetic match with genetic profiles of corresponding donors. In addition, we assess the performance of standard forensic methods for DNA quantitation and genotyping on low-count, DEPArray™-isolated cells, showing that pure, almost complete profiles can be obtained from as few as ten haploid cells. Finally, we explore the applicability in real casework samples, demonstrating that the described approach provides complete

  17. Computerized statistical analysis with bootstrap method in nuclear medicine

    International Nuclear Information System (INIS)

    Zoccarato, O.; Sardina, M.; Zatta, G.; De Agostini, A.; Barbesti, S.; Mana, O.; Tarolo, G.L.

    1988-01-01

    Statistical analysis of data samples involves some hypothesis about the features of data themselves. The accuracy of these hypotheses can influence the results of statistical inference. Among the new methods of computer-aided statistical analysis, the bootstrap method appears to be one of the most powerful, thanks to its ability to reproduce many artificial samples starting from a single original sample and because it works without hypothesis about data distribution. The authors applied the bootstrap method to two typical situation of Nuclear Medicine Department. The determination of the normal range of serum ferritin, as assessed by radioimmunoassay and defined by the mean value ±2 standard deviations, starting from an experimental sample of small dimension, shows an unacceptable lower limit (ferritin plasmatic levels below zero). On the contrary, the results obtained by elaborating 5000 bootstrap samples gives ans interval of values (10.95 ng/ml - 72.87 ng/ml) corresponding to the normal ranges commonly reported. Moreover the authors applied the bootstrap method in evaluating the possible error associated with the correlation coefficient determined between left ventricular ejection fraction (LVEF) values obtained by first pass radionuclide angiocardiography with 99m Tc and 195m Au. The results obtained indicate a high degree of statistical correlation and give the range of r 2 values to be considered acceptable for this type of studies

  18. Software for statistical data analysis used in Higgs searches

    International Nuclear Information System (INIS)

    Gumpert, Christian; Moneta, Lorenzo; Cranmer, Kyle; Kreiss, Sven; Verkerke, Wouter

    2014-01-01

    The analysis and interpretation of data collected by the Large Hadron Collider (LHC) requires advanced statistical tools in order to quantify the agreement between observation and theoretical models. RooStats is a project providing a statistical framework for data analysis with the focus on discoveries, confidence intervals and combination of different measurements in both Bayesian and frequentist approaches. It employs the RooFit data modelling language where mathematical concepts such as variables, (probability density) functions and integrals are represented as C++ objects. RooStats and RooFit rely on the persistency technology of the ROOT framework. The usage of a common data format enables the concept of digital publishing of complicated likelihood functions. The statistical tools have been developed in close collaboration with the LHC experiments to ensure their applicability to real-life use cases. Numerous physics results have been produced using the RooStats tools, with the discovery of the Higgs boson by the ATLAS and CMS experiments being certainly the most popular among them. We will discuss tools currently used by LHC experiments to set exclusion limits, to derive confidence intervals and to estimate discovery significances based on frequentist statistics and the asymptotic behaviour of likelihood functions. Furthermore, new developments in RooStats and performance optimisation necessary to cope with complex models depending on more than 1000 variables will be reviewed

  19. PRECISE - pregabalin in addition to usual care: Statistical analysis plan

    NARCIS (Netherlands)

    S. Mathieson (Stephanie); L. Billot (Laurent); C. Maher (Chris); A.J. McLachlan (Andrew J.); J. Latimer (Jane); B.W. Koes (Bart); M.J. Hancock (Mark J.); I. Harris (Ian); R.O. Day (Richard O.); J. Pik (Justin); S. Jan (Stephen); C.-W.C. Lin (Chung-Wei Christine)

    2016-01-01

    textabstractBackground: Sciatica is a severe, disabling condition that lacks high quality evidence for effective treatment strategies. This a priori statistical analysis plan describes the methodology of analysis for the PRECISE study. Methods/design: PRECISE is a prospectively registered, double

  20. Statistical margin to DNB safety analysis approach for LOFT

    International Nuclear Information System (INIS)

    Atkinson, S.A.

    1982-01-01

    A method was developed and used for LOFT thermal safety analysis to estimate the statistical margin to DNB for the hot rod, and to base safety analysis on desired DNB probability limits. This method is an advanced approach using response surface analysis methods, a very efficient experimental design, and a 2nd-order response surface equation with a 2nd-order error propagation analysis to define the MDNBR probability density function. Calculations for limiting transients were used in the response surface analysis thereby including transient interactions and trip uncertainties in the MDNBR probability density

  1. Multivariate statistical analysis of atom probe tomography data

    International Nuclear Information System (INIS)

    Parish, Chad M.; Miller, Michael K.

    2010-01-01

    The application of spectrum imaging multivariate statistical analysis methods, specifically principal component analysis (PCA), to atom probe tomography (APT) data has been investigated. The mathematical method of analysis is described and the results for two example datasets are analyzed and presented. The first dataset is from the analysis of a PM 2000 Fe-Cr-Al-Ti steel containing two different ultrafine precipitate populations. PCA properly describes the matrix and precipitate phases in a simple and intuitive manner. A second APT example is from the analysis of an irradiated reactor pressure vessel steel. Fine, nm-scale Cu-enriched precipitates having a core-shell structure were identified and qualitatively described by PCA. Advantages, disadvantages, and future prospects for implementing these data analysis methodologies for APT datasets, particularly with regard to quantitative analysis, are also discussed.

  2. Development of statistical analysis code for meteorological data (W-View)

    Energy Technology Data Exchange (ETDEWEB)

    Tachibana, Haruo; Sekita, Tsutomu; Yamaguchi, Takenori [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment

    2003-03-01

    A computer code (W-View: Weather View) was developed to analyze the meteorological data statistically based on 'the guideline of meteorological statistics for the safety analysis of nuclear power reactor' (Nuclear Safety Commission on January 28, 1982; revised on March 29, 2001). The code gives statistical meteorological data to assess the public dose in case of normal operation and severe accident to get the license of nuclear reactor operation. This code was revised from the original code used in a large office computer code to enable a personal computer user to analyze the meteorological data simply and conveniently and to make the statistical data tables and figures of meteorology. (author)

  3. An Entropy-Based Statistic for Genomewide Association Studies

    OpenAIRE

    Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao

    2005-01-01

    Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard χ2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the difference...

  4. Logic analysis and verification of n-input genetic logic circuits

    DEFF Research Database (Denmark)

    Baig, Hasan; Madsen, Jan

    2017-01-01

    . In this paper, we present an approach to analyze and verify the Boolean logic of a genetic circuit from the data obtained through stochastic analog circuit simulations. The usefulness of this analysis is demonstrated through different case studies illustrating how our approach can be used to verify the expected......Nature is using genetic logic circuits to regulate the fundamental processes of life. These genetic logic circuits are triggered by a combination of external signals, such as chemicals, proteins, light and temperature, to emit signals to control other gene expressions or metabolic pathways...... accordingly. As compared to electronic circuits, genetic circuits exhibit stochastic behavior and do not always behave as intended. Therefore, there is a growing interest in being able to analyze and verify the logical behavior of a genetic circuit model, prior to its physical implementation in a laboratory...

  5. The extended statistical analysis of toxicity tests using standardised effect sizes (SESs): a comparison of nine published papers.

    Science.gov (United States)

    Festing, Michael F W

    2014-01-01

    The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error) due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs) a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL) was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.

  6. The extended statistical analysis of toxicity tests using standardised effect sizes (SESs: a comparison of nine published papers.

    Directory of Open Access Journals (Sweden)

    Michael F W Festing

    Full Text Available The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.

  7. Analysis and design of a genetic circuit for dynamic metabolic engineering.

    Science.gov (United States)

    Anesiadis, Nikolaos; Kobayashi, Hideki; Cluett, William R; Mahadevan, Radhakrishnan

    2013-08-16

    Recent advances in synthetic biology have equipped us with new tools for bioprocess optimization at the genetic level. Previously, we have presented an integrated in silico design for the dynamic control of gene expression based on a density-sensing unit and a genetic toggle switch. In the present paper, analysis of a serine-producing Escherichia coli mutant shows that an instantaneous ON-OFF switch leads to a maximum theoretical productivity improvement of 29.6% compared to the mutant. To further the design, global sensitivity analysis is applied here to a mathematical model of serine production in E. coli coupled with a genetic circuit. The model of the quorum sensing and the toggle switch involves 13 parameters of which 3 are identified as having a significant effect on serine concentration. Simulations conducted in this reduced parameter space further identified the optimal ranges for these 3 key parameters to achieve productivity values close to the maximum theoretical values. This analysis can now be used to guide the experimental implementation of a dynamic metabolic engineering strategy and reduce the time required to design the genetic circuit components.

  8. Proper joint analysis of summary association statistics requires the adjustment of heterogeneity in SNP coverage pattern.

    Science.gov (United States)

    Zhang, Han; Wheeler, William; Song, Lei; Yu, Kai

    2017-07-07

    As meta-analysis results published by consortia of genome-wide association studies (GWASs) become increasingly available, many association summary statistics-based multi-locus tests have been developed to jointly evaluate multiple single-nucleotide polymorphisms (SNPs) to reveal novel genetic architectures of various complex traits. The validity of these approaches relies on the accurate estimate of z-score correlations at considered SNPs, which in turn requires knowledge on the set of SNPs assessed by each study participating in the meta-analysis. However, this exact SNP coverage information is usually unavailable from the meta-analysis results published by GWAS consortia. In the absence of the coverage information, researchers typically estimate the z-score correlations by making oversimplified coverage assumptions. We show through real studies that such a practice can generate highly inflated type I errors, and we demonstrate the proper way to incorporate correct coverage information into multi-locus analyses. We advocate that consortia should make SNP coverage information available when posting their meta-analysis results, and that investigators who develop analytic tools for joint analyses based on summary data should pay attention to the variation in SNP coverage and adjust for it appropriately. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  9. CORSSA: Community Online Resource for Statistical Seismicity Analysis

    Science.gov (United States)

    Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.

    2011-12-01

    Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.

  10. Recent advances in statistical energy analysis

    Science.gov (United States)

    Heron, K. H.

    1992-01-01

    Statistical Energy Analysis (SEA) has traditionally been developed using modal summation and averaging approach, and has led to the need for many restrictive SEA assumptions. The assumption of 'weak coupling' is particularly unacceptable when attempts are made to apply SEA to structural coupling. It is now believed that this assumption is more a function of the modal formulation rather than a necessary formulation of SEA. The present analysis ignores this restriction and describes a wave approach to the calculation of plate-plate coupling loss factors. Predictions based on this method are compared with results obtained from experiments using point excitation on one side of an irregular six-sided box structure. Conclusions show that the use and calculation of infinite transmission coefficients is the way forward for the development of a purely predictive SEA code.

  11. Statistical analysis of tourism destination competitiveness

    Directory of Open Access Journals (Sweden)

    Attilio Gardini

    2013-05-01

    Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.

  12. Visual and statistical analysis of 18F-FDG PET in primary progressive aphasia

    International Nuclear Information System (INIS)

    Matias-Guiu, Jordi A.; Moreno-Ramos, Teresa; Garcia-Ramos, Rocio; Fernandez-Matarrubia, Marta; Oreja-Guevara, Celia; Matias-Guiu, Jorge; Cabrera-Martin, Maria Nieves; Perez-Castejon, Maria Jesus; Rodriguez-Rey, Cristina; Ortega-Candil, Aida; Carreras, Jose Luis

    2015-01-01

    Diagnosing progressive primary aphasia (PPA) and its variants is of great clinical importance, and fluorodeoxyglucose (FDG) positron emission tomography (PET) may be a useful diagnostic technique. The purpose of this study was to evaluate interobserver variability in the interpretation of FDG PET images in PPA as well as the diagnostic sensitivity and specificity of the technique. We also aimed to compare visual and statistical analyses of these images. There were 10 raters who analysed 44 FDG PET scans from 33 PPA patients and 11 controls. Five raters analysed the images visually, while the other five used maps created using Statistical Parametric Mapping software. Two spatial normalization procedures were performed: global mean normalization and cerebellar normalization. Clinical diagnosis was considered the gold standard. Inter-rater concordance was moderate for visual analysis (Fleiss' kappa 0.568) and substantial for statistical analysis (kappa 0.756-0.881). Agreement was good for all three variants of PPA except for the nonfluent/agrammatic variant studied with visual analysis. The sensitivity and specificity of each rater's diagnosis of PPA was high, averaging 87.8 and 89.9 % for visual analysis and 96.9 and 90.9 % for statistical analysis using global mean normalization, respectively. In cerebellar normalization, sensitivity was 88.9 % and specificity 100 %. FDG PET demonstrated high diagnostic accuracy for the diagnosis of PPA and its variants. Inter-rater concordance was higher for statistical analysis, especially for the nonfluent/agrammatic variant. These data support the use of FDG PET to evaluate patients with PPA and show that statistical analysis methods are particularly useful for identifying the nonfluent/agrammatic variant of PPA. (orig.)

  13. Genetic analysis of Aedes albopictus (Diptera, Culicidae) reveals a deep divergence in the original regions.

    Science.gov (United States)

    Ruiling, Zhang; Tongkai, Liu; Zhendong, Huang; Guifen, Zhuang; Dezhen, Ma; Zhong, Zhang

    2018-05-02

    Aedes albopictus has been described as one of the 100 worst invasive species in the world. This mosquito originated from southeastern Asia and currently has a widespread presence in every continent except Antarctica. The rapid global expansion of Ae. albopictus has increased public health concerns about arbovirus-related disease threats. Adaptation, adaption to novel areas is a biological challenge for invasive species, and the underlying processes can be studied at the molecular level. In this study, genetic analysis was performed using mitochondrial gene NADH dehydrogenase subunit 5 (ND5), based on both native and invasive populations. Altogether, 38 haplotypes were detected with H1 being the dominant and widely distributed in 21 countries. Both phylogenetic and network analyses supported the existence of five clades, with only clade I being involved in the subsequent global spread of Asian tiger mosquito. The other four clades (II, III, IV and V) were restricted to their original regions, which could be ancestral populations that had diverged from clade I in the early stages of evolution. Neutrality tests suggested that most of the populations had experienced recent expansion. Analysis of molecular variance and the population-pair statistic F ST revealed that most populations lacked genetic structure, while high variability was detected within populations. Multiple and independent human-mediated introductions may explain the present results. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data.

    Science.gov (United States)

    Lin, Nan; Zhu, Yun; Fan, Ruzong; Xiong, Momiao

    2017-10-01

    Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics.

  15. Population genetic analysis and trichothecene profiling of Fusarium graminearum from wheat in Uruguay.

    Science.gov (United States)

    Pan, D; Mionetto, A; Calero, N; Reynoso, M M; Torres, A; Bettucci, L

    2016-03-11

    Fusarium graminearum sensu stricto (F. graminearum s.s.) is the major causal agent of Fusarium head blight of wheat worldwide, and contaminates grains with trichothecene mycotoxins that cause serious threats to food safety and animal health. An important aspect of managing this pathogen and reducing mycotoxin contamination of wheat is knowledge regarding its population genetics. Therefore, isolates of F. graminearum s.s. from the major wheat-growing region of Uruguay were analyzed by amplified fragment length polymorphism assays, PCR genotyping, and chemical analysis of trichothecene production. Of the 102 isolates identified as having the 15-ADON genotype via PCR genotyping, all were DON producers, but only 41 strains were also 15-ADON producers, as determined by chemical analysis. The populations were genotypically diverse but genetically similar, with significant genetic exchange occurring between them. Analysis of molecular variance indicated that most of the genetic variability resulted from differences between isolates within populations. Multilocus linkage disequilibrium analysis suggested that the isolates had a panmictic population genetic structure and that there is significant recombination occurs in F. graminearum s.s. In conclusion, tour findings provide the first detailed description of the genetic structure and trichothecene production of populations of F. graminearum s.s. from Uruguay, and expands our understanding of the agroecology of F. graminearum and of the correlation between genotypes and trichothecene chemotypes.

  16. Detection of superior genotype of fatty acid synthase in Korean native cattle by an environment-adjusted statistical model

    Directory of Open Access Journals (Sweden)

    Jea-Young Lee

    2017-06-01

    Full Text Available Objective This study examines the genetic factors influencing the phenotypes (four economic traits:oleic acid [C18:1], monounsaturated fatty acids, carcass weight, and marbling score of Hanwoo. Methods To enhance the accuracy of the genetic analysis, the study proposes a new statistical model that excludes environmental factors. A statistically adjusted, analysis of covariance model of environmental and genetic factors was developed, and estimated environmental effects (covariate effects of age and effects of calving farms were excluded from the model. Results The accuracy was compared before and after adjustment. The accuracy of the best single nucleotide polymorphism (SNP in C18:1 increased from 60.16% to 74.26%, and that of the two-factor interaction increased from 58.69% to 87.19%. Also, superior SNPs and SNP interactions were identified using the multifactor dimensionality reduction method in Table 1 to 4. Finally, high- and low-risk genotypes were compared based on their mean scores for each trait. Conclusion The proposed method significantly improved the analysis accuracy and identified superior gene-gene interactions and genotypes for each of the four economic traits of Hanwoo.

  17. Australasian Resuscitation In Sepsis Evaluation trial statistical analysis plan.

    Science.gov (United States)

    Delaney, Anthony; Peake, Sandra L; Bellomo, Rinaldo; Cameron, Peter; Holdgate, Anna; Howe, Belinda; Higgins, Alisa; Presneill, Jeffrey; Webb, Steve

    2013-10-01

    The Australasian Resuscitation In Sepsis Evaluation (ARISE) study is an international, multicentre, randomised, controlled trial designed to evaluate the effectiveness of early goal-directed therapy compared with standard care for patients presenting to the ED with severe sepsis. In keeping with current practice, and taking into considerations aspects of trial design and reporting specific to non-pharmacologic interventions, this document outlines the principles and methods for analysing and reporting the trial results. The document is prepared prior to completion of recruitment into the ARISE study, without knowledge of the results of the interim analysis conducted by the data safety and monitoring committee and prior to completion of the two related international studies. The statistical analysis plan was designed by the ARISE chief investigators, and reviewed and approved by the ARISE steering committee. The data collected by the research team as specified in the study protocol, and detailed in the study case report form were reviewed. Information related to baseline characteristics, characteristics of delivery of the trial interventions, details of resuscitation and other related therapies, and other relevant data are described with appropriate comparisons between groups. The primary, secondary and tertiary outcomes for the study are defined, with description of the planned statistical analyses. A statistical analysis plan was developed, along with a trial profile, mock-up tables and figures. A plan for presenting baseline characteristics, microbiological and antibiotic therapy, details of the interventions, processes of care and concomitant therapies, along with adverse events are described. The primary, secondary and tertiary outcomes are described along with identification of subgroups to be analysed. A statistical analysis plan for the ARISE study has been developed, and is available in the public domain, prior to the completion of recruitment into the

  18. Hemophilia Data and Statistics

    Science.gov (United States)

    ... View public health webinars on blood disorders Data & Statistics Language: English (US) Español (Spanish) Recommend on Facebook ... genetic testing is done to diagnose hemophilia before birth. For the one-third ... rates and hospitalization rates for bleeding complications from hemophilia ...

  19. Genetic Diversity of Rose germplasm based on RAPD analysis

    African Journals Online (AJOL)

    AHSAN IQBAL

    2012-06-12

    Jun 12, 2012 ... identification and analysis of genetic variation within a collection of 4 species and 30 accessions of rose using RAPD analysis technique. The results showed the molecular distinctions among the ... that range in colour from white and yellow to many shades of pink and red have been developed. Since.

  20. Genetic Diversity of Acacia mangium Seed Orchard in Wonogiri Indonesia Using Microsatellite Markers

    Directory of Open Access Journals (Sweden)

    VIVI YUSKIANTI

    2012-09-01

    Full Text Available Genetic diversity is important in tree improvement programs. To evaluate levels of genetic diversity of first generation Acacia mangium seedling seed orchard in Wonogiri, Central Java, Indonesia, three populations from each region of Papua New Guinea (PNG and Queensland, Australia (QLD were selected and analyzed using 25 microsatellite markers. Statistical analysis showed that PNG populations have higher number of detected alleles and level of genetic diversity than QLD populations. This study provides a basic information about the genetic background of the populations used in the development of an A. mangium seed orchard in Indonesia.

  1. Measuring the Success of an Academic Development Programme: A Statistical Analysis

    Science.gov (United States)

    Smith, L. C.

    2009-01-01

    This study uses statistical analysis to estimate the impact of first-year academic development courses in microeconomics, statistics, accountancy, and information systems, offered by the University of Cape Town's Commerce Academic Development Programme, on students' graduation performance relative to that achieved by mainstream students. The data…

  2. Genetic diversity analysis in Malaysian giant prawns using expressed sequence tag microsatellite markers for stock improvement program.

    Science.gov (United States)

    Atin, K H; Christianus, A; Fatin, N; Lutas, A C; Shabanimofrad, M; Subha, B

    2017-08-17

    The Malaysian giant prawn is among the most commonly cultured species of the genus Macrobrachium. Stocks of giant prawns from four rivers in Peninsular Malaysia have been used for aquaculture over the past 25 years, which has led to repeated harvesting, restocking, and transplantation between rivers. Consequently, a stock improvement program is now important to avoid the depletion of wild stocks and the loss of genetic diversity. However, the success of such an improvement program depends on our knowledge of the genetic variation of these base populations. The aim of the current study was to estimate genetic variation and differentiation of these riverine sources using novel expressed sequence tag-microsatellite (EST-SSR) markers, which not only are informative on genetic diversity but also provide information on immune and metabolic traits. Our findings indicated that the tested stocks have inbreeding depression due to a significant deficiency in heterozygotes, and F IS was estimated as 0.15538 to 0.31938. An F-statistics analysis suggested that the stocks are composed of one large panmictic population. Among the four locations, stocks from Johor, in the southern region of the peninsular, showed higher allelic and genetic diversity than the other stocks. To overcome inbreeding problems, the Johor population could be used as a base population in a stock improvement program by crossing to the other populations. The study demonstrated that EST-SSR markers can be incorporated in future marker assisted breeding to aid the proper management of the stocks by breeders and stakeholders in Malaysia.

  3. Analysis of large brain MRI databases for investigating the relationships between brain, cognitive, and genetic polymorphisms

    International Nuclear Information System (INIS)

    Mazoyer, B.

    2006-01-01

    A major challenge for the years to come is the understanding of the brain-behaviour relationships, and in particular the investigation and quantification of the impact of genetic polymorphism on these relationships. In this framework, a promising experimental approach, which we will refer to as neuro-epidemiologic imaging, consists in acquiring multimodal (brain images, psychometric an d sociological data, genotypes) data in large (several hundreds or thousands ) cohorts of subjects. Processing of such large databases requires on first place the conception and implementation of automated 'pipelines', including image registration, spatial normalisation tissue segmentation, and multivariate statistical analysis. Given the number of images and data to be processed, such pipelines must be both fully automated and robust enough to be able to handle multi-center MRI data, e.g. having inhomogeneous characteristics in terms of resolution and contrast. This approach will be illustrated using two databases collected in aged healthy subjects, searching for the impact of genetic and environmental on two markers of brain aging, namely white matter hyper-signals, and grey matter atrophy. (author)

  4. Evaluation and application of summary statistic imputation to discover new height-associated loci.

    Science.gov (United States)

    Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

    2018-05-01

    As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian

  5. Analysis of Variance in Statistical Image Processing

    Science.gov (United States)

    Kurz, Ludwik; Hafed Benteftifa, M.

    1997-04-01

    A key problem in practical image processing is the detection of specific features in a noisy image. Analysis of variance (ANOVA) techniques can be very effective in such situations, and this book gives a detailed account of the use of ANOVA in statistical image processing. The book begins by describing the statistical representation of images in the various ANOVA models. The authors present a number of computationally efficient algorithms and techniques to deal with such problems as line, edge, and object detection, as well as image restoration and enhancement. By describing the basic principles of these techniques, and showing their use in specific situations, the book will facilitate the design of new algorithms for particular applications. It will be of great interest to graduate students and engineers in the field of image processing and pattern recognition.

  6. Study of relationship between MUF correlation and detection sensitivity of statistical analysis

    International Nuclear Information System (INIS)

    Tamura, Toshiaki; Ihara, Hitoshi; Yamamoto, Yoichi; Ikawa, Koji

    1989-11-01

    Various kinds of statistical analysis are proposed to NRTA (Near Real Time Materials Accountancy) which was devised to satisfy the timeliness goal of one of the detection goals of IAEA. It will be presumed that different statistical analysis results will occur between the case of considered rigorous error propagation (with MUF correlation) and the case of simplified error propagation (without MUF correlation). Therefore, measurement simulation and decision analysis were done using flow simulation of 800 MTHM/Y model reprocessing plant, and relationship between MUF correlation and detection sensitivity and false alarm of statistical analysis was studied. Specific character of material accountancy for 800 MTHM/Y model reprocessing plant was grasped by this simulation. It also became clear that MUF correlation decreases not only false alarm but also detection probability for protracted loss in case of CUMUF test and Page's test applied to NRTA. (author)

  7. 75 FR 24718 - Guidance for Industry on Documenting Statistical Analysis Programs and Data Files; Availability

    Science.gov (United States)

    2010-05-05

    ...] Guidance for Industry on Documenting Statistical Analysis Programs and Data Files; Availability AGENCY... documenting statistical analyses and data files submitted to the Center for Veterinary Medicine (CVM) for the... on Documenting Statistical Analysis Programs and Data Files; Availability'' giving interested persons...

  8. Genetic analysis of GRIA2 and GRIA4 genes in migraine.

    Science.gov (United States)

    Gasparini, Claudia F; Sutherland, Heidi G; Haupt, Larisa M; Griffiths, Lyn R

    2014-02-01

    Migraine is a brain disorder affecting ∼12% of the Caucasian population. Genes involved in neurological, vascular, and hormonal pathways have all been implicated in predisposing individuals to developing migraine. The migraineur presents with disabling head pain and varying symptoms of nausea, emesis, photophobia, phonophobia, and occasionally visual sensory disturbances. Biochemical and genetic studies have demonstrated dysfunction of neurotransmitters: serotonin, dopamine, and glutamate in migraine susceptibility. Glutamate mediates the transmission of excitatory signals in the mammalian central nervous system that affect normal brain function including cognition, memory and learning. The aim of this study was to investigate polymorphisms in the GRIA2 and GRIA4 genes, which encode subunits of the ionotropic AMPA receptor for association in an Australian Caucasian population. Genotypes for each polymorphism were determined using high resolution melt analysis and the RFLP method. Statistical analysis showed no association between migraine and the GRIA2 and GRIA4 polymorphisms investigated. Although the results of this study showed no significant association between the tested GRIA gene variants and migraine in our Australian Caucasian population further investigation of other components of the glutamatergic system may help to elucidate if there is a relationship between glutamatergic dysfunction and migraine. © 2013 American Headache Society.

  9. Math anxiety and exposure to statistics in messages about genetically modified foods: effects of numeracy, math self-efficacy, and form of presentation.

    Science.gov (United States)

    Silk, Kami J; Parrott, Roxanne L

    2014-01-01

    Health risks are often communicated to the lay public in statistical formats even though low math skills, or innumeracy, have been found to be prevalent among lay individuals. Although numeracy has been a topic of much research investigation, the role of math self-efficacy and math anxiety on health and risk communication processing has received scant attention from health communication researchers. To advance theoretical and applied understanding regarding health message processing, the authors consider the role of math anxiety, including the effects of math self-efficacy, numeracy, and form of presenting statistics on math anxiety, and the potential effects for comprehension, yielding, and behavioral intentions. The authors also examine math anxiety in a health risk context through an evaluation of the effects of exposure to a message about genetically modified foods on levels of math anxiety. Participants (N = 323) were randomly assigned to read a message that varied the presentation of statistical evidence about potential risks associated with genetically modified foods. Findings reveal that exposure increased levels of math anxiety, with increases in math anxiety limiting yielding. Moreover, math anxiety impaired comprehension but was mediated by perceivers' math confidence and skills. Last, math anxiety facilitated behavioral intentions. Participants who received a text-based message with percentages were more likely to yield than participants who received either a bar graph with percentages or a combined form. Implications are discussed as they relate to math competence and its role in processing health and risk messages.

  10. Spurious correlations and inference in landscape genetics

    Science.gov (United States)

    Samuel A. Cushman; Erin L. Landguth

    2010-01-01

    Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...

  11. Point defect characterization in HAADF-STEM images using multivariate statistical analysis

    International Nuclear Information System (INIS)

    Sarahan, Michael C.; Chi, Miaofang; Masiel, Daniel J.; Browning, Nigel D.

    2011-01-01

    Quantitative analysis of point defects is demonstrated through the use of multivariate statistical analysis. This analysis consists of principal component analysis for dimensional estimation and reduction, followed by independent component analysis to obtain physically meaningful, statistically independent factor images. Results from these analyses are presented in the form of factor images and scores. Factor images show characteristic intensity variations corresponding to physical structure changes, while scores relate how much those variations are present in the original data. The application of this technique is demonstrated on a set of experimental images of dislocation cores along a low-angle tilt grain boundary in strontium titanate. A relationship between chemical composition and lattice strain is highlighted in the analysis results, with picometer-scale shifts in several columns measurable from compositional changes in a separate column. -- Research Highlights: → Multivariate analysis of HAADF-STEM images. → Distinct structural variations among SrTiO 3 dislocation cores. → Picometer atomic column shifts correlated with atomic column population changes.

  12. Full likelihood analysis of genetic risk with variable age at onset disease--combining population-based registry data and demographic information.

    Directory of Open Access Journals (Sweden)

    Janne Pitkäniemi

    Full Text Available BACKGROUND: In genetic studies of rare complex diseases it is common to ascertain familial data from population based registries through all incident cases diagnosed during a pre-defined enrollment period. Such an ascertainment procedure is typically taken into account in the statistical analysis of the familial data by constructing either a retrospective or prospective likelihood expression, which conditions on the ascertainment event. Both of these approaches lead to a substantial loss of valuable data. METHODOLOGY AND FINDINGS: Here we consider instead the possibilities provided by a Bayesian approach to risk analysis, which also incorporates the ascertainment procedure and reference information concerning the genetic composition of the target population to the considered statistical model. Furthermore, the proposed Bayesian hierarchical survival model does not require the considered genotype or haplotype effects be expressed as functions of corresponding allelic effects. Our modeling strategy is illustrated by a risk analysis of type 1 diabetes mellitus (T1D in the Finnish population-based on the HLA-A, HLA-B and DRB1 human leucocyte antigen (HLA information available for both ascertained sibships and a large number of unrelated individuals from the Finnish bone marrow donor registry. The heterozygous genotype DR3/DR4 at the DRB1 locus was associated with the lowest predictive probability of T1D free survival to the age of 15, the estimate being 0.936 (0.926; 0.945 95% credible interval compared to the average population T1D free survival probability of 0.995. SIGNIFICANCE: The proposed statistical method can be modified to other population-based family data ascertained from a disease registry provided that the ascertainment process is well documented, and that external information concerning the sizes of birth cohorts and a suitable reference sample are available. We confirm the earlier findings from the same data concerning the HLA-DR3

  13. [Genetic variation and differentiation in striped field mouse Apodemus agrarius inferred from RAPD-PCR analysis].

    Science.gov (United States)

    Atopkin, D M; Bogdanov, A S; Chelomina, G N

    2007-06-01

    Genetic variation and differentiation of the trans-Palearctic species Apodemus agrarius (striped field mouse), whose range consists of two large isolates-European-Siberian and Far Eastern-Chinese, were examined using RAPD-PCR analysis. The material from the both parts of the range was examined (41 individual of A. agrarius from 18 localities of Russia, Ukraine, Moldova, and Kazakhstan); the Far-Eastern part was represented by samples from the Amur region, Khabarovsk krai, and Primorye (Russia). Differences in frequencies of polymorphic RAPD loci were found between the European-Siberian and the Far Eastern population groups of striped field mouse. No "fixed" differences between them in RAPD spectra were found, and none of the used statistical methods permitted to distinguish with absolute certainty animals from the two range parts. Thus, genetic isolation of the European-Siberian and the Far Eastern population groups of A. agrarius is not strict. These results support the hypothesis on recent dispersal of striped field mouse from East to West Palearctics (during the Holocene climatic optimum, 7000 to 4500 years ago) and subsequent disjunction of the species range (not earlier than 4000-4500 years ago). The Far Eastern population group is more polymorphic than the European-Siberian one, while genetic heterogeneity is more uniformly distributed within it. This is probably explained by both historical events that happened during the species dispersal in the past, and different environmental conditions for the species in different parts of its range. The Far Eastern population group inhabits the area close to the distribution center of A. agrarius. It is likely that this group preserved genetic variation of the formerly integral ancestral form, while some amount of genetic polymorphism could be lost during the species colonization of the Siberian and European areas. To date, the settlement density and population number in general are higher than within the European

  14. Near-exact distributions for the block equicorrelation and equivariance likelihood ratio test statistic

    Science.gov (United States)

    Coelho, Carlos A.; Marques, Filipe J.

    2013-09-01

    In this paper the authors combine the equicorrelation and equivariance test introduced by Wilks [13] with the likelihood ratio test (l.r.t.) for independence of groups of variables to obtain the l.r.t. of block equicorrelation and equivariance. This test or its single block version may find applications in many areas as in psychology, education, medicine, genetics and they are important "in many tests of multivariate analysis, e.g. in MANOVA, Profile Analysis, Growth Curve analysis, etc" [12, 9]. By decomposing the overall hypothesis into the hypotheses of independence of groups of variables and the hypothesis of equicorrelation and equivariance we are able to obtain the expressions for the overall l.r.t. statistic and its moments. From these we obtain a suitable factorization of the characteristic function (c.f.) of the logarithm of the l.r.t. statistic, which enables us to develop highly manageable and precise near-exact distributions for the test statistic.

  15. STATCAT, Statistical Analysis of Parametric and Non-Parametric Data

    International Nuclear Information System (INIS)

    David, Hugh

    1990-01-01

    1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required

  16. PeachVar-DB: A Curated Collection of Genetic Variations for the Interactive Analysis of Peach Genome Data.

    Science.gov (United States)

    Cirilli, Marco; Flati, Tiziano; Gioiosa, Silvia; Tagliaferri, Ilario; Ciacciulli, Angelo; Gao, Zhongshan; Gattolin, Stefano; Geuna, Filippo; Maggi, Francesco; Bottoni, Paolo; Rossini, Laura; Bassi, Daniele; Castrignanò, Tiziana; Chillemi, Giovanni

    2018-01-01

    Applying next-generation sequencing (NGS) technologies to species of agricultural interest has the potential to accelerate the understanding and exploration of genetic resources. The storage, availability and maintenance of huge quantities of NGS-generated data remains a major challenge. The PeachVar-DB portal, available at http://hpc-bioinformatics.cineca.it/peach, is an open-source catalog of genetic variants present in peach (Prunus persica L. Batsch) and wild-related species of Prunus genera, annotated from 146 samples publicly released on the Sequence Read Archive (SRA). We designed a user-friendly web-based interface of the database, providing search tools to retrieve single nucleotide polymorphism (SNP) and InDel variants, along with useful statistics and information. PeachVar-DB results are linked to the Genome Database for Rosaceae (GDR) and the Phytozome database to allow easy access to other external useful plant-oriented resources. In order to extend the genetic diversity covered by the PeachVar-DB further, and to allow increasingly powerful comparative analysis, we will progressively integrate newly released data. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  17. Genetics Home Reference: Kniest dysplasia

    Science.gov (United States)

    ... may include a rounded upper back that also curves to the side ( kyphoscoliosis ), severely flattened bones of ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  18. Genetics Home Reference: Carpenter syndrome

    Science.gov (United States)

    ... deformed hips, a rounded upper back that also curves to the side ( kyphoscoliosis ), and knees that are ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  19. Genetics Home Reference: Czech dysplasia

    Science.gov (United States)

    ... such as a rounded upper back that also curves to the side ( kyphoscoliosis ). Some people with Czech ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  20. Microsatellite DNA analysis of northern pike ( Esox lucius L.) populations: insights into the genetic structure and demographic history of a genetically depauperate species

    DEFF Research Database (Denmark)

    Jacobsen, B. H.; Hansen, Michael Møller; Loeschcke, V.

    2005-01-01

    The northern pike Esox lucius L. is a freshwater fish exhibiting pronounced population subdivision and low genetic variability. However, there is limited knowledge on phylogeographical patterns within the species, and it is not known whether the low genetic variability reflects primarily current...... low effective population sizes or historical bottlenecks. We analysed six microsatellite loci in ten populations from Europe and North America. Genetic variation was low, with the average number of alleles within populations ranging from 2.3 to 4.0 per locus. Genetic differentiation among populations...... was high (overall theta(ST) = 0.51; overall rho(ST) = 0.50). Multidimensional scaling analysis of genetic distances between populations and spatial analysis of molecular variance suggested a single phylogeographical race within the sampled populations from northern Europe, whereas North American...

  1. Statistical process control methods allow the analysis and improvement of anesthesia care.

    Science.gov (United States)

    Fasting, Sigurd; Gisvold, Sven E

    2003-10-01

    Quality aspects of the anesthetic process are reflected in the rate of intraoperative adverse events. The purpose of this report is to illustrate how the quality of the anesthesia process can be analyzed using statistical process control methods, and exemplify how this analysis can be used for quality improvement. We prospectively recorded anesthesia-related data from all anesthetics for five years. The data included intraoperative adverse events, which were graded into four levels, according to severity. We selected four adverse events, representing important quality and safety aspects, for statistical process control analysis. These were: inadequate regional anesthesia, difficult emergence from general anesthesia, intubation difficulties and drug errors. We analyzed the underlying process using 'p-charts' for statistical process control. In 65,170 anesthetics we recorded adverse events in 18.3%; mostly of lesser severity. Control charts were used to define statistically the predictable normal variation in problem rate, and then used as a basis for analysis of the selected problems with the following results: Inadequate plexus anesthesia: stable process, but unacceptably high failure rate; Difficult emergence: unstable process, because of quality improvement efforts; Intubation difficulties: stable process, rate acceptable; Medication errors: methodology not suited because of low rate of errors. By applying statistical process control methods to the analysis of adverse events, we have exemplified how this allows us to determine if a process is stable, whether an intervention is required, and if quality improvement efforts have the desired effect.

  2. Effect of the absolute statistic on gene-sampling gene-set analysis methods.

    Science.gov (United States)

    Nam, Dougu

    2017-06-01

    Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.

  3. An improved method for statistical analysis of raw accelerator mass spectrometry data

    International Nuclear Information System (INIS)

    Gutjahr, A.; Phillips, F.; Kubik, P.W.; Elmore, D.

    1987-01-01

    Hierarchical statistical analysis is an appropriate method for statistical treatment of raw accelerator mass spectrometry (AMS) data. Using Monte Carlo simulations we show that this method yields more accurate estimates of isotope ratios and analytical uncertainty than the generally used propagation of errors approach. The hierarchical analysis is also useful in design of experiments because it can be used to identify sources of variability. 8 refs., 2 figs

  4. Statistical Image Analysis of Tomograms with Application to Fibre Geometry Characterisation

    DEFF Research Database (Denmark)

    Emerson, Monica Jane

    The goal of this thesis is to develop statistical image analysis tools to characterise the micro-structure of complex materials used in energy technologies, with a strong focus on fibre composites. These quantification tools are based on extracting geometrical parameters defining structures from 2D...... with high resolution both in space and time to observe fast micro-structural changes. This thesis demonstrates that statistical image analysis combined with X-ray CT opens up numerous possibilities for understanding the behaviour of fibre composites under real life conditions. Besides enabling...

  5. Multivariate Survival Mixed Models for Genetic Analysis of Longevity Traits

    DEFF Research Database (Denmark)

    Pimentel Maia, Rafael; Madsen, Per; Labouriau, Rodrigo

    2014-01-01

    A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented co...... applications. The methods presented are implemented in such a way that large and complex quantitative genetic data can be analyzed......A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented...... concentrates on longevity studies. The framework presented allows to combine models based on continuous time with models based on discrete time in a joint analysis. The continuous time models are approximations of the frailty model in which the hazard function will be assumed to be piece-wise constant...

  6. Multivariate Survival Mixed Models for Genetic Analysis of Longevity Traits

    DEFF Research Database (Denmark)

    Pimentel Maia, Rafael; Madsen, Per; Labouriau, Rodrigo

    2013-01-01

    A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented co...... applications. The methods presented are implemented in such a way that large and complex quantitative genetic data can be analyzed......A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented...... concentrates on longevity studies. The framework presented allows to combine models based on continuous time with models based on discrete time in a joint analysis. The continuous time models are approximations of the frailty model in which the hazard function will be assumed to be piece-wise constant...

  7. Multivariate Meta-Analysis of Genetic Association Studies: A Simulation Study.

    Directory of Open Access Journals (Sweden)

    Binod Neupane

    Full Text Available In a meta-analysis with multiple end points of interests that are correlated between or within studies, multivariate approach to meta-analysis has a potential to produce more precise estimates of effects by exploiting the correlation structure between end points. However, under random-effects assumption the multivariate estimation is more complex (as it involves estimation of more parameters simultaneously than univariate estimation, and sometimes can produce unrealistic parameter estimates. Usefulness of multivariate approach to meta-analysis of the effects of a genetic variant on two or more correlated traits is not well understood in the area of genetic association studies. In such studies, genetic variants are expected to roughly maintain Hardy-Weinberg equilibrium within studies, and also their effects on complex traits are generally very small to modest and could be heterogeneous across studies for genuine reasons. We carried out extensive simulation to explore the comparative performance of multivariate approach with most commonly used univariate inverse-variance weighted approach under random-effects assumption in various realistic meta-analytic scenarios of genetic association studies of correlated end points. We evaluated the performance with respect to relative mean bias percentage, and root mean square error (RMSE of the estimate and coverage probability of corresponding 95% confidence interval of the effect for each end point. Our simulation results suggest that multivariate approach performs similarly or better than univariate method when correlations between end points within or between studies are at least moderate and between-study variation is similar or larger than average within-study variation for meta-analyses of 10 or more genetic studies. Multivariate approach produces estimates with smaller bias and RMSE especially for the end point that has randomly or informatively missing summary data in some individual studies, when

  8. PedGenie: meta genetic association testing in mixed family and case-control designs

    Directory of Open Access Journals (Sweden)

    Allen-Brady Kristina

    2007-11-01

    Full Text Available Abstract Background- PedGenie software, introduced in 2006, includes genetic association testing of cases and controls that may be independent or related (nuclear families or extended pedigrees or mixtures thereof using Monte Carlo significance testing. Our aim is to demonstrate that PedGenie, a unique and flexible analysis tool freely available in Genie 2.4 software, is significantly enhanced by incorporating meta statistics for detecting genetic association with disease using data across multiple study groups. Methods- Meta statistics (chi-squared tests, odds ratios, and confidence intervals were calculated using formal Cochran-Mantel-Haenszel techniques. Simulated data from unrelated individuals and individuals in families were used to illustrate meta tests and their empirically-derived p-values and confidence intervals are accurate, precise, and for independent designs match those provided by standard statistical software. Results- PedGenie yields accurate Monte Carlo p-values for meta analysis of data across multiple studies, based on validation testing using pedigree, nuclear family, and case-control data simulated under both the null and alternative hypotheses of a genotype-phenotype association. Conclusion- PedGenie allows valid combined analysis of data from mixtures of pedigree-based and case-control resources. Added meta capabilities provide new avenues for association analysis, including pedigree resources from large consortia and multi-center studies.

  9. Genetic diversity analysis of cyanogenic potential (CNp) of root among improved genotypes of cassava using simple sequence repeat markers.

    Science.gov (United States)

    Moyib, O K; Mkumbira, J; Odunola, O A; Dixon, A G

    2012-12-01

    Cyanogenic potential (CNp) of cassava constitutes a serious problem for over 500 million people who rely on the crop as their main source of calories. Genetic diversity is a key to successful crop improvement for breeding new improved variability for target traits. Forty-three improved genotypes of cassava developed by International Institute of Tropical Agriculture (ITA), Ibadan, were characterized for CNp trait using 35 Simple Sequence.Repeat (SSR) markers. Essential colorimetry picric test was used for evaluation of CNp on a color scale of 1 to 14. The CNp scores obtained ranged from 3 to 9, with a mean score of 5.48 (+/- 0.09) based on Statistical Analysis System (SAS) package. TMS M98/ 0068 (4.0 +/- 0.25) was identified as the best genotype with low CNp while TMS M98/0028 (7.75 +/- 0.25) was the worst. The 43 genotypes were assigned into 7 phenotypic groups based on rank-sum analysis in SAS. Dissimilarity analysis representatives for windows generated a phylogenetic tree with 5 clusters which represented hybridizing groups. Each of the clusters (except 4) contained low CNp genotypes that could be used for improving the high CNp genotypes in the same or near cluster. The scatter plot of the genotypes showed that there was little or no demarcation for phenotypic CNp groupings in the molecular groupings. The result of this study demonstrated that SSR markers are powerful tools for the assessment of genetic variability, and proper identification and selection of parents for genetic improvement of low CNp trait among the IITA cassava collection.

  10. Genetics Home Reference: Winchester syndrome

    Science.gov (United States)

    ... bones ( osteoporosis ) throughout the skeleton. These abnormalities make bones brittle and more prone to fracture. The bone abnormalities ... information about a genetic condition can statistics provide? Why are some genetic conditions more common in particular ...

  11. The art of data analysis how to answer almost any question using basic statistics

    CERN Document Server

    Jarman, Kristin H

    2013-01-01

    A friendly and accessible approach to applying statistics in the real worldWith an emphasis on critical thinking, The Art of Data Analysis: How to Answer Almost Any Question Using Basic Statistics presents fun and unique examples, guides readers through the entire data collection and analysis process, and introduces basic statistical concepts along the way.Leaving proofs and complicated mathematics behind, the author portrays the more engaging side of statistics and emphasizes its role as a problem-solving tool.  In addition, light-hearted case studies

  12. Identification of novel genetic markers of breast cancer survival

    DEFF Research Database (Denmark)

    Guo, Qi; Schmidt, Marjanka K; Kraft, Peter

    2015-01-01

    BACKGROUND: Survival after a diagnosis of breast cancer varies considerably between patients, and some of this variation may be because of germline genetic variation. We aimed to identify genetic markers associated with breast cancer-specific survival. METHODS: We conducted a large meta-analysis ......BACKGROUND: Survival after a diagnosis of breast cancer varies considerably between patients, and some of this variation may be because of germline genetic variation. We aimed to identify genetic markers associated with breast cancer-specific survival. METHODS: We conducted a large meta......-analysis of studies in populations of European ancestry, including 37954 patients with 2900 deaths from breast cancer. Each study had been genotyped for between 200000 and 900000 single nucleotide polymorphisms (SNPs) across the genome; genotypes for nine million common variants were imputed using a common reference...... panel from the 1000 Genomes Project. We also carried out subtype-specific analyses based on 6881 estrogen receptor (ER)-negative patients (920 events) and 23059 ER-positive patients (1333 events). All statistical tests were two-sided. RESULTS: We identified one new locus (rs2059614 at 11q24...

  13. Statistics in experimental design, preprocessing, and analysis of proteomics data.

    Science.gov (United States)

    Jung, Klaus

    2011-01-01

    High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.

  14. Application of Multivariable Statistical Techniques in Plant-wide WWTP Control Strategies Analysis

    DEFF Research Database (Denmark)

    Flores Alsina, Xavier; Comas, J.; Rodríguez-Roda, I.

    2007-01-01

    The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant...... analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii......) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation...

  15. Genetics Home Reference: X-linked acrogigantism

    Science.gov (United States)

    ... that is caused by pituitary gland abnormalities (pituitary gigantism). Related Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  16. Analysis of genetic diversity in Bolivian llama populations using microsatellites.

    Science.gov (United States)

    Barreta, J; Gutiérrez-Gil, B; Iñiguez, V; Romero, F; Saavedra, V; Chiri, R; Rodríguez, T; Arranz, J J

    2013-08-01

    South American camelids (SACs) have a major role in the maintenance and potential future of rural Andean human populations. More than 60% of the 3.7 million llamas living worldwide are found in Bolivia. Due to the lack of studies focusing on genetic diversity in Bolivian llamas, this analysis investigates both the genetic diversity and structure of 12 regional groups of llamas that span the greater part of the range of distribution for this species in Bolivia. The analysis of 42 microsatellite markers in the considered regional groups showed that, in general, there were high levels of polymorphism (a total of 506 detected alleles; average PIC across per marker: 0.66), which are comparable with those reported for other populations of domestic SACs. The estimated diversity parameters indicated that there was high intrapopulational genetic variation (average number of alleles and average expected heterozygosity per marker: 12.04 and 0.68, respectively) and weak genetic differentiation among populations (FST range: 0.003-0.052). In agreement with these estimates, Bolivian llamas showed a weak genetic structure and an intense gene flow between all the studied regional groups, which is due to the exchange of reproductive males between the different flocks. Interestingly, the groups for which the largest pairwise FST estimates were observed, Sud Lípez and Nor Lípez, showed a certain level of genetic differentiation that is probably due to the pattern of geographic isolation and limited communication infrastructures of these southern localities. Overall, the population parameters reported here may serve as a reference when establishing conservation policies that address Bolivian llama populations. © 2012 Blackwell Verlag GmbH.

  17. A genetic analysis of segregation distortion revealed by molecular ...

    Indian Academy of Sciences (India)

    Journal of Genetics, Vol. 90, No. ... Segregation analysis was based on 64 molecular markers, including 26 .... FHB of RIL populations was controlled by quantitative trait ... The authors acknowledge financial support by the National Basic.

  18. The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments

    International Nuclear Information System (INIS)

    Pham, Bihn T.; Einerson, Jeffrey J.

    2010-01-01

    This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory's Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automated processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.

  19. The statistical analysis techniques to support the NGNP fuel performance experiments

    Energy Technology Data Exchange (ETDEWEB)

    Pham, Binh T., E-mail: Binh.Pham@inl.gov; Einerson, Jeffrey J.

    2013-10-15

    This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He–Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.

  20. Statistical Challenges of Big Data Analysis in Medicine

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2015-01-01

    Roč. 3, č. 1 (2015), s. 24-27 ISSN 1805-8698 R&D Projects: GA ČR GA13-23940S Grant - others:CESNET Development Fund(CZ) 494/2013 Institutional support: RVO:67985807 Keywords : big data * variable selection * classification * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research http://www.ijbh.org/ijbh2015-1.pdf

  1. Statistical Analysis of Hypercalcaemia Data related to Transferability

    DEFF Research Database (Denmark)

    Frølich, Anne; Nielsen, Bo Friis

    2005-01-01

    In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...... at conferences during the late eighties and early nineties. In this report we give a more detailed description of many of the analysis and provide some new results primarily by simultaneous studies of several databases....

  2. Statistical analysis of questionnaires a unified approach based on R and Stata

    CERN Document Server

    Bartolucci, Francesco; Gnaldi, Michela

    2015-01-01

    Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va

  3. Reducing bias in the analysis of counting statistics data

    International Nuclear Information System (INIS)

    Hammersley, A.P.; Antoniadis, A.

    1997-01-01

    In the analysis of counting statistics data it is common practice to estimate the variance of the measured data points as the data points themselves. This practice introduces a bias into the results of further analysis which may be significant, and under certain circumstances lead to false conclusions. In the case of normal weighted least squares fitting this bias is quantified and methods to avoid it are proposed. (orig.)

  4. Morphological analysis of Drosophila larval peripheral sensory neuron dendrites and axons using genetic mosaics.

    Science.gov (United States)

    Karim, M Rezaul; Moore, Adrian W

    2011-11-07

    Nervous system development requires the correct specification of neuron position and identity, followed by accurate neuron class-specific dendritic development and axonal wiring. Recently the dendritic arborization (DA) sensory neurons of the Drosophila larval peripheral nervous system (PNS) have become powerful genetic models in which to elucidate both general and class-specific mechanisms of neuron differentiation. There are four main DA neuron classes (I-IV)(1). They are named in order of increasing dendrite arbor complexity, and have class-specific differences in the genetic control of their differentiation(2-10). The DA sensory system is a practical model to investigate the molecular mechanisms behind the control of dendritic morphology(11-13) because: 1) it can take advantage of the powerful genetic tools available in the fruit fly, 2) the DA neuron dendrite arbor spreads out in only 2 dimensions beneath an optically clear larval cuticle making it easy to visualize with high resolution in vivo, 3) the class-specific diversity in dendritic morphology facilitates a comparative analysis to find key elements controlling the formation of simple vs. highly branched dendritic trees, and 4) dendritic arbor stereotypical shapes of different DA neurons facilitate morphometric statistical analyses. DA neuron activity modifies the output of a larval locomotion central pattern generator(14-16). The different DA neuron classes have distinct sensory modalities, and their activation elicits different behavioral responses(14,16-20). Furthermore different classes send axonal projections stereotypically into the Drosophila larval central nervous system in the ventral nerve cord (VNC)(21). These projections terminate with topographic representations of both DA neuron sensory modality and the position in the body wall of the dendritic field(7,22,23). Hence examination of DA axonal projections can be used to elucidate mechanisms underlying topographic mapping(7,22,23), as well as

  5. Genetic structure of Europeans: a view from the North-East.

    Directory of Open Access Journals (Sweden)

    Mari Nelis

    Full Text Available Using principal component (PC analysis, we studied the genetic constitution of 3,112 individuals from Europe as portrayed by more than 270,000 single nucleotide polymorphisms (SNPs genotyped with the Illumina Infinium platform. In cohorts where the sample size was >100, one hundred randomly chosen samples were used for analysis to minimize the sample size effect, resulting in a total of 1,564 samples. This analysis revealed that the genetic structure of the European population correlates closely with geography. The first two PCs highlight the genetic diversity corresponding to the northwest to southeast gradient and position the populations according to their approximate geographic origin. The resulting genetic map forms a triangular structure with a Finland, b the Baltic region, Poland and Western Russia, and c Italy as its vertexes, and with d Central- and Western Europe in its centre. Inter- and intra- population genetic differences were quantified by the inflation factor lambda (lambda (ranging from 1.00 to 4.21, fixation index (F(st (ranging from 0.000 to 0.023, and by the number of markers exhibiting significant allele frequency differences in pair-wise population comparisons. The estimated lambda was used to assess the real diminishing impact to association statistics when two distinct populations are merged directly in an analysis. When the PC analysis was confined to the 1,019 Estonian individuals (0.1% of the Estonian population, a fine structure emerged that correlated with the geography of individual counties. With at least two cohorts available from several countries, genetic substructures were investigated in Czech, Finnish, German, Estonian and Italian populations. Together with previously published data, our results allow the creation of a comprehensive European genetic map that will greatly facilitate inter-population genetic studies including genome wide association studies (GWAS.

  6. Multilocus genotypic data reveal high genetic diversity and low population genetic structure of Iranian indigenous sheep

    International Nuclear Information System (INIS)

    Vahidi, S.M.F.; Faruque, M.O.; Falahati Anbaran, M.; Afraz, F.; Mousavi, S.M.; Boettcher, P.; Joost, S.; Han, J.L.; Colli, L.; Periasamy, K.; Negrini, R.; Ajmone-Marsan, P.

    2016-01-01

    Full text: Iranian livestock diversity is still largely unexplored, in spite of the interest in the populations historically reared in this country located near the Fertile Crescent, a major livestock domestication centre. In this investigation, the genetic diversity and differentiation of 10 Iranian indigenous fat-tailed sheep breeds were investigated using 18 microsatellite markers. Iranian breeds were found to host a high level of diversity. This conclusion is substantiated by the large number of alleles observed across loci (average 13.83, range 7–22) and by the high within-breed expected heterozygosity (average 0.75, range 0.72–0.76). Iranian sheep have a low level of genetic differentiation, as indicated by the analysis of molecular variance, which allocated a very small proportion (1.67%) of total variation to the between-population component, and by the small fixation index (FST = 0.02). Both Bayesian clustering and principal coordinates analysis revealed the absence of a detectable genetic structure. Also, no isolation by distance was observed through comparison of genetic and geographical distances. In spite of high within-breed variation, signatures of inbreeding were detected by the FIS indices, which were positive in all and statistically significant in three breeds. Possible factors explaining the patterns observed, such as considerable gene flow and inbreeding probably due to anthropogenic activities in the light of population management and conservation programmes are discussed. (author)

  7. Benchmark validation of statistical models: Application to mediation analysis of imagery and memory.

    Science.gov (United States)

    MacKinnon, David P; Valente, Matthew J; Wurpts, Ingrid C

    2018-03-29

    This article describes benchmark validation, an approach to validating a statistical model. According to benchmark validation, a valid model generates estimates and research conclusions consistent with a known substantive effect. Three types of benchmark validation-(a) benchmark value, (b) benchmark estimate, and (c) benchmark effect-are described and illustrated with examples. Benchmark validation methods are especially useful for statistical models with assumptions that are untestable or very difficult to test. Benchmark effect validation methods were applied to evaluate statistical mediation analysis in eight studies using the established effect that increasing mental imagery improves recall of words. Statistical mediation analysis led to conclusions about mediation that were consistent with established theory that increased imagery leads to increased word recall. Benchmark validation based on established substantive theory is discussed as a general way to investigate characteristics of statistical models and a complement to mathematical proof and statistical simulation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  8. Genetic Analysis of Oncorhynchus Nerka : Life History and Genetic Analysis of Redfish Lake Oncorhynchus Nerka, 1993-1994 Completion Report.

    Energy Technology Data Exchange (ETDEWEB)

    Brannon, E.L.; Thorgaard, G.H.; Cummings, S.A.

    1994-10-01

    The study has shown through life history examination and DNA analysis that three forms of O. nerka are present in Redfish Lake. The three forms are closely related, but may be sufficiently different to be considered three separate stocks. Fishhook Creek kokanee are temporally isolated from the beach spawners, and may represent the gene pool most similar to the historic sockeye population that once spawned there. Fishhook Creek offers the best spawning area available in the lake system, and should be considered for use in reestablishing an anadromous Fishhook Creek sockeye swain. The resident beach spawning strain of O. nerka is likewise the most similar genetic form of the companion anadromous beach spawning O. nerka, and needs to be considered the most appropriate genetic source to help minimize reduced fitness of the sockeye from inbreeding.

  9. Evaluation of genetic diversity of Panicum turgidum Forssk from Saudi Arabia.

    Science.gov (United States)

    Assaeed, Abdulaziz M; Al-Faifi, Sulieman A; Migdadi, Hussein M; El-Bana, Magdy I; Al Qarawi, Abdulaziz A; Khan, Mohammad Altaf

    2018-01-01

    The genetic diversity of 177 accessions of Panicum turgidum Forssk, representing ten populations collected from four geographical regions in Saudi Arabia, was analyzed using amplified fragment length polymorphism (AFLP) markers. A set of four primer-pairs with two/three selective nucleotides scored 836 AFLP amplified fragments (putative loci/genome landmarks), all of which were polymorphic. Populations collected from the southern region of the country showed the highest genetic diversity parameters, whereas those collected from the central regions showed the lowest values. Analysis of molecular variance (AMOVA) revealed that 78% of the genetic variability was attributable to differences within populations. Pairwise values for population differentiation and genetic structure were statistically significant for all variances. The UPGMA dendrogram, validated by principal coordinate analysis-grouped accessions, corresponded to the geographical origin of the accessions. Mantel's test showed that there was a significant correlation between the genetic and geographical distances ( r  = 0.35, P  < 0.04). In summary, the AFLP assay demonstrated the existence of substantial genetic variation in P. turgidum . The relationship between the genetic diversity and geographical source of P. turgidum populations of Saudi Arabia, as revealed through this comprehensive study, will enable effective resource management and restoration of new areas without compromising adaptation and genetic diversity.

  10. Analysis of genetic diversity of Piper spp. in Hainan Island (China ...

    African Journals Online (AJOL)

    Inter-simple sequence repeat (ISSR) analysis was used to evaluate the genetic variation of Piper spp. from Hainan, China. 247 polymorphic bands out of a total of 248 (99.60%) were generated from 74 individual plants of Piper spp. The overall level of genetic diversity among Piper spp. in Hainan was high, with the mean ...

  11. Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: a primer and applications.

    Science.gov (United States)

    Shadish, William R; Hedges, Larry V; Pustejovsky, James E

    2014-04-01

    This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  12. Integrative Analysis of Genetic, Genomic, and Phenotypic Data for Ethanol Behaviors: A Network-Based Pipeline for Identifying Mechanisms and Potential Drug Targets.

    Science.gov (United States)

    Bogenpohl, James W; Mignogna, Kristin M; Smith, Maren L; Miles, Michael F

    2017-01-01

    Complex behavioral traits, such as alcohol abuse, are caused by an interplay of genetic and environmental factors, producing deleterious functional adaptations in the central nervous system. The long-term behavioral consequences of such changes are of substantial cost to both the individual and society. Substantial progress has been made in the last two decades in understanding elements of brain mechanisms underlying responses to ethanol in animal models and risk factors for alcohol use disorder (AUD) in humans. However, treatments for AUD remain largely ineffective and few medications for this disease state have been licensed. Genome-wide genetic polymorphism analysis (GWAS) in humans, behavioral genetic studies in animal models and brain gene expression studies produced by microarrays or RNA-seq have the potential to produce nonbiased and novel insight into the underlying neurobiology of AUD. However, the complexity of such information, both statistical and informational, has slowed progress toward identifying new targets for intervention in AUD. This chapter describes one approach for integrating behavioral, genetic, and genomic information across animal model and human studies. The goal of this approach is to identify networks of genes functioning in the brain that are most relevant to the underlying mechanisms of a complex disease such as AUD. We illustrate an example of how genomic studies in animal models can be used to produce robust gene networks that have functional implications, and to integrate such animal model genomic data with human genetic studies such as GWAS for AUD. We describe several useful analysis tools for such studies: ComBAT, WGCNA, and EW_dmGWAS. The end result of this analysis is a ranking of gene networks and identification of their cognate hub genes, which might provide eventual targets for future therapeutic development. Furthermore, this combined approach may also improve our understanding of basic mechanisms underlying gene x

  13. Bayesian statistics applied to neutron activation data for reactor flux spectrum analysis

    International Nuclear Information System (INIS)

    Chiesa, Davide; Previtali, Ezio; Sisti, Monica

    2014-01-01

    Highlights: • Bayesian statistics to analyze the neutron flux spectrum from activation data. • Rigorous statistical approach for accurate evaluation of the neutron flux groups. • Cross section and activation data uncertainties included for the problem solution. • Flexible methodology applied to analyze different nuclear reactor flux spectra. • The results are in good agreement with the MCNP simulations of neutron fluxes. - Abstract: In this paper, we present a statistical method, based on Bayesian statistics, to analyze the neutron flux spectrum from the activation data of different isotopes. The experimental data were acquired during a neutron activation experiment performed at the TRIGA Mark II reactor of Pavia University (Italy) in four irradiation positions characterized by different neutron spectra. In order to evaluate the neutron flux spectrum, subdivided in energy groups, a system of linear equations, containing the group effective cross sections and the activation rate data, has to be solved. However, since the system’s coefficients are experimental data affected by uncertainties, a rigorous statistical approach is fundamental for an accurate evaluation of the neutron flux groups. For this purpose, we applied the Bayesian statistical analysis, that allows to include the uncertainties of the coefficients and the a priori information about the neutron flux. A program for the analysis of Bayesian hierarchical models, based on Markov Chain Monte Carlo (MCMC) simulations, was used to define the problem statistical model and solve it. The first analysis involved the determination of the thermal, resonance-intermediate and fast flux components and the dependence of the results on the Prior distribution choice was investigated to confirm the reliability of the Bayesian analysis. After that, the main resonances of the activation cross sections were analyzed to implement multi-group models with finer energy subdivisions that would allow to determine the

  14. Genetic analysis of a consanguineous Pakistani family with Leber ...

    Indian Academy of Sciences (India)

    2014-08-01

    Aug 1, 2014 ... RESEARCH NOTE. Genetic analysis of a consanguineous Pakistani family with Leber .... representation of the deleterious mutation at genomic and protein level. ... In the last couple of years, numerous mutations in. GUCY2D ...

  15. Reactor noise analysis by statistical pattern recognition methods

    International Nuclear Information System (INIS)

    Howington, L.C.; Gonzalez, R.C.

    1976-01-01

    A multivariate statistical pattern recognition system for reactor noise analysis is presented. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, updating, and data compacting capabilities. System design emphasizes control of the false-alarm rate. Its abilities to learn normal patterns, to recognize deviations from these patterns, and to reduce the dimensionality of data with minimum error were evaluated by experiments at the Oak Ridge National Laboratory (ORNL) High-Flux Isotope Reactor. Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the pattern recognition system

  16. A theoretical analysis of population genetics of plants on restored habitats

    Energy Technology Data Exchange (ETDEWEB)

    Bogoliubov, A.G. [Botanical Institute, Russian Academy of Science, St. Petersburg (Russian Federation); Loehle, C. [Argonne National Lab., IL (United States)

    1995-02-01

    Seed and propagules used for habitat restoration are not likely to be closely adapted to local site conditions. Rapid changes of genotypes frequencies on local microsites and/or microevolution would allow plants to become better adapted to a site. These same factors would help to maintain genetic diversity and ensure the survival of small endangered populations. We used population genetics models to examine the selection of genotypes during establishment on restored sites. Vegetative spread was shown to affect selection and significantly reduce genetic diversity. To study general microevolution, we linked a model of resource usage with a genetics model and analyzed competition between genotypes. A complex suite of feasible ecogenetic states was shown to result. The state actually resulting would depend strongly on initial conditions. This analysis indicated that genetic structure can vary locally and can produce overall genetic variability that is not simply the result of microsite adaptations. For restoration activities, the implication is that small differences in seed source could lead to large differences in local genetic structure after selection.

  17. A theoretical analysis of population genetics of plants on restored habitats

    Energy Technology Data Exchange (ETDEWEB)

    Bogoliubov, A.G. [Russian Academy of Science, St. Petersburg (Russian Federation). Botanical Inst.; Loehle, C. [Argonne National Lab., IL (United States). Environmental Research Div.

    1997-07-01

    Seed and propagules used for habitat restoration are not likely to be closely adapted to local site conditions. Rapid changes of genotypes frequencies on local microsites and/or microevolution would allow plants to become better adapted to a site. These same factors would help to maintain genetic diversity and ensure the survival of small endangered populations. The authors used population genetics models to examine the selection of genotypes during establishment on restored sites. Vegetative spread was shown to affect selection and significantly reduce genetic diversity. To study general microevolution, the authors linked a model of resource usage with a genetics model and analyzed competition between genotypes. A complex suite of feasible ecogenetic states was shown to result. The state actually resulting would depend strongly on initial conditions. This analysis indicated that genetic structure can vary locally and can produce overall genetic variability that is not simply the result of microsite adaptations. For restoration activities, the implication is that small differences in seed source could lead to large differences in local genetic structure after selection.

  18. Genetic dissimilarity of putative gamma-ray-induced 'Preciosa-AAAB-Pome type' banana (Musa sp) mutants based on multivariate statistical analysis.

    Science.gov (United States)

    Pestana, R K N; Amorim, E P; Ferreira, C F; Amorim, V B O; Oliveira, L S; Ledo, C A S; Silva, S O

    2011-10-25

    Bananas are among the most important fruit crops worldwide, being cultivated in more than 120 countries, mainly by small-scale producers. However, short-stature high-yielding bananas presenting good agronomic characteristics are hard to find. Consequently, wind continues to damage a great number of plantations each year, leading to lodging of plants and bunch loss. Development of new cultivars through conventional genetic breeding methods is hindered by female sterility and the low number of seeds. Mutation induction seems to have great potential for the development of new cultivars. We evaluated genetic dissimilarity among putative 'Preciosa' banana mutants generated by gamma-ray irradiation, using morphoagronomic characteristics and ISSR markers. The genetic distances between the putative 'Preciosa' mutants varied from 0.21 to 0.66, with a cophenetic correlation coefficient of 0.8064. We found good variability after irradiation of 'Preciosa' bananas; this procedure could be useful for banana breeding programs aimed at developing short-stature varieties with good agronomic characteristics.

  19. Data analysis using the Gnu R system for statistical computation

    Energy Technology Data Exchange (ETDEWEB)

    Simone, James; /Fermilab

    2011-07-01

    R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.

  20. Application of a statistical thermal design procedure to evaluate the PWR DNBR safety analysis limits

    International Nuclear Information System (INIS)

    Robeyns, J.; Parmentier, F.; Peeters, G.

    2001-01-01

    In the framework of safety analysis for the Belgian nuclear power plants and for the reload compatibility studies, Tractebel Energy Engineering (TEE) has developed, to define a 95/95 DNBR criterion, a statistical thermal design method based on the analytical full statistical approach: the Statistical Thermal Design Procedure (STDP). In that methodology, each DNBR value in the core assemblies is calculated with an adapted CHF (Critical Heat Flux) correlation implemented in the sub-channel code Cobra for core thermal hydraulic analysis. The uncertainties of the correlation are represented by the statistical parameters calculated from an experimental database. The main objective of a sub-channel analysis is to prove that in all class 1 and class 2 situations, the minimum DNBR (Departure from Nucleate Boiling Ratio) remains higher than the Safety Analysis Limit (SAL). The SAL value is calculated from the Statistical Design Limit (SDL) value adjusted with some penalties and deterministic factors. The search of a realistic value for the SDL is the objective of the statistical thermal design methods. In this report, we apply a full statistical approach to define the DNBR criterion or SDL (Statistical Design Limit) with the strict observance of the design criteria defined in the Standard Review Plan. The same statistical approach is used to define the expected number of rods experiencing DNB. (author)

  1. Analytical and statistical analysis of elemental composition of lichens

    International Nuclear Information System (INIS)

    Calvelo, S.; Baccala, N.; Bubach, D.; Arribere, M.A.; Riberio Guevara, S.

    1997-01-01

    The elemental composition of lichens from remote southern South America regions has been studied with analytical and statistical techniques to determine if the values obtained reflect species, growth forms or habitat characteristics. The enrichment factors are calculated discriminated by species and collection site and compared with data available in the literature. The elemental concentrations are standardized and compared for different species. The information was statistically processed, a cluster analysis was performed using the three first principal axes of the PCA; the three groups formed are presented. Their relationship with the species, collection sites and the lichen growth forms are interpreted. (author)

  2. New application of intelligent agents in sporadic amyotrophic lateral sclerosis identifies unexpected specific genetic background

    Directory of Open Access Journals (Sweden)

    Marocchi Alessandro

    2008-05-01

    Full Text Available Abstract Background Few genetic factors predisposing to the sporadic form of amyotrophic lateral sclerosis (ALS have been identified, but the pathology itself seems to be a true multifactorial disease in which complex interactions between environmental and genetic susceptibility factors take place. The purpose of this study was to approach genetic data with an innovative statistical method such as artificial neural networks to identify a possible genetic background predisposing to the disease. A DNA multiarray panel was applied to genotype more than 60 polymorphisms within 35 genes selected from pathways of lipid and homocysteine metabolism, regulation of blood pressure, coagulation, inflammation, cellular adhesion and matrix integrity, in 54 sporadic ALS patients and 208 controls. Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis Results Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis. An unexpected discovery of a strong genetic background in sporadic ALS using a DNA multiarray panel and analytical processing of the data with advanced artificial neural networks was found. The predictive accuracy obtained with Linear Discriminant Analysis and Standard Artificial Neural Networks ranged from 70% to 79% (average 75.31% and from 69.1 to 86.2% (average 76.6% respectively. The corresponding value obtained with Advanced Intelligent Systems reached an average of 96.0% (range 94.4 to 97.6%. This latter approach allowed the identification of seven genetic variants essential to differentiate cases from controls: apolipoprotein E arg

  3. New application of intelligent agents in sporadic amyotrophic lateral sclerosis identifies unexpected specific genetic background.

    Science.gov (United States)

    Penco, Silvana; Buscema, Massimo; Patrosso, Maria Cristina; Marocchi, Alessandro; Grossi, Enzo

    2008-05-30

    Few genetic factors predisposing to the sporadic form of amyotrophic lateral sclerosis (ALS) have been identified, but the pathology itself seems to be a true multifactorial disease in which complex interactions between environmental and genetic susceptibility factors take place. The purpose of this study was to approach genetic data with an innovative statistical method such as artificial neural networks to identify a possible genetic background predisposing to the disease. A DNA multiarray panel was applied to genotype more than 60 polymorphisms within 35 genes selected from pathways of lipid and homocysteine metabolism, regulation of blood pressure, coagulation, inflammation, cellular adhesion and matrix integrity, in 54 sporadic ALS patients and 208 controls. Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis. An unexpected discovery of a strong genetic background in sporadic ALS using a DNA multiarray panel and analytical processing of the data with advanced artificial neural networks was found. The predictive accuracy obtained with Linear Discriminant Analysis and Standard Artificial Neural Networks ranged from 70% to 79% (average 75.31%) and from 69.1 to 86.2% (average 76.6%) respectively. The corresponding value obtained with Advanced Intelligent Systems reached an average of 96.0% (range 94.4 to 97.6%). This latter approach allowed the identification of seven genetic variants essential to differentiate cases from controls: apolipoprotein E arg158cys; hepatic lipase -480 C/T; endothelial

  4. Integrating eQTL data with GWAS summary statistics in pathway-based analysis with application to schizophrenia.

    Science.gov (United States)

    Wu, Chong; Pan, Wei

    2018-04-01

    Many genetic variants affect complex traits through gene expression, which can be exploited to boost statistical power and enhance interpretation in genome-wide association studies (GWASs) as demonstrated by the transcriptome-wide association study (TWAS) approach. Furthermore, due to polygenic inheritance, a complex trait is often affected by multiple genes with similar functions as annotated in gene pathways. Here, we extend TWAS from gene-based analysis to pathway-based analysis: we integrate public pathway collections, expression quantitative trait locus (eQTL) data and GWAS summary association statistics (or GWAS individual-level data) to identify gene pathways associated with complex traits. The basic idea is to weight the SNPs of the genes in a pathway based on their estimated cis-effects on gene expression, then adaptively test for association of the pathway with a GWAS trait by effectively aggregating possibly weak association signals across the genes in the pathway. The P values can be calculated analytically and thus fast. We applied our proposed test with the KEGG and GO pathways to two schizophrenia (SCZ) GWAS summary association data sets, denoted by SCZ1 and SCZ2 with about 20,000 and 150,000 subjects, respectively. Most of the significant pathways identified by analyzing the SCZ1 data were reproduced by the SCZ2 data. Importantly, we identified 15 novel pathways associated with SCZ, such as GABA receptor complex (GO:1902710), which could not be uncovered by the standard single SNP-based analysis or gene-based TWAS. The newly identified pathways may help us gain insights into the biological mechanism underlying SCZ. Our results showcase the power of incorporating gene expression information and gene functional annotations into pathway-based association testing for GWAS. © 2018 WILEY PERIODICALS, INC.

  5. The score statistic of the LD-lod analysis: detecting linkage adaptive to linkage disequilibrium.

    Science.gov (United States)

    Huang, J; Jiang, Y

    2001-01-01

    We study the properties of a modified lod score method for testing linkage that incorporates linkage disequilibrium (LD-lod). By examination of its score statistic, we show that the LD-lod score method adaptively combines two sources of information: (a) the IBD sharing score which is informative for linkage regardless of the existence of LD and (b) the contrast between allele-specific IBD sharing scores which is informative for linkage only in the presence of LD. We also consider the connection between the LD-lod score method and the transmission-disequilibrium test (TDT) for triad data and the mean test for affected sib pair (ASP) data. We show that, for triad data, the recessive LD-lod test is asymptotically equivalent to the TDT; and for ASP data, it is an adaptive combination of the TDT and the ASP mean test. We demonstrate that the LD-lod score method has relatively good statistical efficiency in comparison with the ASP mean test and the TDT for a broad range of LD and the genetic models considered in this report. Therefore, the LD-lod score method is an interesting approach for detecting linkage when the extent of LD is unknown, such as in a genome-wide screen with a dense set of genetic markers. Copyright 2001 S. Karger AG, Basel

  6. The Fusion of Financial Analysis and Seismology: Statistical Methods from Financial Market Analysis Applied to Earthquake Data

    Science.gov (United States)

    Ohyanagi, S.; Dileonardo, C.

    2013-12-01

    As a natural phenomenon earthquake occurrence is difficult to predict. Statistical analysis of earthquake data was performed using candlestick chart and Bollinger Band methods. These statistical methods, commonly used in the financial world to analyze market trends were tested against earthquake data. Earthquakes above Mw 4.0 located on shore of Sanriku (37.75°N ~ 41.00°N, 143.00°E ~ 144.50°E) from February 1973 to May 2013 were selected for analysis. Two specific patterns in earthquake occurrence were recognized through the analysis. One is a spread of candlestick prior to the occurrence of events greater than Mw 6.0. A second pattern shows convergence in the Bollinger Band, which implies a positive or negative change in the trend of earthquakes. Both patterns match general models for the buildup and release of strain through the earthquake cycle, and agree with both the characteristics of the candlestick chart and Bollinger Band analysis. These results show there is a high correlation between patterns in earthquake occurrence and trend analysis by these two statistical methods. The results of this study agree with the appropriateness of the application of these financial analysis methods to the analysis of earthquake occurrence.

  7. Parametric analysis of the statistical model of the stick-slip process

    Science.gov (United States)

    Lima, Roberta; Sampaio, Rubens

    2017-06-01

    In this paper it is performed a parametric analysis of the statistical model of the response of a dry-friction oscillator. The oscillator is a spring-mass system which moves over a base with a rough surface. Due to this roughness, the mass is subject to a dry-frictional force modeled as a Coulomb friction. The system is stochastically excited by an imposed bang-bang base motion. The base velocity is modeled by a Poisson process for which a probabilistic model is fully specified. The excitation induces in the system stochastic stick-slip oscillations. The system response is composed by a random sequence alternating stick and slip-modes. With realizations of the system, a statistical model is constructed for this sequence. In this statistical model, the variables of interest of the sequence are modeled as random variables, as for example, the number of time intervals in which stick or slip occur, the instants at which they begin, and their duration. Samples of the system response are computed by integration of the dynamic equation of the system using independent samples of the base motion. Statistics and histograms of the random variables which characterize the stick-slip process are estimated for the generated samples. The objective of the paper is to analyze how these estimated statistics and histograms vary with the system parameters, i.e., to make a parametric analysis of the statistical model of the stick-slip process.

  8. Introduction to applied statistical signal analysis guide to biomedical and electrical engineering applications

    CERN Document Server

    Shiavi, Richard

    2007-01-01

    Introduction to Applied Statistical Signal Analysis is designed for the experienced individual with a basic background in mathematics, science, and computer. With this predisposed knowledge, the reader will coast through the practical introduction and move on to signal analysis techniques, commonly used in a broad range of engineering areas such as biomedical engineering, communications, geophysics, and speech.Introduction to Applied Statistical Signal Analysis intertwines theory and implementation with practical examples and exercises. Topics presented in detail include: mathematical

  9. Visual and statistical analysis of {sup 18}F-FDG PET in primary progressive aphasia

    Energy Technology Data Exchange (ETDEWEB)

    Matias-Guiu, Jordi A.; Moreno-Ramos, Teresa; Garcia-Ramos, Rocio; Fernandez-Matarrubia, Marta; Oreja-Guevara, Celia; Matias-Guiu, Jorge [Hospital Clinico San Carlos, Department of Neurology, Madrid (Spain); Cabrera-Martin, Maria Nieves; Perez-Castejon, Maria Jesus; Rodriguez-Rey, Cristina; Ortega-Candil, Aida; Carreras, Jose Luis [San Carlos Health Research Institute (IdISSC) Complutense University of Madrid, Department of Nuclear Medicine, Hospital Clinico San Carlos, Madrid (Spain)

    2015-05-01

    Diagnosing progressive primary aphasia (PPA) and its variants is of great clinical importance, and fluorodeoxyglucose (FDG) positron emission tomography (PET) may be a useful diagnostic technique. The purpose of this study was to evaluate interobserver variability in the interpretation of FDG PET images in PPA as well as the diagnostic sensitivity and specificity of the technique. We also aimed to compare visual and statistical analyses of these images. There were 10 raters who analysed 44 FDG PET scans from 33 PPA patients and 11 controls. Five raters analysed the images visually, while the other five used maps created using Statistical Parametric Mapping software. Two spatial normalization procedures were performed: global mean normalization and cerebellar normalization. Clinical diagnosis was considered the gold standard. Inter-rater concordance was moderate for visual analysis (Fleiss' kappa 0.568) and substantial for statistical analysis (kappa 0.756-0.881). Agreement was good for all three variants of PPA except for the nonfluent/agrammatic variant studied with visual analysis. The sensitivity and specificity of each rater's diagnosis of PPA was high, averaging 87.8 and 89.9 % for visual analysis and 96.9 and 90.9 % for statistical analysis using global mean normalization, respectively. In cerebellar normalization, sensitivity was 88.9 % and specificity 100 %. FDG PET demonstrated high diagnostic accuracy for the diagnosis of PPA and its variants. Inter-rater concordance was higher for statistical analysis, especially for the nonfluent/agrammatic variant. These data support the use of FDG PET to evaluate patients with PPA and show that statistical analysis methods are particularly useful for identifying the nonfluent/agrammatic variant of PPA. (orig.)

  10. PVeStA: A Parallel Statistical Model Checking and Quantitative Analysis Tool

    KAUST Repository

    AlTurki, Musab

    2011-01-01

    Statistical model checking is an attractive formal analysis method for probabilistic systems such as, for example, cyber-physical systems which are often probabilistic in nature. This paper is about drastically increasing the scalability of statistical model checking, and making such scalability of analysis available to tools like Maude, where probabilistic systems can be specified at a high level as probabilistic rewrite theories. It presents PVeStA, an extension and parallelization of the VeStA statistical model checking tool [10]. PVeStA supports statistical model checking of probabilistic real-time systems specified as either: (i) discrete or continuous Markov Chains; or (ii) probabilistic rewrite theories in Maude. Furthermore, the properties that it can model check can be expressed in either: (i) PCTL/CSL, or (ii) the QuaTEx quantitative temporal logic. As our experiments show, the performance gains obtained from parallelization can be very high. © 2011 Springer-Verlag.

  11. Machine learning patterns for neuroimaging-genetic studies in the cloud.

    Science.gov (United States)

    Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand

    2014-01-01

    Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.

  12. Statistical analysis of extreme values from insurance, finance, hydrology and other fields

    CERN Document Server

    Reiss, Rolf-Dieter

    1997-01-01

    The statistical analysis of extreme data is important for various disciplines, including hydrology, insurance, finance, engineering and environmental sciences. This book provides a self-contained introduction to the parametric modeling, exploratory analysis and statistical interference for extreme values. The entire text of this third edition has been thoroughly updated and rearranged to meet the new requirements. Additional sections and chapters, elaborated on more than 100 pages, are particularly concerned with topics like dependencies, the conditional analysis and the multivariate modeling of extreme data. Parts I–III about the basic extreme value methodology remain unchanged to some larger extent, yet notable are, e.g., the new sections about "An Overview of Reduced-Bias Estimation" (co-authored by M.I. Gomes), "The Spectral Decomposition Methodology", and "About Tail Independence" (co-authored by M. Frick), and the new chapter about "Extreme Value Statistics of Dependent Random Variables" (co-authored ...

  13. Power flow as a complement to statistical energy analysis and finite element analysis

    Science.gov (United States)

    Cuschieri, J. M.

    1987-01-01

    Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.

  14. Genetic analysis of floating Enteromorpha prolifera in the Yellow Sea with AFLP marker

    Science.gov (United States)

    Liu, Cui; Zhang, Jing; Sun, Xiaoyu; Li, Jian; Zhang, Xi; Liu, Tao

    2011-09-01

    Extremely large accumulation of green algae Enteromorpha prolifera floated along China' coastal region of the Yellow Sea ever since the summer of 2008. Amplified Fragment Length Polymorphism (AFLP) analysis was applied to assess the genetic diversity and relationships among E. prolifera samples collected from 9 affected areas of the Yellow Sea. Two hundred reproducible fragments were generated with 8 AFLP primer combinations, of which 194 (97%) were polymorphic. The average Nei's genetic diversity, the coefficiency of genetic differentiation (Gst), and the average gene flow estimated from Gst in the 9 populations were 0.4018, 0.6404 and 0.2807 respectively. Cluster analysis based on the unweighed pair group method with arithmetic averages (UPGMA) showed that the genetic relationships within one population or among different populations were all related to their collecting locations and sampling time. Large genetic differentiation was detected among the populations. The E. prolifera originated from different areas and were undergoing a course of mixing.

  15. Latent spatial models and sampling design for landscape genetics

    Science.gov (United States)

    Hanks, Ephraim M.; Hooten, Mevin B.; Knick, Steven T.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Cross, Todd B.; Schwartz, Michael K.

    2016-01-01

    We propose a spatially-explicit approach for modeling genetic variation across space and illustrate how this approach can be used to optimize spatial prediction and sampling design for landscape genetic data. We propose a multinomial data model for categorical microsatellite allele data commonly used in landscape genetic studies, and introduce a latent spatial random effect to allow for spatial correlation between genetic observations. We illustrate how modern dimension reduction approaches to spatial statistics can allow for efficient computation in landscape genetic statistical models covering large spatial domains. We apply our approach to propose a retrospective spatial sampling design for greater sage-grouse (Centrocercus urophasianus) population genetics in the western United States.

  16. A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.

    Directory of Open Access Journals (Sweden)

    Jing Qian

    Full Text Available Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs.We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT, to identify protein-coding gene association with 14 cardiometabolic (CMD related traits across 6 publicly available genome wide association (GWA meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1.We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes.We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and

  17. A Genetic Analysis of Mortality in Pigs

    DEFF Research Database (Denmark)

    Varona, Luis; Sorensen, Daniel

    2010-01-01

    to investigate whether there is support for genetic variation for mortality and to study the quality of fit and predictive properties of the various models. In both breeds, the model that provided the best fit to the data was the standard binomial hierarchical model. The model that performed best in terms......An analysis of mortality is undertaken in two breeds of pigs: Danish Landrace and Yorkshire. Zero-inflated and standard versions of hierarchical Poisson, binomial, and negative binomial Bayesian models were fitted using Markov chain Monte Carlo (MCMC). The objectives of the study were...... of the ability to predict the distribution of stillbirths was the hierarchical zero-inflated negative binomial model. The best fit of the binomial hierarchical model and of the zero-inflated hierarchical negative binomial model was obtained when genetic variation was included as a parameter. For the hierarchical...

  18. Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems

    Science.gov (United States)

    He, Yuning; Davies, Misty Dawn

    2014-01-01

    The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.

  19. Validation of statistical models for creep rupture by parametric analysis

    Energy Technology Data Exchange (ETDEWEB)

    Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)

    2012-01-15

    Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).

  20. EvolQG - An R package for evolutionary quantitative genetics [version 2; referees: 1 approved, 2 approved with reservations

    Directory of Open Access Journals (Sweden)

    Diogo Melo

    2016-06-01

    Full Text Available We present an open source package for performing evolutionary quantitative genetics analyses in the R environment for statistical computing. Evolutionary theory shows that evolution depends critically on the available variation in a given population. When dealing with many quantitative traits this variation is expressed in the form of a covariance matrix, particularly the additive genetic covariance matrix or sometimes the phenotypic matrix, when the genetic matrix is unavailable and there is evidence the phenotypic matrix is sufficiently similar to the genetic matrix. Given this mathematical representation of available variation, the EvolQG package provides functions for calculation of relevant evolutionary statistics; estimation of sampling error; corrections for this error; matrix comparison via correlations, distances and matrix decomposition; analysis of modularity patterns; and functions for testing evolutionary hypotheses on taxa diversification.

  1. The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective.

    Science.gov (United States)

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.

  2. Molecular genetic analysis of activation-tagged transcription factors thought to be involved in photomorphogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Neff, Michael M.

    2011-06-23

    This is a final report for Department of Energy Grant No. DE-FG02-08ER15927 entitled “Molecular Genetic Analysis of Activation-Tagged Transcription Factors Thought to be Involved in Photomorphogenesis”. Based on our preliminary photobiological and genetic analysis of the sob1-D mutant, we hypothesized that OBP3 is a transcription factor involved in both phytochrome and cryptochrome-mediated signal transduction. In addition, we hypothesized that OBP3 is involved in auxin signaling and root development. Based on our preliminary photobiological and genetic analysis of the sob2-D mutant, we also hypothesized that a related gene, LEP, is involved in hormone signaling and seedling development.

  3. Analysis of the Threat of Genetically Modified Organisms for Biological Warfare

    Science.gov (United States)

    2011-05-01

    biological warfare. The primary focus of the framework are those aspects of the technology directly affecting humans by inducing virulent infectious disease...applications. Simple organisms such as fruit flies have been used to study the effects of genetic changes across generations. Transgenic mice are...Analysis * Multi-cell pathogens * Toxins (Chemical products of living cells.) * Fungi (Robust organism; no genetic manipulation needed

  4. Statistical analysis of solar proton events

    Directory of Open Access Journals (Sweden)

    V. Kurt

    2004-06-01

    Full Text Available A new catalogue of 253 solar proton events (SPEs with energy >10MeV and peak intensity >10 protons/cm2.s.sr (pfu at the Earth's orbit for three complete 11-year solar cycles (1970-2002 is given. A statistical analysis of this data set of SPEs and their associated flares that occurred during this time period is presented. It is outlined that 231 of these proton events are flare related and only 22 of them are not associated with Ha flares. It is also noteworthy that 42 of these events are registered as Ground Level Enhancements (GLEs in neutron monitors. The longitudinal distribution of the associated flares shows that a great number of these events are connected with west flares. This analysis enables one to understand the long-term dependence of the SPEs and the related flare characteristics on the solar cycle which are useful for space weather prediction.

  5. STATISTICAL ANALYSIS OF THE HEAVY NEUTRAL ATOMS MEASURED BY IBEX

    International Nuclear Information System (INIS)

    Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard; Galli, André; Livadiotis, George; Fuselier, Steve A.; McComas, David J.

    2015-01-01

    We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O and Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O and Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath

  6. The Information Content of Discrete Functions and Their Application in Genetic Data Analysis.

    Science.gov (United States)

    Sakhanenko, Nikita A; Kunert-Graf, James; Galas, David J

    2017-12-01

    The complex of central problems in data analysis consists of three components: (1) detecting the dependence of variables using quantitative measures, (2) defining the significance of these dependence measures, and (3) inferring the functional relationships among dependent variables. We have argued previously that an information theory approach allows separation of the detection problem from the inference of functional form problem. We approach here the third component of inferring functional forms based on information encoded in the functions. We present here a direct method for classifying the functional forms of discrete functions of three variables represented in data sets. Discrete variables are frequently encountered in data analysis, both as the result of inherently categorical variables and from the binning of continuous numerical variables into discrete alphabets of values. The fundamental question of how much information is contained in a given function is answered for these discrete functions, and their surprisingly complex relationships are illustrated. The all-important effect of noise on the inference of function classes is found to be highly heterogeneous and reveals some unexpected patterns. We apply this classification approach to an important area of biological data analysis-that of inference of genetic interactions. Genetic analysis provides a rich source of real and complex biological data analysis problems, and our general methods provide an analytical basis and tools for characterizing genetic problems and for analyzing genetic data. We illustrate the functional description and the classes of a number of common genetic interaction modes and also show how different modes vary widely in their sensitivity to noise.

  7. Explorations in statistics: the analysis of ratios and normalized data.

    Science.gov (United States)

    Curran-Everett, Douglas

    2013-09-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of Explorations in Statistics explores the analysis of ratios and normalized-or standardized-data. As researchers, we compute a ratio-a numerator divided by a denominator-to compute a proportion for some biological response or to derive some standardized variable. In each situation, we want to control for differences in the denominator when the thing we really care about is the numerator. But there is peril lurking in a ratio: only if the relationship between numerator and denominator is a straight line through the origin will the ratio be meaningful. If not, the ratio will misrepresent the true relationship between numerator and denominator. In contrast, regression techniques-these include analysis of covariance-are versatile: they can accommodate an analysis of the relationship between numerator and denominator when a ratio is useless.

  8. Parametric statistical change point analysis

    CERN Document Server

    Chen, Jie

    2000-01-01

    This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study

  9. Perceptual and statistical analysis of cardiac phase and amplitude images

    International Nuclear Information System (INIS)

    Houston, A.; Craig, A.

    1991-01-01

    A perceptual experiment was conducted using cardiac phase and amplitude images. Estimates of statistical parameters were derived from the images and the diagnostic potential of human and statistical decisions compared. Five methods were used to generate the images from 75 gated cardiac studies, 39 of which were classified as pathological. The images were presented to 12 observers experienced in nuclear medicine. The observers rated the images using a five-category scale based on their confidence of an abnormality presenting. Circular and linear statistics were used to analyse phase and amplitude image data, respectively. Estimates of mean, standard deviation (SD), skewness, kurtosis and the first term of the spatial correlation function were evaluated in the region of the left ventricle. A receiver operating characteristic analysis was performed on both sets of data and the human and statistical decisions compared. For phase images, circular SD was shown to discriminate better between normal and abnormal than experienced observers, but no single statistic discriminated as well as the human observer for amplitude images. (orig.)

  10. SecureMA: protecting participant privacy in genetic association meta-analysis.

    Science.gov (United States)

    Xie, Wei; Kantarcioglu, Murat; Bush, William S; Crawford, Dana; Denny, Joshua C; Heatherly, Raymond; Malin, Bradley A

    2014-12-01

    Sharing genomic data is crucial to support scientific investigation such as genome-wide association studies. However, recent investigations suggest the privacy of the individual participants in these studies can be compromised, leading to serious concerns and consequences, such as overly restricted access to data. We introduce a novel cryptographic strategy to securely perform meta-analysis for genetic association studies in large consortia. Our methodology is useful for supporting joint studies among disparate data sites, where privacy or confidentiality is of concern. We validate our method using three multisite association studies. Our research shows that genetic associations can be analyzed efficiently and accurately across substudy sites, without leaking information on individual participants and site-level association summaries. Our software for secure meta-analysis of genetic association studies, SecureMA, is publicly available at http://github.com/XieConnect/SecureMA. Our customized secure computation framework is also publicly available at http://github.com/XieConnect/CircuitService. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Distribution of lod scores in oligogenic linkage analysis.

    Science.gov (United States)

    Williams, J T; North, K E; Martin, L J; Comuzzie, A G; Göring, H H; Blangero, J

    2001-01-01

    In variance component oligogenic linkage analysis it can happen that the residual additive genetic variance bounds to zero when estimating the effect of the ith quantitative trait locus. Using quantitative trait Q1 from the Genetic Analysis Workshop 12 simulated general population data, we compare the observed lod scores from oligogenic linkage analysis with the empirical lod score distribution under a null model of no linkage. We find that zero residual additive genetic variance in the null model alters the usual distribution of the likelihood-ratio statistic.

  12. Analysis of genetic polymorphism and genetic distance among four ...

    African Journals Online (AJOL)

    use

    2011-11-21

    Nov 21, 2011 ... The genomes of 4 sheep populations {Yuanqu white Tan sheep (YWT), Baozhongchang white Tan sheep. (BWT), black Tan sheep (BT) and small-tailed Han sheep (Han)} were screened using 10 microsatellite. DNA markers to estimate the genetic diversities and genetic distances among these ...

  13. Analysis of the genetic basis of disease in the context of worldwide human relationships and migration.

    Directory of Open Access Journals (Sweden)

    Erik Corona

    2013-05-01

    Full Text Available Genetic diversity across different human populations can enhance understanding of the genetic basis of disease. We calculated the genetic risk of 102 diseases in 1,043 unrelated individuals across 51 populations of the Human Genome Diversity Panel. We found that genetic risk for type 2 diabetes and pancreatic cancer decreased as humans migrated toward East Asia. In addition, biliary liver cirrhosis, alopecia areata, bladder cancer, inflammatory bowel disease, membranous nephropathy, systemic lupus erythematosus, systemic sclerosis, ulcerative colitis, and vitiligo have undergone genetic risk differentiation. This analysis represents a large-scale attempt to characterize genetic risk differentiation in the context of migration. We anticipate that our findings will enable detailed analysis pertaining to the driving forces behind genetic risk differentiation.

  14. Statistical analysis of the count and profitability of air conditioners.

    Science.gov (United States)

    Rady, El Houssainy A; Mohamed, Salah M; Abd Elmegaly, Alaa A

    2018-08-01

    This article presents the statistical analysis of the number and profitability of air conditioners in an Egyptian company. Checking the same distribution for each categorical variable has been made using Kruskal-Wallis test.

  15. Statistical analysis of subjective preferences for video enhancement

    Science.gov (United States)

    Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli

    2010-02-01

    Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.

  16. Genetic Diversity Analysis of Iranian Jujube Ecotypes (Ziziphus spp. Using RAPD Molecular Marker

    Directory of Open Access Journals (Sweden)

    S Abbasi

    2012-12-01

    Full Text Available Jujube (Ziziphus jujuba Mill. is a valuable medicinal plant which is important in Iranian traditional medicines. Although the regional plants such as jujube play an important role in our economy, but they are forgotten in research and technology. Considering the economic and medicinal importance of jujube, the first step in breeding programs is determination of the genetic diversity among the individuals. 34 ecotypes of jujube, which have been collected from eight provinces of Iran, were used in this study. The genetic relationships of Iranian jujube ecotypes were analyzed using Random Amplified Polymorphic DNA (RAPD marker. Six out of 15 random decamer primers applied for RAPD analysis, showed an informative polymorphism. According to clustering analysis using UPGMA's methods, the ecotypes were classified into two major groups at the 0.81 level of genetic similarity. The highest value of similarity coefficient (0.92 was detected between Mazandaran and Golestan ecotypes and the most genetic diversity was observed in ecotypes of Khorasan-Jonoubi. The affinity of Khorasan-Jonoubi and Esfahan ecotypes indicated a possible common origin for the variation in these areas. Results indicated that RAPD analysis could be successfully used for the estimation of genetic diversity among Ziziphus ecotypes and it can be useful for further investigations.

  17. Statistical Analysis of the Exchange Rate of Bitcoin.

    Science.gov (United States)

    Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen

    2015-01-01

    Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.

  18. Statistical analysis and Monte Carlo simulation of growing self-avoiding walks on percolation

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Yuxia [Department of Physics, Wuhan University, Wuhan 430072 (China); Sang Jianping [Department of Physics, Wuhan University, Wuhan 430072 (China); Department of Physics, Jianghan University, Wuhan 430056 (China); Zou Xianwu [Department of Physics, Wuhan University, Wuhan 430072 (China)]. E-mail: xwzou@whu.edu.cn; Jin Zhunzhi [Department of Physics, Wuhan University, Wuhan 430072 (China)

    2005-09-26

    The two-dimensional growing self-avoiding walk on percolation was investigated by statistical analysis and Monte Carlo simulation. We obtained the expression of the mean square displacement and effective exponent as functions of time and percolation probability by statistical analysis and made a comparison with simulations. We got a reduced time to scale the motion of walkers in growing self-avoiding walks on regular and percolation lattices.

  19. General specifications for the development of a USL NASA PC R and D statistical analysis support package

    Science.gov (United States)

    Dominick, Wayne D. (Editor); Bassari, Jinous; Triantafyllopoulos, Spiros

    1984-01-01

    The University of Southwestern Louisiana (USL) NASA PC R and D statistical analysis support package is designed to be a three-level package to allow statistical analysis for a variety of applications within the USL Data Base Management System (DBMS) contract work. The design addresses usage of the statistical facilities as a library package, as an interactive statistical analysis system, and as a batch processing package.

  20. Genetic analysis for two italian siblings with usher syndrome and schizophrenia.

    Science.gov (United States)

    Domanico, Daniela; Fragiotta, Serena; Trabucco, Paolo; Nebbioso, Marcella; Vingolo, Enzo Maria

    2012-01-01

    Usher syndrome is a group of autosomal recessive genetic disorders characterized by deafness, retinitis pigmentosa, and sometimes vestibular areflexia. The relationship between Usher syndrome and mental disorders, most commonly a "schizophrenia-like" psychosis, is sometimes described in the literature. The etiology of psychiatric expression of Usher syndrome is still unclear. We reported a case of two natural siblings with congenital hypoacusis, retinitis pigmentosa, and psychiatric symptoms. Clinical features and genetic analysis were also reported. We analyzed possible causes to explain the high prevalence of psychiatric manifestations in Usher syndrome: genetic factors, brain damage, and "stress-related" hypothesis.

  1. Genetic Analysis for Two Italian Siblings with Usher Syndrome and Schizophrenia

    Directory of Open Access Journals (Sweden)

    Daniela Domanico

    2012-01-01

    Full Text Available Usher syndrome is a group of autosomal recessive genetic disorders characterized by deafness, retinitis pigmentosa, and sometimes vestibular areflexia. The relationship between Usher syndrome and mental disorders, most commonly a “schizophrenia-like” psychosis, is sometimes described in the literature. The etiology of psychiatric expression of Usher syndrome is still unclear. We reported a case of two natural siblings with congenital hypoacusis, retinitis pigmentosa, and psychiatric symptoms. Clinical features and genetic analysis were also reported. We analyzed possible causes to explain the high prevalence of psychiatric manifestations in Usher syndrome: genetic factors, brain damage, and “stress-related” hypothesis.

  2. A method for statistical steady state thermal analysis of reactor cores

    International Nuclear Information System (INIS)

    Whetton, P.A.

    1981-01-01

    In a previous publication the author presented a method for undertaking statistical steady state thermal analyses of reactor cores. The present paper extends the technique to an assessment of confidence limits for the resulting probability functions which define the probability that a given thermal response value will be exceeded in a reactor core. Establishing such confidence limits is considered an integral part of any statistical thermal analysis and essential if such analysis are to be considered in any regulatory process. In certain applications the use of a best estimate probability function may be justifiable but it is recognised that a demonstrably conservative probability function is required for any regulatory considerations. (orig.)

  3. A statistical test for outlier identification in data envelopment analysis

    Directory of Open Access Journals (Sweden)

    Morteza Khodabin

    2010-09-01

    Full Text Available In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the presented method, each observation is deleted from the sample once and the resulting linear program is solved, leading to a distribution of efficiency estimates. Based on the achieved distribution, a pared test is designed to identify the potential outlier(s. We illustrate the method through a real data set. The method could be used in a first step, as an exploratory data analysis, before using any frontier estimation.

  4. Longitudinal Analysis of Genetic Susceptibility and BMI Throughout Adult Life.

    Science.gov (United States)

    Song, Mingyang; Zheng, Yan; Qi, Lu; Hu, Frank B; Chan, Andrew T; Giovannucci, Edward L

    2018-02-01

    Little is known about the genetic influence on BMI trajectory throughout adulthood. We created a genetic risk score (GRS) comprising 97 adult BMI-associated variants among 9,971 women and 6,405 men of European ancestry. Serial measures of BMI were assessed from 18 (women) or 21 (men) years to 85 years of age. We also examined BMI change in early (from 18 or 21 to 45 years of age), middle (from 45 to 65 years of age), and late adulthood (from 65 to 80 years of age). GRS was positively associated with BMI across all ages, with stronger associations in women than in men. The associations increased from early to middle adulthood, peaked at 45 years of age in men and at 60 years of age in women (0.91 and 1.35 kg/m 2 per 10-allele increment, respectively) and subsequently declined in late adulthood. For women, each 10-allele increment in the GRS was associated with an average BMI gain of 0.54 kg/m 2 in early adulthood, whereas no statistically significant association was found for BMI change in middle or late adulthood or for BMI change in any life period in men. Our findings indicate that genetic predisposition exerts a persistent effect on adiposity throughout adult life and increases early adulthood weight gain in women. © 2017 by the American Diabetes Association.

  5. Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis

    Science.gov (United States)

    Konrad, T. G.; Kropfli, R. A.

    1975-01-01

    Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.

  6. Genetics Home Reference: arterial tortuosity syndrome

    Science.gov (United States)

    ... arteries are fixed, the extra length twists and curves. Other blood vessel abnormalities that may occur in ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  7. Genetics Home Reference: 3-M syndrome

    Science.gov (United States)

    ... such as a rounded upper back that also curves to the side (kyphoscoliosis) or exaggerated curvature of ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  8. Genetics Home Reference: congenital contractural arachnodactyly

    Science.gov (United States)

    ... underdeveloped muscles, a rounded upper back that also curves to the side ( kyphoscoliosis ), permanently bent fingers and ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  9. Genetics Home Reference: spondylocarpotarsal synostosis syndrome

    Science.gov (United States)

    ... curved lower back ( lordosis ) and a spine that curves to the side ( scoliosis ). People with spondylocarpotarsal synostosis ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  10. Genetics Home Reference: Freeman-Sheldon syndrome

    Science.gov (United States)

    ... Affected individuals may also have a spine that curves to the side ( scoliosis ). People with Freeman-Sheldon ... Information What information about a genetic condition can statistics provide? Why are some genetic conditions more common ...

  11. Genetic diversity of different accessions of Thymus kotschyanus using RAPD marker

    Directory of Open Access Journals (Sweden)

    Ahmad Ismaili

    2014-11-01

    Full Text Available Analysis of genetic diversity is a major step for understanding evolution and breeding applications. Recent advances in the application of the polymerase chain reaction make it possible to score individuals at a large number of loci. The RAPD technique has been successfully used in a variety of taxonomic and genetic diversity studies. The genetic diversity of 18 accessions of Thymus kotschyanus collected from different districts of Iran has been reported in this study, using 30 random amplified polymorphic DNA primers. Multivariate statistical analyses including principal coordinate analysis (PCOA and cluster analysis were used to group the accessions. From 29 primers, 385 bands were scored corresponding to an average of 13.27 bands per primer with 298 bands showing polymorphism (77.40%. A dendrogram constructed based on the UPGMA clustering method revealed three major clusters. The obtained results from grouping 18 accessions of T. kotschyanus with two studied methods indicated that in the most cases the applied methods produced similar grouping results. This study revealed nearly rich genetic diversity among T. kotschyanus accessions from different regions of Iran. The results showed RAPD marker was a useful marker for genetic diversity studies of T. kotschyanus and it was indicative of geographica variations.

  12. Role of XPC, XPD, XRCC1, GSTP genetic polymorphisms and Barrett’s esophagus in a cohort of Italian subjects. A neural network analysis

    Directory of Open Access Journals (Sweden)

    Tarlarini C

    2012-08-01

    Full Text Available Claudia Tarlarini,1 Silvana Penco,1 Massimo Conio,2 Enzo Grossi3 On behalf of the Barrett Italian Study Group 1Department of Laboratory Medicine, Medical Genetics, Niguarda Ca’ Granda Hospital, Milan, Italy; 2Department of Gastroenterology, General Hospital, San Remo, Italy; 3Medical Department, Bracco Imaging SpA, Milan, ItalyBackground: Barrett’s esophagus (BE, a metaplastic premalignant disorder, represents the primary risk factor for the development of esophageal adenocarcinoma. Chronic gastroesophageal reflux disease and central obesity have been associated with BE and esophageal adenocarcinoma, but relatively little is known about the specific genes that confer susceptibility to BE carcinogenesis.Methods: A total of 74 patients with BE and 67 controls coming from six gastrointestinal Italian units were evaluated for six polymorphisms in four genes: XPC, XPD nucleotide excision repair (NER genes, XRCC1 (BER gene, and glutathione S-transferase P1. Smoking status was analyzed together with the genetic data. Statistical analysis was performed through Artificial Neural Networks.Results: Distributions of sex, smoking history, and polymorphisms among BE cases and controls did not show statistically significant differences. The r-value from linear correlation allowed us to identify possible protective factors as well as possible risk factors. The application of advanced intelligent systems allowed for the selection of a subgroup of nine variables. Artificial Neural Networks applied on the final data set reached mean global accuracy of 60%, reaching as high as 65.88%.Conclusion: We report here results from an exploratory study. Results from this study failed to find an association among the tested single nucleotide polymorphisms and BE phenotype through classical statistical methods. On the contrary, advanced intelligent systems are really able to handle the disease complexity, not treating the data with reductionist approaches unable to detect

  13. Population and genomic lessons from genetic analysis of two Indian populations.

    Science.gov (United States)

    Juyal, Garima; Mondal, Mayukh; Luisi, Pierre; Laayouni, Hafid; Sood, Ajit; Midha, Vandana; Heutink, Peter; Bertranpetit, Jaume; Thelma, B K; Casals, Ferran

    2014-10-01

    Indian demographic history includes special features such as founder effects, interpopulation segregation, complex social structure with a caste system and elevated frequency of consanguineous marriages. It also presents a higher frequency for some rare mendelian disorders and in the last two decades increased prevalence of some complex disorders. Despite the fact that India represents about one-sixth of the human population, deep genetic studies from this terrain have been scarce. In this study, we analyzed high-density genotyping and whole-exome sequencing data of a North and a South Indian population. Indian populations show higher differentiation levels than those reported between populations of other continents. In this work, we have analyzed its consequences, by specifically assessing the transferability of genetic markers from or to Indian populations. We show that there is limited genetic marker portability from available genetic resources such as HapMap or the 1,000 Genomes Project to Indian populations, which also present an excess of private rare variants. Conversely, tagSNPs show a high level of portability between the two Indian populations, in contrast to the common belief that North and South Indian populations are genetically very different. By estimating kinship from mates and consanguinity in our data from trios, we also describe different patterns of assortative mating and inbreeding in the two populations, in agreement with distinct mating preferences and social structures. In addition, this analysis has allowed us to describe genomic regions under recent adaptive selection, indicating differential adaptive histories for North and South Indian populations. Our findings highlight the importance of considering demography for design and analysis of genetic studies, as well as the need for extending human genetic variation catalogs to new populations and particularly to those with particular demographic histories.

  14. SSR Analysis of Genetic Diversity Among 192 Diploid Potato Cultivars

    Directory of Open Access Journals (Sweden)

    Xiaoyan Song

    2016-05-01

    Full Text Available In potato breeding, it is difficult to improve the traits of interest at the tetraploid level due to the tetrasomic inheritance. A promising alternative is diploid breeding. Thus it is necessary to assess the genetic diversity of diploid potato germplasm for efficient exploration and deployment of desirable traits. In this study, we used SSR markers to evaluate the genetic diversity of diploid potato cultivars. To screen polymorphic SSR markers, 55 pairs of SSR primers were employed to amplify 39 cultivars with relatively distant genetic relationships. Among them, 12 SSR markers with high polymorphism located at 12 chromosomes were chosen to evaluate the genetic diversity of 192 diploid potato cultivars. The primers produced 6 to 18 bands with an average of 8.2 bands per primer. In total, 98 bands were amplified from 192 cultivars, and 97 of them were polymorphic. Cluster analysis using UPGMA showed the genetic relationships of all accessions tested: 186 of the 192 accessions could be distinguished by only 12 pairs of SSR primers, and the 192 diploid cultivars were divided into 11 groups, and 83.3% constituted the first group. Clustering results showed relatively low genetic diversity among 192 diploid cultivars, with closer relationship at the molecular level. The results can provide molecular basis for diploid potato breeding.

  15. Diagnostic and therapeutic implications of genetic heterogeneity in myeloid neoplasms uncovered by comprehensive mutational analysis

    Directory of Open Access Journals (Sweden)

    Sarah M. Choi

    2017-01-01

    Full Text Available While growing use of comprehensive mutational analysis has led to the discovery of innumerable genetic alterations associated with various myeloid neoplasms, the under-recognized phenomenon of genetic heterogeneity within such neoplasms creates a potential for diagnostic confusion. Here, we describe two cases where expanded mutational testing led to amendment of an initial diagnosis of chronic myelogenous leukemia with subsequent altered treatment of each patient. We demonstrate the power of comprehensive testing in ensuring appropriate classification of genetically heterogeneous neoplasms, and emphasize thoughtful analysis of molecular and genetic data as an essential component of diagnosis and management.

  16. Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance

    Science.gov (United States)

    Glascock, M. D.; Neff, H.; Vaughn, K. J.

    2004-06-01

    The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.

  17. Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance

    International Nuclear Information System (INIS)

    Glascock, M. D.; Neff, H.; Vaughn, K. J.

    2004-01-01

    The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.

  18. Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance

    Energy Technology Data Exchange (ETDEWEB)

    Glascock, M. D.; Neff, H. [University of Missouri, Research Reactor Center (United States); Vaughn, K. J. [Pacific Lutheran University, Department of Anthropology (United States)

    2004-06-15

    The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.

  19. Genetic divergence of rubber tree estimated by multivariate techniques and microsatellite markers

    Directory of Open Access Journals (Sweden)

    Lígia Regina Lima Gouvêa

    2010-01-01

    Full Text Available Genetic diversity of 60 Hevea genotypes, consisting of Asiatic, Amazonian, African and IAC clones, and pertaining to the genetic breeding program of the Agronomic Institute (IAC, Brazil, was estimated. Analyses were based on phenotypic multivariate parameters and microsatellites. Five agronomic descriptors were employed in multivariate procedures, such as Standard Euclidian Distance, Tocher clustering and principal component analysis. Genetic variability among the genotypes was estimated with 68 selected polymorphic SSRs, by way of Modified Rogers Genetic Distance and UPGMA clustering. Structure software in a Bayesian approach was used in discriminating among groups. Genetic diversity was estimated through Nei's statistics. The genotypes were clustered into 12 groups according to the Tocher method, while the molecular analysis identified six groups. In the phenotypic and microsatellite analyses, the Amazonian and IAC genotypes were distributed in several groups, whereas the Asiatic were in only a few. Observed heterozygosity ranged from 0.05 to 0.96. Both high total diversity (H T' = 0.58 and high gene differentiation (Gst' = 0.61 were observed, and indicated high genetic variation among the 60 genotypes, which may be useful for breeding programs. The analyzed agronomic parameters and SSRs markers were effective in assessing genetic diversity among Hevea genotypes, besides proving to be useful for characterizing genetic variability.

  20. From sexless to sexy: Why it is time for human genetics to consider and report analyses of sex.

    Science.gov (United States)

    Powers, Matthew S; Smith, Phillip H; McKee, Sherry A; Ehringer, Marissa A

    2017-01-01

    Science has come a long way with regard to the consideration of sex differences in clinical and preclinical research, but one field remains behind the curve: human statistical genetics. The goal of this commentary is to raise awareness and discussion about how to best consider and evaluate possible sex effects in the context of large-scale human genetic studies. Over the course of this commentary, we reinforce the importance of interpreting genetic results in the context of biological sex, establish evidence that sex differences are not being considered in human statistical genetics, and discuss how best to conduct and report such analyses. Our recommendation is to run stratified analyses by sex no matter the sample size or the result and report the findings. Summary statistics from stratified analyses are helpful for meta-analyses, and patterns of sex-dependent associations may be hidden in a combined dataset. In the age of declining sequencing costs, large consortia efforts, and a number of useful control samples, it is now time for the field of human genetics to appropriately include sex in the design, analysis, and reporting of results.

  1. Statistical analysis and data management

    International Nuclear Information System (INIS)

    Anon.

    1981-01-01

    This report provides an overview of the history of the WIPP Biology Program. The recommendations of the American Institute of Biological Sciences (AIBS) for the WIPP biology program are summarized. The data sets available for statistical analyses and problems associated with these data sets are also summarized. Biological studies base maps are presented. A statistical model is presented to evaluate any correlation between climatological data and small mammal captures. No statistically significant relationship between variance in small mammal captures on Dr. Gennaro's 90m x 90m grid and precipitation records from the Duval Potash Mine were found

  2. Detecting errors in micro and trace analysis by using statistics

    DEFF Research Database (Denmark)

    Heydorn, K.

    1993-01-01

    By assigning a standard deviation to each step in an analytical method it is possible to predict the standard deviation of each analytical result obtained by this method. If the actual variability of replicate analytical results agrees with the expected, the analytical method is said...... to be in statistical control. Significant deviations between analytical results from different laboratories reveal the presence of systematic errors, and agreement between different laboratories indicate the absence of systematic errors. This statistical approach, referred to as the analysis of precision, was applied...

  3. Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

    Science.gov (United States)

    Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

    2016-04-01

    Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  4. Genetics Home Reference: juvenile primary osteoporosis

    Science.gov (United States)

    ... bones (decreased bone mineral density), which makes the bones brittle and prone to fracture. Affected individuals often have ... information about a genetic condition can statistics provide? Why are some genetic conditions more common in particular ...

  5. Genetics Home Reference: lysinuric protein intolerance

    Science.gov (United States)

    ... stature, muscle weakness, impaired immune function, and progressively brittle bones that are prone to fracture ( osteoporosis ). A lung ... information about a genetic condition can statistics provide? Why are some genetic conditions more common in particular ...

  6. Genetics Home Reference: Gaucher disease

    Science.gov (United States)

    ... 500 to 1,000 people of Ashkenazi Jewish heritage. The other forms of Gaucher disease are uncommon and do not occur more frequently in people of Ashkenazi Jewish descent. Related Information What information about a genetic condition can statistics provide? Why are some genetic ...

  7. Statistical analysis of the BOIL program in RSYST-III

    International Nuclear Information System (INIS)

    Beck, W.; Hausch, H.J.

    1978-11-01

    The paper describes a statistical analysis in the RSYST-III program system. Using the example of the BOIL program, it is shown how the effects of inaccurate input data on the output data can be discovered. The existing possibilities of data generation, data handling, and data evaluation are outlined. (orig.) [de

  8. Multivariate statistical analysis of precipitation chemistry in Northwestern Spain

    International Nuclear Information System (INIS)

    Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T.

    1993-01-01

    149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs

  9. Multivariate statistical analysis of precipitation chemistry in Northwestern Spain

    Energy Technology Data Exchange (ETDEWEB)

    Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T. (University of Santiago, Santiago (Spain). Faculty of Mathematics, Dept. of Statistics and Operations Research)

    1993-07-01

    149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs.

  10. SWToolbox: A surface-water tool-box for statistical analysis of streamflow time series

    Science.gov (United States)

    Kiang, Julie E.; Flynn, Kate; Zhai, Tong; Hummel, Paul; Granato, Gregory

    2018-03-07

    This report is a user guide for the low-flow analysis methods provided with version 1.0 of the Surface Water Toolbox (SWToolbox) computer program. The software combines functionality from two software programs—U.S. Geological Survey (USGS) SWSTAT and U.S. Environmental Protection Agency (EPA) DFLOW. Both of these programs have been used primarily for computation of critical low-flow statistics. The main analysis methods are the computation of hydrologic frequency statistics such as the 7-day minimum flow that occurs on average only once every 10 years (7Q10), computation of design flows including biologically based flows, and computation of flow-duration curves and duration hydrographs. Other annual, monthly, and seasonal statistics can also be computed. The interface facilitates retrieval of streamflow discharge data from the USGS National Water Information System and outputs text reports for a record of the analysis. Tools for graphing data and screening tests are available to assist the analyst in conducting the analysis.

  11. Statistics and Biomedical Informatics in Forensic Sciences

    Czech Academy of Sciences Publication Activity Database

    Zvárová, Jana

    2009-01-01

    Roč. 20, č. 6 (2009), s. 743-750 ISSN 1180-4009. [TIES 2007. Annual Meeting of the International Environmental Society /18./. Mikulov, 16.08.2007-20.08.2007] Institutional research plan: CEZ:AV0Z10300504 Keywords : biomedical informatics * biomedical statistics * genetic information * forensic dentistry Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.000, year: 2009

  12. Genetic diversity analysis of Jatropha curcas L. (Euphorbiaceae) based on methylation-sensitive amplification polymorphism.

    Science.gov (United States)

    Kanchanaketu, T; Sangduen, N; Toojinda, T; Hongtrakul, V

    2012-04-13

    Genetic analysis of 56 samples of Jatropha curcas L. collected from Thailand and other countries was performed using the methylation-sensitive amplification polymorphism (MSAP) technique. Nine primer combinations were used to generate MSAP fingerprints. When the data were interpreted as amplified fragment length polymorphism (AFLP) markers, 471 markers were scored. All 56 samples were classified into three major groups: γ-irradiated, non-toxic and toxic accessions. Genetic similarity among the samples was extremely high, ranging from 0.95 to 1.00, which indicated very low genetic diversity in this species. The MSAP fingerprint was further analyzed for DNA methylation polymorphisms. The results revealed differences in the DNA methylation level among the samples. However, the samples collected from saline areas and some species hybrids showed specific DNA methylation patterns. AFLP data were used, together with methylation-sensitive AFLP (MS-AFLP) data, to construct a phylogenetic tree, resulting in higher efficiency to distinguish the samples. This combined analysis separated samples previously grouped in the AFLP analysis. This analysis also distinguished some hybrids. Principal component analysis was also performed; the results confirmed the separation in the phylogenetic tree. Some polymorphic bands, involving both nucleotide and DNA methylation polymorphism, that differed between toxic and non-toxic samples were identified, cloned and sequenced. BLAST analysis of these fragments revealed differences in DNA methylation in some known genes and nucleotide polymorphism in chloroplast DNA. We conclude that MSAP is a powerful technique for the study of genetic diversity for organisms that have a narrow genetic base.

  13. Characterization of genetic diversity of native 'Ancho' chili populations of Mexico using microsatellite markers

    Directory of Open Access Journals (Sweden)

    Rocío Toledo-Aguilar

    2016-03-01

    Full Text Available 'Ancho' type chilis (Capsicum annuum L. var. annuum are an important ingredient in the traditional cuisine of Mexico and so are in high demand. It includes six native sub-types with morphological and fruit color differences. However, the genetic diversity of the set of these sub­types has not been determined. The objective of this study was to characterize the genetic diversity of native Mexican ancho chili populations using microsatellites and to determine the relationship among these populations. Twenty-four microsatellite loci were used to analyze 38 native populations of 'Ancho' chilis collected in seven states of Mexico; three populations different from the ancho type ('Piquin', 'Guajillo', and 'Chilaca' and three hybrids (Capulin, Abedul, and green pepper were included as controls. The number of alleles per locus, number and percentage of polymorphic loci, polymorphic information content (PIC, expected heterozygosity, and Wright F statistics were obtained. Moreover, an analysis of principal components and a cluster analysis were carried out. We detected 220 alleles, with an average of 9.2 alleles per locus; PIC varied between 0.07 and 1, and expected heterozygosity was between 0.36 and 0.59. Also we identified 59 unique alleles and eight alleles common to all of the populations. The F statistics revealed broad genetic differentiation among populations. Both the analysis of principal components and the cluster analysis were able to separate the populations by origin (southern, central, and northern Mexico. The broad genetic diversity detected in the native ancho chili populations of Mexico was found in greater proportion within the populations than between populations.

  14. Anomalous heat transfer modes of nanofluids: a review based on statistical analysis

    Science.gov (United States)

    2011-01-01

    This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids. PMID:21711932

  15. Anomalous heat transfer modes of nanofluids: a review based on statistical analysis

    Science.gov (United States)

    Sergis, Antonis; Hardalupas, Yannis

    2011-05-01

    This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids.

  16. Anomalous heat transfer modes of nanofluids: a review based on statistical analysis

    Directory of Open Access Journals (Sweden)

    Sergis Antonis

    2011-01-01

    Full Text Available Abstract This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids.

  17. Modeling of asphalt-rubber rotational viscosity by statistical analysis and neural networks

    Directory of Open Access Journals (Sweden)

    Luciano Pivoto Specht

    2007-03-01

    Full Text Available It is of a great importance to know binders' viscosity in order to perform handling, mixing, application processes and asphalt mixes compaction in highway surfacing. This paper presents the results of viscosity measurement in asphalt-rubber binders prepared in laboratory. The binders were prepared varying the rubber content, rubber particle size, duration and temperature of mixture, all following a statistical design plan. The statistical analysis and artificial neural networks were used to create mathematical models for prediction of the binders viscosity. The comparison between experimental data and simulated results with the generated models showed best performance of the neural networks analysis in contrast to the statistic models. The results indicated that the rubber content and duration of mixture have major influence on the observed viscosity for the considered interval of parameters variation.

  18. Genetic analysis on three South Indian sympatric hipposiderid bats (Chiroptera, Hipposideridae

    Directory of Open Access Journals (Sweden)

    Kanagaraj, C

    2010-12-01

    Full Text Available In mitochondrial DNA, variations in the sequence of 16S rRNA region were analyzed to infer the genetic relationship and population history of three sympatric hipposiderid bats, Hipposideros speoris, H. fulvus and H. ater. Based on the DNA sequence data, we observed relatively lower haplotype and higher nucleotide diversity in H. speoris than in the other two species. The pairwise comparisons of the genetic divergence inferred a genetic relationship between the three hipposiderid bats. We used haplotype sequences to construct a phylogenetic tree. Maximum parsimony and Bayesian inference analysis generated a tree with similar topology. H. fulvus and H. ater formed one cluster and H. speoris formed another cluster. Analysis of the demographic history of populations using Jajima’s D test revealed past changes in populations. Comparison of the observed distribution of pairwise differences in the nucleotides with expected sudden expansion model accepts for H. fulvus and H. ater but not for H. speoris populations.

  19. Characterizing the genetic influences on risk aversion.

    Science.gov (United States)

    Harrati, Amal

    2014-01-01

    Risk aversion has long been cited as an important factor in retirement decisions, investment behavior, and health. Some of the heterogeneity in individual risk tolerance is well understood, reflecting age gradients, wealth gradients, and similar effects, but much remains unexplained. This study explores genetic contributions to heterogeneity in risk aversion among older Americans. Using over 2 million genetic markers per individual from the U.S. Health and Retirement Study, I report results from a genome-wide association study (GWAS) on risk preferences using a sample of 10,455 adults. None of the single-nucleotide polymorphisms (SNPs) are found to be statistically significant determinants of risk preferences at levels stricter than 5 × 10(-8). These results suggest that risk aversion is a complex trait that is highly polygenic. The analysis leads to upper bounds on the number of genetic effects that could exceed certain thresholds of significance and still remain undetected at the current sample size. The findings suggest that the known heritability in risk aversion is likely to be driven by large numbers of genetic variants, each with a small effect size.

  20. Common pitfalls in statistical analysis: Odds versus risk

    Science.gov (United States)

    Ranganathan, Priya; Aggarwal, Rakesh; Pramesh, C. S.

    2015-01-01

    In biomedical research, we are often interested in quantifying the relationship between an exposure and an outcome. “Odds” and “Risk” are the most common terms which are used as measures of association between variables. In this article, which is the fourth in the series of common pitfalls in statistical analysis, we explain the meaning of risk and odds and the difference between the two. PMID:26623395