WorldWideScience

Sample records for haplotype inference problem

  1. Haplotyping Problem, A Clustering Approach

    International Nuclear Information System (INIS)

    Eslahchi, Changiz; Sadeghi, Mehdi; Pezeshk, Hamid; Kargar, Mehdi; Poormohammadi, Hadi

    2007-01-01

    Construction of two haplotypes from a set of Single Nucleotide Polymorphism (SNP) fragments is called haplotype reconstruction problem. One of the most popular computational model for this problem is Minimum Error Correction (MEC). Since MEC is an NP-hard problem, here we propose a novel heuristic algorithm based on clustering analysis in data mining for haplotype reconstruction problem. Based on hamming distance and similarity between two fragments, our iterative algorithm produces two clusters of fragments; then, in each iteration, the algorithm assigns a fragment to one of the clusters. Our results suggest that the algorithm has less reconstruction error rate in comparison with other algorithms

  2. Haplotype inference in general pedigrees with two sites

    Directory of Open Access Journals (Sweden)

    Doan Duong D

    2011-04-01

    Full Text Available Abstract Background Genetic disease studies investigate relationships between changes in chromosomes and genetic diseases. Single haplotypes provide useful information for these studies but extracting single haplotypes directly by biochemical methods is expensive. A computational method to infer haplotypes from genotype data is therefore important. We investigate the problem of computing the minimum number of recombination events for general pedigrees with two sites for all members. Results We show that this NP-hard problem can be parametrically reduced to the Bipartization by Edge Removal problem and therefore can be solved by an O(2k · n2 exact algorithm, where n is the number of members and k is the number of recombination events. Conclusions Our work can therefore be useful for genetic disease studies to track down how changes in haplotypes such as recombinations relate to genetic disease.

  3. Grouping preprocess for haplotype inference from SNP and CNV data

    International Nuclear Information System (INIS)

    Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Inoue, Masato; Kamatani, Naoyuki

    2009-01-01

    The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.

  4. Grouping preprocess for haplotype inference from SNP and CNV data

    Energy Technology Data Exchange (ETDEWEB)

    Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Inoue, Masato [Department of Electrical Engineering and Bioscience, School of Advanced Science and Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555 (Japan); Kamatani, Naoyuki, E-mail: masato.inoue@eb.waseda.ac.j [Institute of Rheumatology, Tokyo Women' s Medical University, 10-22, Kawada-cho, Shinjuku-ku, Tokyo 162-0054 (Japan)

    2009-12-01

    The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.

  5. A unified framework for haplotype inference in nuclear families.

    Science.gov (United States)

    Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong

    2012-07-01

    Many large genome-wide association studies include nuclear families with more than one child (trio families), allowing for analysis of differences between siblings (sib pair analysis). Statistical power can be increased when haplotypes are used instead of genotypes. Currently, haplotype inference in families with more than one child can be performed either using the familial information or statistical information derived from the population samples but not both. Building on our recently proposed tree-based deterministic framework (TDS) for trio families, we augment its applicability to general nuclear families. We impose a minimum recombinant approach locally and independently on each multiple children family, while resorting to the population-derived information to solve the remaining ambiguities. Thus our framework incorporates all available information (familial and population) in a given study. We demonstrate that using all the constraints in our approach we can have gains in the accuracy as opposed to breaking the multiple children families to separate trios and resorting to a trio inference algorithm or phasing each family in isolation. We believe that our proposed framework could be the method of choice for haplotype inference in studies that include nuclear families with multiple children. Our software (tds2.0) is downloadable from www.ee.columbia.edu/∼anastas/tds. © 2012 The Authors Annals of Human Genetics © 2012 Blackwell Publishing Ltd/University College London.

  6. A new mathematical modeling for pure parsimony haplotyping problem.

    Science.gov (United States)

    Feizabadi, R; Bagherian, M; Vaziri, H R; Salahi, M

    2016-11-01

    Pure parsimony haplotyping (PPH) problem is important in bioinformatics because rational haplotyping inference plays important roles in analysis of genetic data, mapping complex genetic diseases such as Alzheimer's disease, heart disorders and etc. Haplotypes and genotypes are m-length sequences. Although several integer programing models have already been presented for PPH problem, its NP-hardness characteristic resulted in ineffectiveness of those models facing the real instances especially instances with many heterozygous sites. In this paper, we assign a corresponding number to each haplotype and genotype and based on those numbers, we set a mixed integer programing model. Using numbers, instead of sequences, would lead to less complexity of the new model in comparison with previous models in a way that there are neither constraints nor variables corresponding to heterozygous nucleotide sites in it. Experimental results approve the efficiency of the new model in producing better solution in comparison to two state-of-the art haplotyping approaches. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Modeling coverage gaps in haplotype frequencies via Bayesian inference to improve stem cell donor selection.

    Science.gov (United States)

    Louzoun, Yoram; Alter, Idan; Gragert, Loren; Albrecht, Mark; Maiers, Martin

    2018-05-01

    Regardless of sampling depth, accurate genotype imputation is limited in regions of high polymorphism which often have a heavy-tailed haplotype frequency distribution. Many rare haplotypes are thus unobserved. Statistical methods to improve imputation by extending reference haplotype distributions using linkage disequilibrium patterns that relate allele and haplotype frequencies have not yet been explored. In the field of unrelated stem cell transplantation, imputation of highly polymorphic human leukocyte antigen (HLA) genes has an important application in identifying the best-matched stem cell donor when searching large registries totaling over 28,000,000 donors worldwide. Despite these large registry sizes, a significant proportion of searched patients present novel HLA haplotypes. Supporting this observation, HLA population genetic models have indicated that many extant HLA haplotypes remain unobserved. The absent haplotypes are a significant cause of error in haplotype matching. We have applied a Bayesian inference methodology for extending haplotype frequency distributions, using a model where new haplotypes are created by recombination of observed alleles. Applications of this joint probability model offer significant improvement in frequency distribution estimates over the best existing alternative methods, as we illustrate using five-locus HLA frequency data from the National Marrow Donor Program registry. Transplant matching algorithms and disease association studies involving phasing and imputation of rare variants may benefit from this statistical inference framework.

  8. Inference rule and problem solving

    Energy Technology Data Exchange (ETDEWEB)

    Goto, S

    1982-04-01

    Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.

  9. Honey bee-inspired algorithms for SNP haplotype reconstruction problem

    Science.gov (United States)

    PourkamaliAnaraki, Maryam; Sadeghi, Mehdi

    2016-03-01

    Reconstructing haplotypes from SNP fragments is an important problem in computational biology. There have been a lot of interests in this field because haplotypes have been shown to contain promising data for disease association research. It is proved that haplotype reconstruction in Minimum Error Correction model is an NP-hard problem. Therefore, several methods such as clustering techniques, evolutionary algorithms, neural networks and swarm intelligence approaches have been proposed in order to solve this problem in appropriate time. In this paper, we have focused on various evolutionary clustering techniques and try to find an efficient technique for solving haplotype reconstruction problem. It can be referred from our experiments that the clustering methods relying on the behaviour of honey bee colony in nature, specifically bees algorithm and artificial bee colony methods, are expected to result in more efficient solutions. An application program of the methods is available at the following link. http://www.bioinf.cs.ipm.ir/software/haprs/

  10. Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions

    Directory of Open Access Journals (Sweden)

    Balding David J

    2008-12-01

    Full Text Available Abstract Background The power of haplotype-based methods for association studies, identification of regions under selection, and ancestral inference, is well-established for diploid organisms. For polyploids, however, the difficulty of determining phase has limited such approaches. Polyploidy is common in plants and is also observed in animals. Partial polyploidy is sometimes observed in humans (e.g. trisomy 21; Down's syndrome, and it arises more frequently in some human tissues. Local changes in ploidy, known as copy number variations (CNV, arise throughout the genome. Here we present a method, implemented in the software polyHap, for the inference of haplotype phase and missing observations from polyploid genotypes. PolyHap allows each individual to have a different ploidy, but ploidy cannot vary over the genomic region analysed. It employs a hidden Markov model (HMM and a sampling algorithm to infer haplotypes jointly in multiple individuals and to obtain a measure of uncertainty in its inferences. Results In the simulation study, we combine real haplotype data to create artificial diploid, triploid, and tetraploid genotypes, and use these to demonstrate that polyHap performs well, in terms of both switch error rate in recovering phase and imputation error rate for missing genotypes. To our knowledge, there is no comparable software for phasing a large, densely genotyped region of chromosome from triploids and tetraploids, while for diploids we found polyHap to be more accurate than fastPhase. We also compare the results of polyHap to SATlotyper on an experimentally haplotyped tetraploid dataset of 12 SNPs, and show that polyHap is more accurate. Conclusion With the availability of large SNP data in polyploids and CNV regions, we believe that polyHap, our proposed method for inferring haplotypic phase from genotype data, will be useful in enabling researchers analysing such data to exploit the power of haplotype-based analyses.

  11. A Dynamic Programming Algorithm for the k-Haplotyping Problem

    Institute of Scientific and Technical Information of China (English)

    Zhen-ping Li; Ling-yun Wu; Yu-ying Zhao; Xiang-sun Zhang

    2006-01-01

    The Minimum Fragments Removal (MFR) problem is one of the haplotyping problems: given a set of fragments, remove the minimum number of fragments so that the resulting fragments can be partitioned into k classes of non-conflicting subsets. In this paper, we formulate the k-MFR problem as an integer linear programming problem, and develop a dynamic programming approach to solve the k-MFR problem for both the gapless and gap cases.

  12. Problem solving and inference mechanisms

    Energy Technology Data Exchange (ETDEWEB)

    Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A

    1982-01-01

    The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.

  13. Shrinkage Estimators for Robust and Efficient Inference in Haplotype-Based Case-Control Studies

    KAUST Repository

    Chen, Yi-Hau

    2009-03-01

    Case-control association studies often aim to investigate the role of genes and gene-environment interactions in terms of the underlying haplotypes (i.e., the combinations of alleles at multiple genetic loci along chromosomal regions). The goal of this article is to develop robust but efficient approaches to the estimation of disease odds-ratio parameters associated with haplotypes and haplotype-environment interactions. We consider "shrinkage" estimation techniques that can adaptively relax the model assumptions of Hardy-Weinberg-Equilibrium and gene-environment independence required by recently proposed efficient "retrospective" methods. Our proposal involves first development of a novel retrospective approach to the analysis of case-control data, one that is robust to the nature of the gene-environment distribution in the underlying population. Next, it involves shrinkage of the robust retrospective estimator toward a more precise, but model-dependent, retrospective estimator using novel empirical Bayes and penalized regression techniques. Methods for variance estimation are proposed based on asymptotic theories. Simulations and two data examples illustrate both the robustness and efficiency of the proposed methods.

  14. Shrinkage Estimators for Robust and Efficient Inference in Haplotype-Based Case-Control Studies

    KAUST Repository

    Chen, Yi-Hau; Chatterjee, Nilanjan; Carroll, Raymond J.

    2009-01-01

    Case-control association studies often aim to investigate the role of genes and gene-environment interactions in terms of the underlying haplotypes (i.e., the combinations of alleles at multiple genetic loci along chromosomal regions). The goal of this article is to develop robust but efficient approaches to the estimation of disease odds-ratio parameters associated with haplotypes and haplotype-environment interactions. We consider "shrinkage" estimation techniques that can adaptively relax the model assumptions of Hardy-Weinberg-Equilibrium and gene-environment independence required by recently proposed efficient "retrospective" methods. Our proposal involves first development of a novel retrospective approach to the analysis of case-control data, one that is robust to the nature of the gene-environment distribution in the underlying population. Next, it involves shrinkage of the robust retrospective estimator toward a more precise, but model-dependent, retrospective estimator using novel empirical Bayes and penalized regression techniques. Methods for variance estimation are proposed based on asymptotic theories. Simulations and two data examples illustrate both the robustness and efficiency of the proposed methods.

  15. Fundamental problem of forensic mathematics--the evidential value of a rare haplotype.

    Science.gov (United States)

    Brenner, Charles H

    2010-10-01

    Y-chromosomal and mitochondrial haplotyping offer special advantages for criminal (and other) identification. For different reasons, each of them is sometimes detectable in a crime stain for which autosomal typing fails. But they also present special problems, including a fundamental mathematical one: When a rare haplotype is shared between suspect and crime scene, how strong is the evidence linking the two? Assume a reference population sample is available which contains n-1 haplotypes. The most interesting situation as well as the most common one is that the crime scene haplotype was never observed in the population sample. The traditional tools of product rule and sample frequency are not useful when there are no components to multiply and the sample frequency is zero. A useful statistic is the fraction κ of the population sample that consists of "singletons" - of once-observed types. A simple argument shows that the probability for a random innocent suspect to match a previously unobserved crime scene type is (1-κ)/n - distinctly less than 1/n, likely ten times less. The robust validity of this model is confirmed by testing it against a range of population models. This paper hinges above all on one key insight: probability is not frequency. The common but erroneous "frequency" approach adopts population frequency as a surrogate for matching probability and attempts the intractable problem of guessing how many instances exist of the specific haplotype at a certain crime. Probability, by contrast, depends by definition only on the available data. Hence if different haplotypes but with the same data occur in two different crimes, although the frequencies are different (and are hopelessly elusive), the matching probabilities are the same, and are not hard to find. Copyright © 2009 Elsevier Ireland Ltd. All rights reserved.

  16. Analysis of Molecular Variance Inferred from Metric Distances among DNA Haplotypes: Application to Human Mitochondrial DNA Restriction Data

    OpenAIRE

    Excoffier, L.; Smouse, P. E.; Quattro, J. M.

    1992-01-01

    We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as φ-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivisi...

  17. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    Directory of Open Access Journals (Sweden)

    Schwartz Russell

    2008-08-01

    Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.

  18. Haplotype reconstruction error as a classical misclassification problem: introducing sensitivity and specificity as error measures.

    Directory of Open Access Journals (Sweden)

    Claudia Lamina

    Full Text Available BACKGROUND: Statistically reconstructing haplotypes from single nucleotide polymorphism (SNP genotypes, can lead to falsely classified haplotypes. This can be an issue when interpreting haplotype association results or when selecting subjects with certain haplotypes for subsequent functional studies. It was our aim to quantify haplotype reconstruction error and to provide tools for it. METHODS AND RESULTS: By numerous simulation scenarios, we systematically investigated several error measures, including discrepancy, error rate, and R(2, and introduced the sensitivity and specificity to this context. We exemplified several measures in the KORA study, a large population-based study from Southern Germany. We find that the specificity is slightly reduced only for common haplotypes, while the sensitivity was decreased for some, but not all rare haplotypes. The overall error rate was generally increasing with increasing number of loci, increasing minor allele frequency of SNPs, decreasing correlation between the alleles and increasing ambiguity. CONCLUSIONS: We conclude that, with the analytical approach presented here, haplotype-specific error measures can be computed to gain insight into the haplotype uncertainty. This method provides the information, if a specific risk haplotype can be expected to be reconstructed with rather no or high misclassification and thus on the magnitude of expected bias in association estimates. We also illustrate that sensitivity and specificity separate two dimensions of the haplotype reconstruction error, which completely describe the misclassification matrix and thus provide the prerequisite for methods accounting for misclassification.

  19. Inferring mechanisms of copy number change from haplotype structures at the human DEFA1A3 locus.

    Science.gov (United States)

    Black, Holly A; Khan, Fayeza F; Tyson, Jess; Al Armour, John

    2014-07-21

    The determination of structural haplotypes at copy number variable regions can indicate the mechanisms responsible for changes in copy number, as well as explain the relationship between gene copy number and expression. However, obtaining spatial information at regions displaying extensive copy number variation, such as the DEFA1A3 locus, is complex, because of the difficulty in the phasing and assembly of these regions. The DEFA1A3 locus is intriguing in that it falls within a region of high linkage disequilibrium, despite its high variability in copy number (n = 3-16); hence, the mechanisms responsible for changes in copy number at this locus are unclear. In this study, a region flanking the DEFA1A3 locus was sequenced across 120 independent haplotypes with European ancestry, identifying five common classes of DEFA1A3 haplotype. Assigning DEFA1A3 class to haplotypes within the 1000 Genomes project highlights a significant difference in DEFA1A3 class frequencies between populations with different ancestry. The features of each DEFA1A3 class, for example, the associated DEFA1A3 copy numbers, were initially assessed in a European cohort (n = 599) and replicated in the 1000 Genomes samples, showing within-class similarity, but between-class and between-population differences in the features of the DEFA1A3 locus. Emulsion haplotype fusion-PCR was used to generate 61 structural haplotypes at the DEFA1A3 locus, showing a high within-class similarity in structure. Structural haplotypes across the DEFA1A3 locus indicate that intra-allelic rearrangement is the predominant mechanism responsible for changes in DEFA1A3 copy number, explaining the conservation of linkage disequilibrium across the locus. The identification of common structural haplotypes at the DEFA1A3 locus could aid studies into how DEFA1A3 copy number influences expression, which is currently unclear.

  20. Historical biogeography of the land snail Cornu aspersum: a new scenario inferred from haplotype distribution in the Western Mediterranean basin

    Directory of Open Access Journals (Sweden)

    Madec Luc

    2010-01-01

    Full Text Available Abstract Background Despite its key location between the rest of the continent and Europe, research on the phylogeography of north African species remains very limited compared to European and North American taxa. The Mediterranean land mollusc Cornu aspersum (= Helix aspersa is part of the few species widely sampled in north Africa for biogeographical analysis. It then provides an excellent biological model to understand phylogeographical patterns across the Mediterranean basin, and to evaluate hypotheses of population differentiation. We investigated here the phylogeography of this land snail to reassess the evolutionary scenario we previously considered for explaining its scattered distribution in the western Mediterranean, and to help to resolve the question of the direction of its range expansion (from north Africa to Europe or vice versa. By analysing simultaneously individuals from 73 sites sampled in its putative native range, the present work provides the first broad-scale screening of mitochondrial variation (cyt b and 16S rRNA genes of C. aspersum. Results Phylogeographical structure mirrored previous patterns inferred from anatomy and nuclear data, since all haplotypes could be ascribed to a B (West or a C (East lineage. Alternative migration models tested confirmed that C. aspersum most likely spread from north Africa to Europe. In addition to Kabylia in Algeria, which would have been successively a centre of dispersal and a zone of secondary contacts, we identified an area in Galicia where genetically distinct west and east type populations would have regained contact. Conclusions Vicariant and dispersal processes are reviewed and discussed in the light of signatures left in the geographical distribution of the genetic variation. In referring to Mediterranean taxa which show similar phylogeographical patterns, we proposed a parsimonious scenario to account for the "east-west" genetic splitting and the northward expansion of the

  1. "HOOF-Print" Genotyping and Haplotype Inference Discriminates among Brucella spp Isolates From a Small Spatial Scale

    Science.gov (United States)

    We demonstrate that the “HOOF-Print” assay provides high power to discriminate among Brucella isolates collected on a small spatial scale (within Portugal). Additionally, we illustrate how haplotype identification using non-random association among markers allows resolution of B. melitensis biovars ...

  2. Inferring mechanisms of copy number change from haplotype structures at the human DEFA1A3 locus

    OpenAIRE

    Black, Holly A; Khan, Fayeza F; Tyson, Jess; Armour, John AL

    2014-01-01

    Background The determination of structural haplotypes at copy number variable regions can indicate the mechanisms responsible for changes in copy number, as well as explain the relationship between gene copy number and expression. However, obtaining spatial information at regions displaying extensive copy number variation, such as the DEFA1A3 locus, is complex, because of the difficulty in the phasing and assembly of these regions. The DEFA1A3 locus is intriguing in that it falls within a reg...

  3. A combinatorial perspective of the protein inference problem.

    Science.gov (United States)

    Yang, Chao; He, Zengyou; Yu, Weichuan

    2013-01-01

    In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from peptide identification results. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we devote ourselves to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound, and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain an analytical expression for protein inference. Our method achieves comparable results with ProteinProphet in a more efficient manner in experiments on two data sets of standard protein mixtures and two data sets of real samples. Based on our model, we study the impact of unique peptides and degenerate peptides (degenerate peptides are peptides shared by at least two proteins) on protein probabilities. Meanwhile, we also study the relationship between our model and ProteinProphet. We name our program ProteinInfer. Its Java source code, our supplementary document and experimental results are available at: >http://bioinformatics.ust.hk/proteininfer.

  4. Plausible inference: A multi-valued logic for problem solving

    Science.gov (United States)

    Friedman, L.

    1979-01-01

    A new logic is developed which permits continuously variable strength of belief in the truth of assertions. Four inference rules result, with formal logic as a limiting case. Quantification of belief is defined. Propagation of belief to linked assertions results from dependency-based techniques of truth maintenance so that local consistency is achieved or contradiction discovered in problem solving. Rules for combining, confirming, or disconfirming beliefs are given, and several heuristics are suggested that apply to revising already formed beliefs in the light of new evidence. The strength of belief that results in such revisions based on conflicting evidence are a highly subjective phenomenon. Certain quantification rules appear to reflect an orderliness in the subjectivity. Several examples of reasoning by plausible inference are given, including a legal example and one from robot learning. Propagation of belief takes place in directions forbidden in formal logic and this results in conclusions becoming possible for a given set of assertions that are not reachable by formal logic.

  5. Detecting structure of haplotypes and local ancestry

    Science.gov (United States)

    We present a two-layer hidden Markov model to detect the structure of haplotypes for unrelated individuals. This allows us to model two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local an...

  6. Assessment of network inference methods: how to cope with an underdetermined problem.

    Directory of Open Access Journals (Sweden)

    Caroline Siegenthaler

    Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.

  7. Inference

    DEFF Research Database (Denmark)

    Møller, Jesper

    2010-01-01

    Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....

  8. Inference

    DEFF Research Database (Denmark)

    Møller, Jesper

    (This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...

  9. On a full Bayesian inference for force reconstruction problems

    Science.gov (United States)

    Aucejo, M.; De Smet, O.

    2018-05-01

    In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.

  10. A linear programming model for protein inference problem in shotgun proteomics.

    Science.gov (United States)

    Huang, Ting; He, Zengyou

    2012-11-15

    Assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is an important issue in shotgun proteomics. The objective of protein inference is to find a subset of proteins that are truly present in the sample. Although many methods have been proposed for protein inference, several issues such as peptide degeneracy still remain unsolved. In this article, we present a linear programming model for protein inference. In this model, we use a transformation of the joint probability that each peptide/protein pair is present in the sample as the variable. Then, both the peptide probability and protein probability can be expressed as a formula in terms of the linear combination of these variables. Based on this simple fact, the protein inference problem is formulated as an optimization problem: minimize the number of proteins with non-zero probabilities under the constraint that the difference between the calculated peptide probability and the peptide probability generated from peptide identification algorithms should be less than some threshold. This model addresses the peptide degeneracy issue by forcing some joint probability variables involving degenerate peptides to be zero in a rigorous manner. The corresponding inference algorithm is named as ProteinLP. We test the performance of ProteinLP on six datasets. Experimental results show that our method is competitive with the state-of-the-art protein inference algorithms. The source code of our algorithm is available at: https://sourceforge.net/projects/prolp/. zyhe@dlut.edu.cn. Supplementary data are available at Bioinformatics Online.

  11. The Improvement of Communication and Inference Skills in Colloid System Material by Problem Solving Learning Model

    OpenAIRE

    maisarera, yunita; diawati, chansyanah; fadiawati, noor

    2012-01-01

    The aim of this research is to describe the effectiveness of problem solving learning in improving communication and inference skills in colloid system material. Subjects in this research were students of XIIPA1 and XI IPA2 classrooms in Persada Junior High School in Bandar Lampung in academic year 2011-2012 where students of both classrooms had the same characteristics. This research used quasi experiment method and pretest-posttest control group design. Effectiveness of problem solving le...

  12. Divide et impera: subgoaling reduces the complexity of probabilistic inference and problem solving.

    Science.gov (United States)

    Maisto, Domenico; Donnarumma, Francesco; Pezzulo, Giovanni

    2015-03-06

    It has long been recognized that humans (and possibly other animals) usually break problems down into smaller and more manageable problems using subgoals. Despite a general consensus that subgoaling helps problem solving, it is still unclear what the mechanisms guiding online subgoal selection are during the solution of novel problems for which predefined solutions are not available. Under which conditions does subgoaling lead to optimal behaviour? When is subgoaling better than solving a problem from start to finish? Which is the best number and sequence of subgoals to solve a given problem? How are these subgoals selected during online inference? Here, we present a computational account of subgoaling in problem solving. Following Occam's razor, we propose that good subgoals are those that permit planning solutions and controlling behaviour using less information resources, thus yielding parsimony in inference and control. We implement this principle using approximate probabilistic inference: subgoals are selected using a sampling method that considers the descriptive complexity of the resulting sub-problems. We validate the proposed method using a standard reinforcement learning benchmark (four-rooms scenario) and show that the proposed method requires less inferential steps and permits selecting more compact control programs compared to an equivalent procedure without subgoaling. Furthermore, we show that the proposed method offers a mechanistic explanation of the neuronal dynamics found in the prefrontal cortex of monkeys that solve planning problems. Our computational framework provides a novel integrative perspective on subgoaling and its adaptive advantages for planning, control and learning, such as for example lowering cognitive effort and working memory load. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  13. Effects analysis fuzzy inference system in nuclear problems using approximate reasoning

    International Nuclear Information System (INIS)

    Guimaraes, Antonio C.F.; Franklin Lapa, Celso Marcelo

    2004-01-01

    In this paper a fuzzy inference system modeling technique applied on failure mode and effects analysis (FMEA) is introduced in reactor nuclear problems. This method uses the concept of a pure fuzzy logic system to treat the traditional FMEA parameters: probabilities of occurrence, severity and detection. The auxiliary feed-water system of a typical two-loop pressurized water reactor (PWR) was used as practical example in this analysis. The kernel result is the conceptual confrontation among the traditional risk priority number (RPN) and the fuzzy risk priority number (FRPN) obtained from experts opinion. The set of results demonstrated the great potential of the inference system and advantage of the gray approach in this class of problems

  14. HapCol : Accurate and memory-efficient haplotype assembly from long reads

    NARCIS (Netherlands)

    Pirola, Yuri; Zaccaria, Simone; Dondi, Riccardo; Klau, Gunnar W.; Pisanti, Nadia; Bonizzoni, Paola

    2016-01-01

    Motivation: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent

  15. HapCol: Accurate and Memory-efficient Haplotype Assembly from Long Reads

    NARCIS (Netherlands)

    Y. Pirola (Yuri); S. Zaccaria (Simone); R. Dondi (Riccardo); G.W. Klau (Gunnar); N. Pisanti (Nadia); P. Bonizzoni (Paola)

    2015-01-01

    htmlabstractMotivation: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from

  16. Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems

    KAUST Repository

    Contreras, Andres A.

    2016-09-19

    A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd

  17. Approximation properties of haplotype tagging

    Directory of Open Access Journals (Sweden)

    Dreiseitl Stephan

    2006-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and approximation properties. Results It is shown that the tagging problem is NP-hard but approximable within 1 + ln((n2 - n/2 for n haplotypes but not approximable within (1 - ε ln(n/2 for any ε > 0 unless NP ⊂ DTIME(nlog log n. A simple, very easily implementable algorithm that exhibits the above upper bound on solution quality is presented. This algorithm has running time O((2m - p + 1 ≤ O(m(n2 - n/2 where p ≤ min(n, m for n haplotypes of size m. As we show that the approximation bound is asymptotically tight, the algorithm presented is optimal with respect to this asymptotic bound. Conclusion The haplotype tagging problem is hard, but approachable with a fast, practical, and surprisingly simple algorithm that cannot be significantly improved upon on a single processor machine. Hence, significant improvement in computatational efforts expended can only be expected if the computational effort is distributed and done in parallel.

  18. Problem Solving as Probabilistic Inference with Subgoaling: Explaining Human Successes and Pitfalls in the Tower of Hanoi.

    Science.gov (United States)

    Donnarumma, Francesco; Maisto, Domenico; Pezzulo, Giovanni

    2016-04-01

    How do humans and other animals face novel problems for which predefined solutions are not available? Human problem solving links to flexible reasoning and inference rather than to slow trial-and-error learning. It has received considerable attention since the early days of cognitive science, giving rise to well known cognitive architectures such as SOAR and ACT-R, but its computational and brain mechanisms remain incompletely known. Furthermore, it is still unclear whether problem solving is a "specialized" domain or module of cognition, in the sense that it requires computations that are fundamentally different from those supporting perception and action systems. Here we advance a novel view of human problem solving as probabilistic inference with subgoaling. In this perspective, key insights from cognitive architectures are retained such as the importance of using subgoals to split problems into subproblems. However, here the underlying computations use probabilistic inference methods analogous to those that are increasingly popular in the study of perception and action systems. To test our model we focus on the widely used Tower of Hanoi (ToH) task, and show that our proposed method can reproduce characteristic idiosyncrasies of human problem solvers: their sensitivity to the "community structure" of the ToH and their difficulties in executing so-called "counterintuitive" movements. Our analysis reveals that subgoals have two key roles in probabilistic inference and problem solving. First, prior beliefs on (likely) useful subgoals carve the problem space and define an implicit metric for the problem at hand-a metric to which humans are sensitive. Second, subgoals are used as waypoints in the probabilistic problem solving inference and permit to find effective solutions that, when unavailable, lead to problem solving deficits. Our study thus suggests that a probabilistic inference scheme enhanced with subgoals provides a comprehensive framework to study problem

  19. The Network Completion Problem: Inferring Missing Nodes and Edges in Networks

    Energy Technology Data Exchange (ETDEWEB)

    Kim, M; Leskovec, J

    2011-11-14

    Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.

  20. Approximate Bayesian computation for modular inference problems with many parameters: the example of migration rates.

    Science.gov (United States)

    Aeschbacher, S; Futschik, A; Beaumont, M A

    2013-02-01

    We propose a two-step procedure for estimating multiple migration rates in an approximate Bayesian computation (ABC) framework, accounting for global nuisance parameters. The approach is not limited to migration, but generally of interest for inference problems with multiple parameters and a modular structure (e.g. independent sets of demes or loci). We condition on a known, but complex demographic model of a spatially subdivided population, motivated by the reintroduction of Alpine ibex (Capra ibex) into Switzerland. In the first step, the global parameters ancestral mutation rate and male mating skew have been estimated for the whole population in Aeschbacher et al. (Genetics 2012; 192: 1027). In the second step, we estimate in this study the migration rates independently for clusters of demes putatively connected by migration. For large clusters (many migration rates), ABC faces the problem of too many summary statistics. We therefore assess by simulation if estimation per pair of demes is a valid alternative. We find that the trade-off between reduced dimensionality for the pairwise estimation on the one hand and lower accuracy due to the assumption of pairwise independence on the other depends on the number of migration rates to be inferred: the accuracy of the pairwise approach increases with the number of parameters, relative to the joint estimation approach. To distinguish between low and zero migration, we perform ABC-type model comparison between a model with migration and one without. Applying the approach to microsatellite data from Alpine ibex, we find no evidence for substantial gene flow via migration, except for one pair of demes in one direction. © 2013 Blackwell Publishing Ltd.

  1. An efficient Bayesian inference approach to inverse problems based on an adaptive sparse grid collocation method

    International Nuclear Information System (INIS)

    Ma Xiang; Zabaras, Nicholas

    2009-01-01

    A new approach to modeling inverse problems using a Bayesian inference method is introduced. The Bayesian approach considers the unknown parameters as random variables and seeks the probabilistic distribution of the unknowns. By introducing the concept of the stochastic prior state space to the Bayesian formulation, we reformulate the deterministic forward problem as a stochastic one. The adaptive hierarchical sparse grid collocation (ASGC) method is used for constructing an interpolant to the solution of the forward model in this prior space which is large enough to capture all the variability/uncertainty in the posterior distribution of the unknown parameters. This solution can be considered as a function of the random unknowns and serves as a stochastic surrogate model for the likelihood calculation. Hierarchical Bayesian formulation is used to derive the posterior probability density function (PPDF). The spatial model is represented as a convolution of a smooth kernel and a Markov random field. The state space of the PPDF is explored using Markov chain Monte Carlo algorithms to obtain statistics of the unknowns. The likelihood calculation is performed by directly sampling the approximate stochastic solution obtained through the ASGC method. The technique is assessed on two nonlinear inverse problems: source inversion and permeability estimation in flow through porous media

  2. A class representative model for Pure Parsimony Haplotyping under uncertain data.

    Directory of Open Access Journals (Sweden)

    Daniele Catanzaro

    Full Text Available The Pure Parsimony Haplotyping (PPH problem is a NP-hard combinatorial optimization problem that consists of finding the minimum number of haplotypes necessary to explain a given set of genotypes. PPH has attracted more and more attention in recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from mapping complex disease genes to inferring population histories, passing through designing drugs, functional genomics and pharmacogenetics. In this article we investigate, for the first time, a recent version of PPH called the Pure Parsimony Haplotype problem under Uncertain Data (PPH-UD. This version mainly arises when the input genotypes are not accurate, i.e., when some single nucleotide polymorphisms are missing or affected by errors. We propose an exact approach to solution of PPH-UD based on an extended version of Catanzaro et al.[1] class representative model for PPH, currently the state-of-the-art integer programming model for PPH. The model is efficient, accurate, compact, polynomial-sized, easy to implement, solvable with any solver for mixed integer programming, and usable in all those cases for which the parsimony criterion is well suited for haplotype estimation.

  3. Are molecular haplotypes worth the time and expense? A cost-effective method for applying molecular haplotypes.

    Directory of Open Access Journals (Sweden)

    Mark A Levenstien

    2006-08-01

    Full Text Available Because current molecular haplotyping methods are expensive and not amenable to automation, many researchers rely on statistical methods to infer haplotype pairs from multilocus genotypes, and subsequently treat these inferred haplotype pairs as observations. These procedures are prone to haplotype misclassification. We examine the effect of these misclassification errors on the false-positive rate and power for two association tests. These tests include the standard likelihood ratio test (LRTstd and a likelihood ratio test that employs a double-sampling approach to allow for the misclassification inherent in the haplotype inference procedure (LRTae. We aim to determine the cost-benefit relationship of increasing the proportion of individuals with molecular haplotype measurements in addition to genotypes to raise the power gain of the LRTae over the LRTstd. This analysis should provide a guideline for determining the minimum number of molecular haplotypes required for desired power. Our simulations under the null hypothesis of equal haplotype frequencies in cases and controls indicate that (1 for each statistic, permutation methods maintain the correct type I error; (2 specific multilocus genotypes that are misclassified as the incorrect haplotype pair are consistently misclassified throughout each entire dataset; and (3 our simulations under the alternative hypothesis showed a significant power gain for the LRTae over the LRTstd for a subset of the parameter settings. Permutation methods should be used exclusively to determine significance for each statistic. For fixed cost, the power gain of the LRTae over the LRTstd varied depending on the relative costs of genotyping, molecular haplotyping, and phenotyping. The LRTae showed the greatest benefit over the LRTstd when the cost of phenotyping was very high relative to the cost of genotyping. This situation is likely to occur in a replication study as opposed to a whole-genome association study.

  4. An alternative empirical likelihood method in missing response problems and causal inference.

    Science.gov (United States)

    Ren, Kaili; Drummond, Christopher A; Brewster, Pamela S; Haller, Steven T; Tian, Jiang; Cooper, Christopher J; Zhang, Biao

    2016-11-30

    Missing responses are common problems in medical, social, and economic studies. When responses are missing at random, a complete case data analysis may result in biases. A popular debias method is inverse probability weighting proposed by Horvitz and Thompson. To improve efficiency, Robins et al. proposed an augmented inverse probability weighting method. The augmented inverse probability weighting estimator has a double-robustness property and achieves the semiparametric efficiency lower bound when the regression model and propensity score model are both correctly specified. In this paper, we introduce an empirical likelihood-based estimator as an alternative to Qin and Zhang (2007). Our proposed estimator is also doubly robust and locally efficient. Simulation results show that the proposed estimator has better performance when the propensity score is correctly modeled. Moreover, the proposed method can be applied in the estimation of average treatment effect in observational causal inferences. Finally, we apply our method to an observational study of smoking, using data from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  5. Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems

    KAUST Repository

    Contreras, Andres A.; Le Maî tre, Olivier P.; Aquino, Wilkins; Knio, Omar

    2016-01-01

    of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic

  6. Haplotype association analysis of human disease traits using genotype data of unrelated individuals

    DEFF Research Database (Denmark)

    Tan, Qihua; Christiansen, Lene; Christensen, Kaare

    2005-01-01

    unphased multi-locus genotype data, ranging from the early approach by the simple gene-counting method to the recent work using the generalized linear model. However, these methods are either confined to case – control design or unable to yield unbiased point and interval estimates of haplotype effects....... Based on the popular logistic regression model, we present a new approach for haplotype association analysis of human disease traits. Using haplotype-based parameterization, our model infers the effects of specific haplotypes (point estimation) and constructs confidence interval for the risks...... on the well-known logistic regression model is a useful tool for haplotype association analysis of human disease traits....

  7. Decision aid by fuzzy inference: a case study related to the problem of radioactive waste management

    International Nuclear Information System (INIS)

    Krunsch, P.; Fiordalisa, A.; Fortemps, Ph.

    1999-01-01

    This paper illustrates a fuzzy inference system (FIS) developed to assist the economic calculus in radioactive waste management (RWM). The extended time horizons and, in addition, the first-of-a-kind nature of many RWM systems induce large cost uncertainties in project funding. The traditional approach in economic calculus is to include contingency factors in basic cost estimates. A distinction is made between T-factors, used for technological uncertainties, and P-factors, used for project contingencies. In the particular case of nuclear projects, the Electric Power Research Institute (EPRI) has developed specific recommendations for defining both contingency factors. As a generalisation of the EPRI results, a new methodology using fuzzy inference rules is proposed. The inputs to the FIS are derived from the answers of experts regarding both the degrees of technological maturity and project advancement. Inferred T- and P-factors proposed by the FIS are given either as single estimates as possibility intervals. (authors)

  8. iHAP – integrated haplotype analysis pipeline for characterizing the haplotype structure of genes

    Directory of Open Access Journals (Sweden)

    Lim Yun Ping

    2006-12-01

    Full Text Available Abstract Background The advent of genotype data from large-scale efforts that catalog the genetic variants of different populations have given rise to new avenues for multifactorial disease association studies. Recent work shows that genotype data from the International HapMap Project have a high degree of transferability to the wider population. This implies that the design of genotyping studies on local populations may be facilitated through inferences drawn from information contained in HapMap populations. Results To facilitate analysis of HapMap data for characterizing the haplotype structure of genes or any chromosomal regions, we have developed an integrated web-based resource, iHAP. In addition to incorporating genotype and haplotype data from the International HapMap Project and gene information from the UCSC Genome Browser Database, iHAP also provides capabilities for inferring haplotype blocks and selecting tag SNPs that are representative of haplotype patterns. These include block partitioning algorithms, block definitions, tag SNP definitions, as well as SNPs to be "force included" as tags. Based on the parameters defined at the input stage, iHAP performs on-the-fly analysis and displays the result graphically as a webpage. To facilitate analysis, intermediate and final result files can be downloaded. Conclusion The iHAP resource, available at http://ihap.bii.a-star.edu.sg, provides a convenient yet flexible approach for the user community to analyze HapMap data and identify candidate targets for genotyping studies.

  9. A spatial haplotype copying model with applications to genotype imputation.

    Science.gov (United States)

    Yang, Wen-Yun; Hormozdiari, Farhad; Eskin, Eleazar; Pasaniuc, Bogdan

    2015-05-01

    Ever since its introduction, the haplotype copy model has proven to be one of the most successful approaches for modeling genetic variation in human populations, with applications ranging from ancestry inference to genotype phasing and imputation. Motivated by coalescent theory, this approach assumes that any chromosome (haplotype) can be modeled as a mosaic of segments copied from a set of chromosomes sampled from the same population. At the core of the model is the assumption that any chromosome from the sample is equally likely to contribute a priori to the copying process. Motivated by recent works that model genetic variation in a geographic continuum, we propose a new spatial-aware haplotype copy model that jointly models geography and the haplotype copying process. We extend hidden Markov models of haplotype diversity such that at any given location, haplotypes that are closest in the genetic-geographic continuum map are a priori more likely to contribute to the copying process than distant ones. Through simulations starting from the 1000 Genomes data, we show that our model achieves superior accuracy in genotype imputation over the standard spatial-unaware haplotype copy model. In addition, we show the utility of our model in selecting a small personalized reference panel for imputation that leads to both improved accuracy as well as to a lower computational runtime than the standard approach. Finally, we show our proposed model can be used to localize individuals on the genetic-geographical map on the basis of their genotype data.

  10. Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR.

    Science.gov (United States)

    Tyson, Jess; Armour, John A L

    2012-12-11

    Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in) regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

  11. Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR

    Directory of Open Access Journals (Sweden)

    Tyson Jess

    2012-12-01

    Full Text Available Abstract Background Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. Results In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. Conclusion This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

  12. Entropy, Information Theory, Information Geometry and Bayesian Inference in Data, Signal and Image Processing and Inverse Problems

    Directory of Open Access Journals (Sweden)

    Ali Mohammad-Djafari

    2015-06-01

    Full Text Available The main content of this review article is first to review the main inference tools using Bayes rule, the maximum entropy principle (MEP, information theory, relative entropy and the Kullback–Leibler (KL divergence, Fisher information and its corresponding geometries. For each of these tools, the precise context of their use is described. The second part of the paper is focused on the ways these tools have been used in data, signal and image processing and in the inverse problems, which arise in different physical sciences and engineering applications. A few examples of the applications are described: entropy in independent components analysis (ICA and in blind source separation, Fisher information in data model selection, different maximum entropy-based methods in time series spectral estimation and in linear inverse problems and, finally, the Bayesian inference for general inverse problems. Some original materials concerning the approximate Bayesian computation (ABC and, in particular, the variational Bayesian approximation (VBA methods are also presented. VBA is used for proposing an alternative Bayesian computational tool to the classical Markov chain Monte Carlo (MCMC methods. We will also see that VBA englobes joint maximum a posteriori (MAP, as well as the different expectation-maximization (EM algorithms as particular cases.

  13. Entropic Inference

    Science.gov (United States)

    Caticha, Ariel

    2011-03-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.

  14. Entropic Inference

    OpenAIRE

    Caticha, Ariel

    2010-01-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...

  15. Bayes procedures for adaptive inference in inverse problems for the white noise model

    NARCIS (Netherlands)

    Knapik, B.T.; Szabó, B.T.; van der Vaart, A.W.; van Zanten, J.H.

    2016-01-01

    We study empirical and hierarchical Bayes approaches to the problem of estimating an infinite-dimensional parameter in mildly ill-posed inverse problems. We consider a class of prior distributions indexed by a hyperparameter that quantifies regularity. We prove that both methods we consider succeed

  16. HLA-inferred extended haplotype disparity level is more relevant than the level of HLA mismatch alone for the patients survival and GvHD in T cell-replate hematopoietic stem cell transplantation from unrelated donor.

    Science.gov (United States)

    Nowak, Jacek; Nestorowicz, Klaudia; Graczyk-Pol, Elzbieta; Mika-Witkowska, Renata; Rogatko-Koros, Marta; Jaskula, Emilia; Koscinska, Katarzyna; Madej, Sylwia; Tomaszewska, Agnieszka; Nasilowska-Adamska, Barbara; Szczepinski, Andrzej; Halaburda, Kazimierz; Dybko, Jaroslaw; Kuliczkowski, Kazimierz; Czerw, Tomasz; Giebel, Sebastian; Holowiecki, Jerzy; Baranska, Malgorzata; Pieczonka, Anna; Wachowiak, Jacek; Czyz, Anna; Gil, Lidia; Lojko-Dankowska, Anna; Komarnicki, Mieczyslaw; Bieniaszewska, Maria; Kucharska, Agnieszka; Hellmann, Andrzej; Gronkowska, Anna; Jedrzejczak, Wieslaw W; Markiewicz, Miroslaw; Koclega, Anna; Kyrcz-Krzemien, Slawomira; Mielcarek, Monika; Kalwak, Krzysztof; Styczynski, Jan; Wysocki, Mariusz; Drabko, Katarzyna; Wojcik, Beata; Kowalczyk, Jerzy; Gozdzik, Jolanta; Pawliczak, Daria; Gwozdowicz, Slawomir; Dziopa, Joanna; Szlendak, Urszula; Witkowska, Agnieszka; Zubala, Marta; Gawron, Agnieszka; Warzocha, Krzysztof; Lange, Andrzej

    2018-06-01

    Serious risks in unrelated hematopoietic stem cell transplantation (HSCT) including graft versus host disease (GvHD) and mortality are associated with HLA disparity between donor and recipient. The increased risks might be dependent on disparity in not-routinely-tested multiple polymorphisms in genetically dense MHC region, being organized in combinations of two extended MHC haplotypes (Ehp). We assessed the clinical role of donor-recipient Ehp disparity levels in N = 889 patients by the population-based detection of HLA allele phase mismatch. We found increased GvHD incidences and mortality rates with increasing Ehp mismatch level even with the same HLA mismatch level. In multivariate analysis HLA mismatch levels were excluded from models and Ehp disparity level remained independent prognostic factor for high grade acute GvHD (p = 0.000037, HR = 10.68, 95%CI 5.50-32.5) and extended chronic GvHD (p < 0.000001, HR = 15.51, CI95% 5.36-44.8). In group with single HLA mismatch, patients with double Ehp disparity had worse 5-year overall survival (45% vs. 56%, p = 0.00065, HR = 4.05, CI95% 1.69-9.71) and non-relapse mortality (40% vs. 31%, p = 0.00037, HR = 5.63, CI95% 2.04-15.5) than patients with single Ehp disparity. We conclude that Ehp-linked factors contribute to the high morbidity and mortality in recipients given HLA-mismatched unrelated transplant and Ehp matching should be considered in clinical HSCT. Copyright © 2018. Published by Elsevier Inc.

  17. Exact algorithms for haplotype assembly from whole-genome sequence data.

    Science.gov (United States)

    Chen, Zhi-Zhong; Deng, Fei; Wang, Lusheng

    2013-08-15

    Haplotypes play a crucial role in genetic analysis and have many applications such as gene disease diagnoses, association studies, ancestry inference and so forth. The development of DNA sequencing technologies makes it possible to obtain haplotypes from a set of aligned reads originated from both copies of a chromosome of a single individual. This approach is often known as haplotype assembly. Exact algorithms that can give optimal solutions to the haplotype assembly problem are highly demanded. Unfortunately, previous algorithms for this problem either fail to output optimal solutions or take too long time even executed on a PC cluster. We develop an approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model. Most of the previous approaches assume that the columns in the input matrix correspond to (putative) heterozygous sites. This all-heterozygous assumption is correct for most columns, but it may be incorrect for a small number of columns. In this article, we consider the MEC model with or without the all-heterozygous assumption. In our approach, we first use new methods to decompose the input read matrix into small independent blocks and then model the problem for each block as an integer linear programming problem, which is then solved by an integer linear programming solver. We have tested our program on a single PC [a Linux (x64) desktop PC with i7-3960X CPU], using the filtered HuRef and the NA 12878 datasets (after applying some variant calling methods). With the all-heterozygous assumption, our approach can optimally solve the whole HuRef data set within a total time of 31 h (26 h for the most difficult block of the 15th chromosome and only 5 h for the other blocks). To our knowledge, this is the first time that MEC optimal solutions are completely obtained for the filtered HuRef dataset. Moreover, in the general case (without the all-heterozygous assumption), for the HuRef dataset our

  18. Land cover and water yield: inference problems when comparing catchments with mixed land cover

    Directory of Open Access Journals (Sweden)

    A. I. J. M. van Dijk

    2012-09-01

    Full Text Available Controlled experiments provide strong evidence that changing land cover (e.g. deforestation or afforestation can affect mean catchment streamflow (Q. By contrast, a similarly strong influence has not been found in studies that interpret Q from multiple catchments with mixed land cover. One possible reason is that there are methodological issues with the way in which the Budyko framework was used in the latter type studies. We examined this using Q data observed in 278 Australian catchments and by making inferences from synthetic Q data simulated by a hydrological process model (the Australian Water Resources Assessment system Landscape model. The previous contrasting findings could be reproduced. In the synthetic experiment, the land cover influence was still present but not accurately detected with the Budyko- framework. Likely sources of interpretation bias demonstrated include: (i noise in land cover, precipitation and Q data; (ii additional catchment climate characteristics more important than land cover; and (iii covariance between Q and catchment attributes. These methodological issues caution against the use of a Budyko framework to quantify a land cover influence in Q data from mixed land-cover catchments. Importantly, however, our findings do not rule out that there may also be physical processes that modify the influence of land cover in mixed land-cover catchments. Process model simulations suggested that lateral water redistribution between vegetation types and recirculation of intercepted rainfall may be important.

  19. Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics

    Science.gov (United States)

    Pohorille, Andrew

    2006-01-01

    The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described

  20. Mapping Haplotype-haplotype Interactions with Adaptive LASSO

    Directory of Open Access Journals (Sweden)

    Li Ming

    2010-08-01

    Full Text Available Abstract Background The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods for modeling interactions underlying complex diseases between single genetic variants (e.g. single nucleotide polymorphisms or SNPs have been extensively studied. Recently, haplotype-based analysis has gained its popularity among genetic association studies. When multiple sequence or haplotype interactions are involved in determining an individual's susceptibility to a disease, it presents daunting challenges in statistical modeling and testing of the interaction effects, largely due to the complicated higher order epistatic complexity. Results In this article, we propose a new strategy in modeling haplotype-haplotype interactions under the penalized logistic regression framework with adaptive L1-penalty. We consider interactions of sequence variants between haplotype blocks. The adaptive L1-penalty allows simultaneous effect estimation and variable selection in a single model. We propose a new parameter estimation method which estimates and selects parameters by the modified Gauss-Seidel method nested within the EM algorithm. Simulation studies show that it has low false positive rate and reasonable power in detecting haplotype interactions. The method is applied to test haplotype interactions involved in mother and offspring genome in a small for gestational age (SGA neonates data set, and significant interactions between different genomes are detected. Conclusions As demonstrated by the simulation studies and real data analysis, the approach developed provides an efficient tool for the modeling and testing of haplotype interactions. The implementation of the method in R codes can be

  1. Inference in `poor` languages

    Energy Technology Data Exchange (ETDEWEB)

    Petrov, S.

    1996-10-01

    Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.

  2. Haplotype phasing and inheritance of copy number variants in nuclear families.

    Science.gov (United States)

    Palta, Priit; Kaplinski, Lauris; Nagirnaja, Liina; Veidenberg, Andres; Möls, Märt; Nelis, Mari; Esko, Tõnu; Metspalu, Andres; Laan, Maris; Remm, Maido

    2015-01-01

    DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.

  3. Haplotype phasing and inheritance of copy number variants in nuclear families.

    Directory of Open Access Journals (Sweden)

    Priit Palta

    Full Text Available DNA copy number variants (CNVs that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i phase normal and CNV-carrying haplotypes in the copy number variable regions, ii resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.

  4. Inferring demographic history from a spectrum of shared haplotype lengths

    DEFF Research Database (Denmark)

    Harris, Kelley; Nielsen, Rasmus

    2013-01-01

    There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge...... is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture...

  5. Genetics of chloroquine-resistant malaria: a haplotypic view

    Directory of Open Access Journals (Sweden)

    Gauri Awasthi

    2013-12-01

    Full Text Available The development and rapid spread of chloroquine resistance (CQR in Plasmodium falciparum have triggered the identification of several genetic target(s in the P. falciparum genome. In particular, mutations in the Pfcrt gene, specifically, K76T and mutations in three other amino acids in the region adjoining K76 (residues 72, 74, 75 and 76, are considered to be highly related to CQR. These various mutations form several different haplotypes and Pfcrt gene polymorphisms and the global distribution of the different CQR- Pfcrt haplotypes in endemic and non-endemic regions of P. falciparum malaria have been the subject of extensive study. Despite the fact that the Pfcrt gene is considered to be the primary CQR gene in P. falciparum , several studies have suggested that this may not be the case. Furthermore, there is a poor correlation between the evolutionary implications of the Pfcrt haplotypes and the inferred migration of CQR P. falciparum based on CQR epidemiological surveillance data. The present paper aims to clarify the existing knowledge on the genetic basis of the different CQR- Pfcrt haplotypes that are prevalent in worldwide populations based on the published literature and to analyse the data to generate hypotheses on the genetics and evolution of CQR malaria.

  6. Influence of promoter/enhancer region haplotypes on MGMT transcriptional regulation: a potential biomarker for human sensitivity to alkylating agents.

    Science.gov (United States)

    Xu, Meixiang; Nekhayeva, Ilona; Cross, Courtney E; Rondelli, Catherine M; Wickliffe, Jeffrey K; Abdel-Rahman, Sherif Z

    2014-03-01

    The O6-methylguanine-DNA methyltransferase gene (MGMT) encodes the direct reversal DNA repair protein that removes alkyl adducts from the O6 position of guanine. Several single-nucleotide polymorphisms (SNPs) exist in the MGMT promoter/enhancer (P/E) region. However, the haplotype structure encompassing these SNPs and their functional/biological significance are currently unknown. We hypothesized that MGMT P/E haplotypes, rather than individual SNPs, alter MGMT transcription and can thus alter human sensitivity to alkylating agents. To identify the haplotype structure encompassing the MGMT P/E region SNPs, we sequenced 104 DNA samples from healthy individuals and inferred the haplotypes using the data generated. We identified eight SNPs in this region, namely T7C (rs180989103), T135G (rs1711646), G290A (rs61859810), C485A (rs1625649), C575A (rs113813075), G666A (rs34180180), C777A (rs34138162) and C1099T (rs16906252). Phylogenetics and Sequence Evolution analysis predicted 21 potential haplotypes that encompass these SNPs ranging in frequencies from 0.000048 to 0.39. Of these, 10 were identified in our study population as 20 paired haplotype combinations. To determine the functional significance of these haplotypes, luciferase reporter constructs representing these haplotypes were transfected into glioblastoma cells and their effect on MGMT promoter activity was determined. Compared with the most common (reference) haplotype 1, seven haplotypes significantly upregulated MGMT promoter activity (18-119% increase; P alkylating agents.

  7. Geometric statistical inference

    International Nuclear Information System (INIS)

    Periwal, Vipul

    1999-01-01

    A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined

  8. Two haplotype clusters of Echinococcus granulosus sensu stricto in northern Iraq (Kurdistan region) support the hypothesis of a parasite cradle in the Middle East.

    Science.gov (United States)

    Hassan, Zuber Ismael; Meerkhan, Azad Abdullah; Boufana, Belgees; Hama, Abdullah A; Ahmed, Bayram Dawod; Mero, Wijdan Mohammed Salih; Orsten, Serra; Interisano, Maria; Pozio, Edoardo; Casulli, Adriano

    2017-08-01

    Human cystic echinococcosis (CE) caused by Echinococcus granulosus s.s. is a major public health problem in Iraqi Kurdistan with a reported surgical incidence of 6.3 per 100,000 Arbil inhabitants. A total of 125 Echinococcus isolates retrieved from sheep, goats and cattle were used in this study. Our aim was to determine species/genotypes infecting livestock in Iraqi Kurdistan and examine intraspecific variation and population structure of Echinococcus granulosus s.s. in this region and relate it to that of other regions worldwide. Using nucleotide sequences of the mitochondrial cytochrome c oxidase subunit 1 (cox 1) we identified E. granulosus s.s. as the cause of hydatidosis in all examined animals. The haplotype network displayed a double-clustered topology with two main E. granulosus s.s. haplotypes, (KU05) and (KU33). The 'founder' haplotype (KU05) confirmed the presence of a common lineage of non-genetically differentiated populations as inferred by the low non-significant fixation index values. Overall diversity and neutrality indices indicated demographic expansion. We used E. granulosus s.s. nucleotide sequences from GenBank to draw haplotype networks for the Middle East (Iran, Jordan and Turkey), Europe (Albania, Greece, Italy, Romania and Spain), China, Mongolia, Russia, South America (Argentina, Brazil, Chile and Mexico) and Tunisia. Networks with two haplotype clusters like that reported here for Iraqi Kurdistan were seen for the Middle East, Europe, Mongolia, Russia and Tunisia using both 827bp and 1609bp cox1 nucleotide sequences, whereas a star-like network was observed for China and South America. We hypothesize that the double clustering seen at what is generally assumed to be the cradle of domestication may have emerged independently and dispersed from the Middle East to other regions and that haplotype (KU33) may be the main haplotype within a second cluster in the Middle East from where it has spread into Europe, Mongolia, Russia and North

  9. Combinatorial aspects of genome rearrangements and haplotype networks

    OpenAIRE

    Labarre , Anthony

    2008-01-01

    The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks. Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing t...

  10. Statistical inference

    CERN Document Server

    Rohatgi, Vijay K

    2003-01-01

    Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth

  11. SEMANTIC PATCH INFERENCE

    DEFF Research Database (Denmark)

    Andersen, Jesper

    2009-01-01

    Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....

  12. Knowledge and inference

    CERN Document Server

    Nagao, Makoto

    1990-01-01

    Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig

  13. Practical Bayesian Inference

    Science.gov (United States)

    Bailer-Jones, Coryn A. L.

    2017-04-01

    Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.

  14. On quantum statistical inference

    NARCIS (Netherlands)

    Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have

  15. Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors

    International Nuclear Information System (INIS)

    Lucka, Felix

    2012-01-01

    Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)

  16. Estimating haplotype effects for survival data

    DEFF Research Database (Denmark)

    Scheike, Thomas; Martinussen, Torben; Silver, J

    2010-01-01

    Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplo...

  17. Haplotype-Based Genotyping in Polyploids

    Directory of Open Access Journals (Sweden)

    Josh P. Clevenger

    2018-04-01

    Full Text Available Accurate identification of polymorphisms from sequence data is crucial to unlocking the potential of high throughput sequencing for genomics. Single nucleotide polymorphisms (SNPs are difficult to accurately identify in polyploid crops due to the duplicative nature of polyploid genomes leading to low confidence in the true alignment of short reads. Implementing a haplotype-based method in contrasting subgenome-specific sequences leads to higher accuracy of SNP identification in polyploids. To test this method, a large-scale 48K SNP array (Axiom Arachis2 was developed for Arachis hypogaea (peanut, an allotetraploid, in which 1,674 haplotype-based SNPs were included. Results of the array show that 74% of the haplotype-based SNP markers could be validated, which is considerably higher than previous methods used for peanut. The haplotype method has been implemented in a standalone program, HAPLOSWEEP, which takes as input bam files and a vcf file and identifies haplotype-based markers. Haplotype discovery can be made within single reads or span paired reads, and can leverage long read technology by targeting any length of haplotype. Haplotype-based genotyping is applicable in all allopolyploid genomes and provides confidence in marker identification and in silico-based genotyping for polyploid genomics.

  18. Haplotype assembly in polyploid genomes and identical by descent shared tracts.

    Science.gov (United States)

    Aguiar, Derek; Istrail, Sorin

    2013-07-01

    Genome-wide haplotype reconstruction from sequence data, or haplotype assembly, is at the center of major challenges in molecular biology and life sciences. For complex eukaryotic organisms like humans, the genome is vast and the population samples are growing so rapidly that algorithms processing high-throughput sequencing data must scale favorably in terms of both accuracy and computational efficiency. Furthermore, current models and methodologies for haplotype assembly (i) do not consider individuals sharing haplotypes jointly, which reduces the size and accuracy of assembled haplotypes, and (ii) are unable to model genomes having more than two sets of homologous chromosomes (polyploidy). Polyploid organisms are increasingly becoming the target of many research groups interested in the genomics of disease, phylogenetics, botany and evolution but there is an absence of theory and methods for polyploid haplotype reconstruction. In this work, we present a number of results, extensions and generalizations of compass graphs and our HapCompass framework. We prove the theoretical complexity of two haplotype assembly optimizations, thereby motivating the use of heuristics. Furthermore, we present graph theory-based algorithms for the problem of haplotype assembly using our previously developed HapCompass framework for (i) novel implementations of haplotype assembly optimizations (minimum error correction), (ii) assembly of a pair of individuals sharing a haplotype tract identical by descent and (iii) assembly of polyploid genomes. We evaluate our methods on 1000 Genomes Project, Pacific Biosciences and simulated sequence data. HapCompass is available for download at http://www.brown.edu/Research/Istrail_Lab/. Supplementary data are available at Bioinformatics online.

  19. Factor IX gene haplotypes in Amerindians.

    Science.gov (United States)

    Franco, R F; Araújo, A G; Zago, M A; Guerreiro, J F; Figueiredo, M S

    1997-02-01

    We have determined the haplotypes of the factor IX gene for 95 Indians from 5 Brazilian Amazon tribes: Wayampí, Wayana-Apalaí, Kayapó, Arára, and Yanomámi. Eight polymorphisms linked to the factor IX gene were investigated: MseI (at 5', nt -698), BamHI (at 5', nt -561), DdeI (intron 1), BamHI (intron 2), XmnI (intron 3), TaqI (intron 4), MspI (intron 4), and HhaI (at 3', approximately 8 kb). The results of the haplotype distribution and the allele frequencies for each of the factor IX gene polymorphisms in Amerindians were similar to the results reported for Asian populations but differed from results for other ethnic groups. Only five haplotypes were identified within the entire Amerindian study population, and the haplotype distribution was significantly different among the five tribes, with one (Arára) to four (Wayampí) haplotypes being found per tribe. These findings indicate a significant heterogeneity among the Indian tribes and contrast with the homogeneous distribution of the beta-globin gene cluster haplotypes but agree with our recent findings on the distribution of alpha-globin gene cluster haplotypes and the allele frequencies for six VNTRs in the same Amerindian tribes. Our data represent the first study of factor IX-associated polymorphisms in Amerindian populations and emphasizes the applicability of these genetic markers for population and human evolution studies.

  20. Haplotypes of CYP3A4 and their close linkage with CYP3A5 haplotypes in a Japanese population.

    Science.gov (United States)

    Fukushima-Uesaka, Hiromi; Saito, Yoshiro; Watanabe, Hidemi; Shiseki, Kisho; Saeki, Mayumi; Nakamura, Takahiro; Kurose, Kouichi; Sai, Kimie; Komamura, Kazuo; Ueno, Kazuyuki; Kamakura, Shiro; Kitakaze, Masafumi; Hanai, Sotaro; Nakajima, Toshiharu; Matsumoto, Kenji; Saito, Hirohisa; Goto, Yu-ichi; Kimura, Hideo; Katoh, Masaaki; Sugai, Kenji; Minami, Narihiro; Shirao, Kuniaki; Tamura, Tomohide; Yamamoto, Noboru; Minami, Hironobu; Ohtsu, Atsushi; Yoshida, Teruhiko; Saijo, Nagahiro; Kitamura, Yutaka; Kamatani, Naoyuki; Ozawa, Shogo; Sawada, Jun-ichi

    2004-01-01

    In order to identify single nucleotide polymorphisms (SNPs) and haplotype frequencies of CYP3A4 in a Japanese population, the distal enhancer and proximal promoter regions, all exons, and the surrounding introns were sequenced from genomic DNA of 416 Japanese subjects. We found 24 SNPs, including 17 novel ones: two in the distal enhancer, four in the proximal promoter, one in the 5'-untranslated region (UTR), seven in the introns, and three in the 3'-UTR. The most common SNP was c.1026+12G>A (IVS10+12G>A), with a 0.249 frequency. Four non-synonymous SNPs, c.554C>G (p.T185S, CYP3A4(*)16), c.830_831insA (p.E277fsX8, (*)6), c.878T>C (p.L293P, (*)18), and c.1088 C>T (p.T363M, (*)11) were found with frequencies of 0.014, 0.001, 0.028, and 0.002, respectively. No SNP was found in the known nuclear transcriptional factor-binding sites in the enhancer and promoter regions. Using these 24 SNPs, 16 haplotypes were unambiguously identified, and nine haplotypes were inferred by aid of an expectation-maximization-based program. In addition, using data from 186 subjects enabled a close linkage to be found between CYP3A4 and CYP3A5 SNPs, especially among the SNPs at c.1026+12 in CYP3A4 and c.219-237 (IVS3-237, a key SNP site for CYP3A5(*)3), c.865+77 (IVS9+77) and c.1523 in CYP3A5. This result suggested that CYP3A4 and CYP3A5 are within the same gene block. Haplotype analysis between CYP3A4 and CYP3A5 revealed several major haplotype combinations in the CYP3A4-CYP3A5 block. Our findings provide fundamental and useful information for genotyping CYP3A4 (and CYP3A5) in the Japanese, and probably Asian populations. Copyright 2003 Wiley-Liss, Inc.

  1. Spatial and temporal distribution of the neutral polymorphisms in the last ZFX intron: analysis of the haplotype structure and genealogy.

    Science.gov (United States)

    Jaruzelska, J; Zietkiewicz, E; Batzer, M; Cole, D E; Moisan, J P; Scozzari, R; Tavaré, S; Labuda, D

    1999-07-01

    With 10 segregating sites (simple nucleotide polymorphisms) in the last intron (1089 bp) of the ZFX gene we have observed 11 haplotypes in 336 chromosomes representing a worldwide array of 15 human populations. Two haplotypes representing 77% of all chromosomes were distributed almost evenly among four continents. Five of the remaining haplotypes were detected in Africa and 4 others were restricted to Eurasia and the Americas. Using the information about the ancestral state of the segregating positions (inferred from human-great ape comparisons), we applied coalescent analysis to estimate the age of the polymorphisms and the resulting haplotypes. The oldest haplotype, with the ancestral alleles at all the sites, was observed at low frequency only in two groups of African origin. Its estimated age of 740 to 1100 kyr corresponded to the time to the most recent common ancestor. The two most frequent worldwide distributed haplotypes were estimated at 550 to 840 and 260 to 400 kyr, respectively, while the age of the continentally restricted polymorphisms was 120 to 180 kyr and smaller. Comparison of spatial and temporal distribution of the ZFX haplotypes suggests that modern humans diverged from the common ancestral stock in the Middle Paleolithic era. Subsequent range expansion prevented substantial gene flow among continents, separating African groups from populations that colonized Eurasia and the New World.

  2. Genetic differences in the two main groups of the Japanese population based on autosomal SNPs and haplotypes.

    Science.gov (United States)

    Yamaguchi-Kabata, Yumi; Tsunoda, Tatsuhiko; Kumasaka, Natsuhiko; Takahashi, Atsushi; Hosono, Naoya; Kubo, Michiaki; Nakamura, Yusuke; Kamatani, Naoyuki

    2012-05-01

    Although the Japanese population has a rather low genetic diversity, we recently confirmed the presence of two main clusters (the Hondo and Ryukyu clusters) through principal component analysis of genome-wide single-nucleotide polymorphism (SNP) genotypes. Understanding the genetic differences between the two main clusters requires further genome-wide analyses based on a dense SNP set and comparison of haplotype frequencies. In the present study, we determined haplotypes for the Hondo cluster of the Japanese population by detecting SNP homozygotes with 388,591 autosomal SNPs from 18,379 individuals and estimated the haplotype frequencies. Haplotypes for the Ryukyu cluster were inferred by a statistical approach using the genotype data from 504 individuals. We then compared the haplotype frequencies between the Hondo and Ryukyu clusters. In most genomic regions, the haplotype frequencies in the Hondo and Ryukyu clusters were very similar. However, in addition to the human leukocyte antigen region on chromosome 6, other genomic regions (chromosomes 3, 4, 5, 7, 10 and 12) showed dissimilarities in haplotype frequency. These regions were enriched for genes involved in the immune system, cell-cell adhesion and the intracellular signaling cascade. These differentiated genomic regions between the Hondo and Ryukyu clusters are of interest because they (1) should be examined carefully in association studies and (2) likely contain genes responsible for morphological or physiological differences between the two groups.

  3. Y-chromosome STR haplotypes in Somalis

    DEFF Research Database (Denmark)

    Hallenberg, Charlotte; Simonsen, Bo; Sanchez Sanchez, Juan Jose

    2005-01-01

    A total of 201 males from Somalia were typed for the Y-chromosome STRs DYS19, DYS385a/b, DYS389-I, DYS389-II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438 and DYS439 with the PowerPlex Y kit (Promega). A total of 96 different haplotypes were observed and the haplotype diversity was 0.9715. The ......A total of 201 males from Somalia were typed for the Y-chromosome STRs DYS19, DYS385a/b, DYS389-I, DYS389-II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438 and DYS439 with the PowerPlex Y kit (Promega). A total of 96 different haplotypes were observed and the haplotype diversity was 0...

  4. Three Novel Haplotypes of Theileria bicornis in Black and White Rhinoceros in Kenya.

    Science.gov (United States)

    Otiende, M Y; Kivata, M W; Jowers, M J; Makumi, J N; Runo, S; Obanda, V; Gakuya, F; Mutinda, M; Kariuki, L; Alasaad, S

    2016-02-01

    Piroplasms, especially those in the genera Babesia and Theileria, have been found to naturally infect rhinoceros. Due to natural or human-induced stress factors such as capture and translocations, animals often develop fatal clinical piroplasmosis, which causes death if not treated. This study examines the genetic diversity and occurrence of novel Theileria species infecting both black and white rhinoceros in Kenya. Samples collected opportunistically during routine translocations and clinical interventions from 15 rhinoceros were analysed by polymerase chain reaction (PCR) using a nested amplification of the small subunit ribosomal RNA (18S rRNA) gene fragments of Babesia and Theileria. Our study revealed for the first time in Kenya the presence of Theileria bicornis in white (Ceratotherium simum simum) and black (Diceros bicornis michaeli) rhinoceros and the existence of three new haplotypes: haplotypes H1 and H3 were present in white rhinoceros, while H2 was present in black rhinoceros. No specific haplotype was correlated to any specific geographical location. The Bayesian inference 50% consensus phylogram recovered the three haplotypes monophyleticly, and Theileria bicornis had very high support (BPP: 0.98). Furthermore, the genetic p-uncorrected distances and substitutions between T. bicornis and the three haplotypes were the same in all three haplotypes, indicating a very close genetic affinity. This is the first report of the occurrence of Theileria species in white and black rhinoceros from Kenya. The three new haplotypes reported here for the first time have important ecological and conservational implications, especially for population management and translocation programs and as a means of avoiding the transport of infected animals into non-affected areas. © 2014 Blackwell Verlag GmbH.

  5. Distributional Inference

    NARCIS (Netherlands)

    Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.

    1995-01-01

    The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is

  6. A general approach for haplotype phasing across the full spectrum of relatedness.

    Directory of Open Access Journals (Sweden)

    Jared O'Connell

    2014-04-01

    Full Text Available Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.

  7. Variations on Bayesian Prediction and Inference

    Science.gov (United States)

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  8. Multimodel inference and adaptive management

    Science.gov (United States)

    Rehme, S.E.; Powell, L.A.; Allen, Craig R.

    2011-01-01

    Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.

  9. Haplotype-based stratification of Huntington's disease.

    Science.gov (United States)

    Chao, Michael J; Gillis, Tammy; Atwal, Ranjit S; Mysore, Jayalakshmi Srinidhi; Arjomand, Jamshid; Harold, Denise; Holmans, Peter; Jones, Lesley; Orth, Michael; Myers, Richard H; Kwak, Seung; Wheeler, Vanessa C; MacDonald, Marcy E; Gusella, James F; Lee, Jong-Min

    2017-11-01

    Huntington's disease (HD) is an autosomal dominant neurodegenerative disease caused by expansion of a CAG trinucleotide repeat in HTT, resulting in an extended polyglutamine tract in huntingtin. We and others have previously determined that the HD-causing expansion occurs on multiple different haplotype backbones, reflecting more than one ancestral origin of the same type of mutation. In view of the therapeutic potential of mutant allele-specific gene silencing, we have compared and integrated two major systems of HTT haplotype definition, combining data from 74 sequence variants to identify the most frequent disease-associated and control chromosome backbones and revealing that there is potential for additional resolution of HD haplotypes. We have used the large collection of 4078 heterozygous HD subjects analyzed in our recent genome-wide association study of HD age at onset to estimate the frequency of these haplotypes in European subjects, finding that common genetic variation at HTT can distinguish the normal and CAG-expanded chromosomes for more than 95% of European HD individuals. As a resource for the HD research community, we have also determined the haplotypes present in a series of publicly available HD subject-derived fibroblasts, induced pluripotent cells, and embryonic stem cells in order to facilitate efforts to develop inclusive methods of allele-specific HTT silencing applicable to most HD patients. Our data providing genetic guidance for therapeutic gene-based targeting will significantly contribute to the developments of rational treatments and implementation of precision medicine in HD.

  10. Estimating haplotype effects for survival data.

    Science.gov (United States)

    Scheike, Thomas H; Martinussen, Torben; Silver, Jeremy D

    2010-09-01

    Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplotype effects for survival data. These estimating equations are simple to implement and avoid the use of the EM algorithm, which may be slow in the context of the semiparametric Cox model with incomplete covariate information. These estimating equations also lead to easily computable, direct estimators of standard errors, and thus overcome some of the difficulty in obtaining variance estimators based on the EM algorithm in this setting. We also develop an easily implemented goodness-of-fit procedure for Cox's regression model including haplotype effects. Finally, we apply the procedures presented in this article to investigate possible haplotype effects of the PAF-receptor on cardiovascular events in patients with coronary artery disease, and compare our results to those based on the EM algorithm. © 2009, The International Biometric Society.

  11. Perceptual inference.

    Science.gov (United States)

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Inference problems in structural biology

    DEFF Research Database (Denmark)

    Olsson, Simon

    The structure and dynamics of biological molecules are essential for their function. Consequently, a wealth of experimental techniques have been developed to study these features. However, while experiments yield detailed information about geometrical features of molecules, this information is of...

  13. A Bayesian Network Schema for Lessening Database Inference

    National Research Council Canada - National Science Library

    Chang, LiWu; Moskowitz, Ira S

    2001-01-01

    .... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...

  14. Optimization methods for logical inference

    CERN Document Server

    Chandru, Vijay

    2011-01-01

    Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in

  15. Variational inference & deep learning : A new synthesis

    NARCIS (Netherlands)

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  16. Variational inference & deep learning: A new synthesis

    OpenAIRE

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  17. The effect of genealogy-based haplotypes on genomic prediction

    DEFF Research Database (Denmark)

    Edriss, Vahid; Fernando, Rohan L.; Su, Guosheng

    2013-01-01

    on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. Methods A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using...... local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (pi) of the haplotype covariates had zero effect......, i.e. a Bayesian mixture method. Results About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some...

  18. On principles of inductive inference

    OpenAIRE

    Kostecki, Ryszard Paweł

    2011-01-01

    We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.

  19. The performance of phylogenetic algorithms in estimating haplotype genealogies with migration.

    Science.gov (United States)

    Salzburger, Walter; Ewing, Greg B; Von Haeseler, Arndt

    2011-05-01

    Genealogies estimated from haplotypic genetic data play a prominent role in various biological disciplines in general and in phylogenetics, population genetics and phylogeography in particular. Several software packages have specifically been developed for the purpose of reconstructing genealogies from closely related, and hence, highly similar haplotype sequence data. Here, we use simulated data sets to test the performance of traditional phylogenetic algorithms, neighbour-joining, maximum parsimony and maximum likelihood in estimating genealogies from nonrecombining haplotypic genetic data. We demonstrate that these methods are suitable for constructing genealogies from sets of closely related DNA sequences with or without migration. As genealogies based on phylogenetic reconstructions are fully resolved, but not necessarily bifurcating, and without reticulations, these approaches outperform widespread 'network' constructing methods. In our simulations of coalescent scenarios involving panmictic, symmetric and asymmetric migration, we found that phylogenetic reconstruction methods performed well, while the statistical parsimony approach as implemented in TCS performed poorly. Overall, parsimony as implemented in the PHYLIP package performed slightly better than other methods. We further point out that we are not making the case that widespread 'network' constructing methods are bad, but that traditional phylogenetic tree finding methods are applicable to haplotypic data and exhibit reasonable performance with respect to accuracy and robustness. We also discuss some of the problems of converting a tree to a haplotype genealogy, in particular that it is nonunique. © 2011 Blackwell Publishing Ltd.

  20. Amphibole as an archivist of magmatic crystallization conditions: problems, potential, and implications for inferring magma storage prior to the paroxysmal 2010 eruption of Mount Merapi, Indonesia

    Science.gov (United States)

    Erdmann, Saskia; Martel, Caroline; Pichavant, Michel; Kushnir, Alexandra

    2014-06-01

    Amphibole is widely employed to calculate crystallization temperature and pressure, although its potential as a geobarometer has always been debated. Recently, Ridolfi et al. (Contrib Mineral Petrol 160:45-66, 2010) and Ridolfi and Renzulli (Contrib Mineral Petrol 163:877-895, 2012) have presented calibrations for calculating temperature, pressure, fO2, melt H2O, and melt major and minor oxide composition from amphibole with a large compositional range. Using their calibrations, we have (i) calculated crystallization conditions for amphibole from eleven published experimental studies to examine the problems and the potential of the new calibrations; and (ii) calculated crystallization conditions for amphibole from basaltic-andesitic pyroclasts erupted during the paroxysmal 2010 eruption of Mount Merapi in Java, Indonesia, to infer pre-eruptive conditions. Our comparison of experimental and calculated values shows that calculated crystallization temperatures are reasonable estimates. Calculated fO2 and melt SiO2 content yields potentially useful estimates at moderately reduced to moderately oxidized conditions and intermediate to felsic melt compositions. However, calculated crystallization pressure and melt H2O content are untenable estimates that largely reflect compositional variation in the crystallizing magmas and crystallization temperature and not the calculated parameters. Amphibole from Merapi's pyroclasts yields calculated conditions of ~200-800 MPa, ~900-1,050 °C, ~NNO + 0.3-NNO + 1.1, ~3.7-7.2 wt% melt H2O, and ~58-71 wt% melt SiO2. We interpret the variations in calculated temperature, fO2, and melt SiO2 content as reasonable estimates, but conclude that the large calculated pressure variation for amphibole from Merapi and many other arc volcanoes is evidence for thorough mixing of mafic to felsic magmas and not necessarily evidence for crystallization over a large depth range. In contrast, bimodal pressure estimates obtained for other arc magmas

  1. Differentiation analysis for estimating individual ancestry from the Tibetan Plateau by an archaic altitude adaptation EPAS1 haplotype among East Asian populations.

    Science.gov (United States)

    Jiang, Li; Peng, Jianxiong; Huang, Meisha; Liu, Jing; Wang, Ling; Ma, Quan; Zhao, Hui; Yang, Xin; Ji, Anquan; Li, Caixia

    2018-02-10

    Tibetans have adapted to the extreme environment of high altitude for hundreds of generations. A highly differentiated 5-SNP (Single Nucleotide Polymorphism) haplotype motif (AGGAA) on a hypoxic pathway gene, EPAS1, is observed in Tibetans and lowlanders. To evaluate the potential usage of the 5-SNP haplotype in ancestry inference for Tibetan or Tibetan-related populations, we analyzed this haplotype in 1053 individuals of 12 Chinese populations residing on the Tibetan Plateau, peripheral regions of Tibet, and plain regions. These data were integrated with the genotypes from the 1000 Genome populations and populations in a previously reported paper for population structure analyses. We found that populations representing highland and lowland groups have different dominant ancestry components. The core Denisovan haplotype (AGGAA) was observed at a frequency of 72.32% in the Tibetan Plateau, with a frequency range from 9.48 to 21.05% in the peripheral regions and Tibetan Plateau carried the archaic haplotype, while < 5% of the Chinese Han people carried the haplotype. Our findings indicate that the 5-SNP haplotype has a special distribution pattern in populations of Tibet and peripheral regions and could be integrated into AISNP (Ancestry Informative Single Nucleotide Polymorphism) panels to enhance ancestry resolution.

  2. Interactive Instruction in Bayesian Inference

    DEFF Research Database (Denmark)

    Khan, Azam; Breslav, Simon; Hornbæk, Kasper

    2018-01-01

    An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....

  3. Adaptive Inference on General Graphical Models

    OpenAIRE

    Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur

    2012-01-01

    Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...

  4. SLC22A1-ABCB1 haplotype profiles predict imatinib pharmacokinetics in Asian patients with chronic myeloid leukemia.

    Directory of Open Access Journals (Sweden)

    Onkar Singh

    Full Text Available OBJECTIVE: This study aimed to explore the influence of SLC22A1, PXR, ABCG2, ABCB1 and CYP3A5 3 genetic polymorphisms on imatinib mesylate (IM pharmacokinetics in Asian patients with chronic myeloid leukemia (CML. PATIENTS AND METHODS: Healthy subjects belonging to three Asian populations (Chinese, Malay, Indian; n = 70 each and CML patients (n = 38 were enrolled in a prospective pharmacogenetics study. Imatinib trough (C(0h and clearance (CL were determined in the patients at steady state. Haplowalk method was applied to infer the haplotypes and generalized linear model (GLM to estimate haplotypic effects on IM pharmacokinetics. Association of haplotype copy numbers with IM pharmacokinetics was defined by Mann-Whitney U test. RESULTS: Global haplotype score statistics revealed a SLC22A1 sub-haplotypic region encompassing three polymorphisms (rs3798168, rs628031 and IVS7+850C>T, to be significantly associated with IM clearance (p = 0.013. Haplotype-specific GLM estimated that the haplotypes AGT and CGC were both associated with 22% decrease in clearance compared to CAC [CL (10(-2 L/hr/mg: CAC vs AGT: 4.03 vs 3.16, p = 0.017; CAC vs CGC: 4.03 vs 3.15, p = 0.017]. Patients harboring 2 copies of AGT or CGC haplotypes had 33.4% lower clearance and 50% higher C(0h than patients carrying 0 or 1 copy [CL (10(-2 L/hr/mg: 2.19 vs 3.29, p = 0.026; C(0h (10(-6 1/ml: 4.76 vs 3.17, p = 0.013, respectively]. Further subgroup analysis revealed SLC22A1 and ABCB1 haplotypic combinations to be significantly associated with clearance and C(0h (p = 0.002 and 0.009, respectively. CONCLUSION: This exploratory study suggests that SLC22A1-ABCB1 haplotypes may influence IM pharmacokinetics in Asian CML patients.

  5. [Construction of haplotype and haplotype block based on tag single nucleotide polymorphisms and their applications in association studies].

    Science.gov (United States)

    Gu, Ming-liang; Chu, Jia-you

    2007-12-01

    Human genome has structures of haplotype and haplotype block which provide valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to complex diseases. Haplotype block can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of ptag SNPsq can be used to distinguish a large fraction of the haplotypes. These tag SNPs can be potentially useful for construction of haplotype and haplotype block, and association studies in complex diseases. There are two general classes of methods to construct haplotype and haplotype blocks based on genotypes on large pedigrees and statistical algorithms respectively. The author evaluate several construction methods to assess the power of different association tests with a variety of disease models and block-partitioning criteria. The advantages, limitations and applications of each method and the application in the association studies are discussed equitably. With the completion of the HapMap and development of statistical algorithms for addressing haplotype reconstruction, ideas of construction of haplotype based on combination of mathematics, physics, and computer science etc will have profound impacts on population genetics, location and cloning for susceptible genes in complex diseases, and related domain with life science etc.

  6. Human cytochrome P450 2B6 genetic variability in Botswana: a case of haplotype diversity and convergent phenotypes

    KAUST Repository

    Tawe, Leabaneng

    2018-03-14

    Identification of inter-individual variability for drug metabolism through cytochrome P450 2B6 (CYP2B6) enzyme is important for understanding the differences in clinical responses to malaria and HIV. This study evaluates the distribution of CYP2B6 alleles, haplotypes and inferred metabolic phenotypes among subjects with different ethnicity in Botswana. A total of 570 subjects were analyzed for CYP2B6 polymorphisms at position 516 G > T (rs3745274), 785 A > G (rs2279343) and 983 T > C (rs28399499). Samples were collected in three districts of Botswana where the population belongs to Bantu (Serowe/Palapye and Chobe) and San-related (Ghanzi) ethnicity. The three districts showed different haplotype composition according to the ethnic background but similar metabolic inferred phenotypes, with 59.12%, 34.56%, 2.10% and 4.21% of the subjects having, respectively, an extensive, intermediate, slow and rapid metabolic profile. The results hint at the possibility of a convergent adaptation of detoxifying metabolic phenotypes despite a different haplotype structure due to the different genetic background. The main implication is that, while there is substantial homogeneity of metabolic inferred phenotypes among the country, the response to drugs metabolized via CYP2B6 could be individually associated to an increased risk of treatment failure and toxicity. These are important facts since Botswana is facing malaria elimination and a very high HIV prevalence.

  7. Human cytochrome P450 2B6 genetic variability in Botswana: a case of haplotype diversity and convergent phenotypes

    KAUST Repository

    Tawe, Leabaneng; Motshoge, Thato; Ramatlho, Pleasure; Mutukwa, Naledi; Muthoga, Charles Waithaka; Dongho, Ghyslaine Bruna Djeunang; Martinelli, Axel; Peloewetse, Elias; Russo, Gianluca; Quaye, Isaac Kweku; Paganotti, Giacomo Maria

    2018-01-01

    Identification of inter-individual variability for drug metabolism through cytochrome P450 2B6 (CYP2B6) enzyme is important for understanding the differences in clinical responses to malaria and HIV. This study evaluates the distribution of CYP2B6 alleles, haplotypes and inferred metabolic phenotypes among subjects with different ethnicity in Botswana. A total of 570 subjects were analyzed for CYP2B6 polymorphisms at position 516 G > T (rs3745274), 785 A > G (rs2279343) and 983 T > C (rs28399499). Samples were collected in three districts of Botswana where the population belongs to Bantu (Serowe/Palapye and Chobe) and San-related (Ghanzi) ethnicity. The three districts showed different haplotype composition according to the ethnic background but similar metabolic inferred phenotypes, with 59.12%, 34.56%, 2.10% and 4.21% of the subjects having, respectively, an extensive, intermediate, slow and rapid metabolic profile. The results hint at the possibility of a convergent adaptation of detoxifying metabolic phenotypes despite a different haplotype structure due to the different genetic background. The main implication is that, while there is substantial homogeneity of metabolic inferred phenotypes among the country, the response to drugs metabolized via CYP2B6 could be individually associated to an increased risk of treatment failure and toxicity. These are important facts since Botswana is facing malaria elimination and a very high HIV prevalence.

  8. Russell and Humean Inferences

    Directory of Open Access Journals (Sweden)

    João Paulo Monteiro

    2001-12-01

    Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.

  9. Genomic sequence of 'Candidatus Liberibacter solanacearum' haplotype C and its comparison with haplotype A and B genomes.

    Directory of Open Access Journals (Sweden)

    Jinhui Wang

    Full Text Available Haplotypes A and B of 'Candidatus Liberibacter solanacearum' (CLso are associated with diseases of solanaceous plants, especially Zebra chip disease of potato, and haplotypes C, D and E are associated with symptoms on apiaceous plants. To date, one complete genome of haplotype B and two high quality draft genomes of haplotype A have been obtained for these unculturable bacteria using metagenomics from the psyllid vector Bactericera cockerelli. Here, we present the first genomic sequences obtained for the carrot-associated CLso. These two genomic sequences of haplotype C, FIN114 (1.24 Mbp and FIN111 (1.20 Mbp, were obtained from carrot psyllids (Trioza apicalis harboring CLso. Genomic comparisons between the haplotypes A, B and C revealed that the genome organization differs between these haplotypes, due to large inversions and other recombinations. Comparison of protein-coding genes indicated that the core genome of CLso consists of 885 ortholog groups, with the pan-genome consisting of 1327 ortholog groups. Twenty-seven ortholog groups are unique to CLso haplotype C, whilst 11 ortholog groups shared by the haplotypes A and B, are not found in the haplotype C. Some of these ortholog groups that are not part of the core genome may encode functions related to interactions with the different host plant and psyllid species.

  10. Haplotype structure in Ashkenazi Jewish BRCA1 and BRCA2 mutation carriers

    DEFF Research Database (Denmark)

    Im, Kate M; Kirchhoff, Tomas; Wang, Xianshu

    2011-01-01

    Three founder mutations in BRCA1 and BRCA2 contribute to the risk of hereditary breast and ovarian cancer in Ashkenazi Jews (AJ). They are observed at increased frequency in the AJ compared to other BRCA mutations in Caucasian non-Jews (CNJ). Several authors have proposed that elevated allele...... the tools of statistical genomics to examine the likelihood of long-range LD at a deleterious locus in a population that faced a genetic bottleneck. We studied the genotypes of hundreds of women from a large international consortium of BRCA1 and BRCA2 mutation carriers and found that AJ women exhibited long......-range haplotypes compared to CNJ women. More than 50% of the AJ chromosomes with the BRCA1 185delAG mutation share an identical 2.1 Mb haplotype and nearly 16% of AJ chromosomes carrying the BRCA2 6174delT mutation share a 1.4 Mb haplotype. Simulations based on the best inference of Ashkenazi population demography...

  11. Genomic identification of founding haplotypes reveals the history of the selfing species Capsella rubella.

    Directory of Open Access Journals (Sweden)

    Yaniv Brandvain

    Full Text Available The shift from outcrossing to self-fertilization is among the most common evolutionary transitions in flowering plants. Until recently, however, a genome-wide view of this transition has been obscured by both a dearth of appropriate data and the lack of appropriate population genomic methods to interpret such data. Here, we present a novel population genomic analysis detailing the origin of the selfing species, Capsella rubella, which recently split from its outcrossing sister, Capsella grandiflora. Due to the recency of the split, much of the variation within C. rubella is also found within C. grandiflora. We can therefore identify genomic regions where two C. rubella individuals have inherited the same or different segments of ancestral diversity (i.e. founding haplotypes present in C. rubella's founder(s. Based on this analysis, we show that C. rubella was founded by multiple individuals drawn from a diverse ancestral population closely related to extant C. grandiflora, that drift and selection have rapidly homogenized most of this ancestral variation since C. rubella's founding, and that little novel variation has accumulated within this time. Despite the extensive loss of ancestral variation, the approximately 25% of the genome for which two C. rubella individuals have inherited different founding haplotypes makes up roughly 90% of the genetic variation between them. To extend these findings, we develop a coalescent model that utilizes the inferred frequency of founding haplotypes and variation within founding haplotypes to estimate that C. rubella was founded by a potentially large number of individuals between 50 and 100 kya, and has subsequently experienced a twenty-fold reduction in its effective population size. As population genomic data from an increasing number of outcrossing/selfing pairs are generated, analyses like the one developed here will facilitate a fine-scaled view of the evolutionary and demographic impact of the

  12. Novel full-length major histocompatibility complex class I allele discovery and haplotype definition in pig-tailed macaques.

    Science.gov (United States)

    Semler, Matthew R; Wiseman, Roger W; Karl, Julie A; Graham, Michael E; Gieger, Samantha M; O'Connor, David H

    2017-11-13

    Pig-tailed macaques (Macaca nemestrina, Mane) are important models for human immunodeficiency virus (HIV) studies. Their infectability with minimally modified HIV makes them a uniquely valuable animal model to mimic human infection with HIV and progression to acquired immunodeficiency syndrome (AIDS). However, variation in the pig-tailed macaque major histocompatibility complex (MHC) and the impact of individual transcripts on the pathogenesis of HIV and other infectious diseases is understudied compared to that of rhesus and cynomolgus macaques. In this study, we used Pacific Biosciences single-molecule real-time circular consensus sequencing to describe full-length MHC class I (MHC-I) transcripts for 194 pig-tailed macaques from three breeding centers. We then used the full-length sequences to infer Mane-A and Mane-B haplotypes containing groups of MHC-I transcripts that co-segregate due to physical linkage. In total, we characterized full-length open reading frames (ORFs) for 313 Mane-A, Mane-B, and Mane-I sequences that defined 86 Mane-A and 106 Mane-B MHC-I haplotypes. Pacific Biosciences technology allows us to resolve these Mane-A and Mane-B haplotypes to the level of synonymous allelic variants. The newly defined haplotypes and transcript sequences containing full-length ORFs provide an important resource for infectious disease researchers as certain MHC haplotypes have been shown to provide exceptional control of simian immunodeficiency virus (SIV) replication and prevention of AIDS-like disease in nonhuman primates. The increased allelic resolution provided by Pacific Biosciences sequencing also benefits transplant research by allowing researchers to more specifically match haplotypes between donors and recipients to the level of nonsynonymous allelic variation, thus reducing the risk of graft-versus-host disease.

  13. Active inference, communication and hermeneutics.

    Science.gov (United States)

    Friston, Karl J; Frith, Christopher D

    2015-07-01

    Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Inference Attacks and Control on Database Structures

    Directory of Open Access Journals (Sweden)

    Muhamed Turkanovic

    2015-02-01

    Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.

  15. Context Analysis of Customer Requests using a Hybrid Adaptive Neuro Fuzzy Inference System and Hidden Markov Models in the Natural Language Call Routing Problem

    Science.gov (United States)

    Rustamov, Samir; Mustafayev, Elshan; Clements, Mark A.

    2018-04-01

    The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM) can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.

  16. Context Analysis of Customer Requests using a Hybrid Adaptive Neuro Fuzzy Inference System and Hidden Markov Models in the Natural Language Call Routing Problem

    Directory of Open Access Journals (Sweden)

    Rustamov Samir

    2018-04-01

    Full Text Available The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.

  17. Aspects combinatoires des réarrangements génomiques et des réseaux d'haplotypes

    OpenAIRE

    Labarre, Anthony

    2008-01-01

    The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks.Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing th...

  18. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  19. Fitchi: haplotype genealogy graphs based on the Fitch algorithm.

    Science.gov (United States)

    Matschiner, Michael

    2016-04-15

    : In population genetics and phylogeography, haplotype genealogy graphs are important tools for the visualization of population structure based on sequence data. In this type of graph, node sizes are often drawn in proportion to haplotype frequencies and edge lengths represent the minimum number of mutations separating adjacent nodes. I here present Fitchi, a new program that produces publication-ready haplotype genealogy graphs based on the Fitch algorithm. http://www.evoinformatics.eu/fitchi.htm : michaelmatschiner@mac.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution

    Directory of Open Access Journals (Sweden)

    Charleston W. K. Chiang

    2016-05-01

    Full Text Available Identity-by-descent (IBD is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample, and there are currently efforts to detect shorter segments from sequencing. Here, we study a problem of identifiability: because existing approaches detect IBD based on contiguous segments of identity-by-state, inferred long segments of IBD may arise from the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that significant proportions of inferred segments 1–2 cM long are results of conflations of two or more shorter segments, each at least 0.2 cM or longer, under demographic scenarios typical for modern humans for all programs tested. The impact of such conflation is much smaller for longer (> 2 cM segments. This biases the inferred IBD segment length distribution, and so can affect downstream inferences that depend on the assumption that each segment of IBD derives from a single common ancestor. As an example, we present and analyze an estimator of the de novo mutation rate using IBD segments, and demonstrate that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate. Understanding the conflation effect in detail will make its correction in future methods more tractable.

  1. Haplotype of platelet receptor P2RY12 gene is associated with residual clopidogrel on-treatment platelet reactivity.

    Science.gov (United States)

    Nie, Xiao-Yan; Li, Jun-Lei; Zhang, Yong; Xu, Yang; Yang, Xue-Li; Fu, Yu; Liang, Guang-Kai; Lu, Yun; Liu, Jian; Shi, Lu-Wen

    To investigate a possible association between common variations of the P2RY12 and the residual clopidogrel on-treatment platelet reactivity after adjusting for the influence of CYP2C19 tested by thromboelastography (TEG). One hundred and eighty patients with acute coronary syndrome (ACS) treated with clopidogrel and aspirin were included and platelet function was assessed by TEG. Five selected P2RY12 single nucleotide polymorphisms (SNPs; rs6798347, rs6787801, rs6801273, rs6785930, and rs2046934), which cover the common variations in the P2RY12 gene and its regulatory regions, and three CYP2C19 SNPs ( * 2, * 3, * 17) were genotyped and possible haplotypes were analyzed. The high on-treatment platelet reactivity (HTPR) prevalence defined by a platelet inhibition rate <30% by TEG in adenosine diphosphate (ADP)-channel was 69 (38.33%). Six common haplotypes were inferred from four of the selected P2RY12 SNPs (denoted H 0 to H 5 ) according to the linkage disequilibrium R square (except for rs2046934). Haplotype H 1 showed a significantly lower incidence of HTPR than the reference haplotype (H 0 ) in the total study population while haplotypes H 1 and H 2 showed significantly lower incidences of HTPR than H 0 in the nonsmoker subgroup after adjusting for CYP2C19 effects and demographic characteristics. rs2046934 (T744C) did not show any significant association with HTPR. The combination of common P2RY12 variations including regulatory regions rather than rs2046934 (T744C) that related to pharmacodynamics of clopidogrel in patients with ACS was independently associated with residual on-clopidogrel platelet reactivity. This is apart from the established association of the CYP2C19. This association seemed more important in the subgroup defined by smoking.

  2. A DRD1 haplotype is associated with risk for autism spectrum disorders in male-only affected sib-pair families.

    Science.gov (United States)

    Hettinger, Joe A; Liu, Xudong; Schwartz, Charles E; Michaelis, Ron C; Holden, Jeanette J A

    2008-07-05

    Individuals with autism spectrum disorders (ASDs) have impairments in executive function and social cognition, with males generally being more severely affected in these areas than females. Because the dopamine D1 receptor (encoded by DRD1) is integral to the neural circuitry mediating these processes, we examined the DRD1 gene for its role in susceptibility to ASDs by performing single marker and haplotype case-control comparisons, family-based association tests, and genotype-phenotype assessments (quantitative transmission disequilibrium tests: QTDT) using three DRD1 polymorphisms, rs265981C/T, rs4532A/G, and rs686T/C. Our previous findings suggested that the dopaminergic system may be more integrally involved in families with affected males only than in other families. We therefore restricted our study to families with two or more affected males (N = 112). There was over-transmission of rs265981-C and rs4532-A in these families (P = 0.040, P = 0.038), with haplotype TDT analysis showing over-transmission of the C-A-T haplotype (P = 0.022) from mothers to affected sons (P = 0.013). In addition, haplotype case-control comparisons revealed an increase of this putative risk haplotype in affected individuals relative to a comparison group (P = 0.004). QTDT analyses showed associations of the rs265981-C, rs4532-A, rs686-T alleles, and the C-A-T haplotype with more severe problems in social interaction, greater difficulties with nonverbal communication and increased stereotypies compared to individuals with other haplotypes. Preferential haplotype transmission of markers at the DRD1 locus and an increased frequency of a specific haplotype support the DRD1 gene as a risk gene for core symptoms of ASD in families having only affected males. Copyright 2008 Wiley-Liss, Inc.

  3. On Maximum Entropy and Inference

    Directory of Open Access Journals (Sweden)

    Luigi Gresele

    2017-11-01

    Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.

  4. HERC1 polymorphisms: population-specific variations in haplotype composition.

    Science.gov (United States)

    Yuasa, Isao; Umetsu, Kazuo; Nishimukai, Hiroaki; Fukumori, Yasuo; Harihara, Shinji; Saitou, Naruya; Jin, Feng; Chattopadhyay, Prasanta K; Henke, Lotte; Henke, Jürgen

    2009-08-01

    Human HERC1 is one of six HERC proteins and may play an important role in intracellular membrane trafficking. The human HERC1 gene is suggested to have been affected by local positive selection. To assess the global frequency distributions of coding and non-coding single nucleotide polymorphisms (SNPs) in the HERC1 gene, we developed a new simultaneous genotyping method for four SNPs, and applied this method to investigate 1213 individuals from 12 global populations. The results confirmed remarked differences in the allele and haplotype frequencies between East Asian and non-East Asian populations. One of the three common haplotypes observed was found to be characteristic of East Asians, who showed a relatively uniform distribution of haplotypes. Information on haplotypes would be useful for testing the function of polymorphisms in the HERC1 gene. This is the first study to investigate the distribution of HERC1 polymorphisms in various populations. (c) 2009 John Wiley & Sons, Ltd.

  5. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  6. Identification of Tribolium castaneum (Herbst) haplotypes, the pest of ...

    African Journals Online (AJOL)

    SARAH

    2016-07-31

    Jul 31, 2016 ... haplotypes of T. castaneum and their distribution in Senegal. Methodology ... very strong marketing of cereals and vegetables in that area. The mutations ..... for each channel by sampling the various parameters every 1000 ...

  7. Analysis of DR4 haplotypes in insulin dependent diabetes (IDD)

    International Nuclear Information System (INIS)

    Monos, D.S.; Radka, S.F.; Zmijewski, C.M.; Kamoun, M.

    1986-01-01

    Population studies indicate that HLA-DR4 is implicated in the susceptibility of IDD. However, biochemical characterization of the serologically defined DR4 haplotype from normal individuals revealed five DR4 and three DQW3 molecular forms. Hence, in this study, they investigated the heterogeneity of the DR4 haplotype, using B-lymphoblastoid cell lines (B-LCL) generated from patients with IDD and seropositive for DR4. Class II molecules, metabolically labeled with 35 S-methionine, were immunoprecipitated with monoclonal antibodies specific for DR(L243), DQ(N297), DQW3(IVD12) or DR and DQ(SG465) and analyzed by two-dimensional polyacrylamide gel electrophoresis (2D-PAGE). The isoelectrofocusing (IEF) conditions employed in this study allow representation only of the DR4 haplotype from either DR3/4 or DR4/4 cell lines. The analysis of six different DR4 haplotypes from seven IDD patients, revealed the presence of two DR4 β and two DQW3 β chains. Three of the six DR4 β haplotypes analyzed shared the same DR4 β chain and three others shared a different one. Additionally five of the six haplotypes shared a different one. Additionally five of the six haplotypes shared the same DQW3 β chain and only one was carrying a different one. Different combinations of the two DR4 and two DQW3 β chains constitute three distinct patterns of DR4 haplotypes. These results suggest the prevalence of a DQW3 β chain in the small sample of IDD patients studied. Studies of a large number of patients should clarify whether IDD is associated with unique variants of DR4 or DQW3 β chains

  8. MHC Class II haplotypes of Colombian Amerindian tribes

    Science.gov (United States)

    Yunis, Juan J.; Yunis, Edmond J.; Yunis, Emilio

    2013-01-01

    We analyzed 1041 individuals belonging to 17 Amerindian tribes of Colombia, Chimila, Bari and Tunebo (Chibcha linguistic family), Embera, Waunana (Choco linguistic family), Puinave and Nukak (Maku-Puinave linguistic families), Cubeo, Guanano, Tucano, Desano and Piratapuyo (Tukano linguistic family), Guahibo and Guayabero (Guayabero Linguistic Family), Curripaco and Piapoco (Arawak linguistic family) and Yucpa (Karib linguistic family). for MHC class II haplotypes (HLA-DRB1, DQA1, DQB1). Approximately 90% of the MHC class II haplotypes found among these tribes are haplotypes frequently encountered in other Amerindian tribes. Nonetheless, striking differences were observed among Chibcha and non-Chibcha speaking tribes. The DRB1*04:04, DRB1*04:11, DRB1*09:01 carrying haplotypes were frequently found among non-Chibcha speaking tribes, while the DRB1*04:07 haplotype showed significant frequencies among Chibcha speaking tribes, and only marginal frequencies among non-Chibcha speaking tribes. Our results suggest that the differences in MHC class II haplotype frequency found among Chibcha and non-Chibcha speaking tribes could be due to genetic differentiation in Mesoamerica of the ancestral Amerindian population into Chibcha and non-Chibcha speaking populations before they entered into South America. PMID:23885196

  9. Strategies for haplotype-based association mapping in complex pedigreed populations

    DEFF Research Database (Denmark)

    Boleckova, J; Christensen, Ole Fredslund; Sørensen, Peter

    2012-01-01

    In association mapping, haplotype-based methods are generally regarded to provide higher power and increased precision than methods based on single markers. For haplotype-based association mapping most studies use a fixed haplotype effect in the model. However, an increase in haplotype length inc...

  10. Dimensional Anxiety Mediates Linkage of GABRA2 Haplotypes With Alcoholism

    Science.gov (United States)

    Enoch, Mary-Anne; Schwartz, Lori; Albaugh, Bernard; Virkkunen, Matti; Goldman, David

    2015-01-01

    The GABAAα2 receptor gene (GABRA2) modulates anxiety and stress response. Three recent association studies implicate GABRA2 in alcoholism, however in these papers both common, opposite-configuration haplotypes in the region distal to intron3 predict risk. We have now replicated the GABRA2 association with alcoholism in 331 Plains Indian men and women and 461 Finnish Caucasian men. Using a dimensional measure of anxiety, harm avoidance (HA), we also found that the association with alcoholism is mediated, or moderated, by anxiety. Nine SNPs were genotyped revealing two haplotype blocks. Within the previously implicated block 2 region, we identified the two common, opposite-configuration risk haplotypes, A and B. Their frequencies differed markedly in Finns and Plains Indians. In both populations, most block 2 SNPs were significantly associated with alcoholism. The associations were due to increased frequencies of both homozygotes in alcoholics, indicating the possibility of alcoholic subtypes with opposite genotypes. Congruently, there was no significant haplotype association. Using HA as an indicator variable for anxiety, we found haplotype linkage to alcoholism with high and low dimensional anxiety, and to HA itself, in both populations. High HA alcoholics had the highest frequency of the more abundant haplotype (A in Finns, B in Plains Indians); low HA alcoholics had the highest frequency of the less abundant haplotype (B in Finns, A in Plains Indians) (Finns: P α0.007, OR α2.1, Plains Indians: P α0.040, OR α1.9). Non-alcoholics had intermediate frequencies. Our results suggest that within the distal GABRA2 region is a functional locus or loci that may differ between populations but that alters risk for alcoholism via the mediating action of anxiety. PMID:16874763

  11. Type Inference for Session Types in the Pi-Calculus

    DEFF Research Database (Denmark)

    Graversen, Eva Fajstrup; Harbo, Jacob Buchreitz; Huttel, Hans

    2014-01-01

    In this paper we present a direct algorithm for session type inference for the π-calculus. Type inference for session types has previously been achieved by either imposing limitations and restriction on the π-calculus, or by reducing the type inference problem to that for linear types. Our approach...

  12. H-PoP and H-PoPG: heuristic partitioning algorithms for single individual haplotyping of polyploids.

    Science.gov (United States)

    Xie, Minzhu; Wu, Qiong; Wang, Jianxin; Jiang, Tao

    2016-12-15

    Some economically important plants including wheat and cotton have more than two copies of each chromosome. With the decreasing cost and increasing read length of next-generation sequencing technologies, reconstructing the multiple haplotypes of a polyploid genome from its sequence reads becomes practical. However, the computational challenge in polyploid haplotyping is much greater than that in diploid haplotyping, and there are few related methods. This article models the polyploid haplotyping problem as an optimal poly-partition problem of the reads, called the Polyploid Balanced Optimal Partition model. For the reads sequenced from a k-ploid genome, the model tries to divide the reads into k groups such that the difference between the reads of the same group is minimized while the difference between the reads of different groups is maximized. When the genotype information is available, the model is extended to the Polyploid Balanced Optimal Partition with Genotype constraint problem. These models are all NP-hard. We propose two heuristic algorithms, H-PoP and H-PoPG, based on dynamic programming and a strategy of limiting the number of intermediate solutions at each iteration, to solve the two models, respectively. Extensive experimental results on simulated and real data show that our algorithms can solve the models effectively, and are much faster and more accurate than the recent state-of-the-art polyploid haplotyping algorithms. The experiments also show that our algorithms can deal with long reads and deep read coverage effectively and accurately. Furthermore, H-PoP might be applied to help determine the ploidy of an organism. https://github.com/MinzhuXie/H-PoPG CONTACT: xieminzhu@hotmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. On Quantum Statistical Inference, II

    OpenAIRE

    Barndorff-Nielsen, O. E.; Gill, R. D.; Jupp, P. E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, theoretical developments in the theory of quantum measurements have brought the basic mathematical framework for the probability calculations much closer to that of classical probability theory. The present paper reviews this field and proposes and inte...

  14. Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values

    NARCIS (Netherlands)

    Calus, M.P.L.; Meuwissen, T.H.E.; Windig, J.J.; Knol, E.F.; Schrooten, C.; Vereijken, A.L.J.; Veerkamp, R.F.

    2009-01-01

    The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD) probabilities between haplotypes, various haplotype definitions were tested i.e.

  15. Mineralocorticoid receptor haplotype, oral contraceptives and emotional information processing.

    Science.gov (United States)

    Hamstra, D A; de Kloet, E R; van Hemert, A M; de Rijk, R H; Van der Does, A J W

    2015-02-12

    Oral contraceptives (OCs) affect mood in some women and may have more subtle effects on emotional information processing in many more users. Female carriers of mineralocorticoid receptor (MR) haplotype 2 have been shown to be more optimistic and less vulnerable to depression. To investigate the effects of oral contraceptives on emotional information processing and a possible moderating effect of MR haplotype. Cross-sectional study in 85 healthy premenopausal women of West-European descent. We found significant main effects of oral contraceptives on facial expression recognition, emotional memory and decision-making. Furthermore, carriers of MR haplotype 1 or 3 were sensitive to the impact of OCs on the recognition of sad and fearful faces and on emotional memory, whereas MR haplotype 2 carriers were not. Different compounds of OCs were included. No hormonal measures were taken. Most naturally cycling participants were assessed in the luteal phase of their menstrual cycle. Carriers of MR haplotype 2 may be less sensitive to depressogenic side-effects of OCs. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.

  16. In Vivo Characterization of Human APOA5 Haplotypes

    Energy Technology Data Exchange (ETDEWEB)

    Ahituv, Nadav; Akiyama, Jennifer; Chapman-Helleboid, Audrey; Fruchart, Jamila; Pennacchio, Len A.

    2006-10-01

    Increased plasma triglycerides concentrations are an independent risk factor for cardiovascular disease. Numerous studies support a reproducible genetic association between two minor haplotypes in the human apolipoprotein A5 gene (APOA5) and increased plasma triglyceride concentrations. We thus sought to investigate the effect of these minor haplotypes (APOA5*2 and APOA5*3) on ApoAV plasma levels through the precise insertion of single-copy intact APOA5 haplotypes at a targeted location in the mouse genome. While we found no difference in the amount of human plasma ApoAV in mice containing the common APOA5*1 and minor APOA5*2 haplotype, the introduction of the single APOA5*3 defining allele (19W) resulted in 3-fold lower ApoAV plasma levels consistent with existing genetic association studies. These results indicate that S19W polymorphism is likely to be functional and explain the strong association of this variant with plasma triglycerides supporting the value of sensitive in vivo assays to define the functional nature of human haplotypes.

  17. Founder haplotype analysis of Fanconi anemia in the Korean population finds common ancestral haplotypes for a FANCG variant.

    Science.gov (United States)

    Park, Joonhong; Kim, Myungshin; Jang, Woori; Chae, Hyojin; Kim, Yonggoo; Chung, Nack-Gyun; Lee, Jae-Wook; Cho, Bin; Jeong, Dae-Chul; Park, In Yang; Park, Mi Sun

    2015-05-01

    A common ancestral haplotype is strongly suggested in the Korean and Japanese patients with Fanconi anemia (FA), because common mutations have been frequently found: c.2546delC and c.3720_3724delAAACA of FANCA; c.307+1G>C, c.1066C>T, and c.1589_1591delATA of FANCG. Our aim in this study was to investigate the origin of these common mutations of FANCA and FANCG. We genotyped 13 FA patients consisting of five FA-A patients and eight FA-G patients from the Korean FA population. Microsatellite markers used for haplotype analysis included four CA repeat markers which are closely linked with FANCA and eight CA repeat markers which are contiguous with FANCG. As a result, Korean FA-A patients carrying c.2546delC or c.3720_3724delAAACA did not share the same haplotypes. However, three unique haplotypes carrying c.307+1G>C, c.1066C > T, or c.1589_1591delATA, that consisted of eight polymorphic loci covering a flanking region were strongly associated with Korean FA-G, consistent with founder haplotypes reported previously in the Japanese FA-G population. Our finding confirmed the common ancestral haplotypes on the origins of the East Asian FA-G patients, which will improve our understanding of the molecular population genetics of FA-G. To the best of our knowledge, this is the first report on the association between disease-linked mutations and common ancestral haplotypes in the Korean FA population. © 2015 John Wiley & Sons Ltd/University College London.

  18. Bayesian Inference Methods for Sparse Channel Estimation

    DEFF Research Database (Denmark)

    Pedersen, Niels Lovmand

    2013-01-01

    This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...

  19. EI: A Program for Ecological Inference

    Directory of Open Access Journals (Sweden)

    Gary King

    2004-09-01

    Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.

  20. HLA-G Haplotypes Are Differentially Associated with Asthmatic Features

    Directory of Open Access Journals (Sweden)

    Camille Ribeyre

    2018-02-01

    Full Text Available Human leukocyte antigen (HLA-G, a HLA class Ib molecule, interacts with receptors on lymphocytes such as T cells, B cells, and natural killer cells to influence immune responses. Unlike classical HLA molecules, HLA-G expression is not found on all somatic cells, but restricted to tissue sites, including human bronchial epithelium cells (HBEC. Individual variation in HLA-G expression is linked to its genetic polymorphism and has been associated with many pathological situations such as asthma, which is characterized by epithelium abnormalities and inflammatory cell activation. Studies reported both higher and equivalent soluble HLA-G (sHLA-G expression in different cohorts of asthmatic patients. In particular, we recently described impaired local expression of HLA-G and abnormal profiles for alternatively spliced isoforms in HBEC from asthmatic patients. sHLA-G dosage is challenging because of its many levels of polymorphism (dimerization, association with β2-microglobulin, and alternative splicing, thus many clinical studies focused on HLA-G single-nucleotide polymorphisms as predictive biomarkers, but few analyzed HLA-G haplotypes. Here, we aimed to characterize HLA-G haplotypes and describe their association with asthmatic clinical features and sHLA-G peripheral expression and to describe variations in transcription factor (TF binding sites and alternative splicing sites. HLA-G haplotypes were differentially distributed in 330 healthy and 580 asthmatic individuals. Furthermore, HLA-G haplotypes were associated with asthmatic clinical features showed. However, we did not confirm an association between sHLA-G and genetic, biological, or clinical parameters. HLA-G haplotypes were phylogenetically split into distinct groups, with each group displaying particular variations in TF binding or RNA splicing sites that could reflect differential HLA-G qualitative or quantitative expression, with tissue-dependent specificities. Our results, based on a

  1. Prion gene haplotypes of U.S. cattle

    Directory of Open Access Journals (Sweden)

    Harhay Gregory P

    2006-11-01

    Full Text Available Abstract Background Bovine spongiform encephalopathy (BSE is a fatal neurological disorder characterized by abnormal deposits of a protease-resistant isoform of the prion protein. Characterizing linkage disequilibrium (LD and haplotype networks within the bovine prion gene (PRNP is important for 1 testing rare or common PRNP variation for an association with BSE and 2 interpreting any association of PRNP alleles with BSE susceptibility. The objective of this study was to identify polymorphisms and haplotypes within PRNP from the promoter region through the 3'UTR in a diverse sample of U.S. cattle genomes. Results A 25.2-kb genomic region containing PRNP was sequenced from 192 diverse U.S. beef and dairy cattle. Sequence analyses identified 388 total polymorphisms, of which 287 have not previously been reported. The polymorphism alleles define PRNP by regions of high and low LD. High LD is present between alleles in the promoter region through exon 2 (6.7 kb. PRNP alleles within the majority of intron 2, the entire coding sequence and the untranslated region of exon 3 are in low LD (18.0 kb. Two haplotype networks, one representing the region of high LD and the other the region of low LD yielded nineteen different combinations that represent haplotypes spanning PRNP. The haplotype combinations are tagged by 19 polymorphisms (htSNPS which characterize variation within and across PRNP. Conclusion The number of polymorphisms in the prion gene region of U.S. cattle is nearly four times greater than previously described. These polymorphisms define PRNP haplotypes that may influence BSE susceptibility in cattle.

  2. Haplotype analysis and linkage disequilibrium for DGAT1

    OpenAIRE

    Strucken, Eva M.; Rahmatalla, Siham; De Koning, Dirk-Jan; Brockmann, Gudrun A.

    2010-01-01

    This study focused on haplotype effects and linkage disequilibrium (LD) for the K232A locus and the promoter VNTR in the DGAT1 gene. Analyses were carried out in three German Holstein Frisian populations (including 492, 305, and 518 animals) for milk yield, milk fat and protein yield, and milk fat and protein content. We found that effects of the promoter VNTR were not significant and explain only a small amount of the variation of the QTL on BTA14. Haplotype effects were less significant tha...

  3. Mineralocorticoid receptor haplotype, estradiol, progesterone and emotional information processing.

    Science.gov (United States)

    Hamstra, Danielle A; de Kloet, E Ronald; Quataert, Ina; Jansen, Myrthe; Van der Does, Willem

    2017-02-01

    Carriers of MR-haplotype 1 and 3 (GA/CG; rs5522 and rs2070951) are more sensitive to the influence of oral contraceptives (OC) and menstrual cycle phase on emotional information processing than MR-haplotype 2 (CA) carriers. We investigated whether this effect is associated with estradiol (E2) and/or progesterone (P4) levels. Healthy MR-genotyped premenopausal women were tested twice in a counterbalanced design. Naturally cycling (NC) women were tested in the early-follicular and mid-luteal phase and OC-users during OC-intake and in the pill-free week. At both sessions E2 and P4 were assessed in saliva. Tests included implicit and explicit positive and negative affect, attentional blink accuracy, emotional memory, emotion recognition, and risky decision-making (gambling). MR-haplotype 2 homozygotes had higher implicit happiness scores than MR-haplotype 2 heterozygotes (p=0.031) and MR-haplotype 1/3 carriers (pemotion recognition test than MR-haplotype 1/3 (p=0.001). Practice effects were observed for most measures. The pattern of correlations between information processing and P4 or E2 differed between sessions, as well as the moderating effects of the MR genotype. In the first session the MR-genotype moderated the influence of P4 on implicit anxiety (sr=-0.30; p=0.005): higher P4 was associated with reduction in implicit anxiety, but only in MR-haplotype 2 homozygotes (sr=-0.61; p=0.012). In the second session the MR-genotype moderated the influence of E2 on the recognition of facial expressions of happiness (sr=-0.21; p=0.035): only in MR-haplotype 1/3 higher E2 was correlated with happiness recognition (sr=0.29; p=0.005). In the second session higher E2 and P4 were negatively correlated with accuracy in lag2 trials of the attentional blink task (pemotional information processing. This moderating effect may depend on the novelty of the situation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Using metacognitive cues to infer others' thinking

    OpenAIRE

    André Mata; Tiago Almeida

    2014-01-01

    Three studies tested whether people use cues about the way other people think---for example, whether others respond fast vs. slow---to infer what responses other people might give to reasoning problems. People who solve reasoning problems using deliberative thinking have better insight than intuitive problem-solvers into the responses that other people might give to the same problems. Presumably because deliberative responders think of intuitive responses before they think o...

  5. Statistical inference an integrated Bayesianlikelihood approach

    CERN Document Server

    Aitkin, Murray

    2010-01-01

    Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre

  6. Bayesian statistical inference

    Directory of Open Access Journals (Sweden)

    Bruno De Finetti

    2017-04-01

    Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.

  7. Statistical learning and selective inference.

    Science.gov (United States)

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  8. Statistical inference an integrated approach

    CERN Document Server

    Migon, Helio S; Louzada, Francisco

    2014-01-01

    Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...

  9. Causal inference based on counterfactuals

    Directory of Open Access Journals (Sweden)

    Höfler M

    2005-09-01

    Full Text Available Abstract Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept.

  10. Probability biases as Bayesian inference

    Directory of Open Access Journals (Sweden)

    Andre; C. R. Martins

    2006-11-01

    Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.

  11. Nonparametric Bayesian inference in biostatistics

    CERN Document Server

    Müller, Peter

    2015-01-01

    As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...

  12. Bayesian genomic selection: the effect of haplotype lenghts and priors

    DEFF Research Database (Denmark)

    Villumsen, Trine Michelle; Janss, Luc

    2009-01-01

    Breeding values for animals with marker data are estimated using a genomic selection approach where data is analyzed using Bayesian multi-marker association models. Fourteen model scenarios with varying haplotype lengths, hyper parameter and prior distributions were compared to find the scenario ...

  13. Direct chromosome-length haplotyping by single-cell sequencing

    NARCIS (Netherlands)

    Porubský, David; Sanders, Ashley D; van Wietmarschen, Niek; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Bevova, Marianna R; Guryev, Victor; Lansdorp, Peter Michael

    Haplotypes are fundamental to fully characterize the diploid genome of an individual, yet methods to directly chart the unique genetic makeup of each parental chromosome are lacking. Here we introduce single-cell DNA template strand sequencing (Strand-seq) as a novel approach to phasing diploid

  14. Dense and accurate whole-chromosome haplotyping of individual genomes

    NARCIS (Netherlands)

    Porubsky, David; Garg, Shilpa; Sanders, Ashley D.; Korbel, Jan O.; Guryev, Victor; Lansdorp, Peter M.; Marschall, Tobias

    2017-01-01

    The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate

  15. Geographical distribution of a specific mitochondrial haplotype of Zymoseptoria tritici

    Directory of Open Access Journals (Sweden)

    Sameh BOUKEF

    2014-01-01

    Full Text Available Severity of disease caused by the fungus Zymoseptoria tritici throughout world cereal growing regions has elicited much debate on the potential evolutionary mechanism conferring high adaptability of the pathogen to diverse climate conditions and different wheat hosts (Triticum durum and T. aestivum. Specific mitochondrial DNA sequence was used to investigate geographic distribution of the type 4 haplotype (mtRFLP4 within 1363 isolates of Z. tritici originating from 21 countries. The mtRFLP4 haplotype was detected from both durum and bread wheat hosts with greater frequency on durum wheat. The distribution of mtRFLP4 was limited to populations sampled from the Mediterranean and the Red Sea region. Greater frequencies of mtRFLP4 were found in Tunisia (87% and Algeria (60%. The haplotype was absent within European, Australian, North and South American populations except Argentina. While alternative hypotheses such as climatic adaptation could not be ruled out, it is postulated that mtRFLP4 originated in North Africa (e.g. Tunisia or Algeria as an adaptation to durum wheat as the prevailing cereal crop. The specialized haplotype has subsequently spread as indicated by lower frequency of occurrence in the surrounding Mediterranean countries and on bread wheat hosts.

  16. Association of specific haplotype of TNFα with Helicobacter pylori ...

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Genetics; Volume 87; Issue 3. Association of specific haplotype of TNF with Helicobacter pylori-mediated duodenal ulcer in eastern Indian population. Meenakshi Chakravorty Dipanjana Datta De Abhijit Choudhury Amal Santra Susanta Roychoudhury. Research Note Volume 87 Issue 3 ...

  17. General Purpose Probabilistic Programming Platform with Effective Stochastic Inference

    Science.gov (United States)

    2018-04-01

    REFERENCES 74 LIST OF ACRONYMS 80 ii List of Figures Figure 1. The problem of inferring curves from data while simultaneously choosing the...bottom path) as the inverse problem to computer graphics (top path). ........ 18 Figure 18. An illustration of generative probabilistic graphics for 3D...Building these systems involves simultaneously developing mathematical models, inference algorithms and optimized software implementations. Small changes

  18. Beta-globin gene cluster haplotypes of Amerindian populations from the Brazilian Amazon region.

    Science.gov (United States)

    Guerreiro, J F; Figueiredo, M S; Zago, M A

    1994-01-01

    We have determined the beta-globin cluster haplotypes for 80 Indians from four Brazilian Amazon tribes: Kayapó, Wayampí, Wayana-Apalaí, and Arára. The results are analyzed together with 20 Yanomámi previously studied. From 2 to 4 different haplotypes were identified for each tribe, and 7 of the possible 32 haplotypes were found in a sample of 172 chromosomes for which the beta haplotypes were directly determined or derived from family studies. The haplotype distribution does not differ significantly among the five populations. The two most common haplotypes in all tribes were haplotypes 2 and 6, with average frequencies of 0.843 and 0.122, respectively. The genetic affinities between Brazilian Indians and other human populations were evaluated by estimates of genetic distance based on haplotype data. The lowest values were observed in relation to Asians, especially Chinese, Polynesians, and Micronesians.

  19. Logical inference and evaluation

    International Nuclear Information System (INIS)

    Perey, F.G.

    1981-01-01

    Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given

  20. A comparison of different algorithms for phasing haplotypes using Holstein cattle genotypes and pedigree data.

    Science.gov (United States)

    Miar, Younes; Sargolzaei, Mehdi; Schenkel, Flavio S

    2017-04-01

    Phasing genotypes to haplotypes is becoming increasingly important due to its applications in the study of diseases, population and evolutionary genetics, imputation, and so on. Several studies have focused on the development of computational methods that infer haplotype phase from population genotype data. The aim of this study was to compare phasing algorithms implemented in Beagle, Findhap, FImpute, Impute2, and ShapeIt2 software using 50k and 777k (HD) genotyping data. Six scenarios were considered: no-parents, sire-progeny pairs, sire-dam-progeny trios, each with and without pedigree information in Holstein cattle. Algorithms were compared with respect to their phasing accuracy and computational efficiency. In the studied population, Beagle and FImpute were more accurate than other phasing algorithms. Across scenarios, phasing accuracies for Beagle and FImpute were 99.49-99.90% and 99.44-99.99% for 50k, respectively, and 99.90-99.99% and 99.87-99.99% for HD, respectively. Generally, FImpute resulted in higher accuracy when genotypic information of at least one parent was available. In the absence of parental genotypes and pedigree information, Beagle and Impute2 (with double the default number of states) were slightly more accurate than FImpute. Findhap gave high phasing accuracy when parents' genotypes and pedigree information were available. In terms of computing time, Findhap was the fastest algorithm followed by FImpute. FImpute was 30 to 131, 87 to 786, and 353 to 1,400 times faster across scenarios than Beagle, ShapeIt2, and Impute2, respectively. In summary, FImpute and Beagle were the most accurate phasing algorithms. Moreover, the low computational requirement of FImpute makes it an attractive algorithm for phasing genotypes of large livestock populations. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  1. Mice, humans and haplotypes--the hunt for disease genes in SLE.

    Science.gov (United States)

    Rigby, R J; Fernando, M M A; Vyse, T J

    2006-09-01

    Defining the polymorphisms that contribute to the development of complex genetic disease traits is a challenging, although increasingly tractable problem. Historically, the technical difficulties in conducting association studies across the entire human genome are such that murine models have been used to generate candidate genes for analysis in human complex diseases, such as SLE. In this article we discuss the advantages and disadvantages of this approach and specifically address some assumptions made in the transition from studying one species to another, using lupus as an example. These issues include differences in genetic structure and genetic organisation which are a reflection on the population history. Clearly there are major differences in the histories of the human population and inbred laboratory strains of mice. Both human and murine genomes do exhibit structure at the genetic level. That is to say, they comprise haplotypes which are genomic regions that carry runs of polymorphisms that are not independently inherited. Haplotypes therefore reduce the number of combinations of the polymorphisms in the DNA in that region and facilitate the identification of disease susceptibility genes in both mice and humans. There are now novel means of generating candidate genes in SLE using mutagenesis (with ENU) in mice and identifying mice that generate antinuclear autoimmunity. In addition, murine models still provide a valuable means of exploring the functional consequences of genetic variation. However, advances in technology are such that human geneticists can now screen large fractions of the human genome for disease associations using microchip technologies that provide information on upwards of 100,000 different polymorphisms. These approaches are aimed at identifying haplotypes that carry disease susceptibility mutations and rely less on the generation of candidate genes.

  2. Probability and Statistical Inference

    OpenAIRE

    Prosper, Harrison B.

    2006-01-01

    These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.

  3. Polynomial Chaos Surrogates for Bayesian Inference

    KAUST Repository

    Le Maitre, Olivier

    2016-01-06

    The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.

  4. Spatial Inference Based on Geometric Proportional Analogies

    OpenAIRE

    Mullally, Emma-Claire; O'Donoghue, Diarmuid P.

    2006-01-01

    We describe an instance-based reasoning solution to a variety of spatial reasoning problems. The solution centers on identifying an isomorphic mapping between labelled graphs that represent some problem data and a known solution instance. We describe a number of spatial reasoning problems that are solved by generating non-deductive inferences, integrating topology with area (and other) features. We report the accuracy of our algorithm on different categories of spatial reasoning tasks from th...

  5. Bayesian inference for Hawkes processes

    DEFF Research Database (Denmark)

    Rasmussen, Jakob Gulddahl

    The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

  6. Bayesian inference for Hawkes processes

    DEFF Research Database (Denmark)

    Rasmussen, Jakob Gulddahl

    2013-01-01

    The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

  7. INFERENCE BUILDING BLOCKS

    Science.gov (United States)

    2018-02-15

    expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation

  8. Introductory statistical inference

    CERN Document Server

    Mukhopadhyay, Nitis

    2014-01-01

    This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist

  9. Analysis of HLA class II haplotypes in the Cayapa indians of ecuador: A novel DRBI allele reveals evidence for convergent evolution and balancing selection at position 86

    Energy Technology Data Exchange (ETDEWEB)

    Titus-Trachtenberg, E.A.; Erlich, H. (Roche Molecular Systems, Alameda, CA (United States)); Rickards, O.; De Stefano, G.F. (Universita di Roma, Rome (Italy))

    1994-07-01

    PCR amplification, oligonucleotide probe typing, and sequencing were used to analyze the HLA class II loci (DRB1, DQA1, DAB1, and DPB1) of an isolated South Amerindian tribe. Here the authors report HLA class II variation, including the identification of a new DRB1 allele, several novel DR/DQ haplotypes, and an unusual distribution of DPB1 alleles, among the Cayapa Indians (N=100) of Ecuador. A general reduction of HLA class II allelic variation in the Cayapa is consistent with a population bottleneck during the colonization of the Americas. The new Cayapa DRB1 allele, DRB1[sup *]08042, which arose by a G[yields]T point mutation in the parental DRB1[sup *]0802, contains a novel Val codon (GTT) at position 86. The generation of DRB1[sup *]08042 (Val-86) from DRB1[sup *]0802 (Gly-86) in the Cayapa, by a different mechanism than the (GT[yields]TG) change in the creation of DRB1[sub *]08041 (Val-86) from DRB1[sup *]0802 in Africa, implicates selection in the convergent evolution of position 86 DR[beta] variants. The DRB1[sup *]08042 allele has not been found in >1,800 Amerindian haplotypes and thus presumably arose after the Cayapa separated from other South American Amerindians. Selection pressure for increased haplotype diversity can be inferred in the generation and maintenance of three new DRB1[sup *]08042 haplotypes and several novel DR/DQ haplotypes in this population. The DPB1 allelic distribution in the Cayapa is also extraordinary, with two alleles, DPB1[sup *]1401, a very rare allele in North American Amerindian populations, and DPB1[sup *]0402, the most common Amerindian DPB1 allele, constituting 89% of the Cayapa DPB1. These data are consistent with the postulated rapid rate of evolution as noted for the class I HLA-B locus of other South American Indians. 34 refs., 2 figs., 2 tabs.

  10. RFLP's for the human pepsinogen A haplotypes (PGA)

    Energy Technology Data Exchange (ETDEWEB)

    Taggart, R T; Boudi, F B; Bell, G I

    1988-10-11

    PGA 101 is a 1340 bp cDNA clone containing exons 1-9 of the predicted human pepsinogen A coding sequence. Two distinct polymorphisms are detected with EcoRI and Bg1 II. Analysis with these enzymes provides for discrimination of the PGA haplotypes A, B, and C containing three, two and one PGA genes respectively. The PGA complex is located at 11q13. Mendelian inheritance was demonstrated in 20 families.

  11. Extended HLA-D region haplotype associated with celiac disease

    International Nuclear Information System (INIS)

    Howell, M.D.; Smith, J.R.; Austin, R.K.; Kelleher, D.; Nepom, G.T.; Volk, B.; Kagnoff, M.F.

    1988-01-01

    Celiac disease has one of the strongest associations with HLA (human leukocyte antigen) class II markers of the known HLA-linked diseases. This association is primarily with the class II serologic specificities HLA-DR3 and -DQw2. The authors previously described a restriction fragment length polymorphism (RFLP) characterized by the presence of a 4.0-kilobase Rsa I fragment derived from an HLA class II β-chain gene, which distinguishes the class II HLA haplotype of celiac disease patients from those of many serologically matched controls. They now report the isolation of this β-chain gene from a bacteriophage genomic library constructed from the DNA of a celiac disease patient. Based on restriction mapping and differential hybridization with class II cDNA and oligonucleotide probes, this gene was identified as one encoding an HLA-DP β-chain. This celiac disease-associated HLA-DP β-chain gene was flanked by HLA-DP α-chain genes and, therefore, was probably in its normal chromosomal location. The HLA-DPα-chain genes of celiac disease patients also were studied by RFLP analysis. Celiac disease is associated with a subset of HLA-DR3, -DQw2 haplotypes characterized by HLA-DP α- and β-chain gene RFLPs. Within the celiac-disease patient population, the joint segregation of these HLA-DP genes with those encoding the serologic specificities HLA-DR3 and -DQw2 indicates: (i) that the class II HLA haplotype associated with celiac disease is extended throughout the entire HLA-D region, and (ii) that celiac-disease susceptibility genes may reside as far centromeric on this haplotype as the HLA-DP subregion

  12. Extended HLA-D region haplotype associated with celiac disease

    Energy Technology Data Exchange (ETDEWEB)

    Howell, M.D.; Smith, J.R.; Austin, R.K.; Kelleher, D.; Nepom, G.T.; Volk, B.; Kagnoff, M.F.

    1988-01-01

    Celiac disease has one of the strongest associations with HLA (human leukocyte antigen) class II markers of the known HLA-linked diseases. This association is primarily with the class II serologic specificities HLA-DR3 and -DQw2. The authors previously described a restriction fragment length polymorphism (RFLP) characterized by the presence of a 4.0-kilobase Rsa I fragment derived from an HLA class II ..beta..-chain gene, which distinguishes the class II HLA haplotype of celiac disease patients from those of many serologically matched controls. They now report the isolation of this ..beta..-chain gene from a bacteriophage genomic library constructed from the DNA of a celiac disease patient. Based on restriction mapping and differential hybridization with class II cDNA and oligonucleotide probes, this gene was identified as one encoding an HLA-DP ..beta..-chain. This celiac disease-associated HLA-DP ..beta..-chain gene was flanked by HLA-DP ..cap alpha..-chain genes and, therefore, was probably in its normal chromosomal location. The HLA-DP..cap alpha..-chain genes of celiac disease patients also were studied by RFLP analysis. Celiac disease is associated with a subset of HLA-DR3, -DQw2 haplotypes characterized by HLA-DP ..cap alpha..- and ..beta..-chain gene RFLPs. Within the celiac-disease patient population, the joint segregation of these HLA-DP genes with those encoding the serologic specificities HLA-DR3 and -DQw2 indicates: (i) that the class II HLA haplotype associated with celiac disease is extended throughout the entire HLA-D region, and (ii) that celiac-disease susceptibility genes may reside as far centromeric on this haplotype as the HLA-DP subregion.

  13. How to deal with Haplotype data: An Extension to the Conceptual Schema of the Human Genome

    Directory of Open Access Journals (Sweden)

    José Fabián Reyes Román

    2016-12-01

    Full Text Available The goal of this work is to describe the advantages of the application of Conceptual Modeling (CM in complex domains, such as genomics. Nowadays, the study and comprehension of the human genome is a major challenge due to its high level of complexity. The constant evolution in the genomic domain contributes to the generation of ever larger amounts of new data, which means that if we do not manage it correctly data quality could be compromised (i.e., problems related with heterogeneity and inconsistent data. In this paper, we propose the use of a Conceptual Schema of the Human Genome (CSHG, designed to understand and improve our ontological commitment to the domain and also extend (enrich this schema with the integration of a novel concept: Haplotypes. Our focus is on improving the understanding of the relationship between genotype and phenotype, since new findings show that this question is more complex than was originally thought. Here we present the first steps in our data management approach with haplotypes (variations, frequencies and populations and discuss the database evolution to support this data. Each new version in our conceptual schema (CS introduces changes to the underlying database structure that has essential and practical implications for better understanding and managing the relevant information. A solution based on conceptual models gives a clear definition of the domain with direct implications in the medical field (Precision Medicine, in which Genomic Information Systems (GeIS play a very important role.

  14. The effect of using genealogy-based haplotypes for genomic prediction.

    Science.gov (United States)

    Edriss, Vahid; Fernando, Rohan L; Su, Guosheng; Lund, Mogens S; Guldbrandtsen, Bernt

    2013-03-06

    Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.

  15. Statistical inference based on divergence measures

    CERN Document Server

    Pardo, Leandro

    2005-01-01

    The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...

  16. MCP1 haplotypes associated with protection from pulmonary tuberculosis

    Directory of Open Access Journals (Sweden)

    Owusu-Dabo Ellis

    2011-04-01

    Full Text Available Abstract Background The monocyte chemoattractant protein 1 (MCP-1 is involved in the recruitment of lymphocytes and monocytes and their migration to sites of injury and cellular immune reactions. In a Ghanaian tuberculosis (TB case-control study group, associations of the MCP1 -362C and the MCP1 -2581G alleles with resistance to TB were recently described. The latter association was in contrast to genetic effects previously described in study groups originating from Mexico, Korea, Peru and Zambia. This inconsistency prompted us to further investigate the MCP1 gene in order to determine causal variants or haplotypes genetically and functionally. Results A 14 base-pair deletion in the first MCP1 intron, int1del554-567, was strongly associated with protection against pulmonary TB (OR = 0.84, CI 0.77-0.92, Pcorrected = 0.00098. Compared to the wildtype combination, a haplotype comprising the -2581G and -362C promoter variants and the intronic deletion conferred an even stronger protection than did the -362C variant alone (OR = 0.78, CI 0.69-0.87, Pnominal = 0.00002; adjusted Pglobal = 0.0028. In a luciferase reporter gene assay, a significant reduction of luciferase gene expression was observed in the two constructs carrying the MCP1 mutations -2581 A or G plus the combination -362C and int1del554-567 compared to the wildtype haplotype (P = 0.02 and P = 0.006. The associated variants, in particular the haplotypes composed of these latter variants, result in decreased MCP-1 expression and a decreased risk of pulmonary TB. Conclusions In addition to the results of the previous study of the Ghanaian TB case-control sample, we have now identified the haplotype combination -2581G/-362C/int1del554-567 that mediates considerably stronger protection than does the MCP1 -362C allele alone (OR = 0.78, CI 0.69-0.87 vs OR = 0.83, CI 0.76-0.91. Our findings in both the genetic analysis and the reporter gene study further indicate a largely negligible role of the

  17. State-Space Inference and Learning with Gaussian Processes

    OpenAIRE

    Turner, R; Deisenroth, MP; Rasmussen, CE

    2010-01-01

    18.10.13 KB. Ok to add author version to spiral, authors hold copyright. State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. C...

  18. Inference in {open_quotes}poor{close_quotes} languages

    Energy Technology Data Exchange (ETDEWEB)

    Petrov, S. [Oak Ridge National Lab., TN (United States)

    1996-12-31

    Languages with a solvable implication problem but without complete and consistent systems of inference rules ({open_quote}poor{close_quote} languages) are considered. The problem of existence of a finite, complete, and consistent inference rule system for a {open_quotes}poor{close_quotes} language is stated independently of the language or the rule syntax. Several properties of the problem are proved. An application of the results to the language of join dependencies is given.

  19. Type Inference with Inequalities

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff

    1991-01-01

    of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...

  20. Inference and the Introductory Statistics Course

    Science.gov (United States)

    Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross

    2011-01-01

    This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its…

  1. Deep Learning for Population Genetic Inference.

    Science.gov (United States)

    Sheehan, Sara; Song, Yun S

    2016-03-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  2. Deep Learning for Population Genetic Inference.

    Directory of Open Access Journals (Sweden)

    Sara Sheehan

    2016-03-01

    Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  3. Deep Learning for Population Genetic Inference

    Science.gov (United States)

    Sheehan, Sara; Song, Yun S.

    2016-01-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908

  4. Inference as Prediction

    Science.gov (United States)

    Watson, Jane

    2007-01-01

    Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…

  5. Hybrid Optical Inference Machines

    Science.gov (United States)

    1991-09-27

    with labels. Now, events. a set of facts cal be generated in the dyadic form "u, R 1,2" Eichmann and Caulfield (19] consider the same type of and can...these enceding-schemes. These architectures are-based pri- 19. G. Eichmann and H. J. Caulfield, "Optical Learning (Inference)marily on optical inner

  6. Haplotype diversity and linkage disequilibrium at DRD2 locus--a study on four population groups of Andhra Pradesh, India.

    Science.gov (United States)

    Saraswathy, Kallur Nava; Mukhopadhyay, Rupak; Shukla, Deepti; Kaur, Harpreet; Sachdeva, Mohinder Pal; Rao, A P; Saksena, Deepti; Kalla, Aloke Kumar

    2009-02-01

    Dopamine receptor D2 (DRD2) is expressed in the central nervous system and has a high affinity for many antipsychotic drugs. Besides several epidemiological investigations on association of DRD2 locus polymorphism(s) with neuropsychiatric problems and addictive behavior, a few polymorphisms in this locus have also been used to understand genomic diversity and population migratory histories globally. The present study attempts to understand the genomic diversity/affinity among four endogamous groups of Andhra Pradesh (India) against the backdrop of diversity studies from other parts of India and the rest of the world, with special reference to DRD2 locus. The four population groups from Adilabad District of Andhra Pradesh, namely, Brahmin (n=50), Nayakpod (n=49), Thoti (n=52), and Kolam (n=53), were included in the study. The DRD2 markers typed for the present study are three biallelic restriction fragments, that is, TaqI A (rs1800497), TaqI B (rs1079597), and TaqI D (rs1800498). Scoring of DRD2 haplotypes with respect to the three TaqI sites shows that five out of eight possible haplotypes are shared by the four populations. Ancestral haplotype B2D2A1 is most frequent among Thotis (0.359). The results of the present study indicate a differential gene flow into South India followed by certain important demographic events resulting in diversified peopling of India.

  7. Generative inference for cultural evolution.

    Science.gov (United States)

    Kandler, Anne; Powell, Adam

    2018-04-05

    One of the major challenges in cultural evolution is to understand why and how various forms of social learning are used in human populations, both now and in the past. To date, much of the theoretical work on social learning has been done in isolation of data, and consequently many insights focus on revealing the learning processes or the distributions of cultural variants that are expected to have evolved in human populations. In population genetics, recent methodological advances have allowed a greater understanding of the explicit demographic and/or selection mechanisms that underlie observed allele frequency distributions across the globe, and their change through time. In particular, generative frameworks-often using coalescent-based simulation coupled with approximate Bayesian computation (ABC)-have provided robust inferences on the human past, with no reliance on a priori assumptions of equilibrium. Here, we demonstrate the applicability and utility of generative inference approaches to the field of cultural evolution. The framework advocated here uses observed population-level frequency data directly to establish the likely presence or absence of particular hypothesized learning strategies. In this context, we discuss the problem of equifinality and argue that, in the light of sparse cultural data and the multiplicity of possible social learning processes, the exclusion of those processes inconsistent with the observed data might be the most instructive outcome. Finally, we summarize the findings of generative inference approaches applied to a number of case studies.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).

  8. A combined evidence Bayesian method for human ancestry inference applied to Afro-Colombians.

    Science.gov (United States)

    Rishishwar, Lavanya; Conley, Andrew B; Vidakovic, Brani; Jordan, I King

    2015-12-15

    Uniparental genetic markers, mitochondrial DNA (mtDNA) and Y chromosomal DNA, are widely used for the inference of human ancestry. However, the resolution of ancestral origins based on mtDNA haplotypes is limited by the fact that such haplotypes are often found to be distributed across wide geographical regions. We have addressed this issue here by combining two sources of ancestry information that have typically been considered separately: historical records regarding population origins and genetic information on mtDNA haplotypes. To combine these distinct data sources, we applied a Bayesian approach that considers historical records, in the form of prior probabilities, together with data on the geographical distribution of mtDNA haplotypes, formulated as likelihoods, to yield ancestry assignments from posterior probabilities. This combined evidence Bayesian approach to ancestry assignment was evaluated for its ability to accurately assign sub-continental African ancestral origins to Afro-Colombians based on their mtDNA haplotypes. We demonstrate that the incorporation of historical prior probabilities via this analytical framework can provide for substantially increased resolution in sub-continental African ancestry assignment for members of this population. In addition, a personalized approach to ancestry assignment that involves the tuning of priors to individual mtDNA haplotypes yields even greater resolution for individual ancestry assignment. Despite the fact that Colombia has a large population of Afro-descendants, the ancestry of this community has been understudied relative to populations with primarily European and Native American ancestry. Thus, the application of the kind of combined evidence approach developed here to the study of ancestry in the Afro-Colombian population has the potential to be impactful. The formal Bayesian analytical framework we propose for combining historical and genetic information also has the potential to be widely applied

  9. Causal Effect Inference with Deep Latent-Variable Models

    NARCIS (Netherlands)

    Louizos, C; Shalit, U.; Mooij, J.; Sontag, D.; Zemel, R.; Welling, M.

    2017-01-01

    Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of

  10. Improved Inference of Heteroscedastic Fixed Effects Models

    Directory of Open Access Journals (Sweden)

    Afshan Saeed

    2016-12-01

    Full Text Available Heteroscedasticity is a stern problem that distorts estimation and testing of panel data model (PDM. Arellano (1987 proposed the White (1980 estimator for PDM with heteroscedastic errors but it provides erroneous inference for the data sets including high leverage points. In this paper, our attempt is to improve heteroscedastic consistent covariance matrix estimator (HCCME for panel dataset with high leverage points. To draw robust inference for the PDM, our focus is to improve kernel bootstrap estimators, proposed by Racine and MacKinnon (2007. The Monte Carlo scheme is used for assertion of the results.

  11. How to solve mathematical problems

    CERN Document Server

    Wickelgren, Wayne A

    1995-01-01

    Seven problem-solving techniques include inference, classification of action sequences, subgoals, contradiction, working backward, relations between problems, and mathematical representation. Also, problems from mathematics, science, and engineering with complete solutions.

  12. Diversity and population structure of Plasmodium falciparum in Thailand based on the spatial and temporal haplotype patterns of the C-terminal 19-kDa domain of merozoite surface protein-1.

    Science.gov (United States)

    Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Siripoon, Napaporn; Seugorn, Aree; Kaewthamasorn, Morakot; Butcher, Robert D J; Harnyuttanakorn, Pongchai

    2014-02-12

    The 19-kDa C-terminal region of the merozoite surface protein-1 of the human malaria parasite Plasmodium falciparum (PfMSP-119) constitutes the major component on the surface of merozoites and is considered as one of the leading candidates for asexual blood stage vaccines. Because the protein exhibits a level of sequence variation that may compromise the effectiveness of a vaccine, the global sequence diversity of PfMSP-119 has been subjected to extensive research, especially in malaria endemic areas. In Thailand, PfMSP-119 sequences have been derived from a single parasite population in Tak province, located along the Thailand-Myanmar border, since 1995. However, the extent of sequence variation and the spatiotemporal patterns of the MSP-119 haplotypes along the Thai borders with Laos and Cambodia are unknown. Sixty-three isolates of P. falciparum from five geographically isolated populations along the Thai borders with Myanmar, Laos and Cambodia in three transmission seasons between 2002 and 2008 were collected and culture-adapted. The msp-1 gene block 17 was sequenced and analysed for the allelic diversity, frequency and distribution patterns of PfMSP-119 haplotypes in individual populations. The PfMSP-119 haplotype patterns were then compared between parasite populations to infer the population structure and genetic differentiation of the malaria parasite. Five conserved polymorphic positions, which accounted for five distinct haplotypes, of PfMSP-119 were identified. Differences in the prevalence of PfMSP-119 haplotypes were detected in different geographical regions, with the highest levels of genetic diversity being found in the Kanchanaburi and Ranong provinces along the Thailand-Myanmar border and Trat province located at the Thailand-Cambodia border. Despite this variability, the distribution patterns of individual PfMSP-119 haplotypes seemed to be very similar across the country and over the three malarial transmission seasons, suggesting that gene flow

  13. Subjective randomness as statistical inference.

    Science.gov (United States)

    Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B

    2018-06-01

    Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  15. Quantitative trait loci and the relevance of phased haplotypes

    DEFF Research Database (Denmark)

    Gregersen, Vivi Raundahl

    Genetic control of different production traits and diseases within livestock has been of great interest since domenstication. SNPs have greatly facilitated the use of QTL studies in the search of genomic regions affecting different phenotypes. The studies have been conducted to identify regions...... underlying gentic control both as traditional linkage studies relying on genetic maps and as GWAS where an approach of phasing haplotypes within the QTL have been conducted to validate the regions. Overall, regions of interest have been identified for chronic pleuritis and osteochondrosis in addition to meat...... quality and boar taint in pigs, and for improved chees production within cows...

  16. A haplotype specific to North European wheat (Triticum aestivum L.)

    Czech Academy of Sciences Publication Activity Database

    Tsombalova, J.; Karafiátová, Miroslava; Vrána, Jan; Kubaláková, Marie; Peusa, H.; Jakobson, I.; Jarve, M.; Valárik, Miroslav; Doležel, Jaroslav; Jarve, K.

    2017-01-01

    Roč. 64, č. 4 (2017), s. 653-664 ISSN 0925-9864 R&D Projects: GA MŠk(CZ) LO1204; GA ČR(CZ) GA14-07164S Institutional support: RVO:61389030 Keywords : bread wheat * genetic diversity * polyploid wheat * introgression lines * molecular analysis * tetraploid wheat * hexaploid wheat * powdery mildew * spelta l. * map * Common wheat * Triticum aestivum L * Spelt * Triticum spelta L * Chromosome 4A * Zero alleles * Haplotype * Linkage disequilibrium Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Plant sciences, botany Impact factor: 1.294, year: 2016

  17. A Learning Algorithm for Multimodal Grammar Inference.

    Science.gov (United States)

    D'Ulizia, A; Ferri, F; Grifoni, P

    2011-12-01

    The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.

  18. Grammatical inference algorithms, routines and applications

    CERN Document Server

    Wieczorek, Wojciech

    2017-01-01

    This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.

  19. Stochastic processes inference theory

    CERN Document Server

    Rao, Malempati M

    2014-01-01

    This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.

  20. Making Type Inference Practical

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens

    1992-01-01

    We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...

  1. Causal inference in econometrics

    CERN Document Server

    Kreinovich, Vladik; Sriboonchitta, Songsak

    2016-01-01

    This book is devoted to the analysis of causal inference which is one of the most difficult tasks in data analysis: when two phenomena are observed to be related, it is often difficult to decide whether one of them causally influences the other one, or whether these two phenomena have a common cause. This analysis is the main focus of this volume. To get a good understanding of the causal inference, it is important to have models of economic phenomena which are as accurate as possible. Because of this need, this volume also contains papers that use non-traditional economic models, such as fuzzy models and models obtained by using neural networks and data mining techniques. It also contains papers that apply different econometric models to analyze real-life economic dependencies.

  2. The ocean circulation inverse problem

    National Research Council Canada - National Science Library

    Wunsch, C

    1996-01-01

    .... This book addresses the problem of inferring the state of the ocean circulation, understanding it dynamically, and even forecasting it through a quantitative combination of theory and observation...

  3. Active inference and learning.

    Science.gov (United States)

    Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni

    2016-09-01

    This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. Common ataxia telangiectasia mutated haplotypes and risk of breast cancer: a nested case–control study

    International Nuclear Information System (INIS)

    Tamimi, Rulla M; Hankinson, Susan E; Spiegelman, Donna; Kraft, Peter; Colditz, Graham A; Hunter, David J

    2004-01-01

    The ataxia telangiectasia mutated (ATM) gene is a tumor suppressor gene with functions in cell cycle arrest, apoptosis, and repair of DNA double-strand breaks. Based on family studies, women heterozygous for mutations in the ATM gene are reported to have a fourfold to fivefold increased risk of breast cancer compared with noncarriers of the mutations, although not all studies have confirmed this association. Haplotype analysis has been suggested as an efficient method for investigating the role of common variation in the ATM gene and breast cancer. Five biallelic haplotype tagging single nucleotide polymorphisms are estimated to capture 99% of the haplotype diversity in Caucasian populations. We conducted a nested case–control study of breast cancer within the Nurses' Health Study cohort to address the role of common ATM haplotypes and breast cancer. Cases and controls were genotyped for five haplotype tagging single nucleotide polymorphisms. Haplotypes were predicted for 1309 cases and 1761 controls for which genotype information was available. Six unique haplotypes were predicted in this study, five of which occur at a frequency of 5% or greater. The overall distribution of haplotypes was not significantly different between cases and controls (χ 2 = 3.43, five degrees of freedom, P = 0.63). There was no evidence that common haplotypes of ATM are associated with breast cancer risk. Extensive single nucleotide polymorphism detection using the entire genomic sequence of ATM will be necessary to rule out less common variation in ATM and sporadic breast cancer risk

  5. Updated listing of haplotypes at the human phenylalanine hydroxylase (PAH) locus

    Energy Technology Data Exchange (ETDEWEB)

    Eisensmith, R.C.; Woo, S.L.C. (Baylor College of Medicine, Houston, TX (United States))

    1992-12-01

    Analysis of mutant PAH chromosomes has identified approximately 60 different single-base substitutions and deletions within the PAH locus. Nearly all of these molecular lesions are in strong linkage disequilibrium with specific RFLP haplotypes in different ethnic populations. Thus, haplotype analysis is not only useful for diagnostic purposes but is proving to be a valuable tool in population genetic studies of the origin and spread of phenylketonuria alleles in human populations. PCR-based methods have been developed to detect six of the eight polymorphic restriction sites used for determination of RFLP haplotypes at the PAH locus. A table of the proposed expanded haplotypes is given.

  6. Learning Convex Inference of Marginals

    OpenAIRE

    Domke, Justin

    2012-01-01

    Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...

  7. The evolutionary history of the DMRT3 'Gait keeper' haplotype.

    Science.gov (United States)

    Staiger, E A; Almén, M S; Promerová, M; Brooks, S; Cothran, E G; Imsland, F; Jäderkvist Fegraeus, K; Lindgren, G; Mehrabani Yeganeh, H; Mikko, S; Vega-Pla, J L; Tozaki, T; Rubin, C J; Andersson, L

    2017-10-01

    A previous study revealed a strong association between the DMRT3:Ser301STOP mutation in horses and alternate gaits as well as performance in harness racing. Several follow-up studies have confirmed a high frequency of the mutation in gaited horse breeds and an effect on gait quality. The aim of this study was to determine when and where the mutation arose, to identify additional potential causal mutations and to determine the coalescence time for contemporary haplotypes carrying the stop mutation. We utilized sequences from 89 horses representing 26 breeds to identify 102 SNPs encompassing the DMRT3 gene that are in strong linkage disequilibrium with the stop mutation. These 102 SNPs were genotyped in an additional 382 horses representing 72 breeds, and we identified 14 unique haplotypes. The results provided conclusive evidence that DMRT3:Ser301STOP is causal, as no other sequence polymorphisms showed an equally strong association to locomotion traits. The low sequence diversity among mutant chromosomes demonstrated that they must have diverged from a common ancestral sequence within the last 10 000 years. Thus, the mutation occurred either just before domestication or more likely some time after domestication and then spread across the world as a result of selection on locomotion traits. © 2017 Stichting International Foundation for Animal Genetics.

  8. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ

    Directory of Open Access Journals (Sweden)

    Qing-Ming An

    2015-11-01

    Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.

  9. Quantum-Like Representation of Non-Bayesian Inference

    Science.gov (United States)

    Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

    2013-01-01

    This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.

  10. Probabilistic inductive inference: a survey

    OpenAIRE

    Ambainis, Andris

    2001-01-01

    Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.

  11. A mixed integer linear programming model to reconstruct phylogenies from single nucleotide polymorphism haplotypes under the maximum parsimony criterion

    Science.gov (United States)

    2013-01-01

    Background Phylogeny estimation from aligned haplotype sequences has attracted more and more attention in the recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from medical research, to drug discovery, to epidemiology, to population dynamics. The literature on molecular phylogenetics proposes a number of criteria for selecting a phylogeny from among plausible alternatives. Usually, such criteria can be expressed by means of objective functions, and the phylogenies that optimize them are referred to as optimal. One of the most important estimation criteria is the parsimony which states that the optimal phylogeny T∗for a set H of n haplotype sequences over a common set of variable loci is the one that satisfies the following requirements: (i) it has the shortest length and (ii) it is such that, for each pair of distinct haplotypes hi,hj∈H, the sum of the edge weights belonging to the path from hi to hj in T∗ is not smaller than the observed number of changes between hi and hj. Finding the most parsimonious phylogeny for H involves solving an optimization problem, called the Most Parsimonious Phylogeny Estimation Problem (MPPEP), which is NP-hard in many of its versions. Results In this article we investigate a recent version of the MPPEP that arises when input data consist of single nucleotide polymorphism haplotypes extracted from a population of individuals on a common genomic region. Specifically, we explore the prospects for improving on the implicit enumeration strategy of implicit enumeration strategy used in previous work using a novel problem formulation and a series of strengthening valid inequalities and preliminary symmetry breaking constraints to more precisely bound the solution space and accelerate implicit enumeration of possible optimal phylogenies. We present the basic formulation and then introduce a series of provable valid constraints to reduce the solution space. We then prove

  12. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity

    Directory of Open Access Journals (Sweden)

    Benjamin Schwessinger

    2018-02-01

    Full Text Available A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales. In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies.

  13. Haplotypes in CCR5-CCR2, CCL3 and CCL5 are associated with natural resistance to HIV-1 infection in a Colombian cohort.

    Science.gov (United States)

    Vega, Jorge A; Villegas-Ospina, Simón; Aguilar-Jiménez, Wbeimar; Rugeles, María T; Bedoya, Gabriel; Zapata, Wildeman

    2017-06-01

    Variants in genes encoding for HIV-1 co-receptors and their natural ligands have been individually associated to natural resistance to HIV-1 infection. However, the simultaneous presence of these variants has been poorly studied. To evaluate the association of single and multilocus haplotypes in genes coding for the viral co-receptors CCR5 and CCR2, and their ligands CCL3 and CCL5, with resistance or susceptibility to HIV-1 infection. Nine variants in CCR5-CCR2, two SNPs in CCL3 and two in CCL5 were genotyped by PCR-RFLP in 35 seropositive (cases) and 49 HIV-1-exposed seronegative Colombian individuals (controls). Haplotypes were inferred using the Arlequin software, and their frequency in individual or combined loci was compared between cases and controls by the chi-square test. A p' value ;0.05 after Bonferroni correction was considered significant. Homozygosis of the human haplogroup (HH) E was absent in controls and frequent in cases, showing a tendency to susceptibility. The haplotypes C-C and T-T in CCL3 were associated with susceptibility (p'=0.016) and resistance (p';0.0001) to HIV-1 infection, respectively. Finally, in multilocus analysis, the haplotype combinations formed by HHC in CCR5-CCR2, T-T in CCL3 and G-C in CCL5 were associated with resistance (p'=0.006). Our results suggest that specific combinations of variants in genes from the same signaling pathway can define an HIV-1 resistant phenotype. Despite our small sample size, our statistically significant associations suggest strong effects; however, these results should be further validated in larger cohorts.

  14. Chronic inflammatory state in sickle cell anemia patients is associated with HBB(*)S haplotype.

    Science.gov (United States)

    Bandeira, Izabel C J; Rocha, Lillianne B S; Barbosa, Maritza C; Elias, Darcielle B D; Querioz, José A N; Freitas, Max Vitor Carioca; Gonçalves, Romélia P

    2014-02-01

    The chronic inflammatory state in sickle cell anemia (SCA) is associated with several factors such as the following: endothelial damage; increased production of reactive oxygen species; hemolysis; increased expression of adhesion molecules by leukocytes, erythrocytes, and platelets; and increased production of proinflammatory cytokines. Genetic characteristics affecting the clinical severity of SCA include variations in the hemoglobin F (HbF) level, coexistence of alpha-thalassemia, and the haplotype associated with the HbS gene. The different haplotypes of SCA are Bantu, Benin, Senegal, Cameroon, and Arab-Indian. These haplotypes are associated with ethnic groups and also based on the geographical origin. Studies have shown that the Bantu haplotype is associated with higher incidence of clinical complications than the other haplotypes and is therefore considered to have the worst prognosis. This study aimed to evaluate the profile of the proinflammatory cytokines interleukin-6, tumor necrosis factor-α, and interleukin-17 in patients with SCA and also to assess the haplotypes associated with beta globin cluster S (HBB(*)S). We analyzed a total of 62 patients who had SCA and had been treated with hydroxyurea; they had received a dose ranging between 15 and 25 (20.0±0.6)mg/kg/day for 6-60 (18±3.4)months; their data were compared with those for 30 normal individuals. The presence of HbS was detected and the haplotypes of the beta S gene cluster were analyzed by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). Our study demonstrated that SCA patients have increased inflammatory profile when compared to the healthy individuals. Further, analysis of the association between the haplotypes and inflammatory profile showed that the levels of IL-6 and TNF-α were greater in subjects with the Bantu/Bantu haplotype than in subjects with the Benin/Benin haplotype. The Bantu/Benin haplotype individuals had lower levels of cytokines than those with

  15. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    International Nuclear Information System (INIS)

    Lakhssassi, K.; González-Recio, O.

    2017-01-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  16. Genome-wide haplotype analysis of cis expression quantitative trait loci in monocytes.

    Directory of Open Access Journals (Sweden)

    Sophie Garnier

    Full Text Available In order to assess whether gene expression variability could be influenced by several SNPs acting in cis, either through additive or more complex haplotype effects, a systematic genome-wide search for cis haplotype expression quantitative trait loci (eQTL was conducted in a sample of 758 individuals, part of the Cardiogenics Transcriptomic Study, for which genome-wide monocyte expression and GWAS data were available. 19,805 RNA probes were assessed for cis haplotypic regulation through investigation of ~2,1 × 10(9 haplotypic combinations. 2,650 probes demonstrated haplotypic p-values >10(4-fold smaller than the best single SNP p-value. Replication of significant haplotype effects were tested for 412 probes for which SNPs (or proxies that defined the detected haplotypes were available in the Gutenberg Health Study composed of 1,374 individuals. At the Bonferroni correction level of 1.2 × 10(-4 (~0.05/412, 193 haplotypic signals replicated. 1000 G imputation was then conducted, and 105 haplotypic signals still remained more informative than imputed SNPs. In-depth analysis of these 105 cis eQTL revealed that at 76 loci genetic associations were compatible with additive effects of several SNPs, while for the 29 remaining regions data could be compatible with a more complex haplotypic pattern. As 24 of the 105 cis eQTL have previously been reported to be disease-associated loci, this work highlights the need for conducting haplotype-based and 1000 G imputed cis eQTL analysis before commencing functional studies at disease-associated loci.

  17. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    Energy Technology Data Exchange (ETDEWEB)

    Lakhssassi, K.; González-Recio, O.

    2017-07-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  18. Genetic relationships among native americans based on beta-globin gene cluster haplotype frequencies

    Directory of Open Access Journals (Sweden)

    Rita de Cassia Mousinho-Ribeiro

    2003-01-01

    Full Text Available The distribution of b-globin gene haplotypes was studied in 209 Amerindians from eight tribes of the Brazilian Amazon: Asurini from Xingú, Awá-Guajá, Parakanã, Urubú-Kaapór, Zoé, Kayapó (Xikrin from the Bacajá village, Katuena, and Tiriyó. Nine different haplotypes were found, two of which (n. 11 and 13 had not been previously identified in Brazilian indigenous populations. Haplotype 2 (+ - - - - was the most common in all groups studied, with frequencies varying from 70% to 100%, followed by haplotype 6 (- + + - +, with frequencies between 7% and 18%. The frequency distribution of the b-globin gene haplotypes in the eighteen Brazilian Amerindian populations studied to date is characterized by a reduced number of haplotypes (average of 3.5 and low levels of heterozygosity and intrapopulational differentiation, with a single clearly predominant haplotype in most tribes (haplotype 2. The Parakanã, Urubú-Kaapór, Tiriyó and Xavante tribes constitute exceptions, presenting at least four haplotypes with relatively high frequencies. The closest genetic relationships were observed between the Brazilian and the Colombian Amerindians (Wayuu, Kamsa and Inga, and, to a lesser extent, with the Huichol of Mexico. North-American Amerindians are more differentiated and clearly separated from all other tribes, except the Xavante, from Brazil, and the Mapuche, from Argentina. A restricted pool of ancestral haplotypes may explain the low diversity observed among most present-day Brazilian and Colombian Amerindian groups, while interethnic admixture could be the most important factor to explain the high number of haplotypes and high levels of diversity observed in some South-American and most North-American tribes.

  19. Estimating uncertainty of inference for validation

    Energy Technology Data Exchange (ETDEWEB)

    Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM

    2010-09-30

    We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the

  20. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2010-01-01

    Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente

  1. Emotional inferences by pragmatics

    OpenAIRE

    Iza-Miqueleiz, Mauricio

    2017-01-01

    It has for long been taken for granted that, along the course of reading a text, world knowledge is often required in order to establish coherent links between sentences (McKoon & Ratcliff 1992, Iza & Ezquerro 2000). The content grasped from a text turns out to be strongly dependent upon the reader’s additional knowledge that allows a coherent interpretation of the text as a whole. The world knowledge directing the inference may be of distinctive nature. Gygax et al. (2007) showed that m...

  2. Generic patch inference

    DEFF Research Database (Denmark)

    Andersen, Jesper; Lawall, Julia

    2010-01-01

    A key issue in maintaining Linux device drivers is the need to keep them up to date with respect to evolutions in Linux internal libraries. Currently, there is little tool support for performing and documenting such changes. In this paper we present a tool, spdiff, that identifies common changes...... developers can use it to extract an abstract representation of the set of changes that others have made. Our experiments on recent changes in Linux show that the inferred generic patches are more concise than the corresponding patches found in commits to the Linux source tree while being safe with respect...

  3. Musical aptitude is associated with AVPR1A-haplotypes.

    Directory of Open Access Journals (Sweden)

    Liisa T Ukkola

    Full Text Available Artistic creativity forms the basis of music culture and music industry. Composing, improvising and arranging music are complex creative functions of the human brain, which biological value remains unknown. We hypothesized that practicing music is social communication that needs musical aptitude and even creativity in music. In order to understand the neurobiological basis of music in human evolution and communication we analyzed polymorphisms of the arginine vasopressin receptor 1A (AVPR1A, serotonin transporter (SLC6A4, catecol-O-methyltranferase (COMT, dopamin receptor D2 (DRD2 and tyrosine hydroxylase 1 (TPH1, genes associated with social bonding and cognitive functions in 19 Finnish families (n = 343 members with professional musicians and/or active amateurs. All family members were tested for musical aptitude using the auditory structuring ability test (Karma Music test; KMT and Carl Seashores tests for pitch (SP and for time (ST. Data on creativity in music (composing, improvising and/or arranging music was surveyed using a web-based questionnaire. Here we show for the first time that creative functions in music have a strong genetic component (h(2 = .84; composing h(2 = .40; arranging h(2 = .46; improvising h(2 = .62 in Finnish multigenerational families. We also show that high music test scores are significantly associated with creative functions in music (p<.0001. We discovered an overall haplotype association with AVPR1A gene (markers RS1 and RS3 and KMT (p = 0.0008; corrected p = 0.00002, SP (p = 0.0261; corrected p = 0.0072 and combined music test scores (COMB (p = 0.0056; corrected p = 0.0006. AVPR1A haplotype AVR+RS1 further suggested a positive association with ST (p = 0.0038; corrected p = 0.00184 and COMB (p = 0.0083; corrected p = 0.0040 using haplotype-based association test HBAT. The results suggest that the neurobiology of music perception and production is likely to be related to the pathways affecting intrinsic attachment

  4. Parametric inference for biological sequence analysis.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  5. Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values

    Directory of Open Access Journals (Sweden)

    Schrooten Chris

    2009-01-01

    Full Text Available Abstract The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD probabilities between haplotypes, various haplotype definitions were tested i.e. including 2, 6, 12 or 20 marker alleles and clustering base haplotypes related with an IBD probability of > 0.55, 0.75 or 0.95. Simulated data contained 1100 animals with known genotypes and phenotypes and 1000 animals with known genotypes and unknown phenotypes. Genomes comprising 3 Morgan were simulated and contained 74 polymorphic QTL and 383 polymorphic SNP markers with an average r2 value of 0.14 between adjacent markers. The total number of haplotypes decreased up to 50% when the window size was increased from two to 20 markers and decreased by at least 50% when haplotypes related with an IBD probability of > 0.55 instead of > 0.95 were clustered. An intermediate window size led to more precise QTL mapping. Window size and clustering had a limited effect on the accuracy of predicted total breeding values, ranging from 0.79 to 0.81. Our conclusion is that different optimal window sizes should be used in QTL-mapping versus genome-wide breeding value prediction.

  6. Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene.

    Science.gov (United States)

    Guo, Zhiai; Song, Yanxia; Zhou, Ronghua; Ren, Zhenglong; Jia, Jizeng

    2010-02-01

    Ppd-D1 is one of the most potent genes affecting the photoperiod response of wheat (Triticum aestivum). Only two alleles, insensitive Ppd-D1a and sensitive Ppd-D1b, were known previously, and these did not adequately explain the broad adaptation of wheat to photoperiod variation. In this study, five diagnostic molecular markers were employed to identify Ppd-D1 haplotypes in 492 wheat varieties from diverse geographic locations and 55 accessions of Aegilops tauschii, the D genome donor species of wheat. Six Ppd-D1 haplotypes, designated I-VI, were identified. Types II, V and VI were considered to be more ancient and types I, III and IV were considered to be derived from type II. The transcript abundances of the Ppd-D1 haplotypes showed continuous variation, being highest for haplotype I, lowest for haplotype III, and correlating negatively with varietal differences in heading time. These haplotypes also significantly affected other agronomic traits. The distribution frequency of Ppd-D1 haplotypes showed partial correlations with both latitudes and altitudes of wheat cultivation regions. The evolution, expression and distribution of Ppd-D1 haplotypes were consistent evidentially with each other. What was regarded as a pair of alleles in the past can now be considered a series of alleles leading to continuous variation.

  7. The putative oncogene Pim-1 in the mouse: its linkage and variation among t haplotypes.

    Science.gov (United States)

    Nadeau, J H; Phillips, S J

    1987-11-01

    Pim-1, a putative oncogene involved in T-cell lymphomagenesis, was mapped between the pseudo-alpha globin gene Hba-4ps and the alpha-crystallin gene Crya-1 on mouse chromosome 17 and therefore within the t complex. Pim-1 restriction fragment variants were identified among t haplotypes. Analysis of restriction fragment sizes obtained with 12 endonucleases demonstrated that the Pim-1 genes in some t haplotypes were indistinguishable from the sizes for the Pim-1b allele in BALB/c inbred mice. There are now three genes, Pim-1, Crya-1 and H-2 I-E, that vary among independently derived t haplotypes and that have indistinguishable alleles in t haplotypes and inbred strains. These genes are closely linked within the distal inversion of the t complex. Because it is unlikely that these variants arose independently in t haplotypes and their wild-type homologues, we propose that an exchange of chromosomal segments, probably through double crossingover, was responsible for indistinguishable Pim-1 genes shared by certain t haplotypes and their wild-type homologues. There was, however, no apparent association between variant alleles of these three genes among t haplotypes as would be expected if a single exchange introduced these alleles into t haplotypes. If these variant alleles can be shown to be identical to the wild-type allele, then lack of association suggests that multiple exchanges have occurred during the evolution of the t complex.

  8. Analysis of Multiallelic CNVs by Emulsion Haplotype Fusion PCR.

    Science.gov (United States)

    Tyson, Jess; Armour, John A L

    2017-01-01

    Emulsion-fusion PCR recovers long-range sequence information by combining products in cis from individual genomic DNA molecules. Emulsion droplets act as very numerous small reaction chambers in which different PCR products from a single genomic DNA molecule are condensed into short joint products, to unite sequences in cis from widely separated genomic sites. These products can therefore provide information about the arrangement of sequences and variants at a larger scale than established long-read sequencing methods. The method has been useful in defining the phase of variants in haplotypes, the typing of inversions, and determining the configuration of sequence variants in multiallelic CNVs. In this description we outline the rationale for the application of emulsion-fusion PCR methods to the analysis of multiallelic CNVs, and give practical details for our own implementation of the method in that context.

  9. Reinforcement learning or active inference?

    Science.gov (United States)

    Friston, Karl J; Daunizeau, Jean; Kiebel, Stefan J

    2009-07-29

    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

  10. Reinforcement learning or active inference?

    Directory of Open Access Journals (Sweden)

    Karl J Friston

    2009-07-01

    Full Text Available This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

  11. Constrained bayesian inference of project performance models

    OpenAIRE

    Sunmola, Funlade

    2013-01-01

    Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...

  12. Musical Aptitude Is Associated with AVPR1A-Haplotypes

    Science.gov (United States)

    Ukkola, Liisa T.; Onkamo, Päivi; Raijas, Pirre; Karma, Kai; Järvelä, Irma

    2009-01-01

    Artistic creativity forms the basis of music culture and music industry. Composing, improvising and arranging music are complex creative functions of the human brain, which biological value remains unknown. We hypothesized that practicing music is social communication that needs musical aptitude and even creativity in music. In order to understand the neurobiological basis of music in human evolution and communication we analyzed polymorphisms of the arginine vasopressin receptor 1A (AVPR1A), serotonin transporter (SLC6A4), catecol-O-methyltranferase (COMT), dopamin receptor D2 (DRD2) and tyrosine hydroxylase 1 (TPH1), genes associated with social bonding and cognitive functions in 19 Finnish families (n = 343 members) with professional musicians and/or active amateurs. All family members were tested for musical aptitude using the auditory structuring ability test (Karma Music test; KMT) and Carl Seashores tests for pitch (SP) and for time (ST). Data on creativity in music (composing, improvising and/or arranging music) was surveyed using a web-based questionnaire. Here we show for the first time that creative functions in music have a strong genetic component (h2 = .84; composing h2 = .40; arranging h2 = .46; improvising h2 = .62) in Finnish multigenerational families. We also show that high music test scores are significantly associated with creative functions in music (pmusic test scores (COMB) (p = 0.0056; corrected p = 0.0006). AVPR1A haplotype AVR+RS1 further suggested a positive association with ST (p = 0.0038; corrected p = 0.00184) and COMB (p = 0.0083; corrected p = 0.0040) using haplotype-based association test HBAT. The results suggest that the neurobiology of music perception and production is likely to be related to the pathways affecting intrinsic attachment behavior. PMID:19461995

  13. Musical aptitude is associated with AVPR1A-haplotypes.

    Science.gov (United States)

    Ukkola, Liisa T; Onkamo, Päivi; Raijas, Pirre; Karma, Kai; Järvelä, Irma

    2009-05-20

    Artistic creativity forms the basis of music culture and music industry. Composing, improvising and arranging music are complex creative functions of the human brain, which biological value remains unknown. We hypothesized that practicing music is social communication that needs musical aptitude and even creativity in music. In order to understand the neurobiological basis of music in human evolution and communication we analyzed polymorphisms of the arginine vasopressin receptor 1A (AVPR1A), serotonin transporter (SLC6A4), catecol-O-methyltranferase (COMT), dopamin receptor D2 (DRD2) and tyrosine hydroxylase 1 (TPH1), genes associated with social bonding and cognitive functions in 19 Finnish families (n = 343 members) with professional musicians and/or active amateurs. All family members were tested for musical aptitude using the auditory structuring ability test (Karma Music test; KMT) and Carl Seashores tests for pitch (SP) and for time (ST). Data on creativity in music (composing, improvising and/or arranging music) was surveyed using a web-based questionnaire. Here we show for the first time that creative functions in music have a strong genetic component (h(2) = .84; composing h(2) = .40; arranging h(2) = .46; improvising h(2) = .62) in Finnish multigenerational families. We also show that high music test scores are significantly associated with creative functions in music (pmusic test scores (COMB) (p = 0.0056; corrected p = 0.0006). AVPR1A haplotype AVR+RS1 further suggested a positive association with ST (p = 0.0038; corrected p = 0.00184) and COMB (p = 0.0083; corrected p = 0.0040) using haplotype-based association test HBAT. The results suggest that the neurobiology of music perception and production is likely to be related to the pathways affecting intrinsic attachment behavior.

  14. Organelle DNA haplotypes reflect crop-use characteristics and geographic origins of Cannabis sativa.

    Science.gov (United States)

    Gilmore, Simon; Peakall, Rod; Robertson, James

    2007-10-25

    Comparative sequencing of cannabis individuals across 12 chloroplast and mitochondrial DNA loci revealed 7 polymorphic sites, including 5 length variable regions and 2 single nucleotide polymorphisms. Simple PCR assays were developed to assay these polymorphisms, and organelle DNA haplotypes were obtained for 188 cannabis individuals from 76 separate populations, including drug-type, fibre-type and wild populations. The haplotype data were analysed using parsimony, UPGMA and neighbour joining methods. Three haplotype groups were recovered by each analysis method, and these groups are suggestive of the crop-use characteristics and geographical origin of the populations, although not strictly diagnostic. We discuss the relationship between our haplotype data and taxonomic opinions of cannabis, and the implications of organelle DNA haplotyping to forensic investigations of cannabis.

  15. Polymorphism at Expressed DQ and DR Loci in Five Common Equine MHC Haplotypes

    Science.gov (United States)

    Miller, Donald; Tallmadge, Rebecca L.; Binns, Matthew; Zhu, Baoli; Mohamoud, Yasmin Ali; Ahmed, Ayeda; Brooks, Samantha A.; Antczak, Douglas F.

    2016-01-01

    The polymorphism of Major Histocompatibility Complex (MHC) class II DQ and DR genes in five common Equine Leukocyte Antigen (ELA) haplotypes was determined through sequencing of mRNA transcripts isolated from lymphocytes of eight ELA homozygous horses. Ten expressed MHC class II genes were detected in horses of the ELA-A3 haplotype carried by the donor horses of the equine Bacterial Artificial Chromosome (BAC) library and the reference genome sequence: four DR genes and six DQ genes. The other four ELA haplotypes contained at least eight expressed polymorphic MHC class II loci. Next Generation Sequencing (NGS) of genomic DNA of these four MHC haplotypes revealed stop codons in the DQA3 gene in the ELA-A2, ELA-A5, and ELA-A9 haplotypes. Few NGS reads were obtained for the other MHC class II genes that were not amplified in these horses. The amino acid sequences across haplotypes contained locus-specific residues, and the locus clusters produced by phylogenetic analysis were well supported. The MHC class II alleles within the five tested haplotypes were largely non-overlapping between haplotypes. The complement of equine MHC class II DQ and DR genes appears to be well conserved between haplotypes, in contrast to the recently described variation in class I gene loci between equine MHC haplotypes. The identification of allelic series of equine MHC class II loci will aid comparative studies of mammalian MHC conservation and evolution and may also help to interpret associations between the equine MHC class II region and diseases of the horse. PMID:27889800

  16. MGMT DNA repair gene promoter/enhancer haplotypes alter transcription factor binding and gene expression.

    Science.gov (United States)

    Xu, Meixiang; Cross, Courtney E; Speidel, Jordan T; Abdel-Rahman, Sherif Z

    2016-10-01

    The O 6 -methylguanine-DNA methyltransferase (MGMT) protein removes O 6 -alkyl-guanine adducts from DNA. MGMT expression can thus alter the sensitivity of cells and tissues to environmental and chemotherapeutic alkylating agents. Previously, we defined the haplotype structure encompassing single nucleotide polymorphisms (SNPs) in the MGMT promoter/enhancer (P/E) region and found that haplotypes, rather than individual SNPs, alter MGMT promoter activity. The exact mechanism(s) by which these haplotypes exert their effect on MGMT promoter activity is currently unknown, but we noted that many of the SNPs comprising the MGMT P/E haplotypes are located within or in close proximity to putative transcription factor binding sites. Thus, these haplotypes could potentially affect transcription factor binding and, subsequently, alter MGMT promoter activity. In this study, we test the hypothesis that MGMT P/E haplotypes affect MGMT promoter activity by altering transcription factor (TF) binding to the P/E region. We used a promoter binding TF profiling array and a reporter assay to evaluate the effect of different P/E haplotypes on TF binding and MGMT expression, respectively. Our data revealed a significant difference in TF binding profiles between the different haplotypes evaluated. We identified TFs that consistently showed significant haplotype-dependent binding alterations (p ≤ 0.01) and revealed their role in regulating MGMT expression using siRNAs and a dual-luciferase reporter assay system. The data generated support our hypothesis that promoter haplotypes alter the binding of TFs to the MGMT P/E and, subsequently, affect their regulatory function on MGMT promoter activity and expression level.

  17. Plasmodium falciparum isolates from Angola show the StctVMNT haplotype in the pfcrt gene

    Science.gov (United States)

    2010-01-01

    Background Effective treatment remains a mainstay of malaria control, but it is unfortunately strongly compromised by drug resistance, particularly in Plasmodium falciparum, the most important human malaria parasite. Although P. falciparum chemoresistance is well recognized all over the world, limited data are available on the distribution and prevalence of pfcrt and pfmdr1 haplotypes that mediate resistance to commonly used drugs and that show distinct geographic differences. Methods Plasmodium falciparum-infected blood samples collected in 2007 at four municipalities of Luanda, Angola, were genotyped using PCR and direct DNA sequencing. Single nucleotide polymorphisms in the P. falciparum pfcrt and pfmdr1 genes were assessed and haplotype prevalences were determined. Results and Discussion The most prevalent pfcrt haplotype was StctVMNT (representing amino acids at codons 72-76). This result was unexpected, since the StctVMNT haplotype has previously been seen mainly in parasites from South America and India. The CVIET, CVMNT and CVINT drug-resistance haplotypes were also found, and one previously undescribed haplotype (CVMDT) was detected. Regarding pfmdr1, the most prevalent haplotype was YEYSNVD (representing amino acids at codons 86, 130, 184, 1034, 1042, 1109 and 1246). Wild haplotypes for pfcrt and pfmdr1 were uncommon; 3% of field isolates harbored wild type pfcrt (CVMNK), whereas 21% had wild type pfmdr1 (NEYSNVD). The observed predominance of the StctVMNT haplotype in Angola could be a result of frequent travel between Brazil and Angola citizens in the context of selective pressure of heavy CQ use. Conclusions The high prevalence of the pfcrt SVMNT haplotype and the pfmdr1 86Y mutation confirm high-level chloroquine resistance and might suggest reduced efficacy of amodiaquine in Angola. Further studies must be encouraged to examine the in vitro sensitivity of pfcrt SVMNT parasites to artesunate and amodiaquine for better conclusive data. PMID:20565881

  18. Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.

  19. Model averaging, optimal inference and habit formation

    Directory of Open Access Journals (Sweden)

    Thomas H B FitzGerald

    2014-06-01

    Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.

  20. HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR

    International Nuclear Information System (INIS)

    Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin

    2015-01-01

    Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics

  1. Inverse Ising inference with correlated samples

    International Nuclear Information System (INIS)

    Obermayer, Benedikt; Levine, Erel

    2014-01-01

    Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)

  2. The importance of learning when making inferences

    Directory of Open Access Journals (Sweden)

    Jorg Rieskamp

    2008-03-01

    Full Text Available The assumption that people possess a repertoire of strategies to solve the inference problems they face has been made repeatedly. The experimental findings of two previous studies on strategy selection are reexamined from a learning perspective, which argues that people learn to select strategies for making probabilistic inferences. This learning process is modeled with the strategy selection learning (SSL theory, which assumes that people develop subjective expectancies for the strategies they have. They select strategies proportional to their expectancies, which are updated on the basis of experience. For the study by Newell, Weston, and Shanks (2003 it can be shown that people did not anticipate the success of a strategy from the beginning of the experiment. Instead, the behavior observed at the end of the experiment was the result of a learning process that can be described by the SSL theory. For the second study, by Br"oder and Schiffer (2006, the SSL theory is able to provide an explanation for why participants only slowly adapted to new environments in a dynamic inference situation. The reanalysis of the previous studies illustrates the importance of learning for probabilistic inferences.

  3. Common Genetic Variation and Haplotypes of the Anion Exchanger SLC4A2 in Primary Biliary Cirrhosis

    Science.gov (United States)

    Juran, Brian D.; Atkinson, Elizabeth J.; Larson, Joseph J.; Schlicht, Erik M.; Lazaridis, Konstantinos N.

    2010-01-01

    Objectives Deficiencies of the anion exchanger SLC4A2 are thought to play a pathogenic role in primary biliary cirrhosis (PBC), evidenced by decreased expression and activity in PBC patients and development of disease features in SLC4A2 knockout mice. We hypothesized that genetic variation in SLC4A2 might influence this pathogenic contribution. Thus, we aimed to perform a comprehensive assessment of SLC4A2 genetic variation in PBC using a linkage disequilibrium (LD)-based haplotype-tagging approach. Methods Twelve single nucleotide polymorphisms (SNPs) across SLC4A2 were genotyped in 409 PBC patients and 300 controls and evaluated for association with disease, as well as with prior orthotopic liver transplant and antimitochondrial antibody (AMA) status among the PBC patients, both individually and as inferred haplotypes, using logistic regression. Results All SNPs were in Hardy–Weinberg equilibrium. No associations with disease or liver transplantation were detected, but two variants, rs2303929 and rs3793336, were associated with negativity for antimitochondrial antibodies among the PBC patients. Conclusions The common genetic variation of SLC4A2 does not directly affect the risk of PBC or its clinical outcome. Whether the deficiency of SLC4A2 expression and activity observed earlier in PBC patients is an acquired epiphenomenon of underlying disease or is because of heritable factors in unappreciated regulatory regions remains uncertain. Of note, two SLC4A2 variants appear to influence AMA status among PBC patients. The mechanisms behind this finding are unclear. PMID:19491853

  4. HLA-G regulatory haplotypes and implantation outcome in couples who underwent assisted reproduction treatment.

    Science.gov (United States)

    Costa, Cynthia Hernandes; Gelmini, Georgia Fernanda; Wowk, Pryscilla Fanini; Mattar, Sibelle Botogosque; Vargas, Rafael Gustavo; Roxo, Valéria Maria Munhoz Sperandio; Schuffner, Alessandro; Bicalho, Maria da Graça

    2012-09-01

    The role of HLA-G in several clinical conditions related to reproduction has been investigated. Important polymorphisms have been found within the 5'URR and 3'UTR regions of the HLA-G promoter. The aim of the present study was to investigate 16 SNPs in the 5'URR and 14-bp insertion/deletion (ins/del) polymorphism located in the 3'UTR region of the HLA-G gene and its possible association with the implantation outcome in couples who underwent assisted reproduction treatments (ART). The case group was composed of 25 ART couples. Ninety-four couples with two or more term pregnancies composed the control group. Polymorphism haplotype frequencies of the HLA-G were determined for both groups. The Haplotype 5, Haplotype 8 and Haplotype 11 were absolute absence in ART couples. The HLA-G*01:01:02a, HLA-G*01:01:02b alleles and the 14-bp ins polymorphism, Haplotype 2, showed an increased frequency in case women and similar distribution between case and control men. However, this susceptibility haplotype is significantly presented in case women and in couple with failure implantation after treatment, which led us to suggest a maternal effect, associated with this haplotype, once their presence in women is related to a higher number of couples who underwent ART. Copyright © 2012. Published by Elsevier Inc.

  5. Mutation Analysis in Classical Phenylketonuria Patients Followed by Detecting Haplotypes Linked to Some PAH Mutations.

    Science.gov (United States)

    Dehghanian, Fatemeh; Silawi, Mohammad; Tabei, Seyed M B

    2017-02-01

    Deficiency of phenylalanine hydroxylase (PAH) enzyme and elevation of phenylalanine in body fluids cause phenylketonuria (PKU). The gold standard for confirming PKU and PAH deficiency is detecting causal mutations by direct sequencing of the coding exons and splicing involved sequences of the PAH gene. Furthermore, haplotype analysis could be considered as an auxiliary approach for detecting PKU causative mutations before direct sequencing of the PAH gene by making comparisons between prior detected mutation linked-haplotypes and new PKU case haplotypes with undetermined mutations. In this study, 13 unrelated classical PKU patients took part in the study detecting causative mutations. Mutations were identified by polymerase chain reaction (PCR) and direct sequencing in all patients. After that, haplotype analysis was performed by studying VNTR and PAHSTR markers (linked genetic markers of the PAH gene) through application of PCR and capillary electrophoresis (CE). Mutation analysis was performed successfully and the detected mutations were as follows: c.782G>A, c.754C>T, c.842C>G, c.113-115delTCT, c.688G>A, and c.696A>G. Additionally, PAHSTR/VNTR haplotypes were detected to discover haplotypes linked to each mutation. Mutation detection is the best approach for confirming PAH enzyme deficiency in PKU patients. Due to the relatively large size of the PAH gene and high cost of the direct sequencing in developing countries, haplotype analysis could be used before DNA sequencing and mutation detection for a faster and cheaper way via identifying probable mutated exons.

  6. Two families from New England with usher syndrome type IC with distinct haplotypes.

    Science.gov (United States)

    DeAngelis, M M; McGee, T L; Keats, B J; Slim, R; Berson, E L; Dryja, T P

    2001-03-01

    To search for patients with Usher syndrome type IC among those with Usher syndrome type I who reside in New England. Genotype analysis of microsatellite markers closely linked to the USH1C locus was done using the polymerase chain reaction. We compared the haplotype of our patients who were homozygous in the USH1C region with the haplotypes found in previously reported USH1C Acadian families who reside in southwestern Louisiana and from a single family residing in Lebanon. Of 46 unrelated cases of Usher syndrome type I residing in New England, two were homozygous at genetic markers in the USH1C region. Of these, one carried the Acadian USH1C haplotype and had Acadian ancestors (that is, from Nova Scotia) who did not participate in the 1755 migration of Acadians to Louisiana. The second family had a haplotype that proved to be the same as that of a family with USH1C residing in Lebanon. Each of the two families had haplotypes distinct from the other. This is the first report that some patients residing in New England have Usher syndrome type IC. Patients with Usher syndrome type IC can have the Acadian haplotype or the Lebanese haplotype compatible with the idea that at least two independently arising pathogenic mutations have occurred in the yet-to-be identified USH1C gene.

  7. Mineralocorticoid receptor haplotype moderates the effects of oral contraceptives and menstrual cycle on emotional information processing.

    Science.gov (United States)

    Hamstra, Danielle A; de Kloet, E Ronald; Tollenaar, Marieke; Verkuil, Bart; Manai, Meriem; Putman, Peter; Van der Does, Willem

    2016-10-01

    The processing of emotional information is affected by menstrual cycle phase and by the use of oral contraceptives (OCs). The stress hormone cortisol is known to affect emotional information processing via the limbic mineralocorticoid receptor (MR). We investigated in an exploratory study whether the MR-genotype moderates the effect of both OC-use and menstrual cycle phase on emotional cognition. Healthy premenopausal volunteers (n=93) of West-European descent completed a battery of emotional cognition tests. Forty-nine participants were OC users and 44 naturally cycling, 21 of whom were tested in the early follicular (EF) and 23 in the mid-luteal (ML) phase of the menstrual cycle. In MR-haplotype 1/3 carriers, ML women gambled more than EF women when their risk to lose was relatively small. In MR-haplotype 2, ML women gambled more than EF women, regardless of their odds of winning. OC-users with MR-haplotype 1/3 recognised fewer facial expressions than ML women with MR-haplotype 1/3. MR-haplotype 1/3 carriers may be more sensitive to the influence of their female hormonal status. MR-haplotype 2 carriers showed more risky decision-making. As this may reflect optimistic expectations, this finding may support previous observations in female carriers of MR-haplotype 2 in a naturalistic cohort study. © The Author(s) 2016.

  8. Fetal hemoglobin in sickle cell anemia: The Arab-Indian haplotype and new therapeutic agents.

    Science.gov (United States)

    Habara, Alawi H; Shaikho, Elmutaz M; Steinberg, Martin H

    2017-11-01

    Fetal hemoglobin (HbF) has well-known tempering effects on the symptoms of sickle cell disease and its levels vary among patients with different haplotypes of the sickle hemoglobin gene. Compared with sickle cell anemia haplotypes found in patients of African descent, HbF levels in Saudi and Indian patients with the Arab-Indian (AI) haplotype exceed that in any other haplotype by nearly twofold. Genetic association studies have identified some loci associated with high HbF in the AI haplotype but these observations require functional confirmation. Saudi patients with the Benin haplotype have HbF levels almost twice as high as African patients with this haplotype but this difference is unexplained. Hydroxyurea is still the only FDA approved drug for HbF induction in sickle cell disease. While most patients treated with hydroxyurea have an increase in HbF and some clinical improvement, 10 to 20% of adults show little response to this agent. We review the genetic basis of HbF regulation focusing on sickle cell anemia in Saudi Arabia and discuss new drugs that can induce increased levels of HbF. © 2017 Wiley Periodicals, Inc.

  9. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity.

    Science.gov (United States)

    Schwessinger, Benjamin; Sperschneider, Jana; Cuddy, William S; Garnica, Diana P; Miller, Marisa E; Taylor, Jennifer M; Dodds, Peter N; Figueroa, Melania; Park, Robert F; Rathjen, John P

    2018-02-20

    A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N 50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies. IMPORTANCE Current representations of eukaryotic microbial genomes are haploid, hiding the genomic diversity intrinsic to diploid and polyploid life forms. This hidden diversity contributes to the organism's evolutionary potential and ability to adapt to stress conditions. Yet, it is

  10. Analysis of SNPs and haplotypes in vitamin D pathway genes and renal cancer risk.

    Directory of Open Access Journals (Sweden)

    Sara Karami

    2009-09-01

    Full Text Available In the kidney vitamin D is converted to its active form. Since vitamin D exerts its activity through binding to the nuclear vitamin D receptor (VDR, most genetic studies have primarily focused on variation within this gene. Therefore, analysis of genetic variation in VDR and other vitamin D pathway genes may provide insight into the role of vitamin D in renal cell carcinoma (RCC etiology. RCC cases (N = 777 and controls (N = 1,035 were genotyped to investigate the relationship between RCC risk and variation in eight target genes. Minimum-p-value permutation (Min-P tests were used to identify genes associated with risk. A three single nucleotide polymorphism (SNP sliding window was used to identify chromosomal regions with a False Discovery Rate of <10%, where subsequently, haplotype relative risks were computed in Haplostats. Min-P values showed that VDR (p-value = 0.02 and retinoid-X-receptor-alpha (RXRA (p-value = 0.10 were associated with RCC risk. Within VDR, three haplotypes across two chromosomal regions of interest were identified. The first region, located within intron 2, contained two haplotypes that increased RCC risk by approximately 25%. The second region included a haplotype (rs2239179, rs12717991 across intron 4 that increased risk among participants with the TC (OR = 1.31, 95% CI = 1.09-1.57 haplotype compared to participants with the common haplotype, TT. Across RXRA, one haplotype located 3' of the coding sequence (rs748964, rs3118523, increased RCC risk 35% among individuals with the variant haplotype compared to those with the most common haplotype. This study comprehensively evaluated genetic variation across eight vitamin D pathway genes in relation to RCC risk. We found increased risk associated with VDR and RXRA. Replication studies are warranted to confirm these findings.

  11. Genetic and molecular characterization of three novel S-haplotypes in sour cherry (Prunus cerasus L.).

    Science.gov (United States)

    Tsukamoto, Tatsuya; Potter, Daniel; Tao, Ryutaro; Vieira, Cristina P; Vieira, Jorge; Iezzoni, Amy F

    2008-01-01

    Tetraploid sour cherry (Prunus cerasus L.) exhibits gametophytic self-incompatibility (GSI) whereby the specificity of self-pollen rejection is controlled by alleles of the stylar and pollen specificity genes, S-RNase and SFB (S haplotype-specific F-box protein gene), respectively. As sour cherry selections can be either self-compatible (SC) or self-incompatible (SI), polyploidy per se does not result in SC. Instead the genotype-dependent loss of SI in sour cherry is due to the accumulation of non-functional S-haplotypes. The presence of two or more non-functional S-haplotypes within sour cherry 2x pollen renders that pollen SC. Two new S-haplotypes from sour cherry, S(33) and S(34), that are presumed to be contributed by the P. fruticosa species parent, the complete S-RNase and SFB sequences of a third S-haplotype, S(35), plus the presence of two previously identified sweet cherry S-haplotypes, S(14) and S(16) are described here. Genetic segregation data demonstrated that the S(16)-, S(33)-, S(34)-, and S(35)-haplotypes present in sour cherry are fully functional. This result is consistent with our previous finding that 'hetero-allelic' pollen is incompatible in sour cherry. Phylogenetic analyses of the SFB and S-RNase sequences from available Prunus species reveal that the relationships among S-haplotypes show no correspondence to known organismal relationships at any taxonomic level within Prunus, indicating that polymorphisms at the S-locus have been maintained throughout the evolution of the genus. Furthermore, the phylogenetic relationships among SFB sequences are generally incongruent with those among S-RNase sequences for the same S-haplotypes. Hypotheses compatible with these results are discussed.

  12. β-globin haplotypes in normal and hemoglobinopathic individuals from Reconcavo Baiano, State of Bahia, Brazil

    Directory of Open Access Journals (Sweden)

    Wellington dos Santos Silva

    2010-01-01

    Full Text Available Five restriction site polymorphisms in the β-globin gene cluster (HincII-5'ε, HindIII-Gγ, HindIII-ªγ, HincII-'ψβ1 and HincII-3''ψβ1 were analyzed in three populations (n = 114 from Reconcavo Baiano, State of Bahia, Brazil. The groups included two urban populations from the towns of Cachoeira and Maragojipe and one rural Afro-descendant population, known as the "quilombo community", from Cachoeira municipality. The number of haplotypes found in the populations ranged from 10 to 13, which indicated higher diversity than in the parental populations. The haplotypes 2 (+----,3(----+,4(-+--+and6(-++-+onthe βA chromosomes were the most common, and two haplotypes, 9 (-++++and 14 (++--+, were found exclusively in the Maragojipe population. The other haplotypes (1, 5, 9, 11, 12, 13, 14 and 16 had lower frequencies. Restriction site analysis and the derived haplotypes indicated homogeneity among the populations. Thirty-two individuals with hemoglobinopathies (17 sickle cell disease, 12 HbSC disease and 3 HbCC disease were also analyzed. The haplotype frequencies of these patients differed significantly from those of the general population. In the sickle cell disease subgroup, the predominant haplotypes were BEN (Benin and CAR (Central African Republic, with frequencies of 52.9% and 32.4%, respectively. The high frequency of the BEN haplotype agreed with the historical origin of the afro-descendant population in the state of Bahia. However, this frequency differed from that of Salvador, the state capital, where the CAR and BEN haplotypes have similar frequencies, probably as a consequence of domestic slave trade and subsequent internal migrations to other regions of Brazil.

  13. Effects of Single Nucleotide Polymorphism Marker Density on Haplotype Block Partition

    Directory of Open Access Journals (Sweden)

    Sun Ah Kim

    2016-12-01

    Full Text Available Many researchers have found that one of the most important characteristics of the structure of linkage disequilibrium is that the human genome can be divided into non-overlapping block partitions in which only a small number of haplotypes are observed. The location and distribution of haplotype blocks can be seen as a population property influenced by population genetic events such as selection, mutation, recombination and population structure. In this study, we investigate the effects of the density of markers relative to the full set of all polymorphisms in the region on the results of haplotype partitioning for five popular haplotype block partition methods: three methods in Haploview (confidence interval, four gamete test, and solid spine, MIG++ implemented in PLINK 1.9 and S-MIG++. We used several experimental datasets obtained by sampling subsets of single nucleotide polymorphism (SNP markers of chromosome 22 region in the 1000 Genomes Project data and also the HapMap phase 3 data to compare the results of haplotype block partitions by five methods. With decreasing sampling ratio down to 20% of the original SNP markers, the total number of haplotype blocks decreases and the length of haplotype blocks increases for all algorithms. When we examined the marker-independence of the haplotype block locations constructed from the datasets of different density, the results using below 50% of the entire SNP markers were very different from the results using the entire SNP markers. We conclude that the haplotype block construction results should be used and interpreted carefully depending on the selection of markers and the purpose of the study.

  14. Extended likelihood inference in reliability

    International Nuclear Information System (INIS)

    Martz, H.F. Jr.; Beckman, R.J.; Waller, R.A.

    1978-10-01

    Extended likelihood methods of inference are developed in which subjective information in the form of a prior distribution is combined with sampling results by means of an extended likelihood function. The extended likelihood function is standardized for use in obtaining extended likelihood intervals. Extended likelihood intervals are derived for the mean of a normal distribution with known variance, the failure-rate of an exponential distribution, and the parameter of a binomial distribution. Extended second-order likelihood methods are developed and used to solve several prediction problems associated with the exponential and binomial distributions. In particular, such quantities as the next failure-time, the number of failures in a given time period, and the time required to observe a given number of failures are predicted for the exponential model with a gamma prior distribution on the failure-rate. In addition, six types of life testing experiments are considered. For the binomial model with a beta prior distribution on the probability of nonsurvival, methods are obtained for predicting the number of nonsurvivors in a given sample size and for predicting the required sample size for observing a specified number of nonsurvivors. Examples illustrate each of the methods developed. Finally, comparisons are made with Bayesian intervals in those cases where these are known to exist

  15. Active inference and epistemic value.

    Science.gov (United States)

    Friston, Karl; Rigoli, Francesco; Ognibene, Dimitri; Mathys, Christoph; Fitzgerald, Thomas; Pezzulo, Giovanni

    2015-01-01

    We offer a formal treatment of choice behavior based on the premise that agents minimize the expected free energy of future outcomes. Crucially, the negative free energy or quality of a policy can be decomposed into extrinsic and epistemic (or intrinsic) value. Minimizing expected free energy is therefore equivalent to maximizing extrinsic value or expected utility (defined in terms of prior preferences or goals), while maximizing information gain or intrinsic value (or reducing uncertainty about the causes of valuable outcomes). The resulting scheme resolves the exploration-exploitation dilemma: Epistemic value is maximized until there is no further information gain, after which exploitation is assured through maximization of extrinsic value. This is formally consistent with the Infomax principle, generalizing formulations of active vision based upon salience (Bayesian surprise) and optimal decisions based on expected utility and risk-sensitive (Kullback-Leibler) control. Furthermore, as with previous active inference formulations of discrete (Markovian) problems, ad hoc softmax parameters become the expected (Bayes-optimal) precision of beliefs about, or confidence in, policies. This article focuses on the basic theory, illustrating the ideas with simulations. A key aspect of these simulations is the similarity between precision updates and dopaminergic discharges observed in conditioning paradigms.

  16. Feature Inference Learning and Eyetracking

    Science.gov (United States)

    Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.

    2009-01-01

    Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…

  17. An Inference Language for Imaging

    DEFF Research Database (Denmark)

    Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen

    2014-01-01

    We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...

  18. Mapping of HLA- DQ haplotypes in a group of Danish patients with celiac disease

    DEFF Research Database (Denmark)

    Lund, Flemming; Hermansen, Mette N; Pedersen, Merete F

    2015-01-01

    BACKGROUND: A cost-effective identification of HLA- DQ risk haplotypes using the single nucleotide polymorphism (SNP) technique has recently been applied in the diagnosis of celiac disease (CD) in four European populations. The objective of the study was to map risk HLA- DQ haplotypes in a group...... of Danish CD patients using the SNP technique. METHODS: Cohort A: Among 65 patients with gastrointestinal symptoms we compared the HLA- DQ2 and HLA- DQ8 risk haplotypes obtained by the SNP technique (method 1) with results based on a sequence specific primer amplification technique (method 2...

  19. Genetic relationships among native americans based on b-globin gene cluster haplotype frequencies

    OpenAIRE

    Mousinho-Ribeiro Rita de Cassia; Pante-de-Sousa Gabriella; Santos Eduardo José Melo dos; Guerreiro João Farias

    2003-01-01

    The distribution of b-globin gene haplotypes was studied in 209 Amerindians from eight tribes of the Brazilian Amazon: Asurini from Xingú, Awá-Guajá, Parakanã, Urubú-Kaapór, Zoé, Kayapó (Xikrin from the Bacajá village), Katuena, and Tiriyó. Nine different haplotypes were found, two of which (n. 11 and 13) had not been previously identified in Brazilian indigenous populations. Haplotype 2 (+ - - - -) was the most common in all groups studied, with frequencies varying from 70% to 100%, followed...

  20. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor

    DEFF Research Database (Denmark)

    Hansen, Torben Frøstrup; Spindler, Karen-Lise Garm; Andersen, Rikke Fredslund

    2010-01-01

    Abstract: New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study...... using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible...... findings in a second and independent cohort. Haplotype combinations call for further investigation. Keywords: colorectal neoplasm; single nucleotide polymorphisms; haplotypes; vascular endothelial growth factor A; survival...

  1. Gauging Variational Inference

    Energy Technology Data Exchange (ETDEWEB)

    Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ahn, Sungsoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of); Shin, Jinwoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of)

    2017-05-25

    Computing partition function is the most important statistical inference task arising in applications of Graphical Models (GM). Since it is computationally intractable, approximate methods have been used to resolve the issue in practice, where meanfield (MF) and belief propagation (BP) are arguably the most popular and successful approaches of a variational type. In this paper, we propose two new variational schemes, coined Gauged-MF (G-MF) and Gauged-BP (G-BP), improving MF and BP, respectively. Both provide lower bounds for the partition function by utilizing the so-called gauge transformation which modifies factors of GM while keeping the partition function invariant. Moreover, we prove that both G-MF and G-BP are exact for GMs with a single loop of a special structure, even though the bare MF and BP perform badly in this case. Our extensive experiments, on complete GMs of relatively small size and on large GM (up-to 300 variables) confirm that the newly proposed algorithms outperform and generalize MF and BP.

  2. Social Inference Through Technology

    Science.gov (United States)

    Oulasvirta, Antti

    Awareness cues are computer-mediated, real-time indicators of people’s undertakings, whereabouts, and intentions. Already in the mid-1970 s, UNIX users could use commands such as “finger” and “talk” to find out who was online and to chat. The small icons in instant messaging (IM) applications that indicate coconversants’ presence in the discussion space are the successors of “finger” output. Similar indicators can be found in online communities, media-sharing services, Internet relay chat (IRC), and location-based messaging applications. But presence and availability indicators are only the tip of the iceberg. Technological progress has enabled richer, more accurate, and more intimate indicators. For example, there are mobile services that allow friends to query and follow each other’s locations. Remote monitoring systems developed for health care allow relatives and doctors to assess the wellbeing of homebound patients (see, e.g., Tang and Venables 2000). But users also utilize cues that have not been deliberately designed for this purpose. For example, online gamers pay attention to other characters’ behavior to infer what the other players are like “in real life.” There is a common denominator underlying these examples: shared activities rely on the technology’s representation of the remote person. The other human being is not physically present but present only through a narrow technological channel.

  3. An Intuitive Dashboard for Bayesian Network Inference

    International Nuclear Information System (INIS)

    Reddy, Vikas; Farr, Anna Charisse; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K D V

    2014-01-01

    Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++

  4. The NIFTY way of Bayesian signal inference

    International Nuclear Information System (INIS)

    Selig, Marco

    2014-01-01

    We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D 3 PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy

  5. The NIFTy way of Bayesian signal inference

    Science.gov (United States)

    Selig, Marco

    2014-12-01

    We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.

  6. An Intuitive Dashboard for Bayesian Network Inference

    Science.gov (United States)

    Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.

    2014-03-01

    Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.

  7. Refined candidate region specified by haplotype sharing for Escherichia coli F4ab/F4ac susceptibility alleles in pigs

    DEFF Research Database (Denmark)

    Jacobsen, Mette Juul; Kracht, Steffen Skaarup; Esteso, G.

    2009-01-01

    Infection of the small intestine by enterotoxigenic Escherichia coli F4ab/ac is a major welfare problem and financial burden for the pig industry. Natural resistance to this infection is inherited as a Mendelian recessive trait, and a polymorphism in the MUC4 gene segregating for susceptibility....../resistance is presently used in a selection programme by the Danish pig breeding industry. To elucidate the genetic background involved in E. coli F4ab/ac susceptibility in pigs, a detailed haplotype map of the porcine candidate region was established. This region covers approximately 3.7 Mb. The material used...... for the study is a three generation family, where the founders are two Wild boars and eight Large White sows. All pigs have been phenotyped for susceptibility to F4ab/ac using an adhesion assay. Their haplotypes are known from segregation analysis using flanking markes. By a targeted approach, the candidate...

  8. On detecting incomplete soft or hard selective sweeps using haplotype structure

    DEFF Research Database (Denmark)

    Ferrer-Admetlla, Anna; Liang, Mason; Korneliussen, Thorfinn Sand

    2014-01-01

    We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust, particu......We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust......, particularly to recombination rate variation. However, all statistics show some sensitivity to the assumptions of the demographic model. Additionally, we show that nSL has at least as much power as other methods under a number of different selection scenarios, most notably in the cases of sweeps from standing...

  9. Y-STR haplotypes of Native American populations from the Brazilian Amazon region.

    Science.gov (United States)

    Palha, Teresinha Jesus Brabo Ferreira; Rodrigues, Elzemar Martins Ribeiro; dos Santos, Sidney Emanuel Batista

    2010-10-01

    The allele and haplotype frequencies of nine Y-STRs (DYS19, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS393, DYS385 I/II) were determined in a sample of six native tribes from the Brazilian Amazon (Tiriyó, Awa-Guajá, Waiãpi, Urubu-Kaapor, Zoé and Parakanã). Forty-eight different haplotypes were identified, 28 of which unique. Five haplotypes are very frequent and were shared by over 10 individuals. The estimated haplotype diversity (0.9114) was very low compared to other geographic groups, including Africans, Europeans and Asians. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  10. Temporal fluctuation of multidrug resistant salmonella typhi haplotypes in the mekong river delta region of Vietnam.

    Directory of Open Access Journals (Sweden)

    Kathryn E Holt

    2011-01-01

    Full Text Available typhoid fever remains a public health problem in Vietnam, with a significant burden in the Mekong River delta region. Typhoid fever is caused by the bacterial pathogen Salmonella enterica serovar Typhi (S. Typhi, which is frequently multidrug resistant with reduced susceptibility to fluoroquinolone-based drugs, the first choice for the treatment of typhoid fever. We used a GoldenGate (Illumina assay to type 1,500 single nucleotide polymorphisms (SNPs and analyse the genetic variation of S. Typhi isolated from 267 typhoid fever patients in the Mekong delta region participating in a randomized trial conducted between 2004 and 2005.the population of S. Typhi circulating during the study was highly clonal, with 91% of isolates belonging to a single clonal complex of the S. Typhi H58 haplogroup. The patterns of disease were consistent with the presence of an endemic haplotype H58-C and a localised outbreak of S. Typhi haplotype H58-E2 in 2004. H58-E2-associated typhoid fever cases exhibited evidence of significant geo-spatial clustering along the Sông H u branch of the Mekong River. Multidrug resistance was common in the established clone H58-C but not in the outbreak clone H58-E2, however all H58 S. Typhi were nalidixic acid resistant and carried a Ser83Phe amino acid substitution in the gyrA gene.the H58 haplogroup dominates S. Typhi populations in other endemic areas, but the population described here was more homogeneous than previously examined populations, and the dominant clonal complex (H58-C, -E1, -E2 observed in this study has not been detected outside Vietnam. IncHI1 plasmid-bearing S. Typhi H58-C was endemic during the study period whilst H58-E2, which rarely carried the plasmid, was only transient, suggesting a selective advantage for the plasmid. These data add insight into the outbreak dynamics and local molecular epidemiology of S. Typhi in southern Vietnam.

  11. Mitochondrial Haplotype Diversity in Zambian Lions: Bridging a Gap in the Biogeography of an Iconic Species.

    Science.gov (United States)

    Curry, Caitlin J; White, Paula A; Derr, James N

    2015-01-01

    Analysis of DNA sequence diversity at the 12S to 16S mitochondrial genes of 165 African lions (Panthera leo) from five main areas in Zambia has uncovered haplotypes which link Southern Africa with East Africa. Phylogenetic analysis suggests Zambia may serve as a bridge connecting the lion populations in southern Africa to eastern Africa, supporting earlier hypotheses that eastern-southern Africa may represent the evolutionary cradle for the species. Overall gene diversity throughout the Zambian lion population was 0.7319 +/- 0.0174 with eight haplotypes found; three haplotypes previously described and the remaining five novel. The addition of these five novel haplotypes, so far only found within Zambia, nearly doubles the number of haplotypes previously reported for any given geographic location of wild lions. However, based on an AMOVA analysis of these haplotypes, there is little to no matrilineal gene flow (Fst = 0.47) when the eastern and western regions of Zambia are considered as two regional sub-populations. Crossover haplotypes (H9, H11, and Z1) appear in both populations as rare in one but common in the other. This pattern is a possible result of the lion mating system in which predominately males disperse, as all individuals with crossover haplotypes were male. The determination and characterization of lion sub-populations, such as done in this study for Zambia, represent a higher-resolution of knowledge regarding both the genetic health and connectivity of lion populations, which can serve to inform conservation and management of this iconic species.

  12. Mitochondrial Haplotype Diversity in Zambian Lions: Bridging a Gap in the Biogeography of an Iconic Species.

    Directory of Open Access Journals (Sweden)

    Caitlin J Curry

    Full Text Available Analysis of DNA sequence diversity at the 12S to 16S mitochondrial genes of 165 African lions (Panthera leo from five main areas in Zambia has uncovered haplotypes which link Southern Africa with East Africa. Phylogenetic analysis suggests Zambia may serve as a bridge connecting the lion populations in southern Africa to eastern Africa, supporting earlier hypotheses that eastern-southern Africa may represent the evolutionary cradle for the species. Overall gene diversity throughout the Zambian lion population was 0.7319 +/- 0.0174 with eight haplotypes found; three haplotypes previously described and the remaining five novel. The addition of these five novel haplotypes, so far only found within Zambia, nearly doubles the number of haplotypes previously reported for any given geographic location of wild lions. However, based on an AMOVA analysis of these haplotypes, there is little to no matrilineal gene flow (Fst = 0.47 when the eastern and western regions of Zambia are considered as two regional sub-populations. Crossover haplotypes (H9, H11, and Z1 appear in both populations as rare in one but common in the other. This pattern is a possible result of the lion mating system in which predominately males disperse, as all individuals with crossover haplotypes were male. The determination and characterization of lion sub-populations, such as done in this study for Zambia, represent a higher-resolution of knowledge regarding both the genetic health and connectivity of lion populations, which can serve to inform conservation and management of this iconic species.

  13. Haplotypes of the porcine peroxisome proliferator-activated receptor delta gene are associated with backfat thickness

    Directory of Open Access Journals (Sweden)

    Blöcker Helmut

    2009-11-01

    Full Text Available Abstract Background Peroxisome proliferator-activated receptor delta belongs to the nuclear receptor superfamily of ligand-inducible transcription factors. It is a key regulator of lipid metabolism. The peroxisome proliferator-activated receptor delta gene (PPARD has been assigned to a region on porcine chromosome 7, which harbours a quantitative trait locus for backfat. Thus, PPARD is considered a functional and positional candidate gene for backfat thickness. The purpose of this study was to test this candidate gene hypothesis in a cross of breeds that were highly divergent in lipid deposition characteristics. Results Screening for genetic variation in porcine PPARD revealed only silent mutations. Nevertheless, significant associations between PPARD haplotypes and backfat thickness were observed in the F2 generation of the Mangalitsa × Piétrain cross as well as a commercial German Landrace population. Haplotype 5 is associated with increased backfat in F2 Mangalitsa × Piétrain pigs, whereas haplotype 4 is associated with lower backfat thickness in the German Landrace population. Haplotype 4 and 5 carry the same alleles at all but one SNP. Interestingly, the opposite effects of PPARD haplotypes 4 and 5 on backfat thickness are reflected by opposite effects of these two haplotypes on PPAR-δ mRNA levels. Haplotype 4 significantly increases PPAR-δ mRNA levels, whereas haplotype 5 decreases mRNA levels of PPAR-δ. Conclusion This study provides evidence for an association between PPARD and backfat thickness. The association is substantiated by mRNA quantification. Further studies are required to clarify, whether the observed associations are caused by PPARD or are the result of linkage disequilibrium with a causal variant in a neighbouring gene.

  14. Prognostic importance of VEGF-A haplotype combinations in a stage II colon cancer population

    DEFF Research Database (Denmark)

    Kjaer-Frifeldt, Sanne; Fredslund, Rikke; Lindebjerg, Jan

    2012-01-01

    To investigate the prognostic effect of three VEGF-A SNPs, -2578, -460 and 405, as well as the corresponding haplotype combinations, in a unique population of stage II colon cancer patients.......To investigate the prognostic effect of three VEGF-A SNPs, -2578, -460 and 405, as well as the corresponding haplotype combinations, in a unique population of stage II colon cancer patients....

  15. Interrelationships between Amerindian tribes of lower Amazonia as manifest by HLA haplotype disequilibria.

    OpenAIRE

    Black, F L

    1984-01-01

    HLA B-C haplotypes exhibit common disequilibria in populations drawn from four continents, indicating that they are subject to broadly active selective forces. However, the A-B and A-C associations we have examined show no consistent disequilibrium pattern, leaving open the possibility that these disequilibria are due to descent from common progenitors. By examining HLA haplotype distributions, I have explored the implications that would follow from the hypothesis that biological selection pl...

  16. Mitochondrial and Y chromosome haplotype motifs as diagnostic markers of Jewish ancestry: a reconsideration.

    Directory of Open Access Journals (Sweden)

    Sergio eTofanelli

    2014-11-01

    Full Text Available Several authors have proposed haplotype motifs based on site variants at the mitochondrial genome (mtDNA and the non-recombining portion of the Y chromosome (NRY to trace the genealogies of Jewish people. Here, we analyzed their main approaches and test the feasibility of adopting motifs as ancestry markers through construction of a large database of mtDNA and NRY haplotypes from public genetic genealogical repositories. We verified the reliability of Jewish ancestry prediction based on the Cohen and Levite Modal Haplotypes in their classical 6 STR marker format or in the extended 12 STR format, as well as four founder mtDNA lineages (HVS-I segments accounting for about 40% of the current population of Ashkenazi Jews. For this purpose we compared haplotype composition in individuals of self-reported Jewish ancestry with the rest of European, African or Middle Eastern samples, to test for non-random association of ethno-geographic groups and haplotypes. Overall, NRY and mtDNA based motifs, previously reported to differentiate between groups, were found to be more represented in Jewish compared to non-Jewish groups. However, this seems to stem from common ancestors of Jewish lineages being rather recent respect to ancestors of non-Jewish lineages with the same haplotype signatures. Moreover, the polyphyly of haplotypes which contain the proposed motifs and the misuse of constant mutation rates heavily affected previous attempts to correctly dating the origin of common ancestries. Accordingly, our results stress the limitations of using the above haplotype motifs as reliable Jewish ancestry predictors and show its inadequacy for forensic or genealogical purposes.

  17. Haplotypes in the Dystrophin DNA Segment Point to a Mosaic Origin of Modern Human Diversity

    OpenAIRE

    Ziętkiewicz, Ewa; Yotova, Vania; Gehl, Dominik; Wambach, Tina; Arrieta, Isabel; Batzer, Mark; Cole, David E.C.; Hechtman, Peter; Kaplan, Feige; Modiano, David; Moisan, Jean-Paul; Michalski, Roman; Labuda, Damian

    2003-01-01

    Although Africa has played a central role in human evolutionary history, certain studies have suggested that not all contemporary human genetic diversity is of recent African origin. We investigated 35 simple polymorphic sites and one Tn microsatellite in an 8-kb segment of the dystrophin gene. We found 86 haplotypes in 1,343 chromosomes from around the world. Although a classical out-of-Africa topology was observed in trees based on the variant frequencies, the tree of haplotype sequences re...

  18. Genetic variability and haplotypes of Echinococcus isolates from Tunisia.

    Science.gov (United States)

    Boufana, Belgees; Lahmar, Samia; Rebaï, Waël; Ben Safta, Zoubeir; Jebabli, Leïla; Ammar, Adel; Kachti, Mahmoud; Aouadi, Soufia; Craig, Philip S

    2014-11-01

    The species/genotypes of Echinococcus infecting a range of intermediate, canid and human hosts were examined as well as the intraspecific variation and population structure of Echinococcus granulosus sensu lato (s.l.) within these hosts. A total of 174 Echinococcus isolates from humans and ungulate intermediate hosts and adult tapeworms from dogs and jackals were used. Genomic DNA was used to amplify a fragment within a mitochondrial gene and a nuclear gene, coding for cytochrome c oxidase subunit 1 (cox1; 828 bp) and elongation factor 1-alpha (ef1a; 656 bp), respectively. E. granulosus sensu stricto was identified from all host species examined, E. canadensis (G6) in a camel and, for the first time, fertile cysts of E. granulosus (s.s.) and E. equinus in equids (donkeys) and E. granulosus (s.s.) from wild boars and goats. Considerable genetic variation was seen only for the cox1 sequences of E. granulosus (s.s.). The pairwise fixation index (Fst) for cox1 E. granulosus (s.s.) sequences from donkeys was high and was statistically significant compared with that of E. granulosus populations from other intermediate hosts. A single haplotype (EqTu01) was identified for the cox1 nucleotide sequences of E. equinus. The role of donkeys in the epidemiology of echinococcosis in Tunisia requires further investigation. © The Author 2014. Published by Oxford University Press on behalf of Royal Society of Tropical Medicine and Hygiene. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Unique haplotypes of cacao trees as revealed by trnH-psbA chloroplast DNA

    Directory of Open Access Journals (Sweden)

    Nidia Gutiérrez-López

    2016-04-01

    Full Text Available Cacao trees have been cultivated in Mesoamerica for at least 4,000 years. In this study, we analyzed sequence variation in the chloroplast DNA trnH-psbA intergenic spacer from 28 cacao trees from different farms in the Soconusco region in southern Mexico. Genetic relationships were established by two analysis approaches based on geographic origin (five populations and genetic origin (based on a previous study. We identified six polymorphic sites, including five insertion/deletion (indels types and one transversion. The overall nucleotide diversity was low for both approaches (geographic = 0.0032 and genetic = 0.0038. Conversely, we obtained moderate to high haplotype diversity (0.66 and 0.80 with 10 and 12 haplotypes, respectively. The common haplotype (H1 for both networks included cacao trees from all geographic locations (geographic approach and four genetic groups (genetic approach. This common haplotype (ancient derived a set of intermediate haplotypes and singletons interconnected by one or two mutational steps, which suggested directional selection and event purification from the expansion of narrow populations. Cacao trees from Soconusco region were grouped into one cluster without any evidence of subclustering based on AMOVA (FST = 0 and SAMOVA (FST = 0.04393 results. One population (Mazatán showed a high haplotype frequency; thus, this population could be considered an important reservoir of genetic material. The indels located in the trnH-psbA intergenic spacer of cacao trees could be useful as markers for the development of DNA barcoding.

  20. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method

    DEFF Research Database (Denmark)

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-01-01

    The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probabi......The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models...... the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace...... method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous...

  1. Vitamin D Receptor Gene Polymorphisms and Haplotypes in Hungarian Patients with Idiopathic Inflammatory Myopathy

    Directory of Open Access Journals (Sweden)

    Levente Bodoki

    2015-01-01

    Full Text Available Idiopathic inflammatory myopathies are autoimmune diseases characterized by symmetrical proximal muscle weakness. Our aim was to identify a correlation between VDR polymorphisms or haplotypes and myositis. We studied VDR-BsmI, VDR-ApaI, VDR-TaqI, and VDR-FokI polymorphisms and haplotypes in 89 Hungarian poly-/dermatomyositis patients (69 females and 93 controls (52 females. We did not obtain any significant differences for VDR-FokI, BsmI, ApaI, and TaqI genotypes and allele frequencies between patients with myositis and healthy individuals. There was no association of VDR polymorphisms with clinical manifestations and laboratory profiles in myositis patients. Men with myositis had a significantly different distribution of BB, Bb, and bb genotypes than female patients, control male individuals, and the entire control group. Distribution of TT, Tt, and tt genotypes was significantly different in males than in females in patient group. According to four-marker haplotype prevalence, frequencies of sixteen possible haplotypes showed significant differences between patient and control groups. The three most frequent haplotypes in patients were the fbAt, FBaT, and fbAT. Our findings may reveal that there is a significant association: Bb and Tt genotypes can be associated with myositis in the Hungarian population we studied. We underline the importance of our result in the estimated prevalence of four-marker haplotypes.

  2. Endothelial Nitric Oxide Synthase Haplotypes Are Associated with Preeclampsia in Maya Mestizo Women

    Science.gov (United States)

    Díaz-Olguín, Lizbeth; Coral-Vázquez, Ramón Mauricio; Canto-Cetina, Thelma; Canizales-Quinteros, Samuel; Ramírez Regalado, Belem; Fernández, Genny; Canto, Patricia

    2011-01-01

    Preeclampsia is a specific disease of pregnancy and believed to have a genetic component. The aim of this study was to investigate if three polymorphisms in eNOS or their haplotypes are associated with preeclampsia in Maya mestizo women. A case-control study was performed where 127 preeclamptic patients and 263 controls were included. Genotyped and haplotypes for the -768T→C, intron 4 variants, Glu298Asp of eNOS were determined by PCR and real-time PCR allelic discrimination. Logistic regression analysis with adjustment for age and body mass index (BMI) was used to test for associations between genotype and preeclampsia under recessive, codominant and dominant models. Pairwise linkage disequilibrium between single nucleotide polymorphisms was calculated by direct correlation r2, and haplotype analysis was conducted. Women homozygous for the Asp298 allele showed an association of preeclampsia. In addition, analysis of the haplotype frequencies revealed that the -786C-4b-Asp298 haplotype was significantly more frequent in preeclamptic patients than in controls (0.143 vs. 0.041, respectively; OR = 3.01; 95% CI = 1.74–5.23; P = 2.9 × 10−4). Despite the Asp298 genotype in a recessive model associated with the presence of preeclampsia in Maya mestizo women, we believe that in this population the -786C-4b-Asp298 haplotype is a better genetic marker. PMID:21897002

  3. The JAK2 GGCC (46/1 Haplotype in Myeloproliferative Neoplasms: Causal or Random?

    Directory of Open Access Journals (Sweden)

    Luisa Anelli

    2018-04-01

    Full Text Available The germline JAK2 haplotype known as “GGCC or 46/1 haplotype” (haplotypeGGCC_46/1 consists of a combination of single nucleotide polymorphisms (SNPs mapping in a region of about 250 kb, extending from the JAK2 intron 10 to the Insulin-like 4 (INLS4 gene. Four main SNPs (rs3780367, rs10974944, rs12343867, and rs1159782 generating a “GGCC” combination are more frequently indicated to represent the JAK2 haplotype. These SNPs are inherited together and are frequently associated with the onset of myeloproliferative neoplasms (MPN positive for both JAK2 V617 and exon 12 mutations. The association between the JAK2 haplotypeGGCC_46/1 and mutations in other genes, such as thrombopoietin receptor (MPL and calreticulin (CALR, or the association with triple negative MPN, is still controversial. This review provides an overview of the frequency and the role of the JAK2 haplotypeGGCC_46/1 in the pathogenesis of different myeloid neoplasms and describes the hypothetical mechanisms at the basis of the association with JAK2 gene mutations. Moreover, possible clinical implications are discussed, as different papers reported contrasting data about the correlation between the JAK2 haplotypeGGCC_46/1 and blood cell count, survival, or disease progression.

  4. VNTR alleles associated with the {alpha}-globin locus are haplotype and population related

    Energy Technology Data Exchange (ETDEWEB)

    Martinson, J.J.; Clegg, J.B.; Boyce, A.J. [Univ. of Oxford (United Kingdom)

    1994-09-01

    The human {alpha}-globin complex contains several polymorphic restriction-enzyme sites (i.e., RFLPs) linked to form haplotypes and is flanked by two hypervariable VNTR loci, the 5{prime} hypervariable region (HVR) and the more highly polymorphic 3{prime}HVR. Using a combination of RFLP analysis and PCR, the authors have characterized the 5{prime}HVR and 3{prime}HVR alleles associated with the {alpha}-globin haplotypes of 133 chromosomes, and they here show that specific {alpha}-globin haplotypes are each associated with discrete subsets of the alleles observed at these two VNTR loci. This statistically highly significant association is observed over a region spanning {approximately} 100 kb. With the exception of closely related haplotypes, different haplotypes do not share identically sized 3{prime}HVR alleles. Earlier studies have shown that {alpha}-globin haplotype distributions differ between populations; the current findings also reveal extensive population substructure in the repertoire of {alpha}-globin VNTRs. If similar features are characteristic of other VNTR loci, this will have important implications for forensic and anthropological studies. 42 refs., 5 figs., 5 tabs.

  5. Interrelationships between Amerindian tribes of lower Amazonia as manifest by HLA haplotype disequilibria.

    Science.gov (United States)

    Black, F L

    1984-11-01

    HLA B-C haplotypes exhibit common disequilibria in populations drawn from four continents, indicating that they are subject to broadly active selective forces. However, the A-B and A-C associations we have examined show no consistent disequilibrium pattern, leaving open the possibility that these disequilibria are due to descent from common progenitors. By examining HLA haplotype distributions, I have explored the implications that would follow from the hypothesis that biological selection played no role in determining A-C disequilibria in 10 diverse tribes of the lower Amazon Basin. Certain haplotypes are in strong positive disequilibria across a broad geographic area, suggesting that members of diverse tribes descend from common ancestors. On the basis of the extent of diffusion of the components of these haplotypes, one can estimate that the progenitors lived less than 6,000 years ago. One widely encountered lineage entered the area within the last 1,200 years. When haplotype frequencies are used in genetic distance measurements, they give a pattern of relationships very similar to that obtained by conventional chord measurements based on several genetic markers; but more than that, when individual haplotype disequilibria in the several tribes are compared, multiple origins of a single tribe are discernible and relationships are revealed that correlate more closely to geographic and linguistic patterns than do the genetic distance measurements.

  6. Association between β2-adrenoceptor (ADRB2) haplotypes and insulin resistance in PCOS.

    Science.gov (United States)

    Tellechea, Mariana L; Muzzio, Damián O; Iglesias Molli, Andrea E; Belli, Susana H; Graffigna, Mabel N; Levalle, Oscar A; Frechtel, Gustavo D; Cerrone, Gloria E

    2013-04-01

    The aim of this study was to explore β2-adrenoceptor (ADRB2) haplotype associations with phenotypes and quantitative traits related to insulin resistance (IR) and the metabolic syndrome (MS) in a polycystic ovary syndrome (PCOS) population. A secondary purpose was to assess the association between ADRB2 haplotype and PCOS. Genetic polymorphism analysis. Cross-sectional case-control association study. Medical University Hospital and research laboratory. One hundred and sixty-five unrelated women with PCOS and 116 unrelated women without PCOS (control sample). Clinical and biochemical measurements, and ADRB2 genotyping in PCOS patients and control subjects. ADRB2 haplotypes (comprising rs1042711, rs1801704, rs1042713 and rs1042714 in that order), genotyping and statistical analysis to evaluate associations with continuous variables and traits related to IR and MS in a PCOS population. Associations between ADRB2 haplotypes and PCOS were also assessed. We observed an age-adjusted association between ADRB2 haplotype CCGG and lower insulin (P = 0·018) and HOMA (P = 0·008) in the PCOS sample. Interestingly, the expected differences in surrogate measures of IR between cases and controls were not significant in CCGG/CCGG carriers. In the case-control study, genotype CCGG/CCGG was associated with a 14% decrease in PCOS risk (P = 0·043), taking into account confounding variables. Haplotype I (CCGG) has a protective role for IR and MS in PCOS. © 2012 Blackwell Publishing Ltd.

  7. An accurate clone-based haplotyping method by overlapping pool sequencing.

    Science.gov (United States)

    Li, Cheng; Cao, Changchang; Tu, Jing; Sun, Xiao

    2016-07-08

    Chromosome-long haplotyping of human genomes is important to identify genetic variants with differing gene expression, in human evolution studies, clinical diagnosis, and other biological and medical fields. Although several methods have realized haplotyping based on sequencing technologies or population statistics, accuracy and cost are factors that prohibit their wide use. Borrowing ideas from group testing theories, we proposed a clone-based haplotyping method by overlapping pool sequencing. The clones from a single individual were pooled combinatorially and then sequenced. According to the distinct pooling pattern for each clone in the overlapping pool sequencing, alleles for the recovered variants could be assigned to their original clones precisely. Subsequently, the clone sequences could be reconstructed by linking these alleles accordingly and assembling them into haplotypes with high accuracy. To verify the utility of our method, we constructed 130 110 clones in silico for the individual NA12878 and simulated the pooling and sequencing process. Ultimately, 99.9% of variants on chromosome 1 that were covered by clones from both parental chromosomes were recovered correctly, and 112 haplotype contigs were assembled with an N50 length of 3.4 Mb and no switch errors. A comparison with current clone-based haplotyping methods indicated our method was more accurate. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Phylogeography of Thlaspi arvense (Brassicaceae in China Inferred from Chloroplast and Nuclear DNA Sequences and Ecological Niche Modeling

    Directory of Open Access Journals (Sweden)

    Miao An

    2015-06-01

    Full Text Available Thlaspi arvense is a well-known annual farmland weed with worldwide distribution, which can be found from sea level to above 4000 m high on the Qinghai-Tibetan Plateau (QTP. In this paper, a phylogeographic history of T. arvense including 19 populations from China was inferred by using three chloroplast (cp DNA segments (trnL-trnF, rpl32-trnL and rps16 and one nuclear (n DNA segment (Fe-regulated transporter-like protein, ZIP. A total of 11 chloroplast haplotypes and six nuclear alleles were identified, and haplotypes unique to the QTP were recognized (C4, C5, C7 and N4. On the basis of molecular dating, haplotypes C4, C5 and C7 have separated from others around 1.58 Ma for cpDNA, which corresponds to the QTP uplift. In addition, this article suggests that the T. arvense populations in China are a mixture of diverged subpopulations as inferred by hT/vT test (hT ≤ vT, cpDNA and positive Tajima’s D values (1.87, 0.05 < p < 0.10 for cpDNA and 3.37, p < 0.01 for nDNA. Multimodality mismatch distribution curves and a relatively large shared area of suitable environmental conditions between the Last Glacial Maximum (LGM as well as the present time recognized by MaxEnt software reject the sudden expansion population model.

  9. Causal inference of asynchronous audiovisual speech

    Directory of Open Access Journals (Sweden)

    John F Magnotti

    2013-11-01

    Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.

  10. Statistical inference via fiducial methods

    OpenAIRE

    Salomé, Diemer

    1998-01-01

    In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary

  11. Statistical inference for stochastic processes

    National Research Council Canada - National Science Library

    Basawa, Ishwar V; Prakasa Rao, B. L. S

    1980-01-01

    The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...

  12. Inverse Ising Inference Using All the Data

    Science.gov (United States)

    Aurell, Erik; Ekeberg, Magnus

    2012-03-01

    We show that a method based on logistic regression, using all the data, solves the inverse Ising problem far better than mean-field calculations relying only on sample pairwise correlation functions, while still computationally feasible for hundreds of nodes. The largest improvement in reconstruction occurs for strong interactions. Using two examples, a diluted Sherrington-Kirkpatrick model and a two-dimensional lattice, we also show that interaction topologies can be recovered from few samples with good accuracy and that the use of l1 regularization is beneficial in this process, pushing inference abilities further into low-temperature regimes.

  13. BagReg: Protein inference through machine learning.

    Science.gov (United States)

    Zhao, Can; Liu, Dao; Teng, Ben; He, Zengyou

    2015-08-01

    Protein inference from the identified peptides is of primary importance in the shotgun proteomics. The target of protein inference is to identify whether each candidate protein is truly present in the sample. To date, many computational methods have been proposed to solve this problem. However, there is still no method that can fully utilize the information hidden in the input data. In this article, we propose a learning-based method named BagReg for protein inference. The method firstly artificially extracts five features from the input data, and then chooses each feature as the class feature to separately build models to predict the presence probabilities of proteins. Finally, the weak results from five prediction models are aggregated to obtain the final result. We test our method on six public available data sets. The experimental results show that our method is superior to the state-of-the-art protein inference algorithms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. SDG multiple fault diagnosis by real-time inverse inference

    International Nuclear Information System (INIS)

    Zhang Zhaoqian; Wu Chongguang; Zhang Beike; Xia Tao; Li Anfeng

    2005-01-01

    In the past 20 years, one of the qualitative simulation technologies, signed directed graph (SDG) has been widely applied in the field of chemical fault diagnosis. However, the assumption of single fault origin was usually used by many former researchers. As a result, this will lead to the problem of combinatorial explosion and has limited SDG to the realistic application on the real process. This is mainly because that most of the former researchers used forward inference engine in the commercial expert system software to carry out the inverse diagnosis inference on the SDG model which violates the internal principle of diagnosis mechanism. In this paper, we present a new SDG multiple faults diagnosis method by real-time inverse inference. This is a method of multiple faults diagnosis from the genuine significance and the inference engine use inverse mechanism. At last, we give an example of 65t/h furnace diagnosis system to demonstrate its applicability and efficiency

  15. SDG multiple fault diagnosis by real-time inverse inference

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Zhaoqian; Wu Chongguang; Zhang Beike; Xia Tao; Li Anfeng

    2005-02-01

    In the past 20 years, one of the qualitative simulation technologies, signed directed graph (SDG) has been widely applied in the field of chemical fault diagnosis. However, the assumption of single fault origin was usually used by many former researchers. As a result, this will lead to the problem of combinatorial explosion and has limited SDG to the realistic application on the real process. This is mainly because that most of the former researchers used forward inference engine in the commercial expert system software to carry out the inverse diagnosis inference on the SDG model which violates the internal principle of diagnosis mechanism. In this paper, we present a new SDG multiple faults diagnosis method by real-time inverse inference. This is a method of multiple faults diagnosis from the genuine significance and the inference engine use inverse mechanism. At last, we give an example of 65t/h furnace diagnosis system to demonstrate its applicability and efficiency.

  16. Genetic population structure of the desert shrub species lycium ruthenicum inferred from chloroplast dna

    International Nuclear Information System (INIS)

    Chen, H.; Yonezawa, T.

    2014-01-01

    Lycium ruthenicum (Solananeae), a spiny shrub mostly distributed in the desert regions of north and northwest China, has been shown to exhibit high tolerance to the extreme environment. In this study, the phylogeography and evolutionary history of L. ruthenicum were examined, on the basis of 80 individuals from eight populations. Using the sequence variations of two spacer regions of chloroplast DNA (trnH-psbA and rps16-trnK) , the absence of a geographic component in the chloroplast DNA genetic structure was identified (GST = 0.351, NST = 0.304, NST< GST), which was consisted with the result of SAMOVA, suggesting weak phylogeographic structure of this species. Phylogenetic and network analyses showed that a total of 10 haplotypes identified in the present study clustered into two clades, in which clade I harbored the ancestral haplotypes that inferred two independent glacial refugia in the middle of Qaidam Basin and the western Inner Mongolia. The existence of regional evolutionary differences was supported by GENETREE, which revealed that one of the population in Qaidam Basin and the two populations in Tarim Basin had experienced rapid expansion, and the other populations retained relatively stable population size during the Pleistocene . Given the results of long-term gene flow and pairwise differences, strong gene flow was insufficient to reduce the genetic differentiation among populations or within populations, probably due to the genetic composition containing a common haplotype and the high number of private haplotypes fixed for most of the population. The divergence times of different lineages were consistent with the rapid uplift phases of the Qinghai-Tibetan Plateau and the initiation and expansion of deserts in northern China, suggesting that the origin and evolution of L. ruthenicum were strongly influenced by Quaternary environment changes. (author)

  17. Inverse problem of solar oscillations

    International Nuclear Information System (INIS)

    Sekii, T.; Shibahashi, H.

    1987-01-01

    The authors present some preliminary results of numerical simulation to infer the sound velocity distribution in the solar interior from the oscillation data of the Sun as the inverse problem. They analyze the acoustic potential itself by taking account of some factors other than the sound velocity, and infer the sound velocity distribution in the deep interior of the Sun

  18. Optimal inference with suboptimal models: Addiction and active Bayesian inference

    Science.gov (United States)

    Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

    2015-01-01

    When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321

  19. Minimum description length block finder, a method to identify haplotype blocks and to compare the strength of block boundaries.

    Science.gov (United States)

    Mannila, H; Koivisto, M; Perola, M; Varilo, T; Hennah, W; Ekelund, J; Lukk, M; Peltonen, L; Ukkonen, E

    2003-07-01

    We describe a new probabilistic method for finding haplotype blocks that is based on the use of the minimum description length (MDL) principle. We give a rigorous definition of the quality of a segmentation of a genomic region into blocks and describe a dynamic programming algorithm for finding the optimal segmentation with respect to this measure. We also describe a method for finding the probability of a block boundary for each pair of adjacent markers: this gives a tool for evaluating the significance of each block boundary. We have applied the method to the published data of Daly and colleagues. The results expose some problems that exist in the current methods for the evaluation of the significance of predicted block boundaries. Our method, MDL block finder, can be used to compare block borders in different sample sets, and we demonstrate this by applying the MDL-based method to define the block structure in chromosomes from population isolates.

  20. Genetic polymorphisms and haplotypes of the organic cation transporter 1 gene (SLC22A1 in the Xhosa population of South Africa

    Directory of Open Access Journals (Sweden)

    Clifford Jacobs

    2014-06-01

    Full Text Available Human organic cation transporter 1 is primarily expressed in hepatocytes and mediates the electrogenic transport of various endogenous and exogenous compounds, including clinically important drugs. Genetic polymorphisms in the gene coding for human organic cation transporter 1, SLC22A1, are increasingly being recognized as a possible mechanism explaining the variable response to clinical drugs, which are substrates for this transporter. The genotypic and allelic distributions of 19 nonsynonymous and one intronic SLC22A1 single nucleotide polymorphisms were determined in 148 healthy Xhosa participants from South Africa, using a SNAPshot® multiplex assay. In addition, haplotype structure for SLC22A1 was inferred from the genotypic data. The minor allele frequencies for S14F (rs34447885, P341L (rs2282143, V519F (rs78899680, and the intronic variant rs622342 were 1.7%, 8.4%, 3.0%, and 21.6%, respectively. None of the participants carried the variant allele for R61C (rs12208357, C88R (rs55918055, S189L (rs34104736, G220V (rs36103319, P283L (rs4646277, R287G (rs4646278, G401S (rs34130495, M440I (rs35956182, or G465R (rs34059508. In addition, no variant alleles were observed for A306T (COSM164365, A413V (rs144322387, M420V (rs142448543, I421F (rs139512541, C436F (rs139512541, V501E (rs143175763, or I542V (rs137928512 in the population. Eight haplotypes were inferred from the genotypic data. This study reports important genetic data that could be useful for future pharmacogenetic studies of drug transporters in the indigenous Sub-Saharan African populations.

  1. Statistical theory and inference

    CERN Document Server

    Olive, David J

    2014-01-01

    This text is for  a one semester graduate course in statistical theory and covers minimal and complete sufficient statistics, maximum likelihood estimators, method of moments, bias and mean square error, uniform minimum variance estimators and the Cramer-Rao lower bound, an introduction to large sample theory, likelihood ratio tests and uniformly most powerful  tests and the Neyman Pearson Lemma. A major goal of this text is to make these topics much more accessible to students by using the theory of exponential families. Exponential families, indicator functions and the support of the distribution are used throughout the text to simplify the theory. More than 50 ``brand name" distributions are used to illustrate the theory with many examples of exponential families, maximum likelihood estimators and uniformly minimum variance unbiased estimators. There are many homework problems with over 30 pages of solutions.

  2. Identification of the Mislabeled Breast Cancer Samples by Mitochondrial DNA Haplotyping

    Directory of Open Access Journals (Sweden)

    Xiaogang Chen

    2015-01-01

    Full Text Available The task to identify whether an archival malignant tumor specimen had been mislabeled or interchanged is a challenging one for forensic genetics. The nuclear DNA (nDNA markers were affected by the aberration of tumor cells, so they were not suitable for personal identification when the tumor tissues were tested. In this study, we focused on a new solution - mitochondrial single nucleotide polymorphism (mtSNP haplotyping by a multiplex SNaPshot assay. To validate our strategy of haplotyping with 25 mtSNPs, we analyzed 15 pairs of cancerous/healthy tissues taken from patients with ductal breast carcinoma. The haplotypes of all the fifteen breast cancer tissues were matched with their paired breast tissues. The heteroplasmy at 2 sites, 14783A/G and 16519C/T was observed in one breast tissue, which indicated a mixture of related mitochondrial haplotypes. However, only one haplotype was retained in the paired breast cancer tissue, which could be considered the result of proliferation of tumor subclone. The allele drop-out and allele drop-in were observed when 39 STRs and 20 tri-allelic SNPs of nDNA were applied. Compared to nDNA markers applied, 25 mtSNPs were more stable without interference from aberrance of breast cancer. Also, two cases were presented where the investigation of haplotype with 25 mtSNPs was used to prove the origin of biopsy specimen with breast cancer. The mislabeling of biopsy specimen with breast cancer could be certified in one case but could not be supported in the other case. We highlight the importance of stability of mtSNP haplotype in breast cancer. It was implied that our multiplex SNaPshot assay with 25 mtSNPs was a useful strategy to identify mislabeled breast cancer specimen.

  3. A Haplotype Information Theory Method Reveals Genes of Evolutionary Interest in European vs. Asian Pigs.

    Science.gov (United States)

    Hudson, Nicholas J; Naval-Sánchez, Marina; Porto-Neto, Laercio; Pérez-Enciso, Miguel; Reverter, Antonio

    2018-06-05

    Asian and European wild boars were independently domesticated ca. 10,000 years ago. Since the 17th century, Chinese breeds have been imported to Europe to improve the genetics of European animals by introgression of favourable alleles, resulting in a complex mosaic of haplotypes. To interrogate the structure of these haplotypes further, we have run a new haplotype segregation analysis based on information theory, namely compression efficiency (CE). We applied the approach to sequence data from individuals from each phylogeographic region (n = 23 from Asia and Europe) including a number of major pig breeds. Our genome-wide CE is able to discriminate the breeds in a manner reflecting phylogeography. Furthermore, 24,956 non-overlapping sliding windows (each comprising 1,000 consecutive SNP) were quantified for extent of haplotype sharing within and between Asia and Europe. The genome-wide distribution of extent of haplotype sharing was quite different between groups. Unlike European pigs, Asian pigs haplotype sharing approximates a normal distribution. In line with this, we found the European breeds possessed a number of genomic windows of dramatically higher haplotype sharing than the Asian breeds. Our CE analysis of sliding windows capture some of the genomic regions reported to contain signatures of selection in domestic pigs. Prominent among these regions, we highlight the role of a gene encoding the mitochondrial enzyme LACTB which has been associated with obesity, and the gene encoding MYOG a fundamental transcriptional regulator of myogenesis. The origin of these regions likely reflects either a population bottleneck in European animals, or selective targets on commercial phenotypes reducing allelic diversity in particular genes and/or regulatory regions.

  4. F8 haplotype and inhibitor risk: results from the Hemophilia Inhibitor Genetics Study (HIGS) Combined Cohort

    Science.gov (United States)

    Schwarz, John; Astermark, Jan; Menius, Erika D.; Carrington, Mary; Donfield, Sharyne M.; Gomperts, Edward D.; Nelson, George W.; Oldenburg, Johannes; Pavlova, Anna; Shapiro, Amy D.; Winkler, Cheryl A.; Berntorp, Erik

    2012-01-01

    Background Ancestral background, specifically African descent, confers higher risk for development of inhibitory antibodies to factor VIII (FVIII) in hemophilia A. It has been suggested that differences in the distribution of factor VIII gene (F8) haplotypes, and mismatch between endogenous F8 haplotypes and those comprising products used for treatment could contribute to risk. Design and Methods Data from the HIGS Combined Cohort were used to determine the association between F8 haplotype 3 (H3) vs. haplotypes 1 and 2 (H1+H2) and inhibitor risk among individuals of genetically-determined African descent. Other variables known to affect inhibitor risk including type of F8 mutation and HLA were included in the analysis. A second research question regarding risk related to mismatch in endogenous F8 haplotype and recombinant FVIII products used for treatment was addressed. Results H3 was associated with higher inhibitor risk among those genetically-identified (N=49) as of African ancestry, but the association did not remain significant after adjustment for F8 mutation type and the HLA variables. Among subjects of all racial ancestries enrolled in HIGS who reported early use of recombinant products (N=223), mismatch in endogenous haplotype and the FVIII proteins constituting the products used did not confer greater risk for inhibitor development. Conclusion H3 was not an independent predictor of inhibitor risk. Further, our findings did not support a higher risk of inhibitors in the presence of a haplotype mismatch between the FVIII molecule infused and that of the individual. PMID:22958194

  5. Bayesian Estimation and Inference using Stochastic Hardware

    Directory of Open Access Journals (Sweden)

    Chetan Singh Thakur

    2016-03-01

    Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  6. Bayesian Estimation and Inference Using Stochastic Electronics.

    Science.gov (United States)

    Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André

    2016-01-01

    In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  7. Inferring network topology from complex dynamics

    International Nuclear Information System (INIS)

    Shandilya, Srinivas Gorur; Timme, Marc

    2011-01-01

    Inferring the network topology from dynamical observations is a fundamental problem pervading research on complex systems. Here, we present a simple, direct method for inferring the structural connection topology of a network, given an observation of one collective dynamical trajectory. The general theoretical framework is applicable to arbitrary network dynamical systems described by ordinary differential equations. No interference (external driving) is required and the type of dynamics is hardly restricted in any way. In particular, the observed dynamics may be arbitrarily complex; stationary, invariant or transient; synchronous or asynchronous and chaotic or periodic. Presupposing a knowledge of the functional form of the dynamical units and of the coupling functions between them, we present an analytical solution to the inverse problem of finding the network topology from observing a time series of state variables only. Robust reconstruction is achieved in any sufficiently long generic observation of the system. We extend our method to simultaneously reconstructing both the entire network topology and all parameters appearing linear in the system's equations of motion. Reconstruction of network topology and system parameters is viable even in the presence of external noise that distorts the original dynamics substantially. The method provides a conceptually new step towards reconstructing a variety of real-world networks, including gene and protein interaction networks and neuronal circuits.

  8. Models for inference in dynamic metacommunity systems

    Science.gov (United States)

    Dorazio, Robert M.; Kery, Marc; Royle, J. Andrew; Plattner, Matthias

    2010-01-01

    A variety of processes are thought to be involved in the formation and dynamics of species assemblages. For example, various metacommunity theories are based on differences in the relative contributions of dispersal of species among local communities and interactions of species within local communities. Interestingly, metacommunity theories continue to be advanced without much empirical validation. Part of the problem is that statistical models used to analyze typical survey data either fail to specify ecological processes with sufficient complexity or they fail to account for errors in detection of species during sampling. In this paper, we describe a statistical modeling framework for the analysis of metacommunity dynamics that is based on the idea of adopting a unified approach, multispecies occupancy modeling, for computing inferences about individual species, local communities of species, or the entire metacommunity of species. This approach accounts for errors in detection of species during sampling and also allows different metacommunity paradigms to be specified in terms of species- and location-specific probabilities of occurrence, extinction, and colonization: all of which are estimable. In addition, this approach can be used to address inference problems that arise in conservation ecology, such as predicting temporal and spatial changes in biodiversity for use in making conservation decisions. To illustrate, we estimate changes in species composition associated with the species-specific phenologies of flight patterns of butterflies in Switzerland for the purpose of estimating regional differences in biodiversity.

  9. A comprehensive literature review of haplotyping software and methods for use with unrelated individuals

    Directory of Open Access Journals (Sweden)

    Salem Rany M

    2005-03-01

    Full Text Available Abstract Interest in the assignment and frequency analysis of haplotypes in samples of unrelated individuals has increased immeasurably as a result of the emphasis placed on haplotype analyses by, for example, the International HapMap Project and related initiatives. Although there are many available computer programs for haplotype analysis applicable to samples of unrelated individuals, many of these programs have limitations and/or very specific uses. In this paper, the key features of available haplotype analysis software for use with unrelated individuals, as well as pooled DNA samples from unrelated individuals, are summarised. Programs for haplotype analysis were identified through keyword searches on PUBMED and various internet search engines, a review of citations from retrieved papers and personal communications, up to June 2004. Priority was given to functioning computer programs, rather than theoretical models and methods. The available software was considered in light of a number of factors: the algorithm(s used, algorithm accuracy, assumptions, the accommodation of genotyping error, implementation of hypothesis testing, handling of missing data, software characteristics and web-based implementations. Review papers comparing specific methods and programs are also summarised. Forty-six haplotyping programs were identified and reviewed. The programs were divided into two groups: those designed for individual genotype data (a total of 43 programs and those designed for use with pooled DNA samples (a total of three programs. The accuracy of programs using various criteria are assessed and the programs are categorised and discussed in light of: algorithm and method, accuracy, assumptions, genotyping error, hypothesis testing, missing data, software characteristics and web implementation. Many available programs have limitations (eg some cannot accommodate missing data and/or are designed with specific tasks in mind (eg estimating

  10. Bayesian inference data evaluation and decisions

    CERN Document Server

    Harney, Hanns Ludwig

    2016-01-01

    This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...

  11. Bootstrapping phylogenies inferred from rearrangement data

    Directory of Open Access Journals (Sweden)

    Lin Yu

    2012-08-01

    Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its

  12. Bootstrapping phylogenies inferred from rearrangement data.

    Science.gov (United States)

    Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me

    2012-08-29

    Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver

  13. Mitochondrial control region haplotypes of the South American sea lion Otaria flavescens (Shaw, 1800).

    Science.gov (United States)

    Artico, L O; Bianchini, A; Grubel, K S; Monteiro, D S; Estima, S C; Oliveira, L R de; Bonatto, S L; Marins, L F

    2010-09-01

    The South American sea lion, Otaria flavescens, is widely distributed along the Pacific and Atlantic coasts of South America. However, along the Brazilian coast, there are only two nonbreeding sites for the species (Refúgio de Vida Silvestre da Ilha dos Lobos and Refúgio de Vida Silvestre do Molhe Leste da Barra do Rio Grande), both in Southern Brazil. In this region, the species is continuously under the effect of anthropic activities, mainly those related to environmental contamination with organic and inorganic chemicals and fishery interactions. This paper reports, for the first time, the genetic diversity of O. flavescens found along the Southern Brazilian coast. A 287-bp fragment of the mitochondrial DNA control region (D-loop) was analyzed. Seven novel haplotypes were found in 56 individuals (OFA1-OFA7), with OFA1 being the most frequent (47.54%). Nucleotide diversity was moderate (π = 0.62%) and haplotype diversity was relatively low (67%). Furthermore, the median joining network analysis indicated that Brazilian haplotypes formed a reciprocal monophyletic clade when compared to the haplotypes from the Peruvian population on the Pacific coast. These two populations do not share haplotypes and may have become isolated some time back. Further genetic studies covering the entire species distribution are necessary to better understand the biological implications of the results reported here for the management and conservation of South American sea lions.

  14. Mitochondrial control region haplotypes of the South American sea lion Otaria flavescens (Shaw, 1800

    Directory of Open Access Journals (Sweden)

    L.O. Artico

    2010-09-01

    Full Text Available The South American sea lion, Otaria flavescens, is widely distributed along the Pacific and Atlantic coasts of South America. However, along the Brazilian coast, there are only two nonbreeding sites for the species (Refúgio de Vida Silvestre da Ilha dos Lobos and Refúgio de Vida Silvestre do Molhe Leste da Barra do Rio Grande, both in Southern Brazil. In this region, the species is continuously under the effect of anthropic activities, mainly those related to environmental contamination with organic and inorganic chemicals and fishery interactions. This paper reports, for the first time, the genetic diversity of O. flavescens found along the Southern Brazilian coast. A 287-bp fragment of the mitochondrial DNA control region (D-loop was analyzed. Seven novel haplotypes were found in 56 individuals (OFA1-OFA7, with OFA1 being the most frequent (47.54%. Nucleotide diversity was moderate (π = 0.62% and haplotype diversity was relatively low (67%. Furthermore, the median joining network analysis indicated that Brazilian haplotypes formed a reciprocal monophyletic clade when compared to the haplotypes from the Peruvian population on the Pacific coast. These two populations do not share haplotypes and may have become isolated some time back. Further genetic studies covering the entire species distribution are necessary to better understand the biological implications of the results reported here for the management and conservation of South American sea lions.

  15. PADI4 Haplotypes in Association with RA Mexican Patients, a New Prospect for Antigen Modulation

    Directory of Open Access Journals (Sweden)

    Maria Guadalupe Zavala-Cerna

    2013-01-01

    Full Text Available Peptidyl arginine deiminase IV (PAD 4 is the responsible enzyme for a posttranslational modification called citrullination, originating the antigenic determinant recognized by anti-cyclic citrullinated peptide antibodies (ACPA. Four SNPs (single nucleotide polymorphisms have been described in PADI4 gene to form a susceptibility haplotype for rheumatoid arthritis (RA; nevertheless, results in association studies appear contradictory in different populations. The aim of the study was to analyze if the presence of three SNPs in PADI4 gene susceptibility haplotype (GTG is associated with ACPA positivity in patients with RA. This was a cross-sectional study that included 86 RA patients and 98 healthy controls. Polymorphisms PADI4_89, PADI4_90, and PADI4_92 in the PADI4 gene were genotyped. The susceptibility haplotype (GTG was more frequent in RA patients; interestingly, we found a new haplotype associated with RA with a higher frequency (GTC. There were no associations between polymorphisms and high scores in Spanish HAQ-DI and DAS-28, but we did find an association between RARBIS index and PADI4_89, PADI4_90 polymorphisms. We could not confirm an association between susceptibility haplotype presence and ACPA positivity. Further evidence about proteomic expression of this gene will determine its participation in antigenic generation and autoimmunity.

  16. The impact of sample size and marker selection on the study of haplotype structures

    Directory of Open Access Journals (Sweden)

    Sun Xiao

    2004-03-01

    Full Text Available Abstract Several studies of haplotype structures in the human genome in various populations have found that the human chromosomes are structured such that each chromosome can be divided into many blocks, within which there is limited haplotype diversity. In addition, only a few genetic markers in a putative block are needed to capture most of the diversity within a block. There has been no systematic empirical study of the effects of sample size and marker set on the identified block structures and representative marker sets, however. The purpose of this study was to conduct a detailed empirical study to examine such impacts. Towards this goal, we have analysed three representative autosomal regions from a large genome-wide study of haplotypes with samples consisting of African-Americans and samples consisting of Japanese and Chinese individuals. For both populations, we have found that the sample size and marker set have significant impact on the number of blocks and the total number of representative markers identified. The marker set in particular has very strong impacts, and our results indicate that the marker density in the original datasets may not be adequate to allow a meaningful characterisation of haplotype structures. In general, we conclude that we need a relatively large sample size and a very dense marker panel in the study of haplotype structures in human populations.

  17. Haplotype analysis of common variants in the BRCA1 gene and risk of sporadic breast cancer

    International Nuclear Information System (INIS)

    Cox, David G; Kraft, Peter; Hankinson, Susan E; Hunter, David J

    2005-01-01

    Truncation mutations in the BRCA1 gene cause a substantial increase in risk of breast cancer. However, these mutations are rare in the general population and account for little of the overall incidence of sporadic breast cancer. We used whole-gene resequencing data to select haplotype tagging single nucleotide polymorphisms, and examined the association between common haplotypes of BRCA1 and breast cancer in a nested case-control study in the Nurses' Health Study (1323 cases and 1910 controls). One haplotype was associated with a slight increase in risk (odds ratio 1.18, 95% confidence interval 1.02–1.37). A significant interaction (P = 0.05) was seen between this haplotype, positive family history of breast cancer, and breast cancer risk. Although not statistically significant, similar interactions were observed with age at diagnosis and with menopausal status at diagnosis; risk tended to be higher among younger, pre-menopausal women. We have described a haplotype in the BRCA1 gene that was associated with an approximately 20% increase in risk of sporadic breast cancer in the general population. However, the functional variant(s) responsible for the association are unclear

  18. Haplotypes in the dystrophin DNA segment point to a mosaic origin of modern human diversity.

    Science.gov (United States)

    Zietkiewicz, Ewa; Yotova, Vania; Gehl, Dominik; Wambach, Tina; Arrieta, Isabel; Batzer, Mark; Cole, David E C; Hechtman, Peter; Kaplan, Feige; Modiano, David; Moisan, Jean-Paul; Michalski, Roman; Labuda, Damian

    2003-11-01

    Although Africa has played a central role in human evolutionary history, certain studies have suggested that not all contemporary human genetic diversity is of recent African origin. We investigated 35 simple polymorphic sites and one T(n) microsatellite in an 8-kb segment of the dystrophin gene. We found 86 haplotypes in 1,343 chromosomes from around the world. Although a classical out-of-Africa topology was observed in trees based on the variant frequencies, the tree of haplotype sequences reveals three lineages accounting for present-day diversity. The proportion of new recombinants and the diversity of the T(n) microsatellite were used to estimate the age of haplotype lineages and the time of colonization events. The lineage that underwent the great expansion originated in Africa prior to the Upper Paleolithic (27,000-56,000 years ago). A second group, of structurally distinct haplotypes that occupy a central position on the tree, has never left Africa. The third lineage is represented by the haplotype that lies closest to the root, is virtually absent in Africa, and appears older than the recent out-of-Africa expansion. We propose that this lineage could have left Africa before the expansion (as early as 160,000 years ago) and admixed, outside of Africa, with the expanding lineage. Contemporary human diversity, although dominated by the recently expanded African lineage, thus represents a mosaic of different contributions.

  19. Fused Regression for Multi-source Gene Regulatory Network Inference.

    Directory of Open Access Journals (Sweden)

    Kari Y Lam

    2016-12-01

    Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.

  20. Eight challenges in phylodynamic inference

    Directory of Open Access Journals (Sweden)

    Simon D.W. Frost

    2015-03-01

    Full Text Available The field of phylodynamics, which attempts to enhance our understanding of infectious disease dynamics using pathogen phylogenies, has made great strides in the past decade. Basic epidemiological and evolutionary models are now well characterized with inferential frameworks in place. However, significant challenges remain in extending phylodynamic inference to more complex systems. These challenges include accounting for evolutionary complexities such as changing mutation rates, selection, reassortment, and recombination, as well as epidemiological complexities such as stochastic population dynamics, host population structure, and different patterns at the within-host and between-host scales. An additional challenge exists in making efficient inferences from an ever increasing corpus of sequence data.

  1. Asymptotic inference for jump diffusions with state-dependent intensity

    NARCIS (Netherlands)

    Becheri, Gaia; Drost, Feico; Werker, Bas

    2016-01-01

    We establish the local asymptotic normality property for a class of ergodic parametric jump-diffusion processes with state-dependent intensity and known volatility function sampled at high frequency. We prove that the inference problem about the drift and jump parameters is adaptive with respect to

  2. The Impact of Transitive Inference Operations on Mathematics ...

    African Journals Online (AJOL)

    This study examined the extent to which operations of transitive inference tasks have affected the mathematics problem solving abilities of pre-primary school children. Four research hypotheses were tested at 0.05 level of significance using 400 nursery school children whose ages ranged between 4.5 and 5.5 years ...

  3. Cortical hierarchies perform Bayesian causal inference in multisensory perception.

    Directory of Open Access Journals (Sweden)

    Tim Rohe

    2015-02-01

    Full Text Available To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI, and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation. At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion. Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.

  4. Deletion analysis of male sterility effects of t-haplotypes in the mouse.

    Science.gov (United States)

    Bennett, D; Artzt, K

    1990-01-01

    We present data on the effects of three chromosome 17 deletions on transmission ratio distortion (TRD) and sterility of several t-haplotypes. All three deletions have similar effects on male TRD: that is, Tdel/tcomplete genotypes all transmit their t-haplotype in very high proportion. However, each deletion has different effects on sterility of heterozygous males, with TOr/t being fertile, Thp/t less fertile, and TOrl/t still less fertile. These data suggest that wild-type genes on chromosomes homologous to t-haplotypes can be important regulators of both TRD and fertility in males, and that the wild-type genes concerned with TRD and fertility are at least to some extent different. The data also provide a rough map of the positions of these genes.

  5. HLA alleles and haplotypes in Burmese (Myanmarese) and Karen in Thailand.

    Science.gov (United States)

    Kongmaroeng, C; Romphruk, A; Puapairoj, C; Leelayuwat, C; Kulski, J K; Inoko, H; Dunn, D S; Romphruk, A V

    2015-09-01

    This is the first report on human leukocyte antigen (HLA) allele and haplotype frequencies at three class I loci and two class II loci in unrelated healthy individuals from two ethnic groups, 170 Burmese and 200 Karen, originally from Burma (Myanmar), but sampled while residing in Thailand. Overall, the HLA allele and haplotype frequencies detected by polymerase chain reaction sequence-specific primer (PCR-SSP) at five loci (A, B, C, DRB1 and DRQB1) at low resolution showed distinct differences between the Burmese and Karen. In Burmese, five HLA-B*15 haplotypes with different HLA-A and HLA-DR/DQ combinations were detected with three of these not previously reported in other Asian populations. The data are important in the fields of anthropology, transplantation and disease-association studies. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  6. Casein haplotypes and their association with milk production traits in Norwegian Red cattle

    Directory of Open Access Journals (Sweden)

    Nome Torfinn

    2009-02-01

    Full Text Available Abstract A high resolution SNP map was constructed for the bovine casein region to identify haplotype structures and study associations with milk traits in Norwegian Red cattle. Our analyses suggest separation of the casein cluster into two haplotype blocks, one consisting of the CSN1S1, CSN2 and CSN1S2 genes and another one consisting of the CSN3 gene. Highly significant associations with both protein and milk yield were found for both single SNPs and haplotypes within the CSN1S1-CSN2-CSN1S2 haplotype block. In contrast, no significant association was found for single SNPs or haplotypes within the CSN3 block. Our results point towards CSN2 and CSN1S2 as the most likely loci harbouring the underlying causative DNA variation. In our study, the most significant results were found for the SNP CSN2_67 with the C allele consistently associated with both higher protein and milk yields. CSN2_67 calls a C to an A substitution at codon 67 in β-casein gene resulting in histidine replacing proline in the amino acid sequence. This polymorphism determines the protein variants A1/B (CSN2_67 A allele versus A2/A3 (CSN2_67 C allele. Other studies have suggested that a high consumption of A1/B milk may affect human health by increasing the risk of diabetes and heart diseases. Altogether these results argue for an increase in the frequency of the CSN2_67 C allele or haplotypes containing this allele in the Norwegian Red cattle population by selective breeding.

  7. HLA haplotype map of river valley populations with hemochromatosis traced through five centuries in Central Sweden.

    Science.gov (United States)

    Olsson, K Sigvard; Ritter, Bernd; Hansson, Norbeth; Chowdhury, Ruma R

    2008-07-01

    The hemochromatosis mutation, C282Y of the HFE gene, seems to have originated from a single event which once occurred in a person living in the north west of Europe carrying human leukocyte antigen (HLA)-A3-B7. In descendants of this ancestor also other haplotypes appear probably caused by local recombinations and founder effects. The background of these associations is unknown. Isolated river valley populations may be fruitful for the mapping of genetic disorders such as hemochromatosis. In this study, we try to test this hypothesis in a study from central Sweden where the haplotyope A1-B8 was common. HLA haplotypes and HFE mutations were studied in hemochromatosis patients with present or past parental origin in a sparsely populated (1/km(2)) rural district (n = 8366 in the year of 2005), in central Sweden. Pedigrees were constructed from the Swedish church book registry. Extended haplotypes were studied to evaluate origin of recombinations. There were 87 original probands, 36 females and 51 males identified during 30 yr, of whom 86% carried C282Y/C282Y and 14% C282Y/H63D. Of 32 different HLA haplotypes A1-B8 was the most common (34%), followed by A3-B7 (16%), both in strong linkage disequilibrium with controls, (P females. River valley populations may contain HLA haplotypes reflecting their demographic history. This study has demonstrated that the resistance against recombinations between HLA-A and HFE make HLA haplotypes excellent markers for population movements. Founder effects and genetic drift from bottleneck populations (surviving the plague?) may explain the commonness of the mutation in central Scandinavia. The intergenerational time difference >30 yr was greater than expected and means that the age of the original mutation may be underestimated.

  8. An unusual haplotype structure on human chromosome 8p23 derived from the inversion polymorphism.

    Science.gov (United States)

    Deng, Libin; Zhang, Yuezheng; Kang, Jian; Liu, Tao; Zhao, Hongbin; Gao, Yang; Li, Chaohua; Pan, Hao; Tang, Xiaoli; Wang, Dunmei; Niu, Tianhua; Yang, Huanming; Zeng, Changqing

    2008-10-01

    Chromosomal inversion is an important type of genomic variations involved in both evolution and disease pathogenesis. Here, we describe the refined genetic structure of a 3.8-Mb inversion polymorphism at chromosome 8p23. Using HapMap data of 1,073 SNPs generated from 209 unrelated samples from CEPH-Utah residents with ancestry from northern and western Europe (CEU); Yoruba in Ibadan, Nigeria (YRI); and Asian (ASN) samples, which were comprised of Han Chinese from Beijing, China (CHB) and Japanese from Tokyo, Japan (JPT)-we successfully deduced the inversion orientations of all their 418 haplotypes. In particular, distinct haplotype subgroups were identified based on principal component analysis (PCA). Such genetic substructures were consistent with clustering patterns based on neighbor-joining tree reconstruction, which revealed a total of four haplotype clades across all samples. Metaphase fluorescence in situ hybridization (FISH) in a subset of 10 HapMap samples verified their inversion orientations predicted by PCA or phylogenetic tree reconstruction. Positioning of the outgroup haplotype within one of YRI clades suggested that Human NCBI Build 36-inverted order is most likely the ancestral orientation. Furthermore, the population differentiation test and the relative extended haplotype homozygosity (REHH) analysis in this region discovered multiple selection signals, also in a population-specific manner. A positive selection signal was detected at XKR6 in the ASN population. These results revealed the correlation of inversion polymorphisms to population-specific genetic structures, and various selection patterns as possible mechanisms for the maintenance of a large chromosomal rearrangement at 8p23 region during evolution. In addition, our study also showed that haplotype-based clustering methods, such as PCA, can be applied in scanning for cryptic inversion polymorphisms at a genome-wide scale.

  9. The clinical application of single-sperm-based SNP haplotyping for PGD of osteogenesis imperfecta.

    Science.gov (United States)

    Chen, Linjun; Diao, Zhenyu; Xu, Zhipeng; Zhou, Jianjun; Yan, Guijun; Sun, Haixiang

    2018-05-15

    Osteogenesis imperfecta (OI) is a genetically heterogeneous disorder, presenting either autosomal dominant, autosomal recessive or X-linked inheritance patterns. The majority of OI cases are autosomal dominant and are caused by heterozygous mutations in either the COL1A1 or COL1A2 gene. In these dominant disorders, allele dropout (ADO) can lead to misdiagnosis in preimplantation genetic diagnosis (PGD). Polymorphic markers linked to the mutated genes have been used to establish haplotypes for identifying ADO and ensuring the accuracy of PGD. However, the haplotype of male patients cannot be determined without data from affected relatives. Here, we developed a method for single-sperm-based single-nucleotide polymorphism (SNP) haplotyping via next-generation sequencing (NGS) for the PGD of OI. After NGS, 10 informative polymorphic SNP markers located upstream and downstream of the COL1A1 gene and its pathogenic mutation site were linked to individual alleles in a single sperm from an affected male. After haplotyping, a normal blastocyst was transferred to the uterus for a subsequent frozen embryo transfer cycle. The accuracy of PGD was confirmed by amniocentesis at 19 weeks of gestation. A healthy infant weighing 4,250 g was born via vaginal delivery at the 40th week of gestation. Single-sperm-based SNP haplotyping can be applied for PGD of any monogenic disorders or de novo mutations in males in whom the haplotype of paternal mutations cannot be determined due to a lack of affected relatives. ADO: allele dropout; DI: dentinogenesis imperfect; ESHRE: European Society of Human Reproduction and Embryology; FET: frozen embryo transfer; gDNA: genomic DNA; ICSI: intracytoplasmic sperm injection; IVF: in vitro fertilization; MDA: multiple displacement amplification; NGS: next-generation sequencing; OI: osteogenesis imperfect; PBS: phosphate buffer saline; PCR: polymerase chain reaction; PGD: preimplantation genetic diagnosis; SNP: single-nucleotide polymorphism; STR

  10. Patterns of linkage disequilibrium and haplotype distribution in disease candidate genes.

    Science.gov (United States)

    Long, Ji-Rong; Zhao, Lan-Juan; Liu, Peng-Yuan; Lu, Yan; Dvornyk, Volodymyr; Shen, Hui; Liu, Yong-Jun; Zhang, Yuan-Yuan; Xiong, Dong-Hai; Xiao, Peng; Deng, Hong-Wen

    2004-05-24

    The adequacy of association studies for complex diseases depends critically on the existence of linkage disequilibrium (LD) between functional alleles and surrounding SNP markers. We examined the patterns of LD and haplotype distribution in eight candidate genes for osteoporosis and/or obesity using 31 SNPs in 1,873 subjects. These eight genes are apolipoprotein E (APOE), type I collagen alpha1 (COL1A1), estrogen receptor-alpha (ER-alpha), leptin receptor (LEPR), parathyroid hormone (PTH)/PTH-related peptide receptor type 1 (PTHR1), transforming growth factor-beta1 (TGF-beta1), uncoupling protein 3 (UCP3), and vitamin D (1,25-dihydroxyvitamin D3) receptor (VDR). Yin yang haplotypes, two high-frequency haplotypes composed of completely mismatching SNP alleles, were examined. To quantify LD patterns, two common measures of LD, D' and r2, were calculated for the SNPs within the genes. The haplotype distribution varied in the different genes. Yin yang haplotypes were observed only in PTHR1 and UCP3. D' ranged from 0.020 to 1.000 with the average of 0.475, whereas the average r2 was 0.158 (ranging from 0.000 to 0.883). A decay of LD was observed as the intermarker distance increased, however, there was a great difference in LD characteristics of different genes or even in different regions within gene. The differences in haplotype distributions and LD patterns among the genes underscore the importance of characterizing genomic regions of interest prior to association studies.

  11. Vitamin K epoxide reductase complex subunit 1 (Vkorc1 haplotype diversity in mouse priority strains

    Directory of Open Access Journals (Sweden)

    Kohn Michael H

    2008-12-01

    Full Text Available Abstract Background Polymorphisms in the vitamin K-epoxide reductase complex subunit 1 gene, Vkorc1, could affect blood coagulation and other vitamin K-dependent proteins, such as osteocalcin (bone Gla protein, BGP. Here we sequenced the Vkorc1 gene in 40 mouse priority strains. We analyzed Vkorc1 haplotypes with respect to prothrombin time (PT and bone mineral density and composition (BMD and BMC; phenotypes expected to be vitamin K-dependent and represented by data in the Mouse Phenome Database (MPD. Findings In the commonly used laboratory strains of Mus musculus domesticus we identified only four haplotypes differing in the intron or 5' region sequence of the Vkorc1. Six haplotypes differing by coding and non-coding polymorphisms were identified in the other subspecies of Mus. We detected no significant association of Vkorc1 haplotypes with PT, BMD and BMC within each subspecies of Mus. Vkorc1 haplotype sequences divergence between subspecies was associated with PT, BMD and BMC. Conclusion Phenotypic variation in PT, BMD and BMC within subspecies of Mus, while substantial, appears to be dominated by genetic variation in genes other than the Vkorc1. This was particularly evident for M. m. domesticus, where a single haplotype was observed in conjunction with virtually the entire range of PT, BMD and BMC values of all 5 subspecies of Mus included in this study. Differences in these phenotypes between subspecies also should not be attributed to Vkorc1 variants, but should be viewed as a result of genome wide genetic divergence.

  12. Object-Oriented Type Inference

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff; Palsberg, Jens

    1991-01-01

    We present a new approach to inferring types in untyped object-oriented programs with inheritance, assignments, and late binding. It guarantees that all messages are understood, annotates the program with type information, allows polymorphic methods, and can be used as the basis of an op...

  13. Inference in hybrid Bayesian networks

    DEFF Research Database (Denmark)

    Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael

    2009-01-01

    Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....

  14. Mixed normal inference on multicointegration

    NARCIS (Netherlands)

    Boswijk, H.P.

    2009-01-01

    Asymptotic likelihood analysis of cointegration in I(2) models, see Johansen (1997, 2006), Boswijk (2000) and Paruolo (2000), has shown that inference on most parameters is mixed normal, implying hypothesis test statistics with an asymptotic 2 null distribution. The asymptotic distribution of the

  15. Statistical inference and Aristotle's Rhetoric.

    Science.gov (United States)

    Macdonald, Ranald R

    2004-11-01

    Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.

  16. Class I gene regulation of haplotype preference may influence antiviral immunity in vivo

    DEFF Research Database (Denmark)

    Thomsen, Allan Randrup; Marker, O

    1989-01-01

    targets. In regard to the in vivo significance of haplotype preference it was found that (C X C3) F1 mice expressed an earlier and stronger virus-specific delayed type hypersensitivity response and exerted a more efficient virus control than did (C-H-2dm2 X C3) F1. Taken together these findings suggest...... that haplotype preference reflects a selection process favoring the restriction element associated with the most efficient immune response in vivo. The implications of this are discussed....

  17. Novel harmful recessive haplotypes identified for fertility traits in Nordic Holstein cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Nielsen, Ulrik Sander; Aamand, Gert Pedersen

    2013-01-01

    harboring possible recessive lethal alleles. Effects of the identified haplotypes were estimated on two fertility traits: non-return rates and calving interval. Out of the eight identified genomic regions, six regions were confirmed as having an effect on fertility. The information can be used to avoid......Using genomic data, lethal recessives may be discovered from haplotypes that are common in the population but never occur in the homozygote state in live animals. This approach only requires genotype data from phenotypically normal (i.e. live) individuals and not from the affected embryos that die...

  18. Inheritance of the 8.1 ancestral haplotype in recurrent pregnancy loss

    DEFF Research Database (Denmark)

    Kolte, Astrid M; Nielsen, Henriette S; Steffensen, Rudi

    2015-01-01

    . The objective was to test the gestational drive theory for the 8.1AH in women with RPL and their live born children. METHODOLOGY: We investigated the inheritance of the 8.1AH from 82 heterozygous RPL women to 110 live born children. All participants were genotyped for HLA-A, -B and -DRB1 in DNA from EDTA...... pleiotropy. It has also been proposed that the survival of long, conserved haplotypes may be due to gestational drive, i.e. selective miscarriage of fetuses who have not inherited the haplotype from a heterozygous mother. Recurrent pregnancy loss (RPL) is defined as three or more consecutive pregnancy losses...

  19. Congruence as a measurement of extended haplotype structure across the genome

    Science.gov (United States)

    2012-01-01

    Background Historically, extended haplotypes have been defined using only a few data points, such as alleles for several HLA genes in the MHC. High-density SNP data, and the increasing affordability of whole genome SNP typing, creates the opportunity to define higher resolution extended haplotypes. This drives the need for new tools that support quantification and visualization of extended haplotypes as defined by as many as 2000 SNPs. Confronted with high-density SNP data across the major histocompatibility complex (MHC) for 2,300 complete families, compiled by the Type 1 Diabetes Genetics Consortium (T1DGC), we developed software for studying extended haplotypes. Methods The software, called ExHap (Extended Haplotype), uses a similarity measurement we term congruence to identify and quantify long-range allele identity. Using ExHap, we analyzed congruence in both the T1DGC data and family-phased data from the International HapMap Project. Results Congruent chromosomes from the T1DGC data have between 96.5% and 99.9% allele identity over 1,818 SNPs spanning 2.64 megabases of the MHC (HLA-DRB1 to HLA-A). Thirty-three of 132 DQ-DR-B-A defined haplotype groups have > 50% congruent chromosomes in this region. For example, 92% of chromosomes within the DR3-B8-A1 haplotype are congruent from HLA-DRB1 to HLA-A (99.8% allele identity). We also applied ExHap to all 22 autosomes for both CEU and YRI cohorts from the International HapMap Project, identifying multiple candidate extended haplotypes. Conclusions Long-range congruence is not unique to the MHC region. Patterns of allele identity on phased chromosomes provide a simple, straightforward approach to visually and quantitatively inspect complex long-range structural patterns in the genome. Such patterns aid the biologist in appreciating genetic similarities and differences across cohorts, and can lead to hypothesis generation for subsequent studies. PMID:22369243

  20. Distribution pattern of Plasmodium falciparum chloroquine transporter (pfcrt) gene haplotypes in Sri Lanka 1996-2006

    DEFF Research Database (Denmark)

    Zhang, Jenny J; Senaratne, Tharanga N; Daniels, Rachel

    2011-01-01

    Abstract. Widespread antimalarial resistance has been a barrier to malaria elimination efforts in Sri Lanka. Analysis of genetic markers in historic parasites may uncover trends in the spread of resistance. We examined the frequency of Plasmodium falciparum chloroquine transporter (pfcrt; codons 72......-76) haplotypes in Sri Lanka in 1996-1998 and 2004-2006 using a high-resolution melting assay. Among 59 samples from 1996 to 1998, we detected the SVMNT (86%), CVMNK (10%), and CVIET (2%) haplotypes, with a positive trend in SVMNT and a negative trend in CVMNK frequency (P = 0.004) over time. Among 24 samples...

  1. Inference in partially identified models with many moment inequalities using Lasso

    DEFF Research Database (Denmark)

    Bugni, Federico A.; Caner, Mehmet; Kock, Anders Bredahl

    This paper considers the problem of inference in a partially identified moment (in)equality model with possibly many moment inequalities. Our contribution is to propose a novel two-step new inference method based on the combination of two ideas. On the one hand, our test statistic and critical...

  2. An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

    KAUST Repository

    Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

    2016-01-01

    then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve

  3. Occurrence of the Southeast Asian/South American SVMNT haplotype of the chloroquine-resistance transporter gene in Plasmodium falciparum in Tanzania

    DEFF Research Database (Denmark)

    Alifrangis, Michael; Dalgaard, Michael B; Lusingu, John P

    2006-01-01

    Two main haplotypes, CVIET and SVMNT, of the Plasmodium falciparum chloroquine-resistance transporter gene (Pfcrt) are linked to 4-aminoquinoline resistance. The CVIET haplotype has been reported in most malaria-endemic regions, whereas the SVMNT haplotype has only been found outside Africa. We...... investigated Pfcrt haplotype frequencies in Korogwe District, Tanzania, in 2003 and 2004. The SVMNT haplotype was not detected in 2003 but was found in 19% of infected individuals in 2004. Amodiaquine use has increased in the region. The introduction and high prevalence of the SVMNT haplotype may reflect...... this and may raise concern regarding the use of amodiaquine in artemisinin-based combination therapies in Africa....

  4. The Probabilistic Convolution Tree: Efficient Exact Bayesian Inference for Faster LC-MS/MS Protein Inference

    Science.gov (United States)

    Serang, Oliver

    2014-01-01

    Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234

  5. Facility Activity Inference Using Radiation Networks

    Energy Technology Data Exchange (ETDEWEB)

    Rao, Nageswara S. [ORNL; Ramirez Aviles, Camila A. [ORNL

    2017-11-01

    We consider the problem of inferring the operational status of a reactor facility using measurements from a radiation sensor network deployed around the facility’s ventilation off-gas stack. The intensity of stack emissions decays with distance, and the sensor counts or measurements are inherently random with parameters determined by the intensity at the sensor’s location. We utilize the measurements to estimate the intensity at the stack, and use it in a one-sided Sequential Probability Ratio Test (SPRT) to infer on/off status of the reactor. We demonstrate the superior performance of this method over conventional majority fusers and individual sensors using (i) test measurements from a network of 21 NaI detectors, and (ii) effluence measurements collected at the stack of a reactor facility. We also analytically establish the superior detection performance of the network over individual sensors with fixed and adaptive thresholds by utilizing the Poisson distribution of the counts. We quantify the performance improvements of the network detection over individual sensors using the packing number of the intensity space.

  6. Inferring relevance in a changing world

    Directory of Open Access Journals (Sweden)

    Robert C Wilson

    2012-01-01

    Full Text Available Reinforcement learning models of human and animal learning usually concentrate on how we learn the relationship between different stimuli or actions and rewards. However, in real world situations stimuli are ill-defined. On the one hand, our immediate environment is extremely multi-dimensional. On the other hand, in every decision-making scenario only a few aspects of the environment are relevant for obtaining reward, while most are irrelevant. Thus a key question is how do we learn these relevant dimensions, that is, how do we learn what to learn about? We investigated this process of representation learning experimentally, using a task in which one stimulus dimension was relevant for determining reward at each point in time. As in real life situations, in our task the relevant dimension can change without warning, adding ever-present uncertainty engendered by a constantly changing environment. We show that human performance on this task is better described by a suboptimal strategy based on selective attention and serial hypothesis testing rather than a normative strategy based on probabilistic inference. From this, we conjecture that the problem of inferring relevance in general scenarios is too computationally demanding for the brain to solve optimally. As a result the brain utilizes approximations, employing these even in simplified scenarios in which optimal representation learning is tractable, such as the one in our experiment.

  7. Graphical models for inferring single molecule dynamics

    Directory of Open Access Journals (Sweden)

    Gonzalez Ruben L

    2010-10-01

    Full Text Available Abstract Background The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM. The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM with Gaussian observables. A detailed description of smFRET is provided as well. Results The VBEM algorithm returns the model’s evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME, and the latter a description of the model’s parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML optimized by the expectation maximization (EM algorithm, the most important being a natural form of model selection and a well-posed (non-divergent optimization problem. Conclusions The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.

  8. Congested Link Inference Algorithms in Dynamic Routing IP Network

    Directory of Open Access Journals (Sweden)

    Yu Chen

    2017-01-01

    Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.

  9. Baselines and test data for cross-lingual inference

    DEFF Research Database (Denmark)

    Agic, Zeljko; Schluter, Natalie

    2018-01-01

    The recent years have seen a revival of interest in textual entailment, sparked by i) the emergence of powerful deep neural network learners for natural language processing and ii) the timely development of large-scale evaluation datasets such as SNLI. Recast as natural language inference......, the problem now amounts to detecting the relation between pairs of statements: they either contradict or entail one another, or they are mutually neutral. Current research in natural language inference is effectively exclusive to English. In this paper, we propose to advance the research in SNLI-style natural...... language inference toward multilingual evaluation. To that end, we provide test data for four major languages: Arabic, French, Spanish, and Russian. We experiment with a set of baselines. Our systems are based on cross-lingual word embeddings and machine translation. While our best system scores an average...

  10. Sign Inference for Dynamic Signed Networks via Dictionary Learning

    Directory of Open Access Journals (Sweden)

    Yi Cen

    2013-01-01

    Full Text Available Mobile online social network (mOSN is a burgeoning research area. However, most existing works referring to mOSNs deal with static network structures and simply encode whether relationships among entities exist or not. In contrast, relationships in signed mOSNs can be positive or negative and may be changed with time and locations. Applying certain global characteristics of social balance, in this paper, we aim to infer the unknown relationships in dynamic signed mOSNs and formulate this sign inference problem as a low-rank matrix estimation problem. Specifically, motivated by the Singular Value Thresholding (SVT algorithm, a compact dictionary is selected from the observed dataset. Based on this compact dictionary, the relationships in the dynamic signed mOSNs are estimated via solving the formulated problem. Furthermore, the estimation accuracy is improved by employing a dictionary self-updating mechanism.

  11. Echinococcus equinus and Echinococcus granulosus sensu stricto from the United Kingdom: genetic diversity and haplotypic variation.

    Science.gov (United States)

    Boufana, Belgees; Lett, Wai San; Lahmar, Samia; Buishi, Imad; Bodell, Anthony J; Varcasia, Antonio; Casulli, Adriano; Beeching, Nicholas J; Campbell, Fiona; Terlizzo, Monica; McManus, Donald P; Craig, Philip S

    2015-02-01

    Cystic echinococcosis is endemic in Europe including the United Kingdom. However, information on the molecular epidemiology of Echinococcus spp. from the United Kingdom is limited. Echinococcus isolates from intermediate and definitive animal hosts as well as from human cystic echinococcosis cases were analysed to determine species and genotypes within these hosts. Echinococcus equinus was identified from horse hydatid isolates, cysts retrieved from captive UK mammals and copro-DNA of foxhounds and farm dogs. Echinococcus granulosus sensu stricto (s.s.) was identified from hydatid cysts of sheep and cattle as well as in DNA extracted from farm dog and foxhound faecal samples, and from four human cystic echinococcosis isolates, including the first known molecular confirmation of E. granulosus s.s. infection in a Welsh sheep farmer. Low genetic variability for E. equinus from various hosts and from different geographical locations was detected using the mitochondrial cytochrome c oxidase subunit 1 gene (cox1), indicating the presence of a dominant haplotype (EQUK01). In contrast, greater haplotypic variation was observed for E. granulosus s.s. cox1 sequences. The haplotype network showed a star-shaped network with a centrally placed main haplotype (EgUK01) that had been reported from other world regions. Copyright © 2014 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  12. MtDNA haplotype identification of aurochs remains originating from the Czech Republic (Central Europe)

    Czech Academy of Sciences Publication Activity Database

    Kyselý, René; Hájek, M.

    2012-01-01

    Roč. 17, č. 2 (2012), s. 118-125 ISSN 1461-4103 Institutional research plan: CEZ:AV0Z80020508 Institutional support: RVO:67985912 Keywords : wild cattle (Bos primigenius) * aDNA * haplotype P * domestication Subject RIV: AC - Archeology, Anthropology, Ethnology

  13. Haplotype mapping of a diploid non-meiotic organism using existing and induced aneuploidies.

    Directory of Open Access Journals (Sweden)

    Melanie Legrand

    2008-01-01

    Full Text Available Haplotype maps (HapMaps reveal underlying sequence variation and facilitate the study of recombination and genetic diversity. In general, HapMaps are produced by analysis of Single-Nucleotide Polymorphism (SNP segregation in large numbers of meiotic progeny. Candida albicans, the most common human fungal pathogen, is an obligate diploid that does not appear to undergo meiosis. Thus, standard methods for haplotype mapping cannot be used. We exploited naturally occurring aneuploid strains to determine the haplotypes of the eight chromosome pairs in the C. albicans laboratory strain SC5314 and in a clinical isolate. Comparison of the maps revealed that the clinical strain had undergone a significant amount of genome rearrangement, consisting primarily of crossover or gene conversion recombination events. SNP map haplotyping revealed that insertion and activation of the UAU1 cassette in essential and non-essential genes can result in whole chromosome aneuploidy. UAU1 is often used to construct homozygous deletions of targeted genes in C. albicans; the exact mechanism (trisomy followed by chromosome loss versus gene conversion has not been determined. UAU1 insertion into the essential ORC1 gene resulted in a large proportion of trisomic strains, while gene conversion events predominated when UAU1 was inserted into the non-essential LRO1 gene. Therefore, induced aneuploidies can be used to generate HapMaps, which are essential for analyzing genome alterations and mitotic recombination events in this clonal organism.

  14. SNP and haplotype analysis reveal IGF2 variants associated with growth traits in Chinese Qinchuan cattle.

    Science.gov (United States)

    Huang, Yong-Zhen; Zhan, Zhao-Yang; Li, Xin-Yi; Wu, Sheng-Ru; Sun, Yu-Jia; Xue, Jing; Lan, Xian-Yong; Lei, Chu-Zhao; Zhang, Chun-Lei; Jia, Yu-Tang; Chen, Hong

    2014-02-01

    Insulin-like growth factor 2 (IGF2) is a potent cell growth and differentiation factor and is implicated in mammals' growth and development. The objective of this study was to evaluate the effects of the mutations in the bovine IGF2 with growth traits in Chinese Qinchuan cattle. Four single nucleotide polymorphisms (SNPs) were detected of the bovine IGF2 by DNA pool sequencing and forced polymerase chain reaction-restriction fragment length polymorphism (forced PCR-RFLP) methods. We also investigated haplotype structure and linkage disequilibrium (LD) coefficients for four SNPs in 817 individuals representing two main cattle breeds from China. The result of haplotype analysis showed eight different haplotypes and 27 combined genotypes within the study population. The statistical analyses indicated that the four SNPs, combined genotypes and haplotypes are associated with the withers height, body length, chest breadth, chest depth and body weight in Qinchuan cattle population (P growth traits; the heterozygote diplotype was associated with higher growth traits compared to wild-type homozygote. Our results provide evidence that polymorphisms in the IGF2 gene are associated with growth traits, and may be used for marker-assisted selection in beef cattle breeding program.

  15. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor A Gene in Colorectal Cancer

    International Nuclear Information System (INIS)

    Hansen, Torben F.; Spindler, Karen-Lise G.; Andersen, Rikke F.; Lindebjerg, Jan; Kølvraa, Steen; Brandslund, Ivan; Jakobsen, Anders

    2010-01-01

    New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study was to investigate the prognostic importance of haplotypes in the VEGF-A gene in patients with CRC. The study included 486 patients surgically resected for stage II and III CRC, divided into two independent cohorts. Three SNPs in the VEGF-A gene were analyzed by polymerase chain reaction. Haplotypes were estimated using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible for this effect, was present in approximately 30% of the patients and demonstrated a significant relationship with poor survival, and it remained an independent prognostic marker after multivariate analysis, hazard ratio 2.46 (95% confidence interval 1.49–4.06), p < 0.001. Validation was provided by consistent findings in a second and independent cohort. Haplotype combinations call for further investigation

  16. Divergence at the casein haplotypes in dairy and meat goat breeds.

    Science.gov (United States)

    Küpper, Julia; Chessa, Stefania; Rignanese, Daniela; Caroli, Anna; Erhardt, Georg

    2010-02-01

    Casein genes have been proved to have an influence on milk properties, and are in addition appropriate for phylogeny studies. A large number of casein polymorphisms exist in goats, making their analysis quite complex. The four casein loci were analyzed by molecular techniques for genetic polymorphism detection in the two dairy goat breeds Bunte Deutsche Edelziege (BDE; n=96), Weisse Deutsche Edelziege (WDE; n=91), and the meat goat breed Buren (n=75). Of the 35 analyzed alleles, 18 were found in BDE, and 17 in Buren goats and WDE. In addition, a new allele was identified at the CSN1S1 locus in the BDE, showing a frequency of 0.05. This variant, named CSN1S1*A', is characterized by a t-->c transversion in intron 9. Linkage disequilibrium was found at the casein haplotype in all three breeds. A total of 30 haplotypes showed frequencies higher than 0.01. In the Buren breed only one haplotype showed a frequency higher than 0.1. The ancestral haplotype B-A-A-B (in the order: CSN1S1-CSN2-CSN1S2-CSN3) occurred in all three breeds, showing a very high frequency (>0.8) in the Buren.

  17. Using haplotypes to unravel the inheritance of Holstein coat color for a larger audience

    Science.gov (United States)

    Haplotype testing identifies single-nucleotide polymorphisms that bracket a group of alleles from several different genes located on a specific chromosomal section of DNA. For a trait with a limited number of genotypes and phenotypes, the rules of inheritance can be determined by matching up certain...

  18. Population Structure of Pseudocercospora fijiensis in Costa Rica Reveals Shared Haplotype Diversity with Southeast Asian Populations.

    Science.gov (United States)

    Saville, Amanda; Charles, Melodi; Chavan, Suchitra; Muñoz, Miguel; Gómez-Alpizar, Luis; Ristaino, Jean Beagle

    2017-12-01

    Pseudocercospora fijiensis is the causal pathogen of black Sigatoka, a devastating disease of banana that can cause 20 to 80% yield loss in the absence of fungicides in banana crops. The genetic structure of populations of P. fijiensis in Costa Rica was examined and compared with Honduran and global populations to better understand migration patterns and inform management strategies. In total, 118 isolates of P. fijiensis collected from Costa Rica and Honduras from 2010 to 2014 were analyzed using multilocus genotyping of six loci and compared with a previously published global dataset of populations of P. fijiensis. The Costa Rican and Honduran populations shared haplotype diversity with haplotypes from Southeast Asia, Oceania, and the Americas but not Africa for all but one of the six loci studied. Gene flow and shared haplotype diversity was found in Honduran and Costa Rican populations of the pathogen. The data indicate that the haplotypic diversity observed in Costa Rican populations of P. fijiensis is derived from dispersal from initial outbreak sources in Honduras and admixtures between genetically differentiated sources from Southeast Asia, Oceania, and the Americas.

  19. Evidence of a Native Northwest Atlantic COI Haplotype Clade in the Cryptogenic Colonial Ascidian Botryllus schlosseri.

    Science.gov (United States)

    Yund, Philip O; Collins, Catherine; Johnson, Sheri L

    2015-06-01

    The colonial ascidian Botryllus schlosseri should be considered cryptogenic (i.e., not definitively classified as either native or introduced) in the Northwest Atlantic. Although all the evidence is quite circumstantial, over the last 15 years most research groups have accepted the scenario of human-mediated dispersal and classified B. schlosseri as introduced; others have continued to consider it native or cryptogenic. We address the invasion status of this species by adding 174 sequences to the growing worldwide database for the mitochondrial gene cytochrome c oxidase subunit I (COI) and analyzing 1077 sequences to compare genetic diversity of one clade of haplotypes in the Northwest Atlantic with two hypothesized source regions (the Northeast Atlantic and Mediterranean). Our results lead us to reject the prevailing view of the directionality of transport across the Atlantic. We argue that the genetic diversity patterns at COI are far more consistent with the existence of at least one haplotype clade in the Northwest Atlantic (and possibly a second) that substantially pre-dates human colonization from Europe, with this native North American clade subsequently introduced to three sites in Northeast Atlantic and Mediterranean waters. However, we agree with past researchers that some sites in the Northwest Atlantic have more recently been invaded by alien haplotypes, so that some populations are currently composed of a mixture of native and invader haplotypes. © 2015 Marine Biological Laboratory.

  20. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

    NARCIS (Netherlands)

    M.D. Patterson (Murray); T. Marschall (Tobias); N. Pisanti (Nadia); L.J.J. van Iersel (Leo); L. Stougie (Leen); G.W. Klau (Gunnar); A. Schönhuth (Alexander)

    2014-01-01

    htmlabstractThe human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are

  1. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

    NARCIS (Netherlands)

    Patterson, M.; Marschall, T.; Pisanti, N.; van Iersel, L.J.J.; Stougie, L.; Klau, G.W.; Schoenhuth, A.

    2014-01-01

    The human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for

  2. De novo assembly of a haplotype-resolved human genome

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang

    2015-01-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-...

  3. Endothelial Nitric Oxide Synthase Haplotypes Are Associated with Preeclampsia in Maya Mestizo Women

    Directory of Open Access Journals (Sweden)

    Lizbeth Díaz-Olguín

    2011-01-01

    Full Text Available Preeclampsia is a specific disease of pregnancy and believed to have a genetic component. The aim of this study was to investigate if three polymorphisms in eNOS or their haplotypes are associated with preeclampsia in Maya mestizo women.

  4. HLA class II linkage disequilibrium and haplotype evolution in the Cayapa Indians of Ecuador

    Energy Technology Data Exchange (ETDEWEB)

    Trachtenberg, E.A.; Erlich, H.A. [Roche Molecular Systems, Alameda, CA (United States); Klitz, W. [Univ. of California, Berkeley, CA (United States)] [and others

    1995-08-01

    DNA-based typing of the HLA class II loci in a sample of the Cayapa Indians of Ecuador reveals several lines of evidence that selection has operated to maintain and to diversify the existing level of polymorphism in the class II region. As has been noticed for other Native American groups, the overall level of polymorphism at the DRB1, DQA1, DQB1, and DPB1 loci is reduced relative to that found in other human populations. Nonetheless, the relative eveness in the distribution of allele frequencies at each of the four loci points to the role of balancing selection in the maintenance of the polymorphism. The DQA1 and DQB1 loci, in particular, have near-maximum departures from the neutrality model, which suggests that balancing selection has been especially strong in these cases. Several novel DQA1-DQB1 haplotypes and the discovery of a new DRB1 allele demonstrate an evolutionary tendency favoring the diversification of class II alleles and haplotypes. The recombination interval between the centromeric DPB1 locus and the other class II loci will, in the absence of other forces such as selection, reduce disequilibrium across this region. However, nearly all common alleles were found to be part of DR-DP haplotypes in strong disequilibrium, consistent with the recent action of selection acting on these haplotypes in the Cayapa. 50 refs., 3 figs., 3 tabs.

  5. Novel Harmful Recessive Haplotypes Identified for Fertility Traits in Nordic Holstein Cattle

    Science.gov (United States)

    Sahana, Goutam; Nielsen, Ulrik Sander; Aamand, Gert Pedersen; Lund, Mogens Sandø; Guldbrandtsen, Bernt

    2013-01-01

    Using genomic data, lethal recessives may be discovered from haplotypes that are common in the population but never occur in the homozygote state in live animals. This approach only requires genotype data from phenotypically normal (i.e. live) individuals and not from the affected embryos that die. A total of 7,937 Nordic Holstein animals were genotyped with BovineSNP50 BeadChip and haplotypes including 25 consecutive markers were constructed and tested for absence of homozygotes states. We have identified 17 homozygote deficient haplotypes which could be loosely clustered into eight genomic regions harboring possible recessive lethal alleles. Effects of the identified haplotypes were estimated on two fertility traits: non-return rates and calving interval. Out of the eight identified genomic regions, six regions were confirmed as having an effect on fertility. The information can be used to avoid carrier-by-carrier mattings in practical animal breeding. Further, identification of causative genes/polymorphisms responsible for lethal effects will lead to accurate testing of the individuals carrying a lethal allele. PMID:24376603

  6. Haplotype frequencies at the DRD2 locus in populations of the East European Plain

    Directory of Open Access Journals (Sweden)

    Mikulich Alexey I

    2009-09-01

    Full Text Available Abstract Background It was demonstrated previously that the three-locus RFLP haplotype, TaqI B-TaqI D-TaqI A (B-D-A, at the DRD2 locus constitutes a powerful genetic marker and probably reflects the most ancient dispersal of anatomically modern humans. Results We investigated TaqI B, BclI, MboI, TaqI D, and TaqI A RFLPs in 17 contemporary populations of the East European Plain and Siberia. Most of these populations belong to the Indo-European or Uralic language families. We identified three common haplotypes, which occurred in more than 90% of chromosomes investigated. The frequencies of the haplotypes differed according to linguistic and geographical affiliation. Conclusion Populations in the northwestern (Byelorussians from Mjadel', northern (Russians from Mezen' and Oshevensk, and eastern (Russians from Puchezh parts of the East European Plain had relatively high frequencies of haplotype B2-D2-A2, which may reflect admixture with Uralic-speaking populations that inhabited all of these regions in the Early Middle Ages.

  7. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  8. Bayesian inference with ecological applications

    CERN Document Server

    Link, William A

    2009-01-01

    This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...

  9. Bayesian inference on proportional elections.

    Directory of Open Access Journals (Sweden)

    Gabriel Hideki Vatanabe Brunello

    Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.

  10. System Support for Forensic Inference

    Science.gov (United States)

    Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan

    Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.

  11. Statistical inference on residual life

    CERN Document Server

    Jeong, Jong-Hyeon

    2014-01-01

    This is a monograph on the concept of residual life, which is an alternative summary measure of time-to-event data, or survival data. The mean residual life has been used for many years under the name of life expectancy, so it is a natural concept for summarizing survival or reliability data. It is also more interpretable than the popular hazard function, especially for communications between patients and physicians regarding the efficacy of a new drug in the medical field. This book reviews existing statistical methods to infer the residual life distribution. The review and comparison includes existing inference methods for mean and median, or quantile, residual life analysis through medical data examples. The concept of the residual life is also extended to competing risks analysis. The targeted audience includes biostatisticians, graduate students, and PhD (bio)statisticians. Knowledge in survival analysis at an introductory graduate level is advisable prior to reading this book.

  12. Statistical inference a short course

    CERN Document Server

    Panik, Michael J

    2012-01-01

    A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal

  13. Nonparametric predictive inference in reliability

    International Nuclear Information System (INIS)

    Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.

    2002-01-01

    We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere

  14. Oestrogen receptor α gene haplotype and postmenopausal breast cancer risk: a case control study

    International Nuclear Information System (INIS)

    Wedrén, Sara; Stiger, Fredrik; Persson, Ingemar; Baron, John; Weiderpass, Elisabete; Lovmar, Lovisa; Humphreys, Keith; Magnusson, Cecilia; Melhus, Håkan; Syvänen, Ann-Christine; Kindmark, Andreas; Landegren, Ulf; Fermér, Maria Lagerström

    2004-01-01

    Oestrogen receptor α, which mediates the effect of oestrogen in target tissues, is genetically polymorphic. Because breast cancer development is dependent on oestrogenic influence, we have investigated whether polymorphisms in the oestrogen receptor α gene (ESR1) are associated with breast cancer risk. We genotyped breast cancer cases and age-matched population controls for one microsatellite marker and four single-nucleotide polymorphisms (SNPs) in ESR1. The numbers of genotyped cases and controls for each marker were as follows: TA n , 1514 cases and 1514 controls; c.454-397C → T, 1557 cases and 1512 controls; c.454-351A → G, 1556 cases and 1512 controls; c.729C → T, 1562 cases and 1513 controls; c.975C → G, 1562 cases and 1513 controls. Using logistic regression models, we calculated odds ratios (ORs) and 95% confidence intervals (CIs). Haplotype effects were estimated in an exploratory analysis, using expectation-maximisation algorithms for case-control study data. There were no compelling associations between single polymorphic loci and breast cancer risk. In haplotype analyses, a common haplotype of the c.454-351A → G or c.454-397C → T and c.975C → G SNPs appeared to be associated with an increased risk for ductal breast cancer: one copy of the c.454-351A → G and c.975C → G haplotype entailed an OR of 1.19 (95% CI 1.06–1.33) and two copies with an OR of 1.42 (95% CI 1.15–1.77), compared with no copies, under a model of multiplicative penetrance. The association with the c.454-397C → T and c.975C → G haplotypes was similar. Our data indicated that these haplotypes were more influential in women with a high body mass index. Adjustment for multiple comparisons rendered the associations statistically non-significant. We found suggestions of an association between common haplotypes in ESR1 and the risk for ductal breast cancer that is stronger in heavy women

  15. An ancestral haplotype of the human PERIOD2 gene associates with reduced sensitivity to light-induced melatonin suppression.

    Directory of Open Access Journals (Sweden)

    Tokiho Akiyama

    Full Text Available Humans show various responses to the environmental stimulus in individual levels as "physiological variations." However, it has been unclear if these are caused by genetic variations. In this study, we examined the association between the physiological variation of response to light-stimulus and genetic polymorphisms. We collected physiological data from 43 subjects, including light-induced melatonin suppression, and performed haplotype analyses on the clock genes, PER2 and PER3, exhibiting geographical differentiation of allele frequencies. Among the haplotypes of PER3, no significant difference in light sensitivity was found. However, three common haplotypes of PER2 accounted for more than 96% of the chromosomes in subjects, and 1 of those 3 had a significantly low-sensitive response to light-stimulus (P < 0.05. The homozygote of the low-sensitive PER2 haplotype showed significantly lower percentages of melatonin suppression (P < 0.05, and the heterozygotes of the haplotypes varied their ratios, indicating that the physiological variation for light-sensitivity is evidently related to the PER2 polymorphism. Compared with global haplotype frequencies, the haplotype with a low-sensitive response was more frequent in Africans than in non-Africans, and came to the root in the phylogenetic tree, suggesting that the low light-sensitive haplotype is the ancestral type, whereas the other haplotypes with high sensitivity to light are the derived types. Hence, we speculate that the high light-sensitive haplotypes have spread throughout the world after the Out-of-Africa migration of modern humans.

  16. Self-compatible peach (Prunus persica) has mutant versions of the S haplotypes found in self-incompatible Prunus species.

    Science.gov (United States)

    Tao, Ryutaro; Watari, Akiko; Hanada, Toshio; Habu, Tsuyoshi; Yaegaki, Hideaki; Yamaguchi, Masami; Yamane, Hisayo

    2007-01-01

    This study demonstrates that self-compatible (SC) peach has mutant versions of S haplotypes that are present in self-incompatible (SI) Prunus species. All three peach S haplotypes, S (1), S (2), and S (2m), found in this study encode mutated pollen determinants, SFB, while only S (2m) has a mutation that affects the function of the pistil determinant S-RNase. A cysteine residue in the C5 domain of the S (2m)-RNase is substituted by a tyrosine residue, thereby reducing RNase stability. The peach SFB mutations are similar to the SFB mutations found in SC haplotypes of sweet cherry (P. avium) and Japanese apricot (P. mume). SFB (1) of the S (1) haplotype, a mutant version of almond (P. dulcis) S (k) haplotype, encodes truncated SFB due to a 155 bp insertion. SFB (2) of the S (2) and S (2m) haplotypes, both of which are mutant versions of the S (a) haplotype in Japanese plum (P. salicina), encodes a truncated SFB due to a 5 bp insertion. Thus, regardless of the functionality of the pistil determinant, all three peach S haplotypes are SC haplotypes. Our finding that peach has mutant versions of S haplotypes that function in almond and Japanese plum, which are phylogenetically close and remote species, respectively, to peach in the subfamily Prunoideae of the Roasaceae, provides insight into the SC/SI evolution in Prunus. We discuss the significance of SC pollen part mutation in peach with special reference to possible differences in the SI mechanisms between Prunus and Solanaceae.

  17. Continuous Integrated Invariant Inference, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...

  18. Statistical perspectives on inverse problems

    DEFF Research Database (Denmark)

    Andersen, Kim Emil

    of the interior of an object from electrical boundary measurements. One part of this thesis concerns statistical approaches for solving, possibly non-linear, inverse problems. Thus inverse problems are recasted in a form suitable for statistical inference. In particular, a Bayesian approach for regularisation...... problem is given in terms of probability distributions. Posterior inference is obtained by Markov chain Monte Carlo methods and new, powerful simulation techniques based on e.g. coupled Markov chains and simulated tempering is developed to improve the computational efficiency of the overall simulation......Inverse problems arise in many scientific disciplines and pertain to situations where inference is to be made about a particular phenomenon from indirect measurements. A typical example, arising in diffusion tomography, is the inverse boundary value problem for non-invasive reconstruction...

  19. An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks

    KAUST Repository

    Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

    2016-01-01

    In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem

  20. The anatomy of choice: active inference and agency

    Directory of Open Access Journals (Sweden)

    Karl eFriston

    2013-09-01

    Full Text Available This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behaviour. In particular, we consider prior beliefs that action minimises the Kullback-Leibler divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimises a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimising free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action – constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualises optimal decision theory and economic (utilitarian formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimisation, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria has a unique and Bayes-optimal solution – that minimises free energy. This sensitivity corresponds to the precision of beliefs about behaviour, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behaviour entails a representation of confidence about outcomes that are under an agent's control.

  1. The anatomy of choice: active inference and agency.

    Science.gov (United States)

    Friston, Karl; Schwartenbeck, Philipp; Fitzgerald, Thomas; Moutoussis, Michael; Behrens, Timothy; Dolan, Raymond J

    2013-01-01

    This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behavior. In particular, we consider prior beliefs that action minimizes the Kullback-Leibler (KL) divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimizes a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimizing free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action-constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualizes optimal decision theory and economic (utilitarian) formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimization, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria) has a unique and Bayes-optimal solution-that minimizes free energy. This sensitivity corresponds to the precision of beliefs about behavior, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behavior entails a representation of confidence about outcomes that are under an agent's control.

  2. Evidence and Consequence of a Highly Adapted Clonal Haplotype within the Australian Ascochyta rabiei Population

    Directory of Open Access Journals (Sweden)

    Yasir Mehmood

    2017-06-01

    Full Text Available The Australian Ascochyta rabiei (Pass. Labr. (syn. Phoma rabiei population has low genotypic diversity with only one mating type detected to date, potentially precluding substantial evolution through recombination. However, a large diversity in aggressiveness exists. In an effort to better understand the risk from selective adaptation to currently used resistance sources and chemical control strategies, the population was examined in detail. For this, a total of 598 isolates were quasi-hierarchically sampled between 2013 and 2015 across all major Australian chickpea growing regions and commonly grown host genotypes. Although a large number of haplotypes were identified (66 through short sequence repeat (SSR genotyping, overall low gene diversity (Hexp = 0.066 and genotypic diversity (D = 0.57 was detected. Almost 70% of the isolates assessed were of a single dominant haplotype (ARH01. Disease screening on a differential host set, including three commonly deployed resistance sources, revealed distinct aggressiveness among the isolates, with 17% of all isolates identified as highly aggressive. Almost 75% of these were of the ARH01 haplotype. A similar pattern was observed at the host level, with 46% of all isolates collected from the commonly grown host genotype Genesis090 (classified as “resistant” during the term of collection identified as highly aggressive. Of these, 63% belonged to the ARH01 haplotype. In conclusion, the ARH01 haplotype represents a significant risk to the Australian chickpea industry, being not only widely adapted to the diverse agro-geographical environments of the Australian chickpea growing regions, but also containing a disproportionately large number of aggressive isolates, indicating fitness to survive and replicate on the best resistance sources in the Australian germplasm.

  3. Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Pauline M Goubet

    Full Text Available Self-incompatibility has been considered by geneticists a model system for reproductive biology and balancing selection, but our understanding of the genetic basis and evolution of this molecular lock-and-key system has remained limited by the extreme level of sequence divergence among haplotypes, resulting in a lack of appropriate genomic sequences. In this study, we report and analyze the full sequence of eleven distinct haplotypes of the self-incompatibility locus (S-locus in two closely related Arabidopsis species, obtained from individual BAC libraries. We use this extensive dataset to highlight sharply contrasted patterns of molecular evolution of each of the two genes controlling self-incompatibility themselves, as well as of the genomic region surrounding them. We find strong collinearity of the flanking regions among haplotypes on each side of the S-locus together with high levels of sequence similarity. In contrast, the S-locus region itself shows spectacularly deep gene genealogies, high variability in size and gene organization, as well as complete absence of sequence similarity in intergenic sequences and striking accumulation of transposable elements. Of particular interest, we demonstrate that dominant and recessive S-haplotypes experience sharply contrasted patterns of molecular evolution. Indeed, dominant haplotypes exhibit larger size and a much higher density of transposable elements, being matched only by that in the centromere. Overall, these properties highlight that the S-locus presents many striking similarities with other regions involved in the determination of mating-types, such as sex chromosomes in animals or in plants, or the mating-type locus in fungi and green algae.

  4. Enhancing the mathematical properties of new haplotype homozygosity statistics for the detection of selective sweeps.

    Science.gov (United States)

    Garud, Nandita R; Rosenberg, Noah A

    2015-06-01

    Soft selective sweeps represent an important form of adaptation in which multiple haplotypes bearing adaptive alleles rise to high frequency. Most statistical methods for detecting selective sweeps from genetic polymorphism data, however, have focused on identifying hard selective sweeps in which a favored allele appears on a single haplotypic background; these methods might be underpowered to detect soft sweeps. Among exceptions is the set of haplotype homozygosity statistics introduced for the detection of soft sweeps by Garud et al. (2015). These statistics, examining frequencies of multiple haplotypes in relation to each other, include H12, a statistic designed to identify both hard and soft selective sweeps, and H2/H1, a statistic that conditional on high H12 values seeks to distinguish between hard and soft sweeps. A challenge in the use of H2/H1 is that its range depends on the associated value of H12, so that equal H2/H1 values might provide different levels of support for a soft sweep model at different values of H12. Here, we enhance the H12 and H2/H1 haplotype homozygosity statistics for selective sweep detection by deriving the upper bound on H2/H1 as a function of H12, thereby generating a statistic that normalizes H2/H1 to lie between 0 and 1. Through a reanalysis of resequencing data from inbred lines of Drosophila, we show that the enhanced statistic both strengthens interpretations obtained with the unnormalized statistic and leads to empirical insights that are less readily apparent without the normalization. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Distribution of QPY and RAH haplotypes of granzyme B gene in distinct Brazilian populations

    Directory of Open Access Journals (Sweden)

    Fernanda Bernadelli Garcia

    2012-08-01

    Full Text Available INTRODUCTION: The cytolysis mediated by granules is one of the most important effector functions of cytotoxic T lymphocytes and natural killer cells. Recently, three single nucleotide polymorphisms (SNPs were identified at exons 2, 3, and 5 of the granzyme B gene, resulting in a haplotype in which three amino acids of mature protein Q48P88Y245 are changed to R48A88H245, which leads to loss of cytotoxic activity of the protein. In this study, we evaluated the frequency of these polymorphisms in Brazilian populations. METHODS: We evaluated the frequency of these polymorphisms in Brazilian ethnic groups (white, Afro-Brazilian, and Asian by sequencing these regions. RESULTS: The allelic and genotypic frequencies of SNP 2364A/G at exon 2 in Afro-Brazilian individuals (42.3% and 17.3% were significantly higher when compared with those in whites and Asians (p < 0.0001 and p = 0.0007, respectively. The polymorphisms 2933C/G and 4243C/T also were more frequent in Afro-Brazilians but without any significant difference regarding the other groups. The Afro-Brazilian group presented greater diversity of haplotypes, and the RAH haplotype seemed to be more frequent in this group (25%, followed by the whites (20.7% and by the Asians (11.9%, similar to the frequency presented in the literature. CONCLUSIONS: There is a higher frequency of polymorphisms in Afro-Brazilians, and the RAH haplotype was more frequent in these individuals. We believe that further studies should aim to investigate the correlation of this haplotype with diseases related to immunity mediated by cytotoxic lymphocytes, and if this correlation is confirmed, novel treatment strategies might be elaborated.

  6. Introgression of a Rare Haplotype from Southeastern Africa to Breed California Blackeyes with Larger Seeds

    Directory of Open Access Journals (Sweden)

    Mitchell R Lucas

    2015-03-01

    Full Text Available Seed size distinguishes most crops from their wild relatives and is an important quality trait for the grain legume cowpea. In order to breed cowpea varieties with larger seeds we introgressed a rare haplotype associated with large seeds at the Css-1 locus from an African buff seed type cultivar, IT82E-18 (18.5g/100 seeds, into a blackeye seed type cultivar, CB27 (22g/100 seed. Four RILs derived from these two parents were chosen for marker-assisted breeding based on SNP genotyping with a goal of stacking large seed haplotypes into a CB27 background. Foreground and background selection were performed during two cycles of backcrossing based on genome-wide SNP markers. The average seed size of introgression lines homozygous for haplotypes associated with large seeds was 28.7g/100 seed and 24.8g/100 seed for cycles 1 and 2, respectively. One cycle 1 introgression line with desirable seed quality was selfed for two generations to make families with very large seeds (28-35g/100 seeds. Field-based performance trials helped identify breeding lines that not only have large seeds but are also desirable in terms of yield, maturity, and plant architecture when compared to industry standards. A principal component analysis was used to explore the relationships between the parents relative to a core set of landraces and improved varieties based on high-density SNP data. The geographic distribution of haplotypes at the Css-1 locus suggest the haplotype associated with large seeds is unique to accessions collected from Southeastern Africa. Therefore this QTL has a strong potential to develop larger seeded varieties for other growing regions which is demonstrated in this work using a California pedigree.

  7. The IGF1 small dog haplotype is derived from Middle Eastern grey wolves

    Directory of Open Access Journals (Sweden)

    Ostrander Elaine A

    2010-02-01

    Full Text Available Abstract Background A selective sweep containing the insulin-like growth factor 1 (IGF1 gene is associated with size variation in domestic dogs. Intron 2 of IGF1 contains a SINE element and single nucleotide polymorphism (SNP found in all small dog breeds that is almost entirely absent from large breeds. In this study, we surveyed a large sample of grey wolf populations to better understand the ancestral pattern of variation at IGF1 with a particular focus on the distribution of the small dog haplotype and its relationship to the origin of the dog. Results We present DNA sequence data that confirms the absence of the derived small SNP allele in the intron 2 region of IGF1 in a large sample of grey wolves and further establishes the absence of a small dog associated SINE element in all wild canids and most large dog breeds. Grey wolf haplotypes from the Middle East have higher nucleotide diversity suggesting an origin there. Additionally, PCA and phylogenetic analyses suggests a closer kinship of the small domestic dog IGF1 haplotype with those from Middle Eastern grey wolves. Conclusions The absence of both the SINE element and SNP allele in grey wolves suggests that the mutation for small body size post-dates the domestication of dogs. However, because all small dogs possess these diagnostic mutations, the mutations likely arose early in the history of domestic dogs. Our results show that the small dog haplotype is closely related to those in Middle Eastern wolves and is consistent with an ancient origin of the small dog haplotype there. Thus, in concordance with past archeological studies, our molecular analysis is consistent with the early evolution of small size in dogs from the Middle East. See associated opinion by Driscoll and Macdonald: http://jbiol.com/content/9/2/10

  8. Five novel glucose-6-phosphate dehydrogenase deficiency haplotypes correlating with disease severity

    Directory of Open Access Journals (Sweden)

    Dallol Ashraf

    2012-09-01

    Full Text Available Abstract Background Glucose-6-phosphate dehydrogenase (G6PD, EC 1.1.1.49 deficiency is caused by one or more mutations in the G6PD gene on chromosome X. An association between enzyme levels and gene haplotypes remains to be established. Methods In this study, we determined G6PD enzyme levels and sequenced the coding region, including the intron-exon boundaries, in a group of individuals (163 males and 86 females who were referred to the clinic with suspected G6PD deficiency. The sequence data were analysed by physical linkage analysis and PHASE haplotype reconstruction. Results All previously reported G6PD missense changes, including the AURES, MEDITERRANEAN, A-, SIBARI, VIANGCHAN and ANANT, were identified in our cohort. The AURES mutation (p.Ile48Thr was the most common variant in the cohort (30% in males patients followed by the Mediterranean variant (p.Ser188Phe detectable in 17.79% in male patients. Variant forms of the A- mutation (p.Val68Met, p.Asn126Asp or a combination of both were detectable in 15.33% of the male patients. However, unique to this study, several of such mutations co-existed in the same patient as shown by physical linkage in males or PHASE haplotype reconstruction in females. Based on 6 non-synonymous variants of G6PD, 13 different haplotypes (13 in males, 8 in females were identified. Five of these were previously unreported (Jeddah A, B, C, D and E and were defined by previously unreported combinations of extant mutations where patients harbouring these haplotypes exhibited severe G6PD deficiency. Conclusions Our findings will help design a focused population screening approach and provide better management for G6PD deficiency patients.

  9. Multi-Agent Inference in Social Networks: A Finite Population Learning Approach.

    Science.gov (United States)

    Fan, Jianqing; Tong, Xin; Zeng, Yao

    When people in a society want to make inference about some parameter, each person may want to use data collected by other people. Information (data) exchange in social networks is usually costly, so to make reliable statistical decisions, people need to trade off the benefits and costs of information acquisition. Conflicts of interests and coordination problems will arise in the process. Classical statistics does not consider people's incentives and interactions in the data collection process. To address this imperfection, this work explores multi-agent Bayesian inference problems with a game theoretic social network model. Motivated by our interest in aggregate inference at the societal level, we propose a new concept, finite population learning , to address whether with high probability, a large fraction of people in a given finite population network can make "good" inference. Serving as a foundation, this concept enables us to study the long run trend of aggregate inference quality as population grows.

  10. Critical examination of logical formulations in quantum theory. Statistical inference and Hilbertian distance between quantum states

    International Nuclear Information System (INIS)

    Hadjisawas, Nicolas.

    1982-01-01

    After a critical study of the logical quantum mechanics formulations of Jauch and Piron, classical and quantum versions of statistical inference are studied. In order to do this, the significance of the Jaynes and Kulback principles (maximum likelihood, least squares principles) is revealed from the theorems established. In the quantum mechanics inference problem, a ''distance'' between states is defined. This concept is used to solve the quantum equivalent of the classical problem studied by Kulback. The ''projection postulate'' proposition is subsequently deduced [fr

  11. Genetic variation of the greenhouse whitefly, Trialeurodes vaporariorum (Hemiptera: Aleyrodidae), among populations from Serbia and neighbouring countries, as inferred from COI sequence variability.

    Science.gov (United States)

    Prijović, M; Skaljac, M; Drobnjaković, T; Zanić, K; Perić, P; Marčić, D; Puizina, J

    2014-06-01

    The greenhouse whitefly Trialeurodes vaporariorum Westwood, 1856 (Hemiptera: Aleyrodidae) is an invasive and highly polyphagous phloem-feeding pest of vegetables and ornamentals. Trialeurodes vaporariorum causes serious damage due to direct feeding and transmits several important plant viruses. Excessive use of insecticides has resulted in significantly reduced levels of susceptibility of various T. vaporariorum populations. To determine the genetic variability within and among populations of T. vaporariorum from Serbia and to explore their genetic relatedness with other T. vaporariorum populations, we analysed the mitochondrial cytochrome c oxidase I (COI) sequences of 16 populations from Serbia and six neighbouring countries: Montenegro (three populations), Macedonia (one population) and Croatia (two populations), for a total of 198 analysed specimens. A low overall level of sequence divergence and only five variable nucleotides and six haplotypes were found. The most frequent haplotype, H1, was identified in all Serbian populations and in all specimens from distant localities in Croatia and Macedonia. The COI sequence data that was retrieved from GenBank and the data from our study indicated that H1 is the most globally widespread T. vaporariorum haplotype. A lack of spatial genetic structure among the studied T. vaporariorum populations, as well as two demographic tests that we performed (Tajima's D value and Fu's Fs statistics), indicate a recent colonisation event and population growth. Phylogenetic analyses of the COI haplotypes in this study and other T. vaporariorum haplotypes that were retrieved from GenBank were performed using Bayesian inference and median-joining (MJ) network analysis. Two major haplogroups with only a single unique nucleotide difference were found: haplogroup 1 (containing the five Serbian haplotypes and those previously identified in India, China, the Netherlands, the United Kingdom, Morocco, Reunion and the USA) and haplogroup 3

  12. Common ADRB2 haplotypes derived from 26 polymorphic sites direct beta2-adrenergic receptor expression and regulation phenotypes.

    Directory of Open Access Journals (Sweden)

    Alfredo Panebra

    2010-07-01

    Full Text Available The beta2-adrenergic receptor (beta2AR is expressed on numerous cell-types including airway smooth muscle cells and cardiomyocytes. Drugs (agonists or antagonists acting at these receptors for treatment of asthma, chronic obstructive pulmonary disease, and heart failure show substantial interindividual variability in response. The ADRB2 gene is polymorphic in noncoding and coding regions, but virtually all ADRB2 association studies have utilized the two common nonsynonymous coding SNPs, often reaching discrepant conclusions.We constructed the 8 common ADRB2 haplotypes derived from 26 polymorphisms in the promoter, 5'UTR, coding, and 3'UTR of the intronless ADRB2 gene. These were cloned into an expression construct lacking a vector-based promoter, so that beta2AR expression was driven by its promoter, and steady state expression could be modified by polymorphisms throughout ADRB2 within a haplotype. "Whole-gene" transfections were performed with COS-7 cells and revealed 4 haplotypes with increased cell surface beta2AR protein expression compared to the others. Agonist-promoted downregulation of beta2AR protein expression was also haplotype-dependent, and was found to be increased for 2 haplotypes. A phylogenetic tree of the haplotypes was derived and annotated by cellular phenotypes, revealing a pattern potentially driven by expression.Thus for obstructive lung disease, the initial bronchodilator response from intermittent administration of beta-agonist may be influenced by certain beta2AR haplotypes (expression phenotypes, while other haplotypes may influence tachyphylaxis during the response to chronic therapy (downregulation phenotypes. An ideal clinical outcome of high expression and less downregulation was found for two haplotypes. Haplotypes may also affect heart failure antagonist therapy, where beta2AR increase inotropy and are anti-apoptotic. The haplotype-specific expression and regulation phenotypes found in this transfection

  13. Inferring the conservative causal core of gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2010-09-01

    Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  14. Inferring the conservative causal core of gene regulatory networks.

    Science.gov (United States)

    Altay, Gökmen; Emmert-Streib, Frank

    2010-09-28

    Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  15. Bayesian inference for Markov jump processes with informative observations.

    Science.gov (United States)

    Golightly, Andrew; Wilkinson, Darren J

    2015-04-01

    In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.

  16. Reconciling taxonomy and phylogenetic inference: formalism and algorithms for describing discord and inferring taxonomic roots

    Directory of Open Access Journals (Sweden)

    Matsen Frederick A

    2012-05-01

    Full Text Available Abstract Background Although taxonomy is often used informally to evaluate the results of phylogenetic inference and the root of phylogenetic trees, algorithmic methods to do so are lacking. Results In this paper we formalize these procedures and develop algorithms to solve the relevant problems. In particular, we introduce a new algorithm that solves a "subcoloring" problem to express the difference between a taxonomy and a phylogeny at a given rank. This algorithm improves upon the current best algorithm in terms of asymptotic complexity for the parameter regime of interest; we also describe a branch-and-bound algorithm that saves orders of magnitude in computation on real data sets. We also develop a formalism and an algorithm for rooting phylogenetic trees according to a taxonomy. Conclusions The algorithms in this paper, and the associated freely-available software, will help biologists better use and understand taxonomically labeled phylogenetic trees.

  17. More than one kind of inference: re-examining what's learned in feature inference and classification.

    Science.gov (United States)

    Sweller, Naomi; Hayes, Brett K

    2010-08-01

    Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.

  18. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants

    Science.gov (United States)

    2014-01-01

    Background Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. Results We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Conclusions Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes. PMID:24996694

  19. sick: The Spectroscopic Inference Crank

    Science.gov (United States)

    Casey, Andrew R.

    2016-03-01

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  20. Inferring network structure from cascades

    Science.gov (United States)

    Ghonge, Sushrut; Vural, Dervis Can

    2017-07-01

    Many physical, biological, and social phenomena can be described by cascades taking place on a network. Often, the activity can be empirically observed, but not the underlying network of interactions. In this paper we offer three topological methods to infer the structure of any directed network given a set of cascade arrival times. Our formulas hold for a very general class of models where the activation probability of a node is a generic function of its degree and the number of its active neighbors. We report high success rates for synthetic and real networks, for several different cascade models.

  1. SICK: THE SPECTROSCOPIC INFERENCE CRANK

    Energy Technology Data Exchange (ETDEWEB)

    Casey, Andrew R., E-mail: arc@ast.cam.ac.uk [Institute of Astronomy, University of Cambridge, Madingley Road, Cambdridge, CB3 0HA (United Kingdom)

    2016-03-15

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  2. Inference in hybrid Bayesian networks

    International Nuclear Information System (INIS)

    Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio

    2009-01-01

    Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.

  3. SICK: THE SPECTROSCOPIC INFERENCE CRANK

    International Nuclear Information System (INIS)

    Casey, Andrew R.

    2016-01-01

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  4. The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project.

    Science.gov (United States)

    Peng, Ting; Wang, Li; Li, Guisen

    2017-08-11

    The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3'UTR. Total 12 SNPs in URR and 24 SNPs in 3'UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3'UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P 1  = 3.33E-4 vs P 2  = 3.61E-30). The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide

  5. HLA-E regulatory and coding region variability and haplotypes in a Brazilian population sample.

    Science.gov (United States)

    Ramalho, Jaqueline; Veiga-Castelli, Luciana C; Donadi, Eduardo A; Mendes-Junior, Celso T; Castelli, Erick C

    2017-11-01

    The HLA-E gene is characterized by low but wide expression on different tissues. HLA-E is considered a conserved gene, being one of the least polymorphic class I HLA genes. The HLA-E molecule interacts with Natural Killer cell receptors and T lymphocytes receptors, and might activate or inhibit immune responses depending on the peptide associated with HLA-E and with which receptors HLA-E interacts to. Variable sites within the HLA-E regulatory and coding segments may influence the gene function by modifying its expression pattern or encoded molecule, thus, influencing its interaction with receptors and the peptide. Here we propose an approach to evaluate the gene structure, haplotype pattern and the complete HLA-E variability, including regulatory (promoter and 3'UTR) and coding segments (with introns), by using massively parallel sequencing. We investigated the variability of 420 samples from a very admixed population such as Brazilians by using this approach. Considering a segment of about 7kb, 63 variable sites were detected, arranged into 75 extended haplotypes. We detected 37 different promoter sequences (but few frequent ones), 27 different coding sequences (15 representing new HLA-E alleles) and 12 haplotypes at the 3'UTR segment, two of them presenting a summed frequency of 90%. Despite the number of coding alleles, they encode mainly two different full-length molecules, known as E*01:01 and E*01:03, which corresponds to about 90% of all. In addition, differently from what has been previously observed for other non classical HLA genes, the relationship among the HLA-E promoter, coding and 3'UTR haplotypes is not straightforward because the same promoter and 3'UTR haplotypes were many times associated with different HLA-E coding haplotypes. This data reinforces the presence of only two main full-length HLA-E molecules encoded by the many HLA-E alleles detected in our population sample. In addition, this data does indicate that the distal HLA-E promoter is by

  6. Surfing among species, populations and morphotypes: Inferring boundaries between two species of new world silversides (Atherinopsidae).

    Science.gov (United States)

    González-Castro, Mariano; Rosso, Juan José; Mabragaña, Ezequiel; Díaz de Astarloa, Juan Martín

    2016-01-01

    Atherinopsidae are widespread freshwater and shallow marine fish with singular economic importance. Morphological, genetical and life cycles differences between marine and estuarine populations were already reported in this family, suggesting ongoing speciation. Also, coexistence and interbreeding between closely related species were documented. The aim of this study was to infer boundaries among: (A) Odontesthes bonariensis and O. argentinensis at species level, and intermediate morphs; (B) the population of O. argentinensis of Mar Chiquita Lagoon and its marine conspecifics. To achieve this, we integrated, meristic, Geometrics Morphometrics and DNA Barcode approaches. Four groups were discriminated and subsequently characterized according to their morphological traits, shape and meristic characters. No shared haplotypes between O. bonariensis and O. argentinensis were found. Significative-meristic and body shape differences between the Mar Chiquita and marine individuals of O. argentinensis were found, suggesting they behave as well differentiated populations, or even incipient ecological species. The fact that the Odontesthes morphotypes shared haplotypes with both, O. argentinensis and O. bonariensis, but also possess meristic and morphometric distinctive traits open new questions related to the origin of this morphogroup. Copyright © 2015 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  7. Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

    Science.gov (United States)

    Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

    2017-06-26

    Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.

  8. On the Hardness of Topology Inference

    Science.gov (United States)

    Acharya, H. B.; Gouda, M. G.

    Many systems require information about the topology of networks on the Internet, for purposes like management, efficiency, testing of new protocols and so on. However, ISPs usually do not share the actual topology maps with outsiders; thus, in order to obtain the topology of a network on the Internet, a system must reconstruct it from publicly observable data. The standard method employs traceroute to obtain paths between nodes; next, a topology is generated such that the observed paths occur in the graph. However, traceroute has the problem that some routers refuse to reveal their addresses, and appear as anonymous nodes in traces. Previous research on the problem of topology inference with anonymous nodes has demonstrated that it is at best NP-complete. In this paper, we improve upon this result. In our previous research, we showed that in the special case where nodes may be anonymous in some traces but not in all traces (so all node identifiers are known), there exist trace sets that are generable from multiple topologies. This paper extends our theory of network tracing to the general case (with strictly anonymous nodes), and shows that the problem of computing the network that generated a trace set, given the trace set, has no general solution. The weak version of the problem, which allows an algorithm to output a "small" set of networks- any one of which is the correct one- is also not solvable. Any algorithm guaranteed to output the correct topology outputs at least an exponential number of networks. Our results are surprisingly robust: they hold even when the network is known to have exactly two anonymous nodes, and every node as well as every edge in the network is guaranteed to occur in some trace. On the basis of this result, we suggest that exact reconstruction of network topology requires more powerful tools than traceroute.

  9. Inferring the Clonal Structure of Viral Populations from Time Series Sequencing.

    Directory of Open Access Journals (Sweden)

    Donatien F Chedom

    2015-11-01

    Full Text Available RNA virus populations will undergo processes of mutation and selection resulting in a mixed population of viral particles. High throughput sequencing of a viral population subsequently contains a mixed signal of the underlying clones. We would like to identify the underlying evolutionary structures. We utilize two sources of information to attempt this; within segment linkage information, and mutation prevalence. We demonstrate that clone haplotypes, their prevalence, and maximum parsimony reticulate evolutionary structures can be identified, although the solutions may not be unique, even for complete sets of information. This is applied to a chain of influenza infection, where we infer evolutionary structures, including reassortment, and demonstrate some of the difficulties of interpretation that arise from deep sequencing due to artifacts such as template switching during PCR amplification.

  10. Precursors to language: Social cognition and pragmatic inference in primates.

    Science.gov (United States)

    Seyfarth, Robert M; Cheney, Dorothy L

    2017-02-01

    Despite their differences, human language and the vocal communication of nonhuman primates share many features. Both constitute forms of coordinated activity, rely on many shared neural mechanisms, and involve discrete, combinatorial cognition that includes rich pragmatic inference. These common features suggest that during evolution the ancestors of all modern primates faced similar social problems and responded with similar systems of communication and cognition. When language later evolved from this common foundation, many of its distinctive features were already present.

  11. Haplotype Diversity of COI Gene of Hylarana chalconota Species Found at State University of Malang

    Directory of Open Access Journals (Sweden)

    Dian Ratri Wulandari

    2014-01-01

    Full Text Available Hylarana chalconota is a cryptic species of frog endemic to Java Island [1]. This species is small with long legs, and brown skin. The Snout-Vent Length (SVL ranges between 30-40 mm for male and 45-65 mm for female. [4] Reports the existence of this species in State University of Malang, which was not found in 1995 [5]. Sampel #1 displays spots in its skin, which does not exist in sample #2. To reveal the haplotype diversity of COI gene in this species, we analyzed Cytochrome-c oxidase subunit-1 (COI sequences of both samples. Using a pair of primers according to [6] both samples had 604 bp and 574 bp fragment length, respectively. These fragments showed polymorphism; with mutation position in sites 104, 105, and 124. Based on this result, we suggest that the two samples share a different haplotypes, proposed as UM1 and UM2.

  12. New Cases of Thelazia callipaeda Haplotype 1 in Dogs Suggest a Wider Distribution in Romania.

    Science.gov (United States)

    Ioniţă, Mariana; Mitrea, Ioan Liviu; Ionică, Angela Monica; Morariu, Sorin; Mihalca, Andrei Daniel

    2016-03-01

    Thelazia callipaeda is an emerging vector-borne zoonotic helminth parasitizing the conjunctival sac of a broad spectrum of definitive hosts, such as dogs, cats, rabbits, wild carnivores, and humans. Its presence is associated with mild to severe ocular disease. Here, we report two new clinical cases in dogs originating from western and southern Romania, with no travel history. On clinical examination, the nematodes were retrieved from the conjunctival sac and identified using morphological keys and molecular tools. Twenty-two adult nematodes (8 males, 14 females) were collected and were identified as T. callipaeda by morphology. The molecular analysis revealed a 100% identity with haplotype h1 of T. callipaeda. This study describes the occurrence of new autochthonous cases of thelaziosis in Romania, reinforcing the spreading trend of this zoonotic eyeworm and highlighting the need for increased awareness among medical and veterinary practitioners. Moreover, we provide additional molecular evidence for the exclusive distribution of haplotype 1 of T. callipaeda in Europe.

  13. Identification of a type 1 diabetes-associated CD4 promoter haplotype with high constitutive activity

    DEFF Research Database (Denmark)

    Kristiansen, O P; Karlsen, A E; Larsen, Z M

    2004-01-01

    screened the human CD4 promoter for mutations and identified three frequent single nucleotide polymorphisms (SNPs): CD4-181C/G, CD4-521C/G and CD4-1050T/C. The SNPs are in strong linkage disequilibrium (LD) and association with the CD4-1188(TTTTC)(5-14) alleles, and we observed nine CD4 promoter haplotypes...... promoter activity and (2) the CD4-181G variant encodes higher stimulated promoter activity than the CD4-181C variant. This difference is in part neutralized in the frequently occurring CD4 promoter haplotypes by the more upstream genetic variants. Thus, we report functional impact of a novel CD4-181C/G SNP...

  14. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Directory of Open Access Journals (Sweden)

    Tiffany Langewisch

    Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  15. BMP4 and FGF3 haplotypes increase the risk of tendinopathy in volleyball athletes.

    Science.gov (United States)

    Salles, José Inácio; Amaral, Marcus Vinícius; Aguiar, Diego Pinheiro; Lira, Daisy Anne; Quinelato, Valquiria; Bonato, Letícia Ladeira; Duarte, Maria Eugenia Leite; Vieira, Alexandre Rezende; Casado, Priscila Ladeira

    2015-03-01

    To investigate whether genetic variants can be correlated with tendinopathy in elite male volleyball athletes. Case-control study. Fifteen single nucleotide polymorphisms within BMP4, FGF3, FGF10, FGFR1 genes were investigated in 138 elite volleyball athletes, aged between 18 and 35 years, who undergo 4-5h of training per day: 52 with tendinopathy and 86 with no history of pain suggestive of tendinopathy in patellar, Achilles, shoulder, and hip abductors tendons. The clinical diagnostic criterion was progressive pain during training, confirmed by magnetic resonance image. Genomic DNA was obtained from saliva samples. Genetic markers were genotyped using TaqMan real-time PCR. Chi-square test compared genotypes and haplotype differences between groups. Multivariate logistic regression analyzed the significance of covariates and incidence of tendinopathy. Statistical analysis revealed participant age (p=0.005) and years of practice (p=0.004) were risk factors for tendinopathy. A significant association between BMP4 rs2761884 (p=0.03) and tendinopathy was observed. Athletes with a polymorphic genotype have 2.4 times more susceptibility to tendinopathy (OR=2.39; 95%CI=1.10-5.19). Also, association between disease and haplotype TTGGA in BMP4 (p=0.01) was observed. The FGF3 TGGTA haplotype showed a tendency of association with tendinopathy (p=0.05), and so did FGF10 rs900379. FGFR1 showed no association with disease. These findings indicate that haplotypes in BMP4 and FGF3 genes may contribute to the tendon disease process in elite volleyball athletes. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  16. Mitochondrial haplotypes are not associated with mice selectively bred for high voluntary wheel running.

    Science.gov (United States)

    Wone, Bernard W M; Yim, Won C; Schutz, Heidi; Meek, Thomas H; Garland, Theodore

    2018-04-04

    Mitochondrial haplotypes have been associated with human and rodent phenotypes, including nonshivering thermogenesis capacity, learning capability, and disease risk. Although the mammalian mitochondrial D-loop is highly polymorphic, D-loops in laboratory mice are identical, and variation occurs elsewhere mainly between nucleotides 9820 and 9830. Part of this region codes for the tRNA Arg gene and is associated with mitochondrial densities and number of mtDNA copies. We hypothesized that the capacity for high levels of voluntary wheel-running behavior would be associated with mitochondrial haplotype. Here, we analyzed the mtDNA polymorphic region in mice from each of four replicate lines selectively bred for 54 generations for high voluntary wheel running (HR) and from four control lines (Control) randomly bred for 54 generations. Sequencing the polymorphic region revealed a variable number of adenine repeats. Single nucleotide polymorphisms (SNPs) varied from 2 to 3 adenine insertions, resulting in three haplotypes. We found significant genetic differentiations between the HR and Control groups (F st  = 0.779, p ≤ 0.0001), as well as among the replicate lines of mice within groups (F sc  = 0.757, p ≤ 0.0001). Haplotypes, however, were not strongly associated with voluntary wheel running (revolutions run per day), nor with either body mass or litter size. This system provides a useful experimental model to dissect the physiological processes linking mitochondrial, genomic SNPs, epigenetics, or nuclear-mitochondrial cross-talk to exercise activity. Copyright © 2018. Published by Elsevier B.V.

  17. Exploring genetic variation in haplotypes of the filariasis vector Culex quinquefasciatus (Diptera: Culicidae) through DNA barcoding.

    Science.gov (United States)

    Vadivalagan, Chithravel; Karthika, Pushparaj; Murugan, Kadarkarai; Panneerselvam, Chellasamy; Del Serrone, Paola; Benelli, Giovanni

    2017-05-01

    Culex quinquefasciatus (Diptera: Culicidae) is a vector of many pathogens and parasites of humans, as well as domestic and wild animals. In urban and semi-urban Asian countries, Cx. quinquefasciatus is a main vector of nematodes causing lymphatic filariasis. In the African region, it vectors the Rift Valley fever virus, while in the USA it transmits West Nile, St. Louis encephalitis and Western equine encephalitis virus. In this study, DNA barcoding was used to explore the genetic variation of Cx. quinquefasciatus populations from 88 geographical regions. We presented a comprehensive approach analyzing the effectiveness of two gene markers, i.e. CO1 and 16S rRNA. The high threshold genetic divergence of CO1 (0.47%) gene was reported as an ideal marker for molecular identification of this mosquito vector. Furthermore, null substitutions were lower in CO1 if compared to 16S rRNA, which influenced its differentiating potential among Indian haplotypes. NJ tree was well supported with high branch values for CO1 gene than 16S rRNA, indicating ideal genetic differentiation among haplotypes. TCS haplotype network revealed 14 distinct clusters. The intra- and inter-population polymorphism were calculated among the global and Indian Cx. quinquefasciatus lineages. The genetic diversity index Tajima' D showed negative values for all the 4 intra-population clusters (G2-4, G10). Fu's FS showed negative value for G10 cluster, which was significant and indicated recent population expansion. However, the G2-G4 (i.e. Indian lineages) had positive values, suggesting a bottleneck effect. Overall, our research firstly shed light on the genetic differences among the haplotypes of Cx. quinquefasciatus species complex, adding basic knowledge to the molecular ecology of this important mosquito vector. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Cystic fibrosis transmembrane regulator haplotypes in households of patients with cystic fibrosis.

    Science.gov (United States)

    Furgeri, Daniela Tenório; Marson, Fernando Augusto Lima; Correia, Cyntia Arivabeni Araújo; Ribeiro, José Dirceu; Bertuzzo, Carmen Sílvia

    2018-01-30

    Nearly 2000 mutations in the cystic fibrosis transmembrane regulator (CFTR) gene have been reported. The F508del mutation occurs in approximately 50-65% of patients with cystic fibrosis (CF). However, molecular diagnosis is not always possible. Therefore, silent polymorphisms can be used to label the mutant allele in households of patients with CF. To verify the haplotypes of four polymorphisms at the CFTR locus in households of patients with CF for pre-fertilization, pre-implantation, and prenatal indirect mutation diagnosis to provide better genetic counseling for families and patients with CF and to associate the genotypes/haplotypes with the F508del mutation screening. GATT polymorphism analysis was performed using direct polymerase chain reaction amplification, and the MP6-D9, TUB09 and TUB18 polymorphism analyses were performed using restriction fragment length polymorphism. Nine haplotypes were found in 37 CFTR alleles, and of those, 24 were linked with the F508del mutation and 13 with other CFTR mutations. The 6 (GATT), C (MP6-D9), G (TUB09), and C (TUB18) haplotypes showed the highest prevalence (48%) of the mutant CFTR allele and were linked to the F508del mutation (64%). In 43% of households analyzed, at least one informative polymorphism can be used for the indirect diagnostic test. CFTR polymorphisms are genetic markers that are useful for identifying the mutant CFTR alleles in households of patients with CF when it is not possible to establish the complete CFTR genotype. Moreover, the polymorphisms can be used for indirect CFTR mutation identification in cases of pre-fertilization, pre-implantation and prenatal analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Mitochondrial haplotypes influence metabolic traits across bovine inter- and intra-species cybrids

    OpenAIRE

    Wang, Jikun; Xiang, Hai; Liu, Langqing; Kong, Minghua; Yin, Tao; Zhao, Xingbo

    2017-01-01

    In bovine species, mitochondrial DNA polymorphisms and their correlation to productive or reproductive performances have been widely reported across breeds and individuals. However, experimental evidence of this correlation has never been provided. In order to identify differences among bovine mtDNA haplotypes, transmitochondrial cybrids were generated, with the nucleus from MAC-T cell line, derived from a Holstein dairy cow (Bos taurus) and mitochondria from either primary cell line derived ...

  20. Improving preimplantation genetic diagnosis (PGD) reliability by selection of sperm donor with the most informative haplotype.

    Science.gov (United States)

    Malcov, Mira; Gold, Veronica; Peleg, Sagit; Frumkin, Tsvia; Azem, Foad; Amit, Ami; Ben-Yosef, Dalit; Yaron, Yuval; Reches, Adi; Barda, Shimi; Kleiman, Sandra E; Yogev, Leah; Hauser, Ron

    2017-04-26

    The study is aimed to describe a novel strategy that increases the accuracy and reliability of PGD in patients using sperm donation by pre-selecting the donor whose haplotype does not overlap the carrier's one. A panel of 4-9 informative polymorphic markers, flanking the mutation in carriers of autosomal dominant/X-linked disorders, was tested in DNA of sperm donors before PGD. Whenever the lengths of donors' repeats overlapped those of the women, additional donors' DNA samples were analyzed. The donor that demonstrated the minimal overlapping with the patient was selected for IVF. In 8 out of 17 carriers the markers of the initially chosen donors overlapped the patients' alleles and 2-8 additional sperm donors for each patient were haplotyped. The selection of additional sperm donors increased the number of informative markers and reduced misdiagnosis risk from 6.00% ± 7.48 to 0.48% ±0.68. The PGD results were confirmed and no misdiagnosis was detected. Our study demonstrates that pre-selecting a sperm donor whose haplotype has minimal overlapping with the female's haplotype, is critical for reducing the misdiagnosis risk and ensuring a reliable PGD. This strategy may contribute to prevent the transmission of affected IVF-PGD embryos using a simple and economical procedure. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. DNA testing of donors was approved by the institutional Helsinki committee (registration number 319-08TLV, 2008). The present study was approved by the institutional Helsinki committee (registration number 0385-13TLV, 2013).

  1. Insights into HLA-G genetics provided by worldwide haplotype diversity

    Directory of Open Access Journals (Sweden)

    Erick C Castelli

    2014-10-01

    Full Text Available Human Leucocyte Antigen G (HLA-G belongs to the family of nonclassical HLA class I genes, located within the major histocompatibility complex (MHC. HLA-G has been the target of most recent research regarding the function of class I nonclassical genes. The main features that distinguish HLA-G from classical class I genes are: a limited protein variability; b alternative splicing generating several membrane bound and soluble isoforms; c short cytoplasmic tail; d modulation of immune response (immune tolerance; e restricted expression to certain tissues. In the present work, we describe the HLA-G gene structure and address the HLA-G variability and haplotype diversity among several populations around the world, considering each of its major segments (promoter, coding and 3’untranslated regions. For this purpose, we developed a pipeline to reevaluate the 1000Genomes data and recover miscalled or missing genotypes and haplotypes. It became clear that the overall structure of the HLA-G molecule has been maintained during the evolutionary process and that most of the variation sites found in the HLA-G coding region are either coding synonymous or intronic mutations. In addition, only a few frequent and divergent extended haplotypes are found when the promoter, coding and 3’ untranslated regions are evaluated together. The divergence is particularly evident for the regulatory regions. The population comparisons confirmed that most of the HLA-G variability has originated before human dispersion from Africa and that the allele and haplotype frequencies have probably been shaped by strong selective pressures.

  2. Fine haplotype structure of a chromosome 17 region in the laboratory and wild mouse

    Czech Academy of Sciences Publication Activity Database

    Trachtulec, Zdeněk; Vlček, Čestmír; Mihola, Ondřej; Gregorová, Soňa; Fotopulosová, Vladana; Forejt, Jiří

    2008-01-01

    Roč. 178, č. 3 (2008), s. 1777-1784 ISSN 0016-6731 R&D Projects: GA AV ČR IAA5052406; GA ČR GA301/05/0738; GA MŠk(CZ) 1M0520 Institutional research plan: CEZ:AV0Z50520514 Keywords : haplotype * hybrid sterility Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.002, year: 2008

  3. Y chromosome haplotype diversity of domestic sheep (Ovis aries) in northern Eurasia.

    Science.gov (United States)

    Zhang, Min; Peng, Wei-Feng; Yang, Guang-Li; Lv, Feng-Hua; Liu, Ming-Jun; Li, Wen-Rong; Liu, Yong-Gang; Li, Jin-Quan; Wang, Feng; Shen, Zhi-Qiang; Zhao, Sheng-Guo; Hehua, Eer; Marzanov, Nurbiy; Murawski, Maziek; Kantanen, Juha; Li, Meng-Hua

    2014-12-01

    Variation in two SNPs and one microsatellite on the Y chromosome was analyzed in a total of 663 rams representing 59 breeds from a large geographic range in northern Eurasia. SNPA-oY1 showed the highest allele frequency (91.55%) across the breeds, whereas SNPG-oY1 was present in only 56 samples. Combined genotypes established seven haplotypes (H4, H5, H6, H7, H8, H12 and H19). H6 dominated in northern Eurasia, and H8 showed the second-highest frequency. H4, which had been earlier reported to be absent in European breeds, was detected in one European breed (Swiniarka), whereas H7, which had been previously identified to be unique to European breeds, was present in two Chinese breeds (Ninglang Black and Large-tailed Han), one Buryatian (Transbaikal Finewool) and two Russian breeds (North Caucasus Mutton-Wool and Kuibyshev). H12, which had been detected only in Turkish breeds, was also found in Chinese breeds in this work. An overall low level of haplotype diversity (median h = 0.1288) was observed across the breeds with relatively higher median values in breeds from the regions neighboring the Near Eastern domestication center of sheep. H6 is the dominant haplotype in northwestern and eastern China, in which the haplotype distribution could be explained by the historical translocations of the H4 and H8 Y chromosomes to China via the Mongol invasions followed by expansions to northwestern and eastern China. Our findings extend previous results of sheep Y chromosomal genetic variability and indicate probably recent paternal gene flows between sheep breeds from distinct major geographic regions. © 2014 Stichting International Foundation for Animal Genetics.

  4. Do polymorphisms and haplotypes of mismatch repair genes modulate risk of sporadic colorectal cancer?

    Czech Academy of Sciences Publication Activity Database

    Tulupová, Elena; Kumar, R.; Hánová, Monika; Slyšková, Jana; Pardini, Barbara; Poláková, Veronika; Naccarati, Alessio; Vodičková, Ludmila; Novotný, J.; Halamková, J.; Hemminki, K.; Vodička, Pavel

    2008-01-01

    Roč. 648, 1-2 (2008), s. 40-45 ISSN 0027-5107 R&D Projects: GA ČR GA310/07/1430 Institutional research plan: CEZ:AV0Z50390512; CEZ:AV0Z50390703 Keywords : DNA mismatch repair * Genetic polymorphism * Haplotype analysis Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.198, year: 2008

  5. Association of ATG16L1 gene haplotype with inflammatory bowel disease in Indians.

    Directory of Open Access Journals (Sweden)

    Srinivasan Pugazhendhi

    Full Text Available Inflammatory bowel disease (IBD is characterized by multigenic inheritance. Defects in autophagy related genes are considered to show genetic heterogeneity between populations. We evaluated the association of several single nucleotide polymorphisms (SNPs in the autophagy related 16 like 1 (ATG16L1 gene with IBD in Indians. The ATG16L1 gene was genotyped for ten different SNPs using DNA extracted from peripheral blood of 234 patients with Crohn's disease (CD, 249 patients with ulcerative colitis (UC and 393 healthy controls The SNPs rs2241880, rs4663396, rs3792106, rs10210302, rs3792109, rs2241877, rs6737398, rs11682898, rs4663402 and rs4663421 were genotyped using the Sequenom MassArray platform. PLINK was used for the association analysis and pairwise linkage disequilibrium (LD values. Haplotype analysis was done using Haploview. All SNPs were in Hardy Weinberg equilibrium in cases and controls. The G allele at rs6737398 exhibited a protective association with both CD and UC. The T allele at rs4663402 and C allele at rs4663421 were positively associated with CD and UC. The T allele at rs2241877 exhibited protective association with UC only. The AA genotype at rs4663402 and the GG genotype at rs4663421 were protectively associated with both CD and UC. Haplotype analysis revealed that all the SNPs in tight LD (D' = 0.76-1.0 and organized in a single haplotype block. Haplotype D was positively associated with IBD (P = 5.8 x 10-6 for CD and 0.002 for UC. SNPs in ATG16L1 were associated with IBD in Indian patients. The relevance to management of individual patients requires further study.

  6. SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines

    Directory of Open Access Journals (Sweden)

    Smith Oscar

    2002-10-01

    Full Text Available Abstract Background Recent studies of ancestral maize populations indicate that linkage disequilibrium tends to dissipate rapidly, sometimes within 100 bp. We set out to examine the linkage disequilibrium and diversity in maize elite inbred lines, which have been subject to population bottlenecks and intense selection by breeders. Such population events are expected to increase the amount of linkage disequilibrium, but reduce diversity. The results of this study will inform the design of genetic association studies. Results We examined the frequency and distribution of DNA polymorphisms at 18 maize genes in 36 maize inbreds, chosen to represent most of the genetic diversity in U.S. elite maize breeding pool. The frequency of nucleotide changes is high, on average one polymorphism per 31 bp in non-coding regions and 1 polymorphism per 124 bp in coding regions. Insertions and deletions are frequent in non-coding regions (1 per 85 bp, but rare in coding regions. A small number (2–8 of distinct and highly diverse haplotypes can be distinguished at all loci examined. Within genes, SNP loci comprising the haplotypes are in linkage disequilibrium with each other. Conclusions No decline of linkage disequilibrium within a few hundred base pairs was found in the elite maize germplasm. This finding, as well as the small number of haplotypes, relative to neutral expectation, is consistent with the effects of breeding-induced bottlenecks and selection on the elite germplasm pool. The genetic distance between haplotypes is large, indicative of an ancient gene pool and of possible interspecific hybridization events in maize ancestry.

  7. Inferring Social Influence of Anti-Tobacco Mass Media Campaign.

    Science.gov (United States)

    Zhan, Qianyi; Zhang, Jiawei; Yu, Philip S; Emery, Sherry; Xie, Junyuan

    2017-07-01

    Anti-tobacco mass media campaigns are designed to influence tobacco users. It has been proved that campaigns will produce users' changes in awareness, knowledge, and attitudes, and also produce meaningful behavior change of audience. Anti-smoking television advertising is the most important part in the campaign. Meanwhile, nowadays, successful online social networks are creating new media environment, however, little is known about the relation between social conversations and anti-tobacco campaigns. This paper aims to infer social influence of these campaigns, and the problem is formally referred to as the Social Influence inference of anti-Tobacco mass mEdia campaigns (Site) problem. To address the Site problem, a novel influence inference framework, TV advertising social influence estimation (Asie), is proposed based on our analysis of two real anti-tobacco campaigns. Asie divides audience attitudes toward TV ads into three distinct stages: 1) cognitive; 2) affective; and 3) conative. Audience online reactions at each of these three stages are depicted by Asie with specific probabilistic models based on the synergistic influences from both online social friends and offline TV ads. Extensive experiments demonstrate the effectiveness of Asie.

  8. Bootstrap-based Support of HGT Inferred by Maximum Parsimony

    Directory of Open Access Journals (Sweden)

    Nakhleh Luay

    2010-05-01

    Full Text Available Abstract Background Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. Results In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. Conclusions We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/, and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  9. Bootstrap-based support of HGT inferred by maximum parsimony.

    Science.gov (United States)

    Park, Hyun Jung; Jin, Guohua; Nakhleh, Luay

    2010-05-05

    Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/), and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  10. Association of galanin haplotypes with alcoholism and anxiety in two ethnically distinct populations

    Science.gov (United States)

    Belfer, I; Hipp, H; McKnight, C; Evans, C; Buzas, B; Bollettino, A; Albaugh, B; Virkkunen, M; Yuan, Q; Max, MB; Goldman, D; Enoch, MA

    2009-01-01

    The neuropeptide galanin (GAL) is widely expressed in the central nervous system. Animal studies have implicated GAL in alcohol abuse and anxiety: chronic ethanol intake increases hypothalamic GAL mRNA; high levels of stress increase GAL release in the central amygdala. The coding sequence of the galanin gene, GAL, is highly conserved and a functional polymorphism has not yet been found. The aim of our study was, for the first time, to identify GAL haplotypes and investigate associations with alcoholism and anxiety. Seven single-nucleotide polymorphisms (SNPs) spanning GAL were genotyped in 65 controls from five populations: US and Finnish Caucasians, African Americans, Plains and Southwestern Indians. A single haplotype block with little evidence of historical recombination was observed for each population. Four tag SNPs were then genotyped in DSM-III-R lifetime alcoholics and nonalcoholics from two population isolates: 514 Finnish Caucasian men and 331 Plains Indian men and women. Tridimensional Personality Questionnaire harm avoidance (HA) scores, a dimensional measure of anxiety, were obtained. There was a haplotype association with alcoholism in both the Finnish (P=0.001) and Plains Indian (P=0.004) men. The SNPs were also significantly associated. Alcoholics were divided into high and low HA groups (≥ and alcoholics, low HA alcoholics and nonalcoholics. Our results from two independent populations suggest that GAL may contribute to vulnerability to alcoholism, perhaps mediated by dimensional anxiety. PMID:16314872

  11. Mapping the genetic diversity of HLA haplotypes in the Japanese populations

    Science.gov (United States)

    Saw, Woei-Yuh; Liu, Xuanyao; Khor, Chiea-Chuen; Takeuchi, Fumihiko; Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Akiyama, Koichi; Asano, Hiroyuki; Asayama, Kei; Haga, Toshikazu; Hara, Azusa; Hirose, Takuo; Hosaka, Miki; Ichihara, Sahoko; Imai, Yutaka; Inoue, Ryusuke; Ishiguro, Aya; Isomura, Minoru; Isono, Masato; Kamide, Kei; Kato, Norihiro; Katsuya, Tomohiro; Kikuya, Masahiro; Kohara, Katsuhiko; Matsubara, Tatsuaki; Matsuda, Ayako; Metoki, Hirohito; Miki, Tetsuro; Murakami, Keiko; Nabika, Toru; Nakatochi, Masahiro; Ogihara, Toshio; Ohnaka, Keizo; Ohkubo, Takayoshi; Rakugi, Hiromi; Satoh, Michihiro; Shiwaku, Kunihiro; Sugimoto, Ken; Tabara, Yasuharu; Takami, Yoichi; Takayanagi, Ryoichi; Takeuchi, Fumihiko; Tsubota-Utsugi, Megumi; Yamamoto, Ken; Yamamoto, Koichi; Yamasaki, Masayuki; Yasui, Daisaku; Yokota, Mitsuhiro; Teo, Yik-Ying; Kato, Norihiro

    2015-01-01

    Japan has often been viewed as an Asian country that possesses a genetically homogenous community. The basis for partitioning the country into prefectures has largely been geographical, although cultural and linguistic differences still exist between some of the districts/prefectures, especially between Okinawa and the mainland prefectures. The Major Histocompatibility Complex (MHC) region has consistently emerged as the most polymorphic region in the human genome, harbouring numerous biologically important variants; nevertheless the presence of population-specific long haplotypes hinders the imputation of SNPs and classical HLA alleles. Here, we examined the extent of genetic variation at the MHC between eight Japanese populations sampled from Okinawa, and six other prefectures located in or close to the mainland of Japan, specifically focusing at the haplotypes observed within each population, and what the impact of any variation has on imputation. Our results indicated that Okinawa was genetically farther to the mainland Japanese than were Gujarati Indians from Tamil Indians, while the mainland Japanese from six prefectures were more homogeneous than between northern and southern Han Chinese. The distribution of haplotypes across Japan was similar, although imputation was most accurate for Okinawa and several mainland prefectures when population-specific panels were used as reference. PMID:26648100

  12. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

    KAUST Repository

    Chatterjee, Nilanjan

    2009-11-01

    Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. © Institute of Mathematical Statistics, 2009.

  13. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

    KAUST Repository

    Chatterjee, Nilanjan; Chen, Yi-Hau; Luo, Sheng; Carroll, Raymond J.

    2009-01-01

    Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. © Institute of Mathematical Statistics, 2009.

  14. More powerful haplotype sharing by accounting for the mode of inheritance.

    Science.gov (United States)

    Ziegler, Andreas; Ewhida, Adel; Brendel, Michael; Kleensang, André

    2009-04-01

    The concept of haplotype sharing (HS) has received considerable attention recently, and several haplotype association methods have been proposed. Here, we extend the work of Beckmann and colleagues [2005 Hum. Hered. 59:67-78] who derived an HS statistic (BHS) as special case of Mantel's space-time clustering approach. The Mantel-type HS statistic correlates genetic similarity with phenotypic similarity across pairs of individuals. While phenotypic similarity is measured as the mean-corrected cross product of phenotypes, we propose to incorporate information of the underlying genetic model in the measurement of the genetic similarity. Specifically, for the recessive and dominant modes of inheritance we suggest the use of the minimum and maximum of shared length of haplotypes around a marker locus for pairs of individuals. If the underlying genetic model is unknown, we propose a model-free HS Mantel statistic using the max-test approach. We compare our novel HS statistics to BHS using simulated case-control data and illustrate its use by re-analyzing data from a candidate region of chromosome 18q from the Rheumatoid Arthritis (RA) Consortium. We demonstrate that our approach is point-wise valid and superior to BHS. In the re-analysis of the RA data, we identified three regions with point-wise P-values<0.005 containing six known genes (PMIP1, MC4R, PIGN, KIAA1468, TNFRSF11A and ZCCHC2) which might be worth follow-up.

  15. Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs.

    Science.gov (United States)

    Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

    2008-05-28

    Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4-15.9 times faster, while Unphased jobs performed 1.1-18.6 times faster compared to the accumulated computation duration. Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance.

  16. Effect of genetic type and casein haplotype on antioxidant activity of yogurts during storage.

    Science.gov (United States)

    Perna, A; Intaglietta, I; Simonetti, A; Gambacorta, E

    2013-06-01

    The aim of this work was to investigate the antioxidant activity of yogurt made from the milk of 2 breeds-Italian Brown and Italian Holstein-characterized by different casein haplotypes (αS1-, β-, and κ-caseins) during storage up to 15 d. The casein haplotype was determined by isoelectric focusing; antioxidant activity of yogurt was measured using 2,2'-azino-bis-(3-ethylbenzothiazoline-6-sulfonic acid). The statistical analysis showed a significant effect of the studied factors. Antioxidant activity increased during storage of both yogurt types, but yogurt produced with Italian Brown milk showed higher antioxidant activity than those produced with Italian Holstein milk. A high scavenging activity was present in yogurts with the allelic combination of BB-A(2)A(2)-BB. The results of this study suggest that the genetic type and the haplotype make a significant contribution in the production of yogurts with high nutraceutical value. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  17. Different origin and dispersal of sulfadoxine-resistant Plasmodium falciparum haplotypes between Eastern Africa and Democratic Republic of Congo

    DEFF Research Database (Denmark)

    Baraka, Vito; Delgado-Ratto, Christopher; Nag, Sidsel

    2017-01-01

    Sulfadoxine/pyrimethamine (SP) is still used for malaria control in sub-Saharan Africa; however, widespread resistance is a major concern. This study aimed to determine the dispersal and origin of sulfadoxine resistance lineages in the Democratic Republic of the Congo compared with East African.......3 and 7.7 kb) flanking the Pfdhps gene were assayed. Evolutionary analysis revealed a shared origin of Pfdhps haplotypes in East Africa, with a distinct population clustering in DR Congo. Furthermore, in Tanzania there was an independent distinct origin of Pfdhps SGEGA resistant haplotype. In Uganda...... and Tanzania, gene flow patterns contribute to the dispersal and shared origin of parasites carrying double- and triple-mutant Pfdhps haplotypes associated with poor outcomes of intermittent preventive treatment during pregnancy using SP (IPTp-SP). However, the origins of the Pfdhps haplotypes in DR Congo...

  18. Lower complexity bounds for lifted inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred

    2015-01-01

    instances of the model. Numerous approaches for such “lifted inference” techniques have been proposed. While it has been demonstrated that these techniques will lead to significantly more efficient inference on some specific models, there are only very recent and still quite restricted results that show...... the feasibility of lifted inference on certain syntactically defined classes of models. Lower complexity bounds that imply some limitations for the feasibility of lifted inference on more expressive model classes were established earlier in Jaeger (2000; Jaeger, M. 2000. On the complexity of inference about...... that under the assumption that NETIME≠ETIME, there is no polynomial lifted inference algorithm for knowledge bases of weighted, quantifier-, and function-free formulas. Further strengthening earlier results, this is also shown to hold for approximate inference and for knowledge bases not containing...

  19. The interaction between coagulation factor 2 receptor and interleukin 6 haplotypes increases the risk of myocardial infarction in men.

    Directory of Open Access Journals (Sweden)

    Bruna Gigante

    Full Text Available The aim of the study was to investigate if the interaction between the coagulation factor 2 receptor (F2R and the interleukin 6 (IL6 haplotypes modulates the risk of myocardial infarction (MI in the Stockholm Heart Epidemiology Program (SHEEP. Seven SNPs at the F2R locus and three SNPs at the IL6 locus were genotyped. Haplotypes and haplotype pairs (IL6*F2R were generated. A logistic regression analysis was performed to analyze the association of the haplotypes and haplotype pairs with the MI risk. Presence of an interaction between the two haplotypes in each haplotype pair was calculated using two different methods: the statistical, on a multiplicative scale, which includes the cross product of the two factors into the logistic regression model; the biological, on an additive scale, which evaluates the relative risk associated with the joint presence of both factors. The ratio between the observed and the predicted effect of the joint exposure, the synergy index (S, indicates the presence of a synergy (S>1 or of an antagonism (S<1. None of the haplotypes within the two loci was associated with the risk of MI. Out of 22 different haplotype pairs, the haplotype pair 17 GGG*ADGTCCT was associated with an increased risk of MI with an OR (95%CI of 1.58 (1.05-2.41 (p = 0.02 in the crude and an OR of 1.72 (1.11-2.67 (p = 0.01 in the adjusted analysis. We observed the presence of an interaction on a multiplicative scale with an OR (95%CI of 2.24 (1.27-3.95 (p = 0.005 and a slight interactive effect between the two haplotypes on an additive scale with an OR (95%CI of 1.56 (1.02-2.37 (p = 0.03 and S of 1.66 (0.89-31. In conclusion, our results support the hypothesis that the interaction between these two functionally related genes may influence the risk of MI and suggest new mechanisms involved in the genetic susceptibility to MI.

  20. Statistical inference for financial engineering

    CERN Document Server

    Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki

    2014-01-01

    This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.

  1. Type inference for correspondence types

    DEFF Research Database (Denmark)

    Hüttel, Hans; Gordon, Andy; Hansen, Rene Rydhof

    2009-01-01

    We present a correspondence type/effect system for authenticity in a π-calculus with polarized channels, dependent pair types and effect terms and show how one may, given a process P and an a priori type environment E, generate constraints that are formulae in the Alternating Least Fixed......-Point (ALFP) logic. We then show how a reasonable model of the generated constraints yields a type/effect assignment such that P becomes well-typed with respect to E if and only if this is possible. The formulae generated satisfy a finite model property; a system of constraints is satisfiable if and only...... if it has a finite model. As a consequence, we obtain the result that type/effect inference in our system is polynomial-time decidable....

  2. Causal inference in public health.

    Science.gov (United States)

    Glass, Thomas A; Goodman, Steven N; Hernán, Miguel A; Samet, Jonathan M

    2013-01-01

    Causal inference has a central role in public health; the determination that an association is causal indicates the possibility for intervention. We review and comment on the long-used guidelines for interpreting evidence as supporting a causal association and contrast them with the potential outcomes framework that encourages thinking in terms of causes that are interventions. We argue that in public health this framework is more suitable, providing an estimate of an action's consequences rather than the less precise notion of a risk factor's causal effect. A variety of modern statistical methods adopt this approach. When an intervention cannot be specified, causal relations can still exist, but how to intervene to change the outcome will be unclear. In application, the often-complex structure of causal processes needs to be acknowledged and appropriate data collected to study them. These newer approaches need to be brought to bear on the increasingly complex public health challenges of our globalized world.

  3. Bayesian inference for hybrid discrete-continuous stochastic kinetic models

    International Nuclear Information System (INIS)

    Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S

    2014-01-01

    We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)

  4. Evidence that the ancestral haplotype in Australian hemochromatosis patients may be associated with a common mutation in the gene.

    OpenAIRE

    Crawford, D H; Powell, L W; Leggett, B A; Francis, J S; Fletcher, L M; Webb, S I; Halliday, J W; Jazwinska, E C

    1995-01-01

    Hemochromatosis (HC) is a common inherited disorder of iron metabolism for which neither the gene nor biochemical defect have yet been identified. The aim of this study was to look for clinical evidence that the predominant ancestral haplotype in Australian patients is associated with a common mutation in the gene. We compared indices of iron metabolism and storage in three groups of HC patients categorized according to the presence of the ancestral haplotype (i.e., patients with two copies, ...

  5. Mitochondrial DNA haplotype distribution patterns in Pinus ponderosa (Pinaceae): range-wide evolutionary history and implications for conservation.

    Science.gov (United States)

    Potter, Kevin M; Hipkins, Valerie D; Mahalovich, Mary F; Means, Robert E

    2013-08-01

    Ponderosa pine (Pinus ponderosa Douglas ex P. Lawson & C. Lawson) exhibits complicated patterns of morphological and genetic variation across its range in western North America. This study aims to clarify P. ponderosa evolutionary history and phylogeography using a highly polymorphic mitochondrial DNA marker, with results offering insights into how geographical and climatological processes drove the modern evolutionary structure of tree species in the region. We amplified the mtDNA nad1 second intron minisatellite region for 3,100 trees representing 104 populations, and sequenced all length variants. We estimated population-level haplotypic diversity and determined diversity partitioning among varieties, races and populations. After aligning sequences of minisatellite repeat motifs, we evaluated evolutionary relationships among haplotypes. The geographical structuring of the 10 haplotypes corresponded with division between Pacific and Rocky Mountain varieties. Pacific haplotypes clustered with high bootstrap support, and appear to have descended from Rocky Mountain haplotypes. A greater proportion of diversity was partitioned between Rocky Mountain races than between Pacific races. Areas of highest haplotypic diversity were the southern Sierra Nevada mountain range in California, northwestern California, and southern Nevada. Pinus ponderosa haplotype distribution patterns suggest a complex phylogeographic history not revealed by other genetic and morphological data, or by the sparse paleoecological record. The results appear consistent with long-term divergence between the Pacific and Rocky Mountain varieties, along with more recent divergences not well-associated with race. Pleistocene refugia may have existed in areas of high haplotypic diversity, as well as the Great Basin, Southwestern United States/northern Mexico, and the High Plains.

  6. Polymorphic haplotypes on R408BW PKU and normal PAH chromosomes in Quebec and European populations

    Energy Technology Data Exchange (ETDEWEB)

    Byck, S.; Morgan, K.; Scriver, C.R. [McGill Univ., Montreal (Canada)] [and others

    1994-09-01

    The R408W mutation in the phenylalanine hydroxylase gene (PAH) is associated with haplotype 2.3 (RFLP haplotype 2, VNTR 3 of the HindIII system) in most European populations. Another chromosome, first observed in Quebec and then in northwest Europe, carries R408W on haplotype 1.8. The occurrence of the R408W mutation on two different PKU chromosomes could be the result of intragenic recombination, recurrent mutation or gene conversion. In this study, we analyzed both normal and R408W chromosomes carrying 1.8 and 2.3 haplotypes in Quebec and European populations; we used the TCTA{sub (n)} short tandem repeat sequence (STR) at the 5{prime} end of the PAH gene and the HindIII VNTR system at the 3{prime} end of the PAH gene to characterize chromosomes. Fourteen of sixteen R408W chromosomes from {open_quotes}Celtic{close_quotes} families in Quebec and the United Kingdom (UK) harbor a 244 bp STR allele; the remaining two chromosomes, carry a 240 bp or 248bp STR allele. Normal chromosomes (n=18) carry the 240 bp STR allele. R408W chromosomes are different from mutant H1.8 chromosomes; mutant H2.3 carries the 240 bp STR allele (14 of 16 chromosomes) or the 236 allele (2 of 16 chromosomes). The HindIII VNTR comprises variable numbers of 30 bp repeats (cassettes); the repeats also vary in nucleotide sequence. Variation clusters toward the 3{prime} end of cassettes and VNTRs. VNTR 3 alleles on normal H2 (n=9) and mutant R408W H2 (n=19) chromosomes were identical. VNTR 8 alleles on normal H1 chromosomes (n=9) and on R408W H1 chromosomes (n=15) differ by 1 bp substitution near the 3{prime} end of the 6th cassette. In summary, the mutant H1.8 chromosome harboring the R408W mutation has unique features at both the 5{prime} and 3{prime} end of the gene that distinguish it from the mutant H2.3 and normal H1.8 and H2.3 counterparts. The explanation for the occurrence of R408W on two different PAH haplotypes is recurrent mutation affecting the CpG dinucleotide in PAH codon 408.

  7. Evidence that the ancestral haplotype in Australian hemochromatosis patients may be associated with a common mutation in the gene

    Energy Technology Data Exchange (ETDEWEB)

    Crawford, D.H.G.; Powell, L.W.; Leggett, B.A. [Univ. of Queensland (Australia)] [and others

    1995-08-01

    Hemochromatosis (HC) is a common inherited disorder of iron metabolism for which neither the gene nor biochemical defect have yet been identified. The aim of this study was to look for clinical evidence that the predominant ancestral haplotype in Australian patients is associated with a common mutation in the gene. We compared indices of iron metabolism and storage in three groups of HC patients categorized according to the presence of the ancestral haplotype (i.e., patients with two copies, one copy, and no copies of the ancestral haplotype). We also examined iron indices in two groups of HC heterozygotes (those with the ancestral haplotype and those without) and in age-matched controls. These analyses indicate that (i) HC patients with two copies of the ancestral haplotype show significantly more severe expression of the disorder than those with one copy or those without, (ii) HC heterozygotes have partial clinical expression, which may be influenced by the presence of the ancestral haplotype in females but not in males, and (iii) the high population frequency of the HC gene may be the result of the selective advantage conferred by protecting heterozygotes against iron deficiency. 18 refs., 3 tabs.

  8. Association of Xmn I Polymorphism and Hemoglobin E Haplotypes on Postnatal Gamma Globin Gene Expression in Homozygous Hemoglobin E

    Directory of Open Access Journals (Sweden)

    Supachai Ekwattanakit

    2012-01-01

    Full Text Available Background and Objectives. To explore the role of cis-regulatory sequences within the β globin gene cluster at chromosome 11 on human γ globin gene expression related to Hb E allele, we analyze baseline hematological data and Hb F values together with β globin haplotypes in homozygous Hb E. Patients and Methods. 80 individuals with molecularly confirmed homozygous Hb E were analyzed for the β globin haplotypes and Xmn I polymorphism using PCR-RFLPs. 74 individuals with complete laboratory data were further studied for association analyses. Results. Eight different β globin haplotypes were found linked to Hb E alleles; three major haplotypes were (a (III, (b (V, and (c (IV accounting for 94% of Hb E chromosomes. A new haplotype (Th-1 was identified and most likely converted from the major ones. The majority of individuals had Hb F < 5%; only 10.8% of homozygous Hb E had high Hb F (average 10.5%, range 5.8–14.3%. No association was found on a specific haplotype or Xmn I in these individuals with high Hb F, measured by alkaline denaturation. Conclusion. The cis-regulation of γ globin gene expression might not be apparent under a milder condition with lesser globin imbalance such as homozygous Hb E.

  9. ABCB1 haplotype and OPRM1 118A > G genotype interaction in methadone maintenance treatment pharmacogenetics

    Directory of Open Access Journals (Sweden)

    Barratt DT

    2012-04-01

    Full Text Available Daniel T Barratt1, Janet K Coller1, Richard Hallinan2, Andrew Byrne2, Jason M White1, David JR Foster3, Andrew A Somogyi1,41Discipline of Pharmacology, School of Medical Sciences, University of Adelaide, Adelaide, South Australia; 2The Byrne Surgery, Specialist Drug and Alcohol Practice, Redfern, New South Wales; 3Division of Health Sciences, Sansom Institute, School of Pharmacy and Medical Sciences, University of South Australia, Adelaide, South Australia; 4Department of Clinical Pharmacology, Royal Adelaide Hospital, Adelaide, South Australia, AustraliaBackground: Genetic variability in ABCB1, encoding the P-glycoprotein efflux transporter, has been linked to altered methadone maintenance treatment dose requirements. However, subsequent studies have indicated that additional environmental or genetic factors may confound ABCB1 pharmacogenetics in different methadone maintenance treatment settings. There is evidence that genetic variability in OPRM1, encoding the mu opioid receptor, and ABCB1 may interact to affect morphine response in opposite ways. This study aimed to examine whether a similar gene-gene interaction occurs for methadone in methadone maintenance treatment.Methods: Opioid-dependent subjects (n = 119 maintained on methadone (15–300 mg/day were genotyped for five single nucleotide polymorphisms of ABCB1 (61A > G; 1199G > A; 1236C > T; 2677G > T; 3435C > T, as well as for the OPRM1 18A > G single nucleotide polymorphism. Subjects’ methadone doses and trough plasma (R-methadone concentrations (Ctrough were compared between ABCB1 haplotypes (with and without controlling for OPRM1 genotype, and between OPRM1 genotypes (with and without controlling for ABCB1 haplotype.Results: Among wild-type OPRM1 subjects, an ABCB1 variant haplotype group (subjects with a wild-type and 61A:1199G:1236C:2677T:3435T haplotype combination, or homozygous for the 61A:1199G:1236C:2677T:3435T haplotype had significantly lower doses (median ± standard

  10. The Association of DRD2 with Insight Problem Solving.

    Science.gov (United States)

    Zhang, Shun; Zhang, Jinghuan

    2016-01-01

    Although the insight phenomenon has attracted great attention from psychologists, it is still largely unknown whether its variation in well-functioning human adults has a genetic basis. Several lines of evidence suggest that genes involved in dopamine (DA) transmission might be potential candidates. The present study explored for the first time the association of dopamine D2 receptor gene ( DRD2 ) with insight problem solving. Fifteen single-nucleotide polymorphisms (SNPs) covering DRD2 were genotyped in 425 unrelated healthy Chinese undergraduates, and were further tested for association with insight problem solving. Both single SNP and haplotype analysis revealed several associations of DRD2 SNPs and haplotypes with insight problem solving. In conclusion, the present study provides the first evidence for the involvement of DRD2 in insight problem solving, future studies are necessary to validate these findings.

  11. Maximum Likelihood Method for Predicting Environmental Conditions from Assemblage Composition: The R Package bio.infer

    Directory of Open Access Journals (Sweden)

    Lester L. Yuan

    2007-06-01

    Full Text Available This paper provides a brief introduction to the R package bio.infer, a set of scripts that facilitates the use of maximum likelihood (ML methods for predicting environmental conditions from assemblage composition. Environmental conditions can often be inferred from only biological data, and these inferences are useful when other sources of data are unavailable. ML prediction methods are statistically rigorous and applicable to a broader set of problems than more commonly used weighted averaging techniques. However, ML methods require a substantially greater investment of time to program algorithms and to perform computations. This package is designed to reduce the effort required to apply ML prediction methods.

  12. Bayesian inference in probabilistic risk assessment-The current state of the art

    International Nuclear Information System (INIS)

    Kelly, Dana L.; Smith, Curtis L.

    2009-01-01

    Markov chain Monte Carlo (MCMC) approaches to sampling directly from the joint posterior distribution of aleatory model parameters have led to tremendous advances in Bayesian inference capability in a wide variety of fields, including probabilistic risk analysis. The advent of freely available software coupled with inexpensive computing power has catalyzed this advance. This paper examines where the risk assessment community is with respect to implementing modern computational-based Bayesian approaches to inference. Through a series of examples in different topical areas, it introduces salient concepts and illustrates the practical application of Bayesian inference via MCMC sampling to a variety of important problems

  13. Sparse linear models: Variational approximate inference and Bayesian experimental design

    International Nuclear Information System (INIS)

    Seeger, Matthias W

    2009-01-01

    A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

  14. Inference algorithms and learning theory for Bayesian sparse factor analysis

    International Nuclear Information System (INIS)

    Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John

    2009-01-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  15. Sparse linear models: Variational approximate inference and Bayesian experimental design

    Energy Technology Data Exchange (ETDEWEB)

    Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)

    2009-12-01

    A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

  16. Inference algorithms and learning theory for Bayesian sparse factor analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)

    2009-12-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  17. Cis-acting mutation and duplication: History of molecular evolution in a P450 haplotype responsible for insecticide resistance in Culex quinquefasciatus.

    Science.gov (United States)

    Itokawa, Kentaro; Komagata, Osamu; Kasai, Shinji; Masada, Masahiro; Tomita, Takashi

    2011-07-01

    A cytochrome P450 gene, Cyp9m10, is more than 200-fold overexpressed in a pyrethroid resistant strain of Culex quinquefasciatus, JPal-per. The haplotype of this strain contains two copies of Cyp9m10 resulted from recent tandem duplication. In this study, we discovered and isolated a Cyp9m10 haplotype closely related to this duplicated Cyp9m10 haplotype from JHB, a strain used for the recent genome project for this mosquito species. The isolated haplotype (JHB-NIID-B haplotype) shared the same insertion of a transposable element upstream of the coding region with JPal-per strain but not duplicated. The JHB-NIID-B haplotype was considered to have diverged from the JPal-per lineage just before the duplication event. Cyp9m10 was moderately overexpressed in larvae with the JHB-NIID-B haplotype. The overexpressions in JHB-NIID-B and JPal-per haplotypes were developmentally regulated in similar pattern indicating both haplotypes share a common cis-acting mutation responsible for the overexpressions. The isolated moderately overexpressed haplotype conferred resistance, however, its efficacy was relatively small. We hypothesized that the first cis-acting mutation modified the consequence of the subsequent duplication in JPal-per lineage to confer stronger phenotypic effect than that if it occurred before the first cis-acting mutation. Copyright © 2011 Elsevier Ltd. All rights reserved.

  18. KIR And HLA Haplotype Analysis in a Family Lacking The KIR 2DL1-2DP1 Genes

    Directory of Open Access Journals (Sweden)

    Vojvodić Svetlana

    2015-06-01

    Full Text Available The killer cell immunoglobulin-like receptor (KIR gene cluster exhibits extensive allelic and haplotypic diversity that is observed as presence/absence of genes, resulting in expansion and contraction of KIR haplotypes and by allelic variation of individual KIR genes. We report a case of KIR pseudogene 2DP1 and 2DL1 gene absence in members of one family with the children suffering from acute myelogenous leukemia (AML. Killer cell immunoglo-bulin-like receptor low resolution genotyping was performed by the polymerase chain reaction (PCR-sequencespecific primers (SSP/sequence-specific oligonucleotide (SSO method and haplotype assignment was done by gene content analysis. Both parents and the maternal grandfather, shared the same Cen-B2 KIR haplotype, containing KIR 3DL3, -2DS2, -2DL2 and -3DP1 genes. The second haplotype in the KIR genotype of the mother and grandfather was Tel-A1 with KIR 2DL4 (normal and deleted variant, -3DL1, -22 bp deletion variant of the 2DS4 gene and -3DL2, while the second haplotype in the KIR genotype of the father was Tel-B1 with 2DL4 (normal variant, -3DS1, -2DL5, -2DS5, -2DS1 and 3DL2 genes. Haplotype analysis in all three offsprings revealed that the children inherited the Cen-B2 haplotype with the same gene content but two of the children inherited a deleted variant of the 2DL4 gene, while the third child inherited a normal one. The second haplotype of all three offspring contained KIR 2DL4, -2DL5, -2DS1, -2DS4 (del 22bp variant, -2DS5, -3DL1 and -3DL2 genes, which was the basis of the assumption that there is a hybrid haplotype and that the present 3DL1 gene is a variant of the 3DS1 gene. Due to consanguinity among the ancestors, the results of KIR segregation analysis showed the existence of a very rare KIR genotype in the offspring. The family who is the subject of this case is even more interesting because the father was 10/10 human leukocyte antigen (HLA-matched to his daughter, all members of the family have

  19. Bayesian inference of radiation belt loss timescales.

    Science.gov (United States)

    Camporeale, E.; Chandorkar, M.

    2017-12-01

    Electron fluxes in the Earth's radiation belts are routinely studied using the classical quasi-linear radial diffusion model. Although this simplified linear equation has proven to be an indispensable tool in understanding the dynamics of the radiation belt, it requires specification of quantities such as the diffusion coefficient and electron loss timescales that are never directly measured. Researchers have so far assumed a-priori parameterisations for radiation belt quantities and derived the best fit using satellite data. The state of the art in this domain lacks a coherent formulation of this problem in a probabilistic framework. We present some recent progress that we have made in performing Bayesian inference of radial diffusion parameters. We achieve this by making extensive use of the theory connecting Gaussian Processes and linear partial differential equations, and performing Markov Chain Monte Carlo sampling of radial diffusion parameters. These results are important for understanding the role and the propagation of uncertainties in radiation belt simulations and, eventually, for providing a probabilistic forecast of energetic electron fluxes in a Space Weather context.

  20. Inferring human mobility using communication patterns

    Science.gov (United States)

    Palchykov, Vasyl; Mitrović, Marija; Jo, Hang-Hyun; Saramäki, Jari; Pan, Raj Kumar

    2014-08-01

    Understanding the patterns of mobility of individuals is crucial for a number of reasons, from city planning to disaster management. There are two common ways of quantifying the amount of travel between locations: by direct observations that often involve privacy issues, e.g., tracking mobile phone locations, or by estimations from models. Typically, such models build on accurate knowledge of the population size at each location. However, when this information is not readily available, their applicability is rather limited. As mobile phones are ubiquitous, our aim is to investigate if mobility patterns can be inferred from aggregated mobile phone call data alone. Using data released by Orange for Ivory Coast, we show that human mobility is well predicted by a simple model based on the frequency of mobile phone calls between two locations and their geographical distance. We argue that the strength of the model comes from directly incorporating the social dimension of mobility. Furthermore, as only aggregated call data is required, the model helps to avoid potential privacy problems.

  1. Different patterns of evolution in the centromeric and telomeric regions of group A and B haplotypes of the human killer cell Ig-like receptor locus.

    Directory of Open Access Journals (Sweden)

    Chul-Woo Pyo

    Full Text Available The fast evolving human KIR gene family encodes variable lymphocyte receptors specific for polymorphic HLA class I determinants. Nucleotide sequences for 24 representative human KIR haplotypes were determined. With three previously defined haplotypes, this gave a set of 12 group A and 15 group B haplotypes for assessment of KIR variation. The seven gene-content haplotypes are all combinations of four centromeric and two telomeric motifs. 2DL5, 2DS5 and 2DS3 can be present in centromeric and telomeric locations. With one exception, haplotypes having identical gene content differed in their combinations of KIR alleles. Sequence diversity varied between haplotype groups and between centromeric and telomeric halves of the KIR locus. The most variable A haplotype genes are in the telomeric half, whereas the most variable genes characterizing B haplotypes are in the centromeric half. Of the highly polymorphic genes, only the 3DL3 framework gene exhibits a similar diversity when carried by A and B haplotypes. Phylogenetic analysis and divergence time estimates, point to the centromeric gene-content motifs that distinguish A and B haplotypes having emerged ~6 million years ago, contemporaneously with the separation of human and chimpanzee ancestors. In contrast, the telomeric motifs that distinguish A and B haplotypes emerged more recently, ~1.7 million years ago, before the emergence of Homo sapiens. Thus the centromeric and telomeric motifs that typify A and B haplotypes have likely been present throughout human evolution. The results suggest the common ancestor of A and B haplotypes combined a B-like centromeric region with an A-like telomeric region.

  2. LAIT: a local ancestry inference toolkit.

    Science.gov (United States)

    Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei

    2017-09-06

    Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.

  3. Forward and backward inference in spatial cognition.

    Directory of Open Access Journals (Sweden)

    Will D Penny

    Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.

  4. Generative Inferences Based on Learned Relations

    Science.gov (United States)

    Chen, Dawn; Lu, Hongjing; Holyoak, Keith J.

    2017-01-01

    A key property of relational representations is their "generativity": From partial descriptions of relations between entities, additional inferences can be drawn about other entities. A major theoretical challenge is to demonstrate how the capacity to make generative inferences could arise as a result of learning relations from…

  5. Inference in models with adaptive learning

    NARCIS (Netherlands)

    Chevillon, G.; Massmann, M.; Mavroeidis, S.

    2010-01-01

    Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be

  6. Fiducial inference - A Neyman-Pearson interpretation

    NARCIS (Netherlands)

    Salome, D; VonderLinden, W; Dose,; Fischer, R; Preuss, R

    1999-01-01

    Fisher's fiducial argument is a tool for deriving inferences in the form of a probability distribution on the parameter space, not based on Bayes's Theorem. Lindley established that in exceptional situations fiducial inferences coincide with posterior distributions; in the other situations fiducial

  7. Uncertainty in prediction and in inference

    NARCIS (Netherlands)

    Hilgevoord, J.; Uffink, J.

    1991-01-01

    The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close re-lationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in

  8. Causal inference in economics and marketing.

    Science.gov (United States)

    Varian, Hal R

    2016-07-05

    This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.

  9. The Impact of Disablers on Predictive Inference

    Science.gov (United States)

    Cummins, Denise Dellarosa

    2014-01-01

    People consider alternative causes when deciding whether a cause is responsible for an effect (diagnostic inference) but appear to neglect them when deciding whether an effect will occur (predictive inference). Five experiments were conducted to test a 2-part explanation of this phenomenon: namely, (a) that people interpret standard predictive…

  10. Compiling Relational Bayesian Networks for Exact Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark

    2006-01-01

    We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...

  11. Compiling Relational Bayesian Networks for Exact Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan

    2004-01-01

    We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...

  12. Ancient Biomolecules and Evolutionary Inference.

    Science.gov (United States)

    Cappellini, Enrico; Prohaska, Ana; Racimo, Fernando; Welker, Frido; Pedersen, Mikkel Winther; Allentoft, Morten E; de Barros Damgaard, Peter; Gutenbrunner, Petra; Dunne, Julie; Hammann, Simon; Roffet-Salque, Mélanie; Ilardo, Melissa; Moreno-Mayar, J Víctor; Wang, Yucheng; Sikora, Martin; Vinner, Lasse; Cox, Jürgen; Evershed, Richard P; Willerslev, Eske

    2018-04-25

    Over the last decade, studies of ancient biomolecules-particularly ancient DNA, proteins, and lipids-have revolutionized our understanding of evolutionary history. Though initially fraught with many challenges, the field now stands on firm foundations. Researchers now successfully retrieve nucleotide and amino acid sequences, as well as lipid signatures, from progressively older samples, originating from geographic areas and depositional environments that, until recently, were regarded as hostile to long-term preservation of biomolecules. Sampling frequencies and the spatial and temporal scope of studies have also increased markedly, and with them the size and quality of the data sets generated. This progress has been made possible by continuous technical innovations in analytical methods, enhanced criteria for the selection of ancient samples, integrated experimental methods, and advanced computational approaches. Here, we discuss the history and current state of ancient biomolecule research, its applications to evolutionary inference, and future directions for this young and exciting field. Expected final online publication date for the Annual Review of Biochemistry Volume 87 is June 20, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

  13. EEG Based Inference of Spatio-Temporal Brain Dynamics

    DEFF Research Database (Denmark)

    Hansen, Sofie Therese

    Electroencephalography (EEG) provides a measure of brain activity and has improved our understanding of the brain immensely. However, there is still much to be learned and the full potential of EEG is yet to be realized. In this thesis we suggest to improve the information gain of EEG using three...... different approaches; 1) by recovery of the EEG sources, 2) by representing and inferring the propagation path of EEG sources, and 3) by combining EEG with functional magnetic resonance imaging (fMRI). The common goal of the methods, and thus of this thesis, is to improve the spatial dimension of EEG...... recovery ability. The forward problem describes the propagation of neuronal activity in the brain to the EEG electrodes on the scalp. The geometry and conductivity of the head layers are normally required to model this path. We propose a framework for inferring forward models which is based on the EEG...

  14. Inference with constrained hidden Markov models in PRISM

    DEFF Research Database (Denmark)

    Christiansen, Henning; Have, Christian Theil; Lassen, Ole Torp

    2010-01-01

    A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference. De......_different are integrated. We experimentally validate our approach on the biologically motivated problem of global pairwise alignment.......A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference...

  15. Bootstrap-Based Inference for Cube Root Consistent Estimators

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Jansson, Michael; Nagasawa, Kenichi

    This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known...... to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. This modification leads to a generic and easy-to-implement resampling method for inference that is conceptually distinct...... from other available distributional approximations based on some form of modified bootstrap. We offer simulation evidence showcasing the performance of our inference method in finite samples. An extension of our methodology to general M-estimation problems is also discussed....

  16. Problem Solving and Learning

    Science.gov (United States)

    Singh, Chandralekha

    2009-07-01

    One finding of cognitive research is that people do not automatically acquire usable knowledge by spending lots of time on task. Because students' knowledge hierarchy is more fragmented, "knowledge chunks" are smaller than those of experts. The limited capacity of short term memory makes the cognitive load high during problem solving tasks, leaving few cognitive resources available for meta-cognition. The abstract nature of the laws of physics and the chain of reasoning required to draw meaningful inferences makes these issues critical. In order to help students, it is crucial to consider the difficulty of a problem from the perspective of students. We are developing and evaluating interactive problem-solving tutorials to help students in the introductory physics courses learn effective problem-solving strategies while solidifying physics concepts. The self-paced tutorials can provide guidance and support for a variety of problem solving techniques, and opportunity for knowledge and skill acquisition.

  17. The association between individual SNPs or haplotypes of matrix metalloproteinase 1 and gastric cancer susceptibility, progression and prognosis.

    Directory of Open Access Journals (Sweden)

    Yong-Xi Song

    Full Text Available BACKGROUND: The single nucleotide polymorphisms (SNPs in matrix metalloproteinase 1(MMP-1 play important roles in some cancers. This study examined the associations between individual SNPs or haplotypes in MMP-1 and susceptibility, clinicopathological parameters and prognosis of gastric cancer in a large sample of the Han population in northern China. METHODS: In this case-controlled study, there were 404 patients with gastric cancer and 404 healthy controls. Seven SNPs were genotyped using the MALDI-TOF MS system. Then, SPSS software, Haploview 4.2 software, Haplo.states software and THEsias software were used to estimate the association between individual SNPs or haplotypes of MMP-1 and gastric cancer susceptibility, progression and prognosis. RESULTS: Among seven SNPs, there were no individual SNPs correlated to gastric cancer risk. Moreover, only the rs470206 genotype had a correlation with histologic grades, and the patients with GA/AA had well cell differentiation compared to the patients with genotype GG (OR=0.573; 95%CI: 0.353-0.929; P=0.023. Then, we constructed a four-marker haplotype block that contained 4 common haplotypes: TCCG, GCCG, TTCG and TTTA. However, all four common haplotypes had no correlation with gastric cancer risk and we did not find any relationship between these haplotypes and clinicopathological parameters in gastric cancer. Furthermore, neither individual SNPs nor haplotypes had an association with the survival of patients with gastric cancer. CONCLUSIONS: This study evaluated polymorphisms of the MMP-1 gene in gastric cancer with a MALDI-TOF MS method in a large northern Chinese case-controlled cohort. Our results indicated that these seven SNPs of MMP-1 might not be useful as significant markers to predict gastric cancer susceptibility, progression or prognosis, at least in the Han population in northern China.

  18. Kernel learning at the first level of inference.

    Science.gov (United States)

    Cawley, Gavin C; Talbot, Nicola L C

    2014-05-01

    Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. Serpin peptidase inhibitor (SERPINB5) haplotypes are associated with susceptibility to hepatocellular carcinoma

    Science.gov (United States)

    Yang, Shun-Fa; Yeh, Chao-Bin; Chou, Ying-Erh; Lee, Hsiang-Lin; Liu, Yu-Fan

    2016-05-01

    Hepatocellular carcinoma (HCC) represents the second leading cause of cancer-related death worldwide. The serpin peptidase inhibitor SERPINB5 is a tumour-suppressor gene that promotes the development of various cancers in humans. However, whether SERPINB5 gene variants play a role in HCC susceptibility remains unknown. In this study, we genotyped 6 SNPs of the SERPINB5 gene in an independent cohort from a replicate population comprising 302 cases and 590 controls. Additionally, patients who had at least one rs2289520 C allele in SERPINB5 tended to exhibit better liver function than patients with genotype GG (Child-Pugh grade A vs. B or C; P = 0.047). Next, haplotype blocks were reconstructed according to the linkage disequilibrium structure of the SERPINB5 gene. A haplotype “C-C-C” (rs17071138 + rs3744941 + rs8089204) in SERPINB5-correlated promoter showed a significant association with an increased HCC risk (AOR = 1.450 P = 0.031). Haplotypes “T-C-A” and “C-C-C” (rs2289519 + rs2289520 + rs1455555) located in the SERPINB5 coding region had a decreased (AOR = 0.744 P = 0.031) and increased (AOR = 1.981 P = 0.001) HCC risk, respectively. Finally, an additional integrated in silico analysis confirmed that these SNPs affected SERPINB5 expression and protein stability, which significantly correlated with tumour expression and subsequently with tumour development and aggressiveness. Taken together, our findings regarding these biomarkers provide a prediction model for risk assessment.

  20. Worldwide genealogy of Entamoeba histolytica: an overview to understand haplotype distribution and infection outcome.

    Science.gov (United States)

    Zermeño, Valeria; Ximénez, Cecilia; Morán, Patricia; Valadez, Alicia; Valenzuela, Olivia; Rascón, Edgar; Diaz, Daniel; Cerritos, René

    2013-07-01

    Although Entamoeba histolytica is one of the most prevalent intestinal parasites, how the different strains of this species are distributed all over the world and how different genotypes are associated with the infection outcome are yet to be fully understood. Recently, the use of a number of molecular markers has made the characterization of several genotypes in those regions with high incidence of amoebiasis possible. This work proposes the first genealogy of E. histolytica, with an haplotype network based on two tRNA gene-linked array of Short Tandem Repeats (STRs) reported until today, and 47 sequences from 39 new isolates of Mexican Amoebic Liver Abscesses (ALA) samples. One hundred and three sequences were obtained from D-A locus, their information about the geographic region of isolation as well as clinical diagnosis were also collected. One hundred and five sequences from N-K2 locus were also obtained as well as the region of isolation, but the information about clinical diagnosis was not available in all cases. The most abundant and widely distributed haplotype in the world is the one of E. histolytica HM1:IMSS strain. This was found in Mexico, Bangladesh, Japan, China and USA and is associated to symptomatic patients as well as asymptomatic cyst passers. Many other haplotypes were found only in a single country. Both genealogies suggest that there are no lineages within the networks that may be related to a particular geographic region or infection outcome. A concatenated analysis of the two molecular markers revealed 12 different combinations, which suggests the possibility of genetic recombination events. The present study is the first to propose a global genealogy of this species and suggests that there are still many genotypes to be discovered. The genotyping of new isolates will help to understand the great diversity and genetic structure of this parasite. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. Role of MAPT mutations and haplotype in frontotemporal lobar degeneration in Northern Finland

    Directory of Open Access Journals (Sweden)

    Tuominen Hannu

    2008-12-01

    Full Text Available Abstract Background Frontotemporal lobar degeneration (FTLD consists of a clinically and neuropathologically heterogeneous group of syndromes affecting the frontal and temporal lobes of the brain. Mutations in microtubule-associated protein tau (MAPT, progranulin (PGRN and charged multi-vesicular body protein 2B (CHMP2B are associated with familial forms of the disease. The prevalence of these mutations varies between populations. The H1 haplotype of MAPT has been found to be closely associated with tauopathies and with sporadic FTLD. Our aim was to investigate MAPT mutations and haplotype frequencies in a clinical series of patients with FTLD in Northern Finland. Methods MAPT exons 1, 2 and 9–13 were sequenced in 59 patients with FTLD, and MAPT haplotypes were analysed in these patients, 122 patients with early onset Alzheimer's disease (eoAD and 198 healthy controls. Results No pathogenic mutations were found. The H2 allele frequency was 11.0% (P = 0.028 in the FTLD patients, 9.8% (P = 0.029 in the eoAD patients and 5.3% in the controls. The H2 allele was especially clustered in patients with a positive family history (P = 0.011 but did not lower the age at onset of the disease. The ApoE4 allele frequency was significantly increased in the patients with eoAD and in those with FTLD. Conclusion We conclude that although pathogenic MAPT mutations are rare in Northern Finland, the MAPT H2 allele may be associated with increased risks of FTLD and eoAD in the Finnish population.

  2. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  3. Exploring and Harnessing Haplotype Diversity to Improve Yield Stability in Crops

    Directory of Open Access Journals (Sweden)

    Lunwen Qian

    2017-09-01

    Full Text Available In order to meet future food, feed, fiber, and bioenergy demands, global yields of all major crops need to be increased significantly. At the same time, the increasing frequency of extreme weather events such as heat and drought necessitates improvements in the environmental resilience of modern crop cultivars. Achieving sustainably increase yields implies rapid improvement of quantitative traits with a very complex genetic architecture and strong environmental interaction. Latest advances in genome analysis technologies today provide molecular information at an ultrahigh resolution, revolutionizing crop genomic research, and paving the way for advanced quantitative genetic approaches. These include highly detailed assessment of population structure and genotypic diversity, facilitating the identification of selective sweeps and signatures of directional selection, dissection of genetic variants that underlie important agronomic traits, and genomic selection (GS strategies that not only consider major-effect genes. Single-nucleotide polymorphism (SNP markers today represent the genotyping system of choice for crop genetic studies because they occur abundantly in plant genomes and are easy to detect. SNPs are typically biallelic, however, hence their information content compared to multiallelic markers is low, limiting the resolution at which SNP–trait relationships can be delineated. An efficient way to overcome this limitation is to construct haplotypes based on linkage disequilibrium, one of the most important features influencing genetic analyses of crop genomes. Here, we give an overview of the latest advances in genomics-based haplotype analyses in crops, highlighting their importance in the context of polyploidy and genome evolution, linkage drag, and co-selection. We provide examples of how haplotype analyses can complement well-established quantitative genetics frameworks, such as quantitative trait analysis and GS, ultimately

  4. Exploring and Harnessing Haplotype Diversity to Improve Yield Stability in Crops.

    Science.gov (United States)

    Qian, Lunwen; Hickey, Lee T; Stahl, Andreas; Werner, Christian R; Hayes, Ben; Snowdon, Rod J; Voss-Fels, Kai P

    2017-01-01

    In order to meet future food, feed, fiber, and bioenergy demands, global yields of all major crops need to be increased significantly. At the same time, the increasing frequency of extreme weather events such as heat and drought necessitates improvements in the environmental resilience of modern crop cultivars. Achieving sustainably increase yields implies rapid improvement of quantitative traits with a very complex genetic architecture and strong environmental interaction. Latest advances in genome analysis technologies today provide molecular information at an ultrahigh resolution, revolutionizing crop genomic research, and paving the way for advanced quantitative genetic approaches. These include highly detailed assessment of population structure and genotypic diversity, facilitating the identification of selective sweeps and signatures of directional selection, dissection of genetic variants that underlie important agronomic traits, and genomic selection (GS) strategies that not only consider major-effect genes. Single-nucleotide polymorphism (SNP) markers today represent the genotyping system of choice for crop genetic studies because they occur abundantly in plant genomes and are easy to detect. SNPs are typically biallelic, however, hence their information content compared to multiallelic markers is low, limiting the resolution at which SNP-trait relationships can be delineated. An efficient way to overcome this limitation is to construct haplotypes based on linkage disequilibrium, one of the most important features influencing genetic analyses of crop genomes. Here, we give an overview of the latest advances in genomics-based haplotype analyses in crops, highlighting their importance in the context of polyploidy and genome evolution, linkage drag, and co-selection. We provide examples of how haplotype analyses can complement well-established quantitative genetics frameworks, such as quantitative trait analysis and GS, ultimately providing an effective tool

  5. Haplotype analysis indicates an association between the DOPA decarboxylase (DDC) gene and nicotine dependence.

    Science.gov (United States)

    Ma, Jennie Z; Beuten, Joke; Payne, Thomas J; Dupont, Randolph T; Elston, Robert C; Li, Ming D

    2005-06-15

    DOPA decarboxylase (DDC; also known as L-amino acid decarboxylase; AADC) is involved in the synthesis of dopamine, norepinephrine and serotonin. Because the mesolimbic dopaminergic system is implicated in the reinforcing effects of many drugs, including nicotine, the DDC gene is considered a plausible candidate for involvement in the development of vulnerability to nicotine dependence (ND). Further, this gene is located within the 7p11 region that showed a 'suggestive linkage' to ND in our previous genome-wide scan in the Framingham Heart Study population. In the present study, we tested eight single nucleotide polymorphisms (SNPs) within DDC for association with ND, which was assessed by smoking quantity (SQ), the heaviness of smoking index (HSI) and the Fagerstrom test for ND (FTND) score, in a total of 2037 smokers and non-smokers from 602 nuclear families of African- or European-American (AA or EA, respectively) ancestry. Association analysis for individual SNPs using the PBAT-GEE program indicated that SNP rs921451 was significantly associated with two of the three adjusted ND measures in the EA sample (P=0.01-0.04). Haplotype-based association analysis revealed a protective T-G-T-G haplotype for rs921451-rs3735273-rs1451371-rs2060762 in the AA sample, which was significantly associated with all three adjusted ND measures after correction for multiple testing (min Z=-2.78, P=0.006 for HSI). In contrast, we found a high-risk T-G-T-G haplotype for a different SNP combination in the EA sample, rs921451-rs3735273-rs1451371-rs3757472, which showed a significant association after Bonferroni correction with the SQ and FTND score (max Z=2.73, P=0.005 for FTND). In summary, our findings provide the first evidence for the involvement of DDC in the susceptibility to ND and, further, reveal the racial specificity of its impact.

  6. PTCH1 gene haplotype association with basal cell carcinoma after transplantation.

    Science.gov (United States)

    Begnini, A; Tessari, G; Turco, A; Malerba, G; Naldi, L; Gotti, E; Boschiero, L; Forni, A; Rugiu, C; Piaserico, S; Fortina, A B; Brunello, A; Cascone, C; Girolomoni, G; Gomez Lira, M

    2010-08-01

    Basal cell carcinoma (BCC) is 10 times more frequent in organ transplant recipients (OTRs) than in the general population. Factors in OTRs conferring increased susceptibility to BCC include ultraviolet radiation exposure, immunosuppression, viral infections such as human papillomavirus, phototype and genetic predisposition. The PTCH1 gene is a negative regulator of the hedgehog pathway, that provides mitogenic signals to basal cells in skin. PTCH1 gene mutations cause naevoid BCC syndrome, and contribute to the development of sporadic BCC and other types of cancers. Associations have been reported between PTCH1 polymorphisms and BCC susceptibility in nontransplanted individuals. To search for novel common polymorphisms in the proximal 5' regulatory region upstream of PTCH1 gene exon 1B, and to investigate the possible association of PTCH1 polymorphisms and haplotypes with BCC risk after organ transplantation. Three PTCH1 single nucleotide polymorphisms (rs2297086, rs2066836 and rs357564) were analysed by restriction fragment length polymorphism analysis in 161 northern Italian OTRs (56 BCC cases and 105 controls). Two regions of the PTCH1 gene promoter were screened by heteroduplex analysis in 30 cases and 30 controls. Single locus analysis showed no significant association. Haplotype T(1686)-T(3944) appeared to confer a significantly higher risk for BCC development (odds ratio 2.98, 95% confidence interval 2.55-3.48; P = 0.001). Two novel rare polymorphisms were identified at positions 176 and 179 of the 5'UTR. Two novel alleles of the -4 (CGG)(n) microsatellite were identified. No association of this microsatellite with BCC was observed. Haplotypes containing T(1686)-T(3944) alleles were shown to be associated with an increased BCC risk in our study population. These data appear to be of great interest for further investigations in a larger group of transplant individuals. Our results do not support the hypothesis that common polymorphisms in the proximal 5

  7. HLA alleles and haplotypes distribution in Dai population in Yunnan province, Southwest China.

    Science.gov (United States)

    Shi, L; Yao, Y F; Shi, L; Matsushita, M; Yu, L; Lin, Q K; Tao, Y F; Oka, T; Chu, J Y; Tokunaga, K

    2010-02-01

    Human leukocyte antigen (HLA) analysis would be a useful tool to trace the origin of modern humans. In this study, we provided the first four digital HLA-A, -B, -C and -DRB1 allele and haplotype data in the Dai ethnic population, which is a unique and representative Kam-Tai-speaking ethnic minority living in the Yunnan province of Southwestern China. Our results showed that the Dai population has unique HLA characteristic that are most closely related to the Southeastern Asia group and similar to the Kam-Tai speaking populations in China and Thailand.

  8. The Systemic Lupus Erythematosus IRF5 Risk Haplotype Is Associated with Systemic Sclerosis

    Science.gov (United States)

    Beretta, Lorenzo; Simeón, Carmen P.; Carreira, Patricia E.; Callejas, José Luis; Fernández-Castro, Mónica; Sáez-Comet, Luis; Beltrán, Emma; Camps, María Teresa; Egurbide, María Victoria; Airó, Paolo; Scorza, Raffaella; Lunardi, Claudio; Hunzelmann, Nicolas; Riemekasten, Gabriela; Witte, Torsten; Kreuter, Alexander; Distler, Jörg H. W.; Madhok, Rajan; Shiels, Paul; van Laar, Jacob M.; Fonseca, Carmen; Denton, Christopher; Herrick, Ariane; Worthington, Jane; Schuerwegh, Annemie J.; Vonk, Madelon C.; Voskuyl, Alexandre E.; Radstake, Timothy R. D. J.; Martín, Javier

    2013-01-01

    Systemic sclerosis (SSc) is a fibrotic autoimmune disease in which the genetic component plays an important role. One of the strongest SSc association signals outside the human leukocyte antigen (HLA) region corresponds to interferon (IFN) regulatory factor 5 (IRF5), a major regulator of the type I IFN pathway. In this study we aimed to evaluate whether three different haplotypic blocks within this locus, which have been shown to alter the protein function influencing systemic lupus erythematosus (SLE) susceptibility, are involved in SSc susceptibility and clinical phenotypes. For that purpose, we genotyped one representative single-nucleotide polymorphism (SNP) of each block (rs10488631, rs2004640, and rs4728142) in a total of 3,361 SSc patients and 4,012 unaffected controls of Caucasian origin from Spain, Germany, The Netherlands, Italy and United Kingdom. A meta-analysis of the allele frequencies was performed to analyse the overall effect of these IRF5 genetic variants on SSc. Allelic combination and dependency tests were also carried out. The three SNPs showed strong associations with the global disease (rs4728142: P  = 1.34×10−8, OR  = 1.22, CI 95%  = 1.14–1.30; rs2004640: P  = 4.60×10−7, OR  = 0.84, CI 95%  = 0.78–0.90; rs10488631: P  = 7.53×10−20, OR  = 1.63, CI 95%  = 1.47–1.81). However, the association of rs2004640 with SSc was not independent of rs4728142 (conditioned P  = 0.598). The haplotype containing the risk alleles (rs4728142*A-rs2004640*T-rs10488631*C: P  = 9.04×10−22, OR  = 1.75, CI 95%  = 1.56–1.97) better explained the observed association (likelihood P-value  = 1.48×10−4), suggesting an additive effect of the three haplotypic blocks. No statistical significance was observed in the comparisons amongst SSc patients with and without the main clinical characteristics. Our data clearly indicate that the SLE risk haplotype also influences SSc predisposition, and that

  9. Risk of Pediatric Celiac Disease According to HLA Haplotype and Country

    Science.gov (United States)

    Liu, Edwin; Lee, Hye-Seung; Aronsson, Carin A.; Hagopian, William A.; Koletzko, Sibylle; Rewers, Marian J.; Eisenbarth, George S.; Bingley, Polly J.; Bonifacio, Ezio; Simell, Ville; Agardh, Daniel

    2014-01-01

    BACKGROUND The presence of HLA haplotype DR3–DQ2 or DR4–DQ8 is associated with an increased risk of celiac disease. In addition, nearly all children with celiac disease have serum antibodies against tissue transglutaminase (tTG). METHODS We studied 6403 children with HLA haplotype DR3–DQ2 or DR4–DQ8 prospectively from birth in the United States, Finland, Germany, and Sweden. The primary end point was the development of celiac disease autoimmunity, which was defined as the presence of tTG antibodies on two consecutive tests at least 3 months apart. The secondary end point was the development of celiac disease, which was defined for the purpose of this study as either a diagnosis on biopsy or persistently high levels of tTG antibodies. RESULTS The median follow-up was 60 months (interquartile range, 46 to 77). Celiac disease autoimmunity developed in 786 children (12%). Of the 350 children who underwent biopsy, 291 had confirmed celiac disease; an additional 21 children who did not undergo biopsy had persistently high levels of tTG antibodies. The risks of celiac disease autoimmunity and celiac disease by the age of 5 years were 11% and 3%, respectively, among children with a single DR3–DQ2 haplotype, and 26% and 11%, respectively, among those with two copies (DR3–DQ2 homozygosity). In the adjusted model, the hazard ratios for celiac disease autoimmunity were 2.09 (95% confidence interval [CI], 1.70 to 2.56) among heterozygotes and 5.70 (95% CI, 4.66 to 6.97) among homozygotes, as compared with children who had the lowest-risk genotypes (DR4–DQ8 heterozygotes or homozygotes). Residence in Sweden was also independently associated with an increased risk of celiac disease autoimmunity (hazard ratio, 1.90; 95% CI, 1.61 to 2.25). CONCLUSIONS Children with the HLA haplotype DR3–DQ2, especially homozygotes, were found to be at high risk for celiac disease autoimmunity and celiac disease early in childhood. The higher risk in Sweden than in other countries

  10. Inflammation, insulin resistance, and diabetes--Mendelian randomization using CRP haplotypes points upstream.

    Directory of Open Access Journals (Sweden)

    Eric J Brunner

    2008-08-01

    Full Text Available Raised C-reactive protein (CRP is a risk factor for type 2 diabetes. According to the Mendelian randomization method, the association is likely to be causal if genetic variants that affect CRP level are associated with markers of diabetes development and diabetes. Our objective was to examine the nature of the association between CRP phenotype and diabetes development using CRP haplotypes as instrumental variables.We genotyped three tagging SNPs (CRP + 2302G > A; CRP + 1444T > C; CRP + 4899T > G in the CRP gene and measured serum CRP in 5,274 men and women at mean ages 49 and 61 y (Whitehall II Study. Homeostasis model assessment-insulin resistance (HOMA-IR and hemoglobin A1c (HbA1c were measured at age 61 y. Diabetes was ascertained by glucose tolerance test and self-report. Common major haplotypes were strongly associated with serum CRP levels, but unrelated to obesity, blood pressure, and socioeconomic position, which may confound the association between CRP and diabetes risk. Serum CRP was associated with these potential confounding factors. After adjustment for age and sex, baseline serum CRP was associated with incident diabetes (hazard ratio = 1.39 [95% confidence interval 1.29-1.51], HOMA-IR, and HbA1c, but the associations were considerably attenuated on adjustment for potential confounding factors. In contrast, CRP haplotypes were not associated with HOMA-IR or HbA1c (p = 0.52-0.92. The associations of CRP with HOMA-IR and HbA1c were all null when examined using instrumental variables analysis, with genetic variants as the instrument for serum CRP. Instrumental variables estimates differed from the directly observed associations (p = 0.007-0.11. Pooled analysis of CRP haplotypes and diabetes in Whitehall II and Northwick Park Heart Study II produced null findings (p = 0.25-0.88. Analyses based on the Wellcome Trust Case Control Consortium (1,923 diabetes cases, 2,932 controls using three SNPs in tight linkage disequilibrium with our

  11. RTEL1 tagging SNPs and haplotypes were associated with glioma development.

    Science.gov (United States)

    Li, Gang; Jin, Tianbo; Liang, Hongjuan; Zhang, Zhiguo; He, Shiming; Tu, Yanyang; Yang, Haixia; Geng, Tingting; Cui, Guangbin; Chen, Chao; Gao, Guodong

    2013-05-17

    As glioma ranks as the first most prevalent solid tumors in primary central nervous system, certain single-nucleotide polymorphisms (SNPs) may be related to increased glioma risk, and have implications in carcinogenesis. The present case-control study was carried out to elucidate how common variants contribute to glioma susceptibility. Ten candidate tagging SNPs (tSNPs) were selected from seven genes whose polymorphisms have been proven by classical literatures and reliable databases to be tended to relate with gliomas, and with the minor allele frequency (MAF)>5% in the HapMap Asian population. The selected tSNPs were genotyped in 629 glioma patients and 645 controls from a Han Chinese population using the multiplexed SNP MassEXTEND assay calibrated. Two significant tSNPs in RTEL1 gene were observed to be associated with glioma risk (rs6010620, P=0.0016, OR: 1.32, 95% CI: 1.11-1.56; rs2297440, P=0.001, OR: 1.33, 95% CI: 1.12-1.58) by χ2 test. It was identified the genotype "GG" of rs6010620 acted as the protective genotype for glioma (OR, 0.46; 95% CI, 0.31-0.7; P=0.0002), while the genotype "CC" of rs2297440 as the protective genotype in glioma (OR, 0.47; 95% CI, 0.31-0.71; P=0.0003). Furthermore, haplotype "GCT" in RTEL1 gene was found to be associated with risk of glioma (OR, 0.7; 95% CI, 0.57-0.86; Fisher's P=0.0005; Pearson's P=0.0005), and haplotype "ATT" was detected to be associated with risk of glioma (OR, 1.32; 95% CI, 1.12-1.57; Fisher's P=0.0013; Pearson's P=0.0013). Two single variants, the genotypes of "GG" of rs6010620 and "CC" of rs2297440 (rs6010620 and rs2297440) in the RTEL1 gene, together with two haplotypes of GCT and ATT, were identified to be associated with glioma development. And it might be used to evaluate the glioma development risks to screen the above RTEL1 tagging SNPs and haplotypes. The virtual slides for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1993021136961998.

  12. Power laws for heavy-tailed distributions: modeling allele and haplotype diversity for the national marrow donor program.

    Directory of Open Access Journals (Sweden)

    Noa Slater

    2015-04-01

    Full Text Available Measures of allele and haplotype diversity, which are fundamental properties in population genetics, often follow heavy tailed distributions. These measures are of particular interest in the field of hematopoietic stem cell transplant (HSCT. Donor/Recipient suitability for HSCT is determined by Human Leukocyte Antigen (HLA similarity. Match predictions rely upon a precise description of HLA diversity, yet classical estimates are inaccurate given the heavy-tailed nature of the distribution. This directly affects HSCT matching and diversity measures in broader fields such as species richness. We, therefore, have developed a power-law based estimator to measure allele and haplotype diversity that accommodates heavy tails using the concepts of regular variation and occupancy distributions. Application of our estimator to 6.59 million donors in the Be The Match Registry revealed that haplotypes follow a heavy tail distribution across all ethnicities: for example, 44.65% of the European American haplotypes are represented by only 1 individual. Indeed, our discovery rate of all U.S. European American haplotypes is estimated at 23.45% based upon sampling 3.97% of the population, leaving a large number of unobserved haplotypes. Population coverage, however, is much higher at 99.4% given that 90% of European Americans carry one of the 4.5% most frequent haplotypes. Alleles were found to be less diverse suggesting the current registry represents most alleles in the population. Thus, for HSCT registries, haplotype discovery will remain high with continued recruitment to a very deep level of sampling, but population coverage will not. Finally, we compared the convergence of our power-law versus classical diversity estimators such as Capture recapture, Chao, ACE and Jackknife methods. When fit to the haplotype data, our estimator displayed favorable properties in terms of convergence (with respect to sampling depth and accuracy (with respect to diversity

  13. Novel Nucleotide Variations, Haplotypes Structure and Associations with Growth Related Traits of Goat AT Motif-Binding Factor ( Gene

    Directory of Open Access Journals (Sweden)

    Xiaoyan Zhang

    2015-10-01

    Full Text Available The AT motif-binding factor (ATBF1 not only interacts with protein inhibitor of activated signal transducer and activator of transcription 3 (STAT3 (PIAS3 to suppress STAT3 signaling regulating embryo early development and cell differentiation, but is required for early activation of the pituitary specific transcription factor 1 (Pit1 gene (also known as POU1F1 critically affecting mammalian growth and development. The goal of this study was to detect novel nucleotide variations and haplotypes structure of the ATBF1 gene, as well as to test their associations with growth-related traits in goats. Herein, a total of seven novel single nucleotide polymorphisms (SNPs (SNP 1-7 within this gene were found in two well-known Chinese native goat breeds. Haplotypes structure analysis demonstrated that there were four haplotypes in Hainan black goat while seventeen haplotypes in Xinong Saanen dairy goat, and both breeds only shared one haplotype (hap1. Association testing revealed that the SNP2, SNP5, SNP6, and SNP7 loci were also found to significantly associate with growth-related traits in goats, respectively. Moreover, one diplotype in Xinong Saanen dairy goats significantly linked to growth related traits. These preliminary findings not only would extend the spectrum of genetic variations of the goat ATBF1 gene, but also would contribute to implementing marker-assisted selection in genetics and breeding in goats.

  14. Minimal sharing of Y-chromosome STR haplotypes among five endogamous population groups from western and southwestern India.

    Science.gov (United States)

    Das, Birajalaxmi; Chauhan, P S; Seshadri, M

    2004-10-01

    We attempt to address the issue of genetic variation and the pattern of male gene flow among and between five Indian population groups of two different geographic and linguistic affiliations using Y-chromosome markers. We studied 221 males at three Y-chromosome biallelic loci and 184 males for the five Y-chromosome STRs. We observed 111 Y-chromosome STR haplotypes. An analysis of molecular variance (AMOVA) based on Y-chromosome STRs showed that the variation observed between the population groups belonging to two major regions (western and southwestern India) was 0.17%, which was significantly lower than the level of genetic variance among the five populations (0.59%) considered as a single group. Combined haplotype analysis of the five STRs and the biallelic locus 92R7 revealed minimal sharing of haplotypes among these five ethnic groups, irrespective of the similar origin of the linguistic and geographic affiliations; this minimal sharing indicates restricted male gene flow. As a consequence, most of the haplotypes were population specific. Network analysis showed that the haplotypes, which were shared between the populations, seem to have originated from different mutational pathways at different loci. Biallelic markers showed that all five ethnic groups have a similar ancestral origin despite their geographic and linguistic diversity.

  15. Influence of βS-Globin Haplotypes and Hydroxyurea on Arginase I Levels in Sickle Cell Disease

    Directory of Open Access Journals (Sweden)

    J. A. Moreira

    2016-01-01

    Full Text Available Introduction. Sickle cell disease (SCD is characterized by hemoglobin S homozygosity, leading to hemolysis and vasoocclusion. The hemolysis releases arginase I, an enzyme that decreases the bioavailability of nitric oxide, worsening the symptoms. The different SCD haplotypes are related to clinical symptoms and varied hemoglobin F (HbF concentration. The aim of this study was to evaluate the impact of the βS gene haplotypes and HbF concentration on arginase I levels in SCD patients. Methods. Fifty SCD adult patients were enrolled in the study and 20 blood donors composed the control group. Arginase I was measured by ELISA. The βS haplotypes were identified by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP. Statistical analyses were performed with GraphPad Prism program and the significance level was p<0.05. Results. Significant increase was observed in the arginase I levels in SCD patients compared to the control group (p<0.0001. The comparison between the levels of arginase I in three haplotypes groups showed a difference between the Bantu/Bantu × Bantu/Benin groups; Bantu/Bantu × Benin/Benin, independent of HU dosage. An inverse correlation with the arginase I levels and HbF concentration was observed. Conclusion. The results support the hypothesis that arginase I is associated with HbF concentration, also measured indirectly by the association with haplotypes.

  16. Inheritance of the 8.1 ancestral haplotype in recurrent pregnancy loss

    DEFF Research Database (Denmark)

    Kolte, Astrid M; Nielsen, Henriette S; Steffensen, Rudi

    2015-01-01

    pleiotropy. It has also been proposed that the survival of long, conserved haplotypes may be due to gestational drive, i.e. selective miscarriage of fetuses who have not inherited the haplotype from a heterozygous mother. Recurrent pregnancy loss (RPL) is defined as three or more consecutive pregnancy losses....... The objective was to test the gestational drive theory for the 8.1AH in women with RPL and their live born children. METHODOLOGY: We investigated the inheritance of the 8.1AH from 82 heterozygous RPL women to 110 live born children. All participants were genotyped for HLA-A, -B and -DRB1 in DNA from EDTA......-treated blood or buccal swaps. Inheritance was compared with a Mendelian inheritance of 50% using a two-sided exact binomial test. RESULTS: We found that 55% of the live born children had inherited the 8.1AH, which was not significantly higher than the expected 50% (P = 0.29). Interestingly, we found a non...

  17. Nucleotide polymorphisms and haplotype diversity of RTCS gene in China elite maize inbred lines.

    Directory of Open Access Journals (Sweden)

    Enying Zhang

    Full Text Available The maize RTCS gene, encoding a LOB domain transcription factor, plays important roles in the initiation of embryonic seminal and postembryonic shoot-borne root. In this study, the genomic sequences of this gene in 73 China elite inbred lines, including 63 lines from 5 temperate heteroric groups and 10 tropic germplasms, were obtained, and the nucleotide polymorphisms and haplotype diversity were detected. A total of 63 sequence variants, including 44 SNPs and 19 indels, were identified at this locus, and most of them were found to be located in the regions of UTR and intron. The coding region of this gene in all tested inbred lines carried 14 haplotypes, which encoding 7 deferring RTCS proteins. Analysis of the polymorphism sites revealed that at least 6 recombination events have occurred. Among all 6 groups tested, only the P heterotic group had a much lower nucleotide diversity than the whole set, and selection analysis also revealed that only this group was under strong negative selection. However, the set of Huangzaosi and its derived lines possessed a higher nucleotide diversity than the whole set, and no selection signal were identified.

  18. Characterizing Metastatic HER2-Positive Gastric Cancer at the CDH1 Haplotype

    Science.gov (United States)

    Caggiari, Laura; Miolo, Gianmaria; Buonadonna, Angela; Basile, Debora; Santeufemia, Davide A.; De Zorzi, Mariangela; Fornasarig, Mara; Alessandrini, Lara; Lo Re, Giovanni; Puglisi, Fabio; Steffan, Agostino

    2017-01-01

    The CDH1 gene, coding for the E-cadherin protein, is linked to gastric cancer (GC) susceptibility and tumor invasion. The human epidermal growth factor receptor 2 (HER2) is amplified and overexpressed in a portion of GC. HER2 is an established therapeutic target in metastatic GC (mGC). Trastuzumab, in combination with various chemotherapeutic agents, is a standard treatment for these tumors leading to outcome improvement. Unfortunately, the survival benefit is limited to a fraction of patients. The aim of this study was to improve knowledge of the HER2 and the E-cadherin alterations in the context of GC to characterize subtypes of patients that could better benefit from targeted therapy. An association between the P7-CDH1 haplotype, including two polymorphisms (rs16260A-rs1801552T) and a subset of HER2-positive mGC with better prognosis was observed. Results indicated the potential evaluation of CDH1 haplotypes in mGC to stratify patients that will benefit from trastuzumab-based treatments. Moreover, data may have implications to understanding the HER2 and the E-cadherin interactions in vivo and in response to treatments. PMID:29295527

  19. Association analysis of calpain 10 gene variants/haplotypes with gestational diabetes mellitus among Mexican women.

    Science.gov (United States)

    Castro-Martínez, Anna Gabriela; Sánchez-Corona, José; Vázquez-Vargas, Adriana Patricia; García-Zapién, Alejandra Guadalupe; López-Quintero, Andres; Villalpando-Velazco, Héctor Javier; Flores-Martínez, Silvia Esperanza

    2018-02-28

    Gestational diabetes mellitus (GDM) is a metabolically complex disease with major genetic determinants. GDM has been associated with insulin resistance and dysfunction of pancreatic beta cells, so the GDM candidate genes are those that encode proteins modulating the function and secretion of insulin, such as that for calpain 10 (CAPN10). This study aimed to assess whether single nucleotide polymorphism (SNP)-43, SNP-44, SNP-63, and the indel-19 variant, and specific haplotypes of the CAPN10 gene were associated with gestational diabetes mellitus. We studied 116 patients with gestational diabetes mellitus and 83 women with normal glucose tolerance. Measurements of anthropometric and biochemical parameters were performed. SNP-43, SNP-44, and SNP-63 were identified by polymerase chain reaction (PCR)-restriction fragment length polymorphisms, while the indel-19 variant was detected by TaqMan qPCR assays.  The allele, genotype, and haplotype frequencies of the four variants did not differ significantly between women with gestational diabetes mellitus and controls. However, in women with gestational diabetes mellitus, glucose levels were significantly higher bearing the 3R/3R genotype than in carriers of the 3R/2R genotype of the indel-19 variant (p = 0.006). In conclusion, the 3R/3R genotype of the indel-19 variant of the CAPN-10 gene influenced increased glucose levels in these Mexican women with gestational diabetes mellitus.

  20. Compound haplotypes at Xp11.23 and human population growth in Eurasia.

    Science.gov (United States)

    Alonso, S; Armour, J A L

    2004-09-01

    To investigate patterns of diversity and the evolutionary history of Eurasians, we have sequenced a 2.8 kb region at Xp11.23 in a sample of African and Eurasian chromosomes. This region is in a long intron of CLCN5 and is immediately flanked by a highly variable minisatellite, DXS255, and a human-specific Ta0 LINE. Compared to Africans, Eurasians showed a marked reduction in sequence diversity. The main Euro-Asiatic haplotype seems to be the ancestral haplotype for the whole sample. Coalescent simulations, including recombination and exponential growth, indicate a median length of strong linkage disequilibrium, up to approximately 9 kb for this area. The Ka/Ks ratio between the coding sequence of human CLCN5 and its mouse orthologue is much less than 1. This implies that the region sequenced is unlikely to be under the strong influence of positive selective processes on CLCN5, mutations in which have been associated with disorders such as Dent's disease. In contrast, a scenario based on a population bottleneck and exponential growth seems a more likely explanation for the reduced diversity observed in Eurasians. Coalescent analysis and linked minisatellite diversity (which reaches a gene diversity value greater than 98% in Eurasians) suggest an estimated age of origin of the Euro-Asiatic diversity compatible with a recent out-of-Africa model for colonization of Eurasia by modern Homo sapiens.