WorldWideScience

Sample records for haplotype inference problem

  1. Haplotype inference in general pedigrees with two sites

    Directory of Open Access Journals (Sweden)

    Doan Duong D

    2011-04-01

    Full Text Available Abstract Background Genetic disease studies investigate relationships between changes in chromosomes and genetic diseases. Single haplotypes provide useful information for these studies but extracting single haplotypes directly by biochemical methods is expensive. A computational method to infer haplotypes from genotype data is therefore important. We investigate the problem of computing the minimum number of recombination events for general pedigrees with two sites for all members. Results We show that this NP-hard problem can be parametrically reduced to the Bipartization by Edge Removal problem and therefore can be solved by an O(2k · n2 exact algorithm, where n is the number of members and k is the number of recombination events. Conclusions Our work can therefore be useful for genetic disease studies to track down how changes in haplotypes such as recombinations relate to genetic disease.

  2. A new mathematical modeling for pure parsimony haplotyping problem.

    Science.gov (United States)

    Feizabadi, R; Bagherian, M; Vaziri, H R; Salahi, M

    2016-11-01

    Pure parsimony haplotyping (PPH) problem is important in bioinformatics because rational haplotyping inference plays important roles in analysis of genetic data, mapping complex genetic diseases such as Alzheimer's disease, heart disorders and etc. Haplotypes and genotypes are m-length sequences. Although several integer programing models have already been presented for PPH problem, its NP-hardness characteristic resulted in ineffectiveness of those models facing the real instances especially instances with many heterozygous sites. In this paper, we assign a corresponding number to each haplotype and genotype and based on those numbers, we set a mixed integer programing model. Using numbers, instead of sequences, would lead to less complexity of the new model in comparison with previous models in a way that there are neither constraints nor variables corresponding to heterozygous nucleotide sites in it. Experimental results approve the efficiency of the new model in producing better solution in comparison to two state-of-the art haplotyping approaches. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Grouping preprocess for haplotype inference from SNP and CNV data

    International Nuclear Information System (INIS)

    Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Inoue, Masato; Kamatani, Naoyuki

    2009-01-01

    The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.

  4. Grouping preprocess for haplotype inference from SNP and CNV data

    Energy Technology Data Exchange (ETDEWEB)

    Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Inoue, Masato [Department of Electrical Engineering and Bioscience, School of Advanced Science and Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555 (Japan); Kamatani, Naoyuki, E-mail: masato.inoue@eb.waseda.ac.j [Institute of Rheumatology, Tokyo Women' s Medical University, 10-22, Kawada-cho, Shinjuku-ku, Tokyo 162-0054 (Japan)

    2009-12-01

    The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.

  5. A unified framework for haplotype inference in nuclear families.

    Science.gov (United States)

    Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong

    2012-07-01

    Many large genome-wide association studies include nuclear families with more than one child (trio families), allowing for analysis of differences between siblings (sib pair analysis). Statistical power can be increased when haplotypes are used instead of genotypes. Currently, haplotype inference in families with more than one child can be performed either using the familial information or statistical information derived from the population samples but not both. Building on our recently proposed tree-based deterministic framework (TDS) for trio families, we augment its applicability to general nuclear families. We impose a minimum recombinant approach locally and independently on each multiple children family, while resorting to the population-derived information to solve the remaining ambiguities. Thus our framework incorporates all available information (familial and population) in a given study. We demonstrate that using all the constraints in our approach we can have gains in the accuracy as opposed to breaking the multiple children families to separate trios and resorting to a trio inference algorithm or phasing each family in isolation. We believe that our proposed framework could be the method of choice for haplotype inference in studies that include nuclear families with multiple children. Our software (tds2.0) is downloadable from www.ee.columbia.edu/∼anastas/tds. © 2012 The Authors Annals of Human Genetics © 2012 Blackwell Publishing Ltd/University College London.

  6. Haplotyping Problem, A Clustering Approach

    International Nuclear Information System (INIS)

    Eslahchi, Changiz; Sadeghi, Mehdi; Pezeshk, Hamid; Kargar, Mehdi; Poormohammadi, Hadi

    2007-01-01

    Construction of two haplotypes from a set of Single Nucleotide Polymorphism (SNP) fragments is called haplotype reconstruction problem. One of the most popular computational model for this problem is Minimum Error Correction (MEC). Since MEC is an NP-hard problem, here we propose a novel heuristic algorithm based on clustering analysis in data mining for haplotype reconstruction problem. Based on hamming distance and similarity between two fragments, our iterative algorithm produces two clusters of fragments; then, in each iteration, the algorithm assigns a fragment to one of the clusters. Our results suggest that the algorithm has less reconstruction error rate in comparison with other algorithms

  7. Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions

    Directory of Open Access Journals (Sweden)

    Balding David J

    2008-12-01

    Full Text Available Abstract Background The power of haplotype-based methods for association studies, identification of regions under selection, and ancestral inference, is well-established for diploid organisms. For polyploids, however, the difficulty of determining phase has limited such approaches. Polyploidy is common in plants and is also observed in animals. Partial polyploidy is sometimes observed in humans (e.g. trisomy 21; Down's syndrome, and it arises more frequently in some human tissues. Local changes in ploidy, known as copy number variations (CNV, arise throughout the genome. Here we present a method, implemented in the software polyHap, for the inference of haplotype phase and missing observations from polyploid genotypes. PolyHap allows each individual to have a different ploidy, but ploidy cannot vary over the genomic region analysed. It employs a hidden Markov model (HMM and a sampling algorithm to infer haplotypes jointly in multiple individuals and to obtain a measure of uncertainty in its inferences. Results In the simulation study, we combine real haplotype data to create artificial diploid, triploid, and tetraploid genotypes, and use these to demonstrate that polyHap performs well, in terms of both switch error rate in recovering phase and imputation error rate for missing genotypes. To our knowledge, there is no comparable software for phasing a large, densely genotyped region of chromosome from triploids and tetraploids, while for diploids we found polyHap to be more accurate than fastPhase. We also compare the results of polyHap to SATlotyper on an experimentally haplotyped tetraploid dataset of 12 SNPs, and show that polyHap is more accurate. Conclusion With the availability of large SNP data in polyploids and CNV regions, we believe that polyHap, our proposed method for inferring haplotypic phase from genotype data, will be useful in enabling researchers analysing such data to exploit the power of haplotype-based analyses.

  8. Modeling coverage gaps in haplotype frequencies via Bayesian inference to improve stem cell donor selection.

    Science.gov (United States)

    Louzoun, Yoram; Alter, Idan; Gragert, Loren; Albrecht, Mark; Maiers, Martin

    2018-05-01

    Regardless of sampling depth, accurate genotype imputation is limited in regions of high polymorphism which often have a heavy-tailed haplotype frequency distribution. Many rare haplotypes are thus unobserved. Statistical methods to improve imputation by extending reference haplotype distributions using linkage disequilibrium patterns that relate allele and haplotype frequencies have not yet been explored. In the field of unrelated stem cell transplantation, imputation of highly polymorphic human leukocyte antigen (HLA) genes has an important application in identifying the best-matched stem cell donor when searching large registries totaling over 28,000,000 donors worldwide. Despite these large registry sizes, a significant proportion of searched patients present novel HLA haplotypes. Supporting this observation, HLA population genetic models have indicated that many extant HLA haplotypes remain unobserved. The absent haplotypes are a significant cause of error in haplotype matching. We have applied a Bayesian inference methodology for extending haplotype frequency distributions, using a model where new haplotypes are created by recombination of observed alleles. Applications of this joint probability model offer significant improvement in frequency distribution estimates over the best existing alternative methods, as we illustrate using five-locus HLA frequency data from the National Marrow Donor Program registry. Transplant matching algorithms and disease association studies involving phasing and imputation of rare variants may benefit from this statistical inference framework.

  9. Honey bee-inspired algorithms for SNP haplotype reconstruction problem

    Science.gov (United States)

    PourkamaliAnaraki, Maryam; Sadeghi, Mehdi

    2016-03-01

    Reconstructing haplotypes from SNP fragments is an important problem in computational biology. There have been a lot of interests in this field because haplotypes have been shown to contain promising data for disease association research. It is proved that haplotype reconstruction in Minimum Error Correction model is an NP-hard problem. Therefore, several methods such as clustering techniques, evolutionary algorithms, neural networks and swarm intelligence approaches have been proposed in order to solve this problem in appropriate time. In this paper, we have focused on various evolutionary clustering techniques and try to find an efficient technique for solving haplotype reconstruction problem. It can be referred from our experiments that the clustering methods relying on the behaviour of honey bee colony in nature, specifically bees algorithm and artificial bee colony methods, are expected to result in more efficient solutions. An application program of the methods is available at the following link. http://www.bioinf.cs.ipm.ir/software/haprs/

  10. Are molecular haplotypes worth the time and expense? A cost-effective method for applying molecular haplotypes.

    Directory of Open Access Journals (Sweden)

    Mark A Levenstien

    2006-08-01

    Full Text Available Because current molecular haplotyping methods are expensive and not amenable to automation, many researchers rely on statistical methods to infer haplotype pairs from multilocus genotypes, and subsequently treat these inferred haplotype pairs as observations. These procedures are prone to haplotype misclassification. We examine the effect of these misclassification errors on the false-positive rate and power for two association tests. These tests include the standard likelihood ratio test (LRTstd and a likelihood ratio test that employs a double-sampling approach to allow for the misclassification inherent in the haplotype inference procedure (LRTae. We aim to determine the cost-benefit relationship of increasing the proportion of individuals with molecular haplotype measurements in addition to genotypes to raise the power gain of the LRTae over the LRTstd. This analysis should provide a guideline for determining the minimum number of molecular haplotypes required for desired power. Our simulations under the null hypothesis of equal haplotype frequencies in cases and controls indicate that (1 for each statistic, permutation methods maintain the correct type I error; (2 specific multilocus genotypes that are misclassified as the incorrect haplotype pair are consistently misclassified throughout each entire dataset; and (3 our simulations under the alternative hypothesis showed a significant power gain for the LRTae over the LRTstd for a subset of the parameter settings. Permutation methods should be used exclusively to determine significance for each statistic. For fixed cost, the power gain of the LRTae over the LRTstd varied depending on the relative costs of genotyping, molecular haplotyping, and phenotyping. The LRTae showed the greatest benefit over the LRTstd when the cost of phenotyping was very high relative to the cost of genotyping. This situation is likely to occur in a replication study as opposed to a whole-genome association study.

  11. A Dynamic Programming Algorithm for the k-Haplotyping Problem

    Institute of Scientific and Technical Information of China (English)

    Zhen-ping Li; Ling-yun Wu; Yu-ying Zhao; Xiang-sun Zhang

    2006-01-01

    The Minimum Fragments Removal (MFR) problem is one of the haplotyping problems: given a set of fragments, remove the minimum number of fragments so that the resulting fragments can be partitioned into k classes of non-conflicting subsets. In this paper, we formulate the k-MFR problem as an integer linear programming problem, and develop a dynamic programming approach to solve the k-MFR problem for both the gapless and gap cases.

  12. Inference rule and problem solving

    Energy Technology Data Exchange (ETDEWEB)

    Goto, S

    1982-04-01

    Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.

  13. Detecting structure of haplotypes and local ancestry

    Science.gov (United States)

    We present a two-layer hidden Markov model to detect the structure of haplotypes for unrelated individuals. This allows us to model two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local an...

  14. A class representative model for Pure Parsimony Haplotyping under uncertain data.

    Directory of Open Access Journals (Sweden)

    Daniele Catanzaro

    Full Text Available The Pure Parsimony Haplotyping (PPH problem is a NP-hard combinatorial optimization problem that consists of finding the minimum number of haplotypes necessary to explain a given set of genotypes. PPH has attracted more and more attention in recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from mapping complex disease genes to inferring population histories, passing through designing drugs, functional genomics and pharmacogenetics. In this article we investigate, for the first time, a recent version of PPH called the Pure Parsimony Haplotype problem under Uncertain Data (PPH-UD. This version mainly arises when the input genotypes are not accurate, i.e., when some single nucleotide polymorphisms are missing or affected by errors. We propose an exact approach to solution of PPH-UD based on an extended version of Catanzaro et al.[1] class representative model for PPH, currently the state-of-the-art integer programming model for PPH. The model is efficient, accurate, compact, polynomial-sized, easy to implement, solvable with any solver for mixed integer programming, and usable in all those cases for which the parsimony criterion is well suited for haplotype estimation.

  15. Fundamental problem of forensic mathematics--the evidential value of a rare haplotype.

    Science.gov (United States)

    Brenner, Charles H

    2010-10-01

    Y-chromosomal and mitochondrial haplotyping offer special advantages for criminal (and other) identification. For different reasons, each of them is sometimes detectable in a crime stain for which autosomal typing fails. But they also present special problems, including a fundamental mathematical one: When a rare haplotype is shared between suspect and crime scene, how strong is the evidence linking the two? Assume a reference population sample is available which contains n-1 haplotypes. The most interesting situation as well as the most common one is that the crime scene haplotype was never observed in the population sample. The traditional tools of product rule and sample frequency are not useful when there are no components to multiply and the sample frequency is zero. A useful statistic is the fraction κ of the population sample that consists of "singletons" - of once-observed types. A simple argument shows that the probability for a random innocent suspect to match a previously unobserved crime scene type is (1-κ)/n - distinctly less than 1/n, likely ten times less. The robust validity of this model is confirmed by testing it against a range of population models. This paper hinges above all on one key insight: probability is not frequency. The common but erroneous "frequency" approach adopts population frequency as a surrogate for matching probability and attempts the intractable problem of guessing how many instances exist of the specific haplotype at a certain crime. Probability, by contrast, depends by definition only on the available data. Hence if different haplotypes but with the same data occur in two different crimes, although the frequencies are different (and are hopelessly elusive), the matching probabilities are the same, and are not hard to find. Copyright © 2009 Elsevier Ireland Ltd. All rights reserved.

  16. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    Directory of Open Access Journals (Sweden)

    Schwartz Russell

    2008-08-01

    Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.

  17. iHAP – integrated haplotype analysis pipeline for characterizing the haplotype structure of genes

    Directory of Open Access Journals (Sweden)

    Lim Yun Ping

    2006-12-01

    Full Text Available Abstract Background The advent of genotype data from large-scale efforts that catalog the genetic variants of different populations have given rise to new avenues for multifactorial disease association studies. Recent work shows that genotype data from the International HapMap Project have a high degree of transferability to the wider population. This implies that the design of genotyping studies on local populations may be facilitated through inferences drawn from information contained in HapMap populations. Results To facilitate analysis of HapMap data for characterizing the haplotype structure of genes or any chromosomal regions, we have developed an integrated web-based resource, iHAP. In addition to incorporating genotype and haplotype data from the International HapMap Project and gene information from the UCSC Genome Browser Database, iHAP also provides capabilities for inferring haplotype blocks and selecting tag SNPs that are representative of haplotype patterns. These include block partitioning algorithms, block definitions, tag SNP definitions, as well as SNPs to be "force included" as tags. Based on the parameters defined at the input stage, iHAP performs on-the-fly analysis and displays the result graphically as a webpage. To facilitate analysis, intermediate and final result files can be downloaded. Conclusion The iHAP resource, available at http://ihap.bii.a-star.edu.sg, provides a convenient yet flexible approach for the user community to analyze HapMap data and identify candidate targets for genotyping studies.

  18. Haplotype phasing and inheritance of copy number variants in nuclear families.

    Science.gov (United States)

    Palta, Priit; Kaplinski, Lauris; Nagirnaja, Liina; Veidenberg, Andres; Möls, Märt; Nelis, Mari; Esko, Tõnu; Metspalu, Andres; Laan, Maris; Remm, Maido

    2015-01-01

    DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.

  19. Haplotype phasing and inheritance of copy number variants in nuclear families.

    Directory of Open Access Journals (Sweden)

    Priit Palta

    Full Text Available DNA copy number variants (CNVs that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i phase normal and CNV-carrying haplotypes in the copy number variable regions, ii resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.

  20. Haplotype association analysis of human disease traits using genotype data of unrelated individuals

    DEFF Research Database (Denmark)

    Tan, Qihua; Christiansen, Lene; Christensen, Kaare

    2005-01-01

    unphased multi-locus genotype data, ranging from the early approach by the simple gene-counting method to the recent work using the generalized linear model. However, these methods are either confined to case – control design or unable to yield unbiased point and interval estimates of haplotype effects....... Based on the popular logistic regression model, we present a new approach for haplotype association analysis of human disease traits. Using haplotype-based parameterization, our model infers the effects of specific haplotypes (point estimation) and constructs confidence interval for the risks...... on the well-known logistic regression model is a useful tool for haplotype association analysis of human disease traits....

  1. Problem solving and inference mechanisms

    Energy Technology Data Exchange (ETDEWEB)

    Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A

    1982-01-01

    The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.

  2. A general approach for haplotype phasing across the full spectrum of relatedness.

    Directory of Open Access Journals (Sweden)

    Jared O'Connell

    2014-04-01

    Full Text Available Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.

  3. Exact algorithms for haplotype assembly from whole-genome sequence data.

    Science.gov (United States)

    Chen, Zhi-Zhong; Deng, Fei; Wang, Lusheng

    2013-08-15

    Haplotypes play a crucial role in genetic analysis and have many applications such as gene disease diagnoses, association studies, ancestry inference and so forth. The development of DNA sequencing technologies makes it possible to obtain haplotypes from a set of aligned reads originated from both copies of a chromosome of a single individual. This approach is often known as haplotype assembly. Exact algorithms that can give optimal solutions to the haplotype assembly problem are highly demanded. Unfortunately, previous algorithms for this problem either fail to output optimal solutions or take too long time even executed on a PC cluster. We develop an approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model. Most of the previous approaches assume that the columns in the input matrix correspond to (putative) heterozygous sites. This all-heterozygous assumption is correct for most columns, but it may be incorrect for a small number of columns. In this article, we consider the MEC model with or without the all-heterozygous assumption. In our approach, we first use new methods to decompose the input read matrix into small independent blocks and then model the problem for each block as an integer linear programming problem, which is then solved by an integer linear programming solver. We have tested our program on a single PC [a Linux (x64) desktop PC with i7-3960X CPU], using the filtered HuRef and the NA 12878 datasets (after applying some variant calling methods). With the all-heterozygous assumption, our approach can optimally solve the whole HuRef data set within a total time of 31 h (26 h for the most difficult block of the 15th chromosome and only 5 h for the other blocks). To our knowledge, this is the first time that MEC optimal solutions are completely obtained for the filtered HuRef dataset. Moreover, in the general case (without the all-heterozygous assumption), for the HuRef dataset our

  4. Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR.

    Science.gov (United States)

    Tyson, Jess; Armour, John A L

    2012-12-11

    Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in) regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

  5. Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR

    Directory of Open Access Journals (Sweden)

    Tyson Jess

    2012-12-01

    Full Text Available Abstract Background Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. Results In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. Conclusion This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

  6. Approximation properties of haplotype tagging

    Directory of Open Access Journals (Sweden)

    Dreiseitl Stephan

    2006-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and approximation properties. Results It is shown that the tagging problem is NP-hard but approximable within 1 + ln((n2 - n/2 for n haplotypes but not approximable within (1 - ε ln(n/2 for any ε > 0 unless NP ⊂ DTIME(nlog log n. A simple, very easily implementable algorithm that exhibits the above upper bound on solution quality is presented. This algorithm has running time O((2m - p + 1 ≤ O(m(n2 - n/2 where p ≤ min(n, m for n haplotypes of size m. As we show that the approximation bound is asymptotically tight, the algorithm presented is optimal with respect to this asymptotic bound. Conclusion The haplotype tagging problem is hard, but approachable with a fast, practical, and surprisingly simple algorithm that cannot be significantly improved upon on a single processor machine. Hence, significant improvement in computatational efforts expended can only be expected if the computational effort is distributed and done in parallel.

  7. Shrinkage Estimators for Robust and Efficient Inference in Haplotype-Based Case-Control Studies

    KAUST Repository

    Chen, Yi-Hau; Chatterjee, Nilanjan; Carroll, Raymond J.

    2009-01-01

    Case-control association studies often aim to investigate the role of genes and gene-environment interactions in terms of the underlying haplotypes (i.e., the combinations of alleles at multiple genetic loci along chromosomal regions). The goal of this article is to develop robust but efficient approaches to the estimation of disease odds-ratio parameters associated with haplotypes and haplotype-environment interactions. We consider "shrinkage" estimation techniques that can adaptively relax the model assumptions of Hardy-Weinberg-Equilibrium and gene-environment independence required by recently proposed efficient "retrospective" methods. Our proposal involves first development of a novel retrospective approach to the analysis of case-control data, one that is robust to the nature of the gene-environment distribution in the underlying population. Next, it involves shrinkage of the robust retrospective estimator toward a more precise, but model-dependent, retrospective estimator using novel empirical Bayes and penalized regression techniques. Methods for variance estimation are proposed based on asymptotic theories. Simulations and two data examples illustrate both the robustness and efficiency of the proposed methods.

  8. Shrinkage Estimators for Robust and Efficient Inference in Haplotype-Based Case-Control Studies

    KAUST Repository

    Chen, Yi-Hau

    2009-03-01

    Case-control association studies often aim to investigate the role of genes and gene-environment interactions in terms of the underlying haplotypes (i.e., the combinations of alleles at multiple genetic loci along chromosomal regions). The goal of this article is to develop robust but efficient approaches to the estimation of disease odds-ratio parameters associated with haplotypes and haplotype-environment interactions. We consider "shrinkage" estimation techniques that can adaptively relax the model assumptions of Hardy-Weinberg-Equilibrium and gene-environment independence required by recently proposed efficient "retrospective" methods. Our proposal involves first development of a novel retrospective approach to the analysis of case-control data, one that is robust to the nature of the gene-environment distribution in the underlying population. Next, it involves shrinkage of the robust retrospective estimator toward a more precise, but model-dependent, retrospective estimator using novel empirical Bayes and penalized regression techniques. Methods for variance estimation are proposed based on asymptotic theories. Simulations and two data examples illustrate both the robustness and efficiency of the proposed methods.

  9. Assessment of network inference methods: how to cope with an underdetermined problem.

    Directory of Open Access Journals (Sweden)

    Caroline Siegenthaler

    Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.

  10. Haplotype reconstruction error as a classical misclassification problem: introducing sensitivity and specificity as error measures.

    Directory of Open Access Journals (Sweden)

    Claudia Lamina

    Full Text Available BACKGROUND: Statistically reconstructing haplotypes from single nucleotide polymorphism (SNP genotypes, can lead to falsely classified haplotypes. This can be an issue when interpreting haplotype association results or when selecting subjects with certain haplotypes for subsequent functional studies. It was our aim to quantify haplotype reconstruction error and to provide tools for it. METHODS AND RESULTS: By numerous simulation scenarios, we systematically investigated several error measures, including discrepancy, error rate, and R(2, and introduced the sensitivity and specificity to this context. We exemplified several measures in the KORA study, a large population-based study from Southern Germany. We find that the specificity is slightly reduced only for common haplotypes, while the sensitivity was decreased for some, but not all rare haplotypes. The overall error rate was generally increasing with increasing number of loci, increasing minor allele frequency of SNPs, decreasing correlation between the alleles and increasing ambiguity. CONCLUSIONS: We conclude that, with the analytical approach presented here, haplotype-specific error measures can be computed to gain insight into the haplotype uncertainty. This method provides the information, if a specific risk haplotype can be expected to be reconstructed with rather no or high misclassification and thus on the magnitude of expected bias in association estimates. We also illustrate that sensitivity and specificity separate two dimensions of the haplotype reconstruction error, which completely describe the misclassification matrix and thus provide the prerequisite for methods accounting for misclassification.

  11. HapCol: Accurate and Memory-efficient Haplotype Assembly from Long Reads

    NARCIS (Netherlands)

    Y. Pirola (Yuri); S. Zaccaria (Simone); R. Dondi (Riccardo); G.W. Klau (Gunnar); N. Pisanti (Nadia); P. Bonizzoni (Paola)

    2015-01-01

    htmlabstractMotivation: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from

  12. A spatial haplotype copying model with applications to genotype imputation.

    Science.gov (United States)

    Yang, Wen-Yun; Hormozdiari, Farhad; Eskin, Eleazar; Pasaniuc, Bogdan

    2015-05-01

    Ever since its introduction, the haplotype copy model has proven to be one of the most successful approaches for modeling genetic variation in human populations, with applications ranging from ancestry inference to genotype phasing and imputation. Motivated by coalescent theory, this approach assumes that any chromosome (haplotype) can be modeled as a mosaic of segments copied from a set of chromosomes sampled from the same population. At the core of the model is the assumption that any chromosome from the sample is equally likely to contribute a priori to the copying process. Motivated by recent works that model genetic variation in a geographic continuum, we propose a new spatial-aware haplotype copy model that jointly models geography and the haplotype copying process. We extend hidden Markov models of haplotype diversity such that at any given location, haplotypes that are closest in the genetic-geographic continuum map are a priori more likely to contribute to the copying process than distant ones. Through simulations starting from the 1000 Genomes data, we show that our model achieves superior accuracy in genotype imputation over the standard spatial-unaware haplotype copy model. In addition, we show the utility of our model in selecting a small personalized reference panel for imputation that leads to both improved accuracy as well as to a lower computational runtime than the standard approach. Finally, we show our proposed model can be used to localize individuals on the genetic-geographical map on the basis of their genotype data.

  13. HapCol : Accurate and memory-efficient haplotype assembly from long reads

    NARCIS (Netherlands)

    Pirola, Yuri; Zaccaria, Simone; Dondi, Riccardo; Klau, Gunnar W.; Pisanti, Nadia; Bonizzoni, Paola

    2016-01-01

    Motivation: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent

  14. Inferring mechanisms of copy number change from haplotype structures at the human DEFA1A3 locus.

    Science.gov (United States)

    Black, Holly A; Khan, Fayeza F; Tyson, Jess; Al Armour, John

    2014-07-21

    The determination of structural haplotypes at copy number variable regions can indicate the mechanisms responsible for changes in copy number, as well as explain the relationship between gene copy number and expression. However, obtaining spatial information at regions displaying extensive copy number variation, such as the DEFA1A3 locus, is complex, because of the difficulty in the phasing and assembly of these regions. The DEFA1A3 locus is intriguing in that it falls within a region of high linkage disequilibrium, despite its high variability in copy number (n = 3-16); hence, the mechanisms responsible for changes in copy number at this locus are unclear. In this study, a region flanking the DEFA1A3 locus was sequenced across 120 independent haplotypes with European ancestry, identifying five common classes of DEFA1A3 haplotype. Assigning DEFA1A3 class to haplotypes within the 1000 Genomes project highlights a significant difference in DEFA1A3 class frequencies between populations with different ancestry. The features of each DEFA1A3 class, for example, the associated DEFA1A3 copy numbers, were initially assessed in a European cohort (n = 599) and replicated in the 1000 Genomes samples, showing within-class similarity, but between-class and between-population differences in the features of the DEFA1A3 locus. Emulsion haplotype fusion-PCR was used to generate 61 structural haplotypes at the DEFA1A3 locus, showing a high within-class similarity in structure. Structural haplotypes across the DEFA1A3 locus indicate that intra-allelic rearrangement is the predominant mechanism responsible for changes in DEFA1A3 copy number, explaining the conservation of linkage disequilibrium across the locus. The identification of common structural haplotypes at the DEFA1A3 locus could aid studies into how DEFA1A3 copy number influences expression, which is currently unclear.

  15. Genetics of chloroquine-resistant malaria: a haplotypic view

    Directory of Open Access Journals (Sweden)

    Gauri Awasthi

    2013-12-01

    Full Text Available The development and rapid spread of chloroquine resistance (CQR in Plasmodium falciparum have triggered the identification of several genetic target(s in the P. falciparum genome. In particular, mutations in the Pfcrt gene, specifically, K76T and mutations in three other amino acids in the region adjoining K76 (residues 72, 74, 75 and 76, are considered to be highly related to CQR. These various mutations form several different haplotypes and Pfcrt gene polymorphisms and the global distribution of the different CQR- Pfcrt haplotypes in endemic and non-endemic regions of P. falciparum malaria have been the subject of extensive study. Despite the fact that the Pfcrt gene is considered to be the primary CQR gene in P. falciparum , several studies have suggested that this may not be the case. Furthermore, there is a poor correlation between the evolutionary implications of the Pfcrt haplotypes and the inferred migration of CQR P. falciparum based on CQR epidemiological surveillance data. The present paper aims to clarify the existing knowledge on the genetic basis of the different CQR- Pfcrt haplotypes that are prevalent in worldwide populations based on the published literature and to analyse the data to generate hypotheses on the genetics and evolution of CQR malaria.

  16. Analysis of Molecular Variance Inferred from Metric Distances among DNA Haplotypes: Application to Human Mitochondrial DNA Restriction Data

    OpenAIRE

    Excoffier, L.; Smouse, P. E.; Quattro, J. M.

    1992-01-01

    We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as φ-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivisi...

  17. Mapping Haplotype-haplotype Interactions with Adaptive LASSO

    Directory of Open Access Journals (Sweden)

    Li Ming

    2010-08-01

    Full Text Available Abstract Background The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods for modeling interactions underlying complex diseases between single genetic variants (e.g. single nucleotide polymorphisms or SNPs have been extensively studied. Recently, haplotype-based analysis has gained its popularity among genetic association studies. When multiple sequence or haplotype interactions are involved in determining an individual's susceptibility to a disease, it presents daunting challenges in statistical modeling and testing of the interaction effects, largely due to the complicated higher order epistatic complexity. Results In this article, we propose a new strategy in modeling haplotype-haplotype interactions under the penalized logistic regression framework with adaptive L1-penalty. We consider interactions of sequence variants between haplotype blocks. The adaptive L1-penalty allows simultaneous effect estimation and variable selection in a single model. We propose a new parameter estimation method which estimates and selects parameters by the modified Gauss-Seidel method nested within the EM algorithm. Simulation studies show that it has low false positive rate and reasonable power in detecting haplotype interactions. The method is applied to test haplotype interactions involved in mother and offspring genome in a small for gestational age (SGA neonates data set, and significant interactions between different genomes are detected. Conclusions As demonstrated by the simulation studies and real data analysis, the approach developed provides an efficient tool for the modeling and testing of haplotype interactions. The implementation of the method in R codes can be

  18. A linear programming model for protein inference problem in shotgun proteomics.

    Science.gov (United States)

    Huang, Ting; He, Zengyou

    2012-11-15

    Assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is an important issue in shotgun proteomics. The objective of protein inference is to find a subset of proteins that are truly present in the sample. Although many methods have been proposed for protein inference, several issues such as peptide degeneracy still remain unsolved. In this article, we present a linear programming model for protein inference. In this model, we use a transformation of the joint probability that each peptide/protein pair is present in the sample as the variable. Then, both the peptide probability and protein probability can be expressed as a formula in terms of the linear combination of these variables. Based on this simple fact, the protein inference problem is formulated as an optimization problem: minimize the number of proteins with non-zero probabilities under the constraint that the difference between the calculated peptide probability and the peptide probability generated from peptide identification algorithms should be less than some threshold. This model addresses the peptide degeneracy issue by forcing some joint probability variables involving degenerate peptides to be zero in a rigorous manner. The corresponding inference algorithm is named as ProteinLP. We test the performance of ProteinLP on six datasets. Experimental results show that our method is competitive with the state-of-the-art protein inference algorithms. The source code of our algorithm is available at: https://sourceforge.net/projects/prolp/. zyhe@dlut.edu.cn. Supplementary data are available at Bioinformatics Online.

  19. Haplotype structure in Ashkenazi Jewish BRCA1 and BRCA2 mutation carriers

    DEFF Research Database (Denmark)

    Im, Kate M; Kirchhoff, Tomas; Wang, Xianshu

    2011-01-01

    Three founder mutations in BRCA1 and BRCA2 contribute to the risk of hereditary breast and ovarian cancer in Ashkenazi Jews (AJ). They are observed at increased frequency in the AJ compared to other BRCA mutations in Caucasian non-Jews (CNJ). Several authors have proposed that elevated allele...... the tools of statistical genomics to examine the likelihood of long-range LD at a deleterious locus in a population that faced a genetic bottleneck. We studied the genotypes of hundreds of women from a large international consortium of BRCA1 and BRCA2 mutation carriers and found that AJ women exhibited long......-range haplotypes compared to CNJ women. More than 50% of the AJ chromosomes with the BRCA1 185delAG mutation share an identical 2.1 Mb haplotype and nearly 16% of AJ chromosomes carrying the BRCA2 6174delT mutation share a 1.4 Mb haplotype. Simulations based on the best inference of Ashkenazi population demography...

  20. Three Novel Haplotypes of Theileria bicornis in Black and White Rhinoceros in Kenya.

    Science.gov (United States)

    Otiende, M Y; Kivata, M W; Jowers, M J; Makumi, J N; Runo, S; Obanda, V; Gakuya, F; Mutinda, M; Kariuki, L; Alasaad, S

    2016-02-01

    Piroplasms, especially those in the genera Babesia and Theileria, have been found to naturally infect rhinoceros. Due to natural or human-induced stress factors such as capture and translocations, animals often develop fatal clinical piroplasmosis, which causes death if not treated. This study examines the genetic diversity and occurrence of novel Theileria species infecting both black and white rhinoceros in Kenya. Samples collected opportunistically during routine translocations and clinical interventions from 15 rhinoceros were analysed by polymerase chain reaction (PCR) using a nested amplification of the small subunit ribosomal RNA (18S rRNA) gene fragments of Babesia and Theileria. Our study revealed for the first time in Kenya the presence of Theileria bicornis in white (Ceratotherium simum simum) and black (Diceros bicornis michaeli) rhinoceros and the existence of three new haplotypes: haplotypes H1 and H3 were present in white rhinoceros, while H2 was present in black rhinoceros. No specific haplotype was correlated to any specific geographical location. The Bayesian inference 50% consensus phylogram recovered the three haplotypes monophyleticly, and Theileria bicornis had very high support (BPP: 0.98). Furthermore, the genetic p-uncorrected distances and substitutions between T. bicornis and the three haplotypes were the same in all three haplotypes, indicating a very close genetic affinity. This is the first report of the occurrence of Theileria species in white and black rhinoceros from Kenya. The three new haplotypes reported here for the first time have important ecological and conservational implications, especially for population management and translocation programs and as a means of avoiding the transport of infected animals into non-affected areas. © 2014 Blackwell Verlag GmbH.

  1. Human cytochrome P450 2B6 genetic variability in Botswana: a case of haplotype diversity and convergent phenotypes

    KAUST Repository

    Tawe, Leabaneng

    2018-03-14

    Identification of inter-individual variability for drug metabolism through cytochrome P450 2B6 (CYP2B6) enzyme is important for understanding the differences in clinical responses to malaria and HIV. This study evaluates the distribution of CYP2B6 alleles, haplotypes and inferred metabolic phenotypes among subjects with different ethnicity in Botswana. A total of 570 subjects were analyzed for CYP2B6 polymorphisms at position 516 G > T (rs3745274), 785 A > G (rs2279343) and 983 T > C (rs28399499). Samples were collected in three districts of Botswana where the population belongs to Bantu (Serowe/Palapye and Chobe) and San-related (Ghanzi) ethnicity. The three districts showed different haplotype composition according to the ethnic background but similar metabolic inferred phenotypes, with 59.12%, 34.56%, 2.10% and 4.21% of the subjects having, respectively, an extensive, intermediate, slow and rapid metabolic profile. The results hint at the possibility of a convergent adaptation of detoxifying metabolic phenotypes despite a different haplotype structure due to the different genetic background. The main implication is that, while there is substantial homogeneity of metabolic inferred phenotypes among the country, the response to drugs metabolized via CYP2B6 could be individually associated to an increased risk of treatment failure and toxicity. These are important facts since Botswana is facing malaria elimination and a very high HIV prevalence.

  2. Human cytochrome P450 2B6 genetic variability in Botswana: a case of haplotype diversity and convergent phenotypes

    KAUST Repository

    Tawe, Leabaneng; Motshoge, Thato; Ramatlho, Pleasure; Mutukwa, Naledi; Muthoga, Charles Waithaka; Dongho, Ghyslaine Bruna Djeunang; Martinelli, Axel; Peloewetse, Elias; Russo, Gianluca; Quaye, Isaac Kweku; Paganotti, Giacomo Maria

    2018-01-01

    Identification of inter-individual variability for drug metabolism through cytochrome P450 2B6 (CYP2B6) enzyme is important for understanding the differences in clinical responses to malaria and HIV. This study evaluates the distribution of CYP2B6 alleles, haplotypes and inferred metabolic phenotypes among subjects with different ethnicity in Botswana. A total of 570 subjects were analyzed for CYP2B6 polymorphisms at position 516 G > T (rs3745274), 785 A > G (rs2279343) and 983 T > C (rs28399499). Samples were collected in three districts of Botswana where the population belongs to Bantu (Serowe/Palapye and Chobe) and San-related (Ghanzi) ethnicity. The three districts showed different haplotype composition according to the ethnic background but similar metabolic inferred phenotypes, with 59.12%, 34.56%, 2.10% and 4.21% of the subjects having, respectively, an extensive, intermediate, slow and rapid metabolic profile. The results hint at the possibility of a convergent adaptation of detoxifying metabolic phenotypes despite a different haplotype structure due to the different genetic background. The main implication is that, while there is substantial homogeneity of metabolic inferred phenotypes among the country, the response to drugs metabolized via CYP2B6 could be individually associated to an increased risk of treatment failure and toxicity. These are important facts since Botswana is facing malaria elimination and a very high HIV prevalence.

  3. Effects analysis fuzzy inference system in nuclear problems using approximate reasoning

    International Nuclear Information System (INIS)

    Guimaraes, Antonio C.F.; Franklin Lapa, Celso Marcelo

    2004-01-01

    In this paper a fuzzy inference system modeling technique applied on failure mode and effects analysis (FMEA) is introduced in reactor nuclear problems. This method uses the concept of a pure fuzzy logic system to treat the traditional FMEA parameters: probabilities of occurrence, severity and detection. The auxiliary feed-water system of a typical two-loop pressurized water reactor (PWR) was used as practical example in this analysis. The kernel result is the conceptual confrontation among the traditional risk priority number (RPN) and the fuzzy risk priority number (FRPN) obtained from experts opinion. The set of results demonstrated the great potential of the inference system and advantage of the gray approach in this class of problems

  4. Problem Solving as Probabilistic Inference with Subgoaling: Explaining Human Successes and Pitfalls in the Tower of Hanoi.

    Science.gov (United States)

    Donnarumma, Francesco; Maisto, Domenico; Pezzulo, Giovanni

    2016-04-01

    How do humans and other animals face novel problems for which predefined solutions are not available? Human problem solving links to flexible reasoning and inference rather than to slow trial-and-error learning. It has received considerable attention since the early days of cognitive science, giving rise to well known cognitive architectures such as SOAR and ACT-R, but its computational and brain mechanisms remain incompletely known. Furthermore, it is still unclear whether problem solving is a "specialized" domain or module of cognition, in the sense that it requires computations that are fundamentally different from those supporting perception and action systems. Here we advance a novel view of human problem solving as probabilistic inference with subgoaling. In this perspective, key insights from cognitive architectures are retained such as the importance of using subgoals to split problems into subproblems. However, here the underlying computations use probabilistic inference methods analogous to those that are increasingly popular in the study of perception and action systems. To test our model we focus on the widely used Tower of Hanoi (ToH) task, and show that our proposed method can reproduce characteristic idiosyncrasies of human problem solvers: their sensitivity to the "community structure" of the ToH and their difficulties in executing so-called "counterintuitive" movements. Our analysis reveals that subgoals have two key roles in probabilistic inference and problem solving. First, prior beliefs on (likely) useful subgoals carve the problem space and define an implicit metric for the problem at hand-a metric to which humans are sensitive. Second, subgoals are used as waypoints in the probabilistic problem solving inference and permit to find effective solutions that, when unavailable, lead to problem solving deficits. Our study thus suggests that a probabilistic inference scheme enhanced with subgoals provides a comprehensive framework to study problem

  5. Influence of promoter/enhancer region haplotypes on MGMT transcriptional regulation: a potential biomarker for human sensitivity to alkylating agents.

    Science.gov (United States)

    Xu, Meixiang; Nekhayeva, Ilona; Cross, Courtney E; Rondelli, Catherine M; Wickliffe, Jeffrey K; Abdel-Rahman, Sherif Z

    2014-03-01

    The O6-methylguanine-DNA methyltransferase gene (MGMT) encodes the direct reversal DNA repair protein that removes alkyl adducts from the O6 position of guanine. Several single-nucleotide polymorphisms (SNPs) exist in the MGMT promoter/enhancer (P/E) region. However, the haplotype structure encompassing these SNPs and their functional/biological significance are currently unknown. We hypothesized that MGMT P/E haplotypes, rather than individual SNPs, alter MGMT transcription and can thus alter human sensitivity to alkylating agents. To identify the haplotype structure encompassing the MGMT P/E region SNPs, we sequenced 104 DNA samples from healthy individuals and inferred the haplotypes using the data generated. We identified eight SNPs in this region, namely T7C (rs180989103), T135G (rs1711646), G290A (rs61859810), C485A (rs1625649), C575A (rs113813075), G666A (rs34180180), C777A (rs34138162) and C1099T (rs16906252). Phylogenetics and Sequence Evolution analysis predicted 21 potential haplotypes that encompass these SNPs ranging in frequencies from 0.000048 to 0.39. Of these, 10 were identified in our study population as 20 paired haplotype combinations. To determine the functional significance of these haplotypes, luciferase reporter constructs representing these haplotypes were transfected into glioblastoma cells and their effect on MGMT promoter activity was determined. Compared with the most common (reference) haplotype 1, seven haplotypes significantly upregulated MGMT promoter activity (18-119% increase; P alkylating agents.

  6. Genetic differences in the two main groups of the Japanese population based on autosomal SNPs and haplotypes.

    Science.gov (United States)

    Yamaguchi-Kabata, Yumi; Tsunoda, Tatsuhiko; Kumasaka, Natsuhiko; Takahashi, Atsushi; Hosono, Naoya; Kubo, Michiaki; Nakamura, Yusuke; Kamatani, Naoyuki

    2012-05-01

    Although the Japanese population has a rather low genetic diversity, we recently confirmed the presence of two main clusters (the Hondo and Ryukyu clusters) through principal component analysis of genome-wide single-nucleotide polymorphism (SNP) genotypes. Understanding the genetic differences between the two main clusters requires further genome-wide analyses based on a dense SNP set and comparison of haplotype frequencies. In the present study, we determined haplotypes for the Hondo cluster of the Japanese population by detecting SNP homozygotes with 388,591 autosomal SNPs from 18,379 individuals and estimated the haplotype frequencies. Haplotypes for the Ryukyu cluster were inferred by a statistical approach using the genotype data from 504 individuals. We then compared the haplotype frequencies between the Hondo and Ryukyu clusters. In most genomic regions, the haplotype frequencies in the Hondo and Ryukyu clusters were very similar. However, in addition to the human leukocyte antigen region on chromosome 6, other genomic regions (chromosomes 3, 4, 5, 7, 10 and 12) showed dissimilarities in haplotype frequency. These regions were enriched for genes involved in the immune system, cell-cell adhesion and the intracellular signaling cascade. These differentiated genomic regions between the Hondo and Ryukyu clusters are of interest because they (1) should be examined carefully in association studies and (2) likely contain genes responsible for morphological or physiological differences between the two groups.

  7. Haplotype assembly in polyploid genomes and identical by descent shared tracts.

    Science.gov (United States)

    Aguiar, Derek; Istrail, Sorin

    2013-07-01

    Genome-wide haplotype reconstruction from sequence data, or haplotype assembly, is at the center of major challenges in molecular biology and life sciences. For complex eukaryotic organisms like humans, the genome is vast and the population samples are growing so rapidly that algorithms processing high-throughput sequencing data must scale favorably in terms of both accuracy and computational efficiency. Furthermore, current models and methodologies for haplotype assembly (i) do not consider individuals sharing haplotypes jointly, which reduces the size and accuracy of assembled haplotypes, and (ii) are unable to model genomes having more than two sets of homologous chromosomes (polyploidy). Polyploid organisms are increasingly becoming the target of many research groups interested in the genomics of disease, phylogenetics, botany and evolution but there is an absence of theory and methods for polyploid haplotype reconstruction. In this work, we present a number of results, extensions and generalizations of compass graphs and our HapCompass framework. We prove the theoretical complexity of two haplotype assembly optimizations, thereby motivating the use of heuristics. Furthermore, we present graph theory-based algorithms for the problem of haplotype assembly using our previously developed HapCompass framework for (i) novel implementations of haplotype assembly optimizations (minimum error correction), (ii) assembly of a pair of individuals sharing a haplotype tract identical by descent and (iii) assembly of polyploid genomes. We evaluate our methods on 1000 Genomes Project, Pacific Biosciences and simulated sequence data. HapCompass is available for download at http://www.brown.edu/Research/Istrail_Lab/. Supplementary data are available at Bioinformatics online.

  8. Haplotypes of CYP3A4 and their close linkage with CYP3A5 haplotypes in a Japanese population.

    Science.gov (United States)

    Fukushima-Uesaka, Hiromi; Saito, Yoshiro; Watanabe, Hidemi; Shiseki, Kisho; Saeki, Mayumi; Nakamura, Takahiro; Kurose, Kouichi; Sai, Kimie; Komamura, Kazuo; Ueno, Kazuyuki; Kamakura, Shiro; Kitakaze, Masafumi; Hanai, Sotaro; Nakajima, Toshiharu; Matsumoto, Kenji; Saito, Hirohisa; Goto, Yu-ichi; Kimura, Hideo; Katoh, Masaaki; Sugai, Kenji; Minami, Narihiro; Shirao, Kuniaki; Tamura, Tomohide; Yamamoto, Noboru; Minami, Hironobu; Ohtsu, Atsushi; Yoshida, Teruhiko; Saijo, Nagahiro; Kitamura, Yutaka; Kamatani, Naoyuki; Ozawa, Shogo; Sawada, Jun-ichi

    2004-01-01

    In order to identify single nucleotide polymorphisms (SNPs) and haplotype frequencies of CYP3A4 in a Japanese population, the distal enhancer and proximal promoter regions, all exons, and the surrounding introns were sequenced from genomic DNA of 416 Japanese subjects. We found 24 SNPs, including 17 novel ones: two in the distal enhancer, four in the proximal promoter, one in the 5'-untranslated region (UTR), seven in the introns, and three in the 3'-UTR. The most common SNP was c.1026+12G>A (IVS10+12G>A), with a 0.249 frequency. Four non-synonymous SNPs, c.554C>G (p.T185S, CYP3A4(*)16), c.830_831insA (p.E277fsX8, (*)6), c.878T>C (p.L293P, (*)18), and c.1088 C>T (p.T363M, (*)11) were found with frequencies of 0.014, 0.001, 0.028, and 0.002, respectively. No SNP was found in the known nuclear transcriptional factor-binding sites in the enhancer and promoter regions. Using these 24 SNPs, 16 haplotypes were unambiguously identified, and nine haplotypes were inferred by aid of an expectation-maximization-based program. In addition, using data from 186 subjects enabled a close linkage to be found between CYP3A4 and CYP3A5 SNPs, especially among the SNPs at c.1026+12 in CYP3A4 and c.219-237 (IVS3-237, a key SNP site for CYP3A5(*)3), c.865+77 (IVS9+77) and c.1523 in CYP3A5. This result suggested that CYP3A4 and CYP3A5 are within the same gene block. Haplotype analysis between CYP3A4 and CYP3A5 revealed several major haplotype combinations in the CYP3A4-CYP3A5 block. Our findings provide fundamental and useful information for genotyping CYP3A4 (and CYP3A5) in the Japanese, and probably Asian populations. Copyright 2003 Wiley-Liss, Inc.

  9. Divide et impera: subgoaling reduces the complexity of probabilistic inference and problem solving.

    Science.gov (United States)

    Maisto, Domenico; Donnarumma, Francesco; Pezzulo, Giovanni

    2015-03-06

    It has long been recognized that humans (and possibly other animals) usually break problems down into smaller and more manageable problems using subgoals. Despite a general consensus that subgoaling helps problem solving, it is still unclear what the mechanisms guiding online subgoal selection are during the solution of novel problems for which predefined solutions are not available. Under which conditions does subgoaling lead to optimal behaviour? When is subgoaling better than solving a problem from start to finish? Which is the best number and sequence of subgoals to solve a given problem? How are these subgoals selected during online inference? Here, we present a computational account of subgoaling in problem solving. Following Occam's razor, we propose that good subgoals are those that permit planning solutions and controlling behaviour using less information resources, thus yielding parsimony in inference and control. We implement this principle using approximate probabilistic inference: subgoals are selected using a sampling method that considers the descriptive complexity of the resulting sub-problems. We validate the proposed method using a standard reinforcement learning benchmark (four-rooms scenario) and show that the proposed method requires less inferential steps and permits selecting more compact control programs compared to an equivalent procedure without subgoaling. Furthermore, we show that the proposed method offers a mechanistic explanation of the neuronal dynamics found in the prefrontal cortex of monkeys that solve planning problems. Our computational framework provides a novel integrative perspective on subgoaling and its adaptive advantages for planning, control and learning, such as for example lowering cognitive effort and working memory load. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  10. "HOOF-Print" Genotyping and Haplotype Inference Discriminates among Brucella spp Isolates From a Small Spatial Scale

    Science.gov (United States)

    We demonstrate that the “HOOF-Print” assay provides high power to discriminate among Brucella isolates collected on a small spatial scale (within Portugal). Additionally, we illustrate how haplotype identification using non-random association among markers allows resolution of B. melitensis biovars ...

  11. Combinatorial aspects of genome rearrangements and haplotype networks

    OpenAIRE

    Labarre , Anthony

    2008-01-01

    The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks. Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing t...

  12. Haplotype-Based Genotyping in Polyploids

    Directory of Open Access Journals (Sweden)

    Josh P. Clevenger

    2018-04-01

    Full Text Available Accurate identification of polymorphisms from sequence data is crucial to unlocking the potential of high throughput sequencing for genomics. Single nucleotide polymorphisms (SNPs are difficult to accurately identify in polyploid crops due to the duplicative nature of polyploid genomes leading to low confidence in the true alignment of short reads. Implementing a haplotype-based method in contrasting subgenome-specific sequences leads to higher accuracy of SNP identification in polyploids. To test this method, a large-scale 48K SNP array (Axiom Arachis2 was developed for Arachis hypogaea (peanut, an allotetraploid, in which 1,674 haplotype-based SNPs were included. Results of the array show that 74% of the haplotype-based SNP markers could be validated, which is considerably higher than previous methods used for peanut. The haplotype method has been implemented in a standalone program, HAPLOSWEEP, which takes as input bam files and a vcf file and identifies haplotype-based markers. Haplotype discovery can be made within single reads or span paired reads, and can leverage long read technology by targeting any length of haplotype. Haplotype-based genotyping is applicable in all allopolyploid genomes and provides confidence in marker identification and in silico-based genotyping for polyploid genomics.

  13. A combinatorial perspective of the protein inference problem.

    Science.gov (United States)

    Yang, Chao; He, Zengyou; Yu, Weichuan

    2013-01-01

    In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from peptide identification results. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we devote ourselves to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound, and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain an analytical expression for protein inference. Our method achieves comparable results with ProteinProphet in a more efficient manner in experiments on two data sets of standard protein mixtures and two data sets of real samples. Based on our model, we study the impact of unique peptides and degenerate peptides (degenerate peptides are peptides shared by at least two proteins) on protein probabilities. Meanwhile, we also study the relationship between our model and ProteinProphet. We name our program ProteinInfer. Its Java source code, our supplementary document and experimental results are available at: >http://bioinformatics.ust.hk/proteininfer.

  14. [Construction of haplotype and haplotype block based on tag single nucleotide polymorphisms and their applications in association studies].

    Science.gov (United States)

    Gu, Ming-liang; Chu, Jia-you

    2007-12-01

    Human genome has structures of haplotype and haplotype block which provide valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to complex diseases. Haplotype block can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of ptag SNPsq can be used to distinguish a large fraction of the haplotypes. These tag SNPs can be potentially useful for construction of haplotype and haplotype block, and association studies in complex diseases. There are two general classes of methods to construct haplotype and haplotype blocks based on genotypes on large pedigrees and statistical algorithms respectively. The author evaluate several construction methods to assess the power of different association tests with a variety of disease models and block-partitioning criteria. The advantages, limitations and applications of each method and the application in the association studies are discussed equitably. With the completion of the HapMap and development of statistical algorithms for addressing haplotype reconstruction, ideas of construction of haplotype based on combination of mathematics, physics, and computer science etc will have profound impacts on population genetics, location and cloning for susceptible genes in complex diseases, and related domain with life science etc.

  15. Genomic sequence of 'Candidatus Liberibacter solanacearum' haplotype C and its comparison with haplotype A and B genomes.

    Directory of Open Access Journals (Sweden)

    Jinhui Wang

    Full Text Available Haplotypes A and B of 'Candidatus Liberibacter solanacearum' (CLso are associated with diseases of solanaceous plants, especially Zebra chip disease of potato, and haplotypes C, D and E are associated with symptoms on apiaceous plants. To date, one complete genome of haplotype B and two high quality draft genomes of haplotype A have been obtained for these unculturable bacteria using metagenomics from the psyllid vector Bactericera cockerelli. Here, we present the first genomic sequences obtained for the carrot-associated CLso. These two genomic sequences of haplotype C, FIN114 (1.24 Mbp and FIN111 (1.20 Mbp, were obtained from carrot psyllids (Trioza apicalis harboring CLso. Genomic comparisons between the haplotypes A, B and C revealed that the genome organization differs between these haplotypes, due to large inversions and other recombinations. Comparison of protein-coding genes indicated that the core genome of CLso consists of 885 ortholog groups, with the pan-genome consisting of 1327 ortholog groups. Twenty-seven ortholog groups are unique to CLso haplotype C, whilst 11 ortholog groups shared by the haplotypes A and B, are not found in the haplotype C. Some of these ortholog groups that are not part of the core genome may encode functions related to interactions with the different host plant and psyllid species.

  16. Spatial and temporal distribution of the neutral polymorphisms in the last ZFX intron: analysis of the haplotype structure and genealogy.

    Science.gov (United States)

    Jaruzelska, J; Zietkiewicz, E; Batzer, M; Cole, D E; Moisan, J P; Scozzari, R; Tavaré, S; Labuda, D

    1999-07-01

    With 10 segregating sites (simple nucleotide polymorphisms) in the last intron (1089 bp) of the ZFX gene we have observed 11 haplotypes in 336 chromosomes representing a worldwide array of 15 human populations. Two haplotypes representing 77% of all chromosomes were distributed almost evenly among four continents. Five of the remaining haplotypes were detected in Africa and 4 others were restricted to Eurasia and the Americas. Using the information about the ancestral state of the segregating positions (inferred from human-great ape comparisons), we applied coalescent analysis to estimate the age of the polymorphisms and the resulting haplotypes. The oldest haplotype, with the ancestral alleles at all the sites, was observed at low frequency only in two groups of African origin. Its estimated age of 740 to 1100 kyr corresponded to the time to the most recent common ancestor. The two most frequent worldwide distributed haplotypes were estimated at 550 to 840 and 260 to 400 kyr, respectively, while the age of the continentally restricted polymorphisms was 120 to 180 kyr and smaller. Comparison of spatial and temporal distribution of the ZFX haplotypes suggests that modern humans diverged from the common ancestral stock in the Middle Paleolithic era. Subsequent range expansion prevented substantial gene flow among continents, separating African groups from populations that colonized Eurasia and the New World.

  17. Differentiation analysis for estimating individual ancestry from the Tibetan Plateau by an archaic altitude adaptation EPAS1 haplotype among East Asian populations.

    Science.gov (United States)

    Jiang, Li; Peng, Jianxiong; Huang, Meisha; Liu, Jing; Wang, Ling; Ma, Quan; Zhao, Hui; Yang, Xin; Ji, Anquan; Li, Caixia

    2018-02-10

    Tibetans have adapted to the extreme environment of high altitude for hundreds of generations. A highly differentiated 5-SNP (Single Nucleotide Polymorphism) haplotype motif (AGGAA) on a hypoxic pathway gene, EPAS1, is observed in Tibetans and lowlanders. To evaluate the potential usage of the 5-SNP haplotype in ancestry inference for Tibetan or Tibetan-related populations, we analyzed this haplotype in 1053 individuals of 12 Chinese populations residing on the Tibetan Plateau, peripheral regions of Tibet, and plain regions. These data were integrated with the genotypes from the 1000 Genome populations and populations in a previously reported paper for population structure analyses. We found that populations representing highland and lowland groups have different dominant ancestry components. The core Denisovan haplotype (AGGAA) was observed at a frequency of 72.32% in the Tibetan Plateau, with a frequency range from 9.48 to 21.05% in the peripheral regions and Tibetan Plateau carried the archaic haplotype, while < 5% of the Chinese Han people carried the haplotype. Our findings indicate that the 5-SNP haplotype has a special distribution pattern in populations of Tibet and peripheral regions and could be integrated into AISNP (Ancestry Informative Single Nucleotide Polymorphism) panels to enhance ancestry resolution.

  18. Two haplotype clusters of Echinococcus granulosus sensu stricto in northern Iraq (Kurdistan region) support the hypothesis of a parasite cradle in the Middle East.

    Science.gov (United States)

    Hassan, Zuber Ismael; Meerkhan, Azad Abdullah; Boufana, Belgees; Hama, Abdullah A; Ahmed, Bayram Dawod; Mero, Wijdan Mohammed Salih; Orsten, Serra; Interisano, Maria; Pozio, Edoardo; Casulli, Adriano

    2017-08-01

    Human cystic echinococcosis (CE) caused by Echinococcus granulosus s.s. is a major public health problem in Iraqi Kurdistan with a reported surgical incidence of 6.3 per 100,000 Arbil inhabitants. A total of 125 Echinococcus isolates retrieved from sheep, goats and cattle were used in this study. Our aim was to determine species/genotypes infecting livestock in Iraqi Kurdistan and examine intraspecific variation and population structure of Echinococcus granulosus s.s. in this region and relate it to that of other regions worldwide. Using nucleotide sequences of the mitochondrial cytochrome c oxidase subunit 1 (cox 1) we identified E. granulosus s.s. as the cause of hydatidosis in all examined animals. The haplotype network displayed a double-clustered topology with two main E. granulosus s.s. haplotypes, (KU05) and (KU33). The 'founder' haplotype (KU05) confirmed the presence of a common lineage of non-genetically differentiated populations as inferred by the low non-significant fixation index values. Overall diversity and neutrality indices indicated demographic expansion. We used E. granulosus s.s. nucleotide sequences from GenBank to draw haplotype networks for the Middle East (Iran, Jordan and Turkey), Europe (Albania, Greece, Italy, Romania and Spain), China, Mongolia, Russia, South America (Argentina, Brazil, Chile and Mexico) and Tunisia. Networks with two haplotype clusters like that reported here for Iraqi Kurdistan were seen for the Middle East, Europe, Mongolia, Russia and Tunisia using both 827bp and 1609bp cox1 nucleotide sequences, whereas a star-like network was observed for China and South America. We hypothesize that the double clustering seen at what is generally assumed to be the cradle of domestication may have emerged independently and dispersed from the Middle East to other regions and that haplotype (KU33) may be the main haplotype within a second cluster in the Middle East from where it has spread into Europe, Mongolia, Russia and North

  19. Plausible inference: A multi-valued logic for problem solving

    Science.gov (United States)

    Friedman, L.

    1979-01-01

    A new logic is developed which permits continuously variable strength of belief in the truth of assertions. Four inference rules result, with formal logic as a limiting case. Quantification of belief is defined. Propagation of belief to linked assertions results from dependency-based techniques of truth maintenance so that local consistency is achieved or contradiction discovered in problem solving. Rules for combining, confirming, or disconfirming beliefs are given, and several heuristics are suggested that apply to revising already formed beliefs in the light of new evidence. The strength of belief that results in such revisions based on conflicting evidence are a highly subjective phenomenon. Certain quantification rules appear to reflect an orderliness in the subjectivity. Several examples of reasoning by plausible inference are given, including a legal example and one from robot learning. Propagation of belief takes place in directions forbidden in formal logic and this results in conclusions becoming possible for a given set of assertions that are not reachable by formal logic.

  20. SLC22A1-ABCB1 haplotype profiles predict imatinib pharmacokinetics in Asian patients with chronic myeloid leukemia.

    Directory of Open Access Journals (Sweden)

    Onkar Singh

    Full Text Available OBJECTIVE: This study aimed to explore the influence of SLC22A1, PXR, ABCG2, ABCB1 and CYP3A5 3 genetic polymorphisms on imatinib mesylate (IM pharmacokinetics in Asian patients with chronic myeloid leukemia (CML. PATIENTS AND METHODS: Healthy subjects belonging to three Asian populations (Chinese, Malay, Indian; n = 70 each and CML patients (n = 38 were enrolled in a prospective pharmacogenetics study. Imatinib trough (C(0h and clearance (CL were determined in the patients at steady state. Haplowalk method was applied to infer the haplotypes and generalized linear model (GLM to estimate haplotypic effects on IM pharmacokinetics. Association of haplotype copy numbers with IM pharmacokinetics was defined by Mann-Whitney U test. RESULTS: Global haplotype score statistics revealed a SLC22A1 sub-haplotypic region encompassing three polymorphisms (rs3798168, rs628031 and IVS7+850C>T, to be significantly associated with IM clearance (p = 0.013. Haplotype-specific GLM estimated that the haplotypes AGT and CGC were both associated with 22% decrease in clearance compared to CAC [CL (10(-2 L/hr/mg: CAC vs AGT: 4.03 vs 3.16, p = 0.017; CAC vs CGC: 4.03 vs 3.15, p = 0.017]. Patients harboring 2 copies of AGT or CGC haplotypes had 33.4% lower clearance and 50% higher C(0h than patients carrying 0 or 1 copy [CL (10(-2 L/hr/mg: 2.19 vs 3.29, p = 0.026; C(0h (10(-6 1/ml: 4.76 vs 3.17, p = 0.013, respectively]. Further subgroup analysis revealed SLC22A1 and ABCB1 haplotypic combinations to be significantly associated with clearance and C(0h (p = 0.002 and 0.009, respectively. CONCLUSION: This exploratory study suggests that SLC22A1-ABCB1 haplotypes may influence IM pharmacokinetics in Asian CML patients.

  1. Novel full-length major histocompatibility complex class I allele discovery and haplotype definition in pig-tailed macaques.

    Science.gov (United States)

    Semler, Matthew R; Wiseman, Roger W; Karl, Julie A; Graham, Michael E; Gieger, Samantha M; O'Connor, David H

    2017-11-13

    Pig-tailed macaques (Macaca nemestrina, Mane) are important models for human immunodeficiency virus (HIV) studies. Their infectability with minimally modified HIV makes them a uniquely valuable animal model to mimic human infection with HIV and progression to acquired immunodeficiency syndrome (AIDS). However, variation in the pig-tailed macaque major histocompatibility complex (MHC) and the impact of individual transcripts on the pathogenesis of HIV and other infectious diseases is understudied compared to that of rhesus and cynomolgus macaques. In this study, we used Pacific Biosciences single-molecule real-time circular consensus sequencing to describe full-length MHC class I (MHC-I) transcripts for 194 pig-tailed macaques from three breeding centers. We then used the full-length sequences to infer Mane-A and Mane-B haplotypes containing groups of MHC-I transcripts that co-segregate due to physical linkage. In total, we characterized full-length open reading frames (ORFs) for 313 Mane-A, Mane-B, and Mane-I sequences that defined 86 Mane-A and 106 Mane-B MHC-I haplotypes. Pacific Biosciences technology allows us to resolve these Mane-A and Mane-B haplotypes to the level of synonymous allelic variants. The newly defined haplotypes and transcript sequences containing full-length ORFs provide an important resource for infectious disease researchers as certain MHC haplotypes have been shown to provide exceptional control of simian immunodeficiency virus (SIV) replication and prevention of AIDS-like disease in nonhuman primates. The increased allelic resolution provided by Pacific Biosciences sequencing also benefits transplant research by allowing researchers to more specifically match haplotypes between donors and recipients to the level of nonsynonymous allelic variation, thus reducing the risk of graft-versus-host disease.

  2. Knowledge and inference

    CERN Document Server

    Nagao, Makoto

    1990-01-01

    Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig

  3. Geometric statistical inference

    International Nuclear Information System (INIS)

    Periwal, Vipul

    1999-01-01

    A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined

  4. Factor IX gene haplotypes in Amerindians.

    Science.gov (United States)

    Franco, R F; Araújo, A G; Zago, M A; Guerreiro, J F; Figueiredo, M S

    1997-02-01

    We have determined the haplotypes of the factor IX gene for 95 Indians from 5 Brazilian Amazon tribes: Wayampí, Wayana-Apalaí, Kayapó, Arára, and Yanomámi. Eight polymorphisms linked to the factor IX gene were investigated: MseI (at 5', nt -698), BamHI (at 5', nt -561), DdeI (intron 1), BamHI (intron 2), XmnI (intron 3), TaqI (intron 4), MspI (intron 4), and HhaI (at 3', approximately 8 kb). The results of the haplotype distribution and the allele frequencies for each of the factor IX gene polymorphisms in Amerindians were similar to the results reported for Asian populations but differed from results for other ethnic groups. Only five haplotypes were identified within the entire Amerindian study population, and the haplotype distribution was significantly different among the five tribes, with one (Arára) to four (Wayampí) haplotypes being found per tribe. These findings indicate a significant heterogeneity among the Indian tribes and contrast with the homogeneous distribution of the beta-globin gene cluster haplotypes but agree with our recent findings on the distribution of alpha-globin gene cluster haplotypes and the allele frequencies for six VNTRs in the same Amerindian tribes. Our data represent the first study of factor IX-associated polymorphisms in Amerindian populations and emphasizes the applicability of these genetic markers for population and human evolution studies.

  5. Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems

    KAUST Repository

    Contreras, Andres A.

    2016-09-19

    A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd

  6. The performance of phylogenetic algorithms in estimating haplotype genealogies with migration.

    Science.gov (United States)

    Salzburger, Walter; Ewing, Greg B; Von Haeseler, Arndt

    2011-05-01

    Genealogies estimated from haplotypic genetic data play a prominent role in various biological disciplines in general and in phylogenetics, population genetics and phylogeography in particular. Several software packages have specifically been developed for the purpose of reconstructing genealogies from closely related, and hence, highly similar haplotype sequence data. Here, we use simulated data sets to test the performance of traditional phylogenetic algorithms, neighbour-joining, maximum parsimony and maximum likelihood in estimating genealogies from nonrecombining haplotypic genetic data. We demonstrate that these methods are suitable for constructing genealogies from sets of closely related DNA sequences with or without migration. As genealogies based on phylogenetic reconstructions are fully resolved, but not necessarily bifurcating, and without reticulations, these approaches outperform widespread 'network' constructing methods. In our simulations of coalescent scenarios involving panmictic, symmetric and asymmetric migration, we found that phylogenetic reconstruction methods performed well, while the statistical parsimony approach as implemented in TCS performed poorly. Overall, parsimony as implemented in the PHYLIP package performed slightly better than other methods. We further point out that we are not making the case that widespread 'network' constructing methods are bad, but that traditional phylogenetic tree finding methods are applicable to haplotypic data and exhibit reasonable performance with respect to accuracy and robustness. We also discuss some of the problems of converting a tree to a haplotype genealogy, in particular that it is nonunique. © 2011 Blackwell Publishing Ltd.

  7. Y-chromosome STR haplotypes in Somalis

    DEFF Research Database (Denmark)

    Hallenberg, Charlotte; Simonsen, Bo; Sanchez Sanchez, Juan Jose

    2005-01-01

    A total of 201 males from Somalia were typed for the Y-chromosome STRs DYS19, DYS385a/b, DYS389-I, DYS389-II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438 and DYS439 with the PowerPlex Y kit (Promega). A total of 96 different haplotypes were observed and the haplotype diversity was 0.9715. The ......A total of 201 males from Somalia were typed for the Y-chromosome STRs DYS19, DYS385a/b, DYS389-I, DYS389-II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438 and DYS439 with the PowerPlex Y kit (Promega). A total of 96 different haplotypes were observed and the haplotype diversity was 0...

  8. Estimating haplotype effects for survival data.

    Science.gov (United States)

    Scheike, Thomas H; Martinussen, Torben; Silver, Jeremy D

    2010-09-01

    Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplotype effects for survival data. These estimating equations are simple to implement and avoid the use of the EM algorithm, which may be slow in the context of the semiparametric Cox model with incomplete covariate information. These estimating equations also lead to easily computable, direct estimators of standard errors, and thus overcome some of the difficulty in obtaining variance estimators based on the EM algorithm in this setting. We also develop an easily implemented goodness-of-fit procedure for Cox's regression model including haplotype effects. Finally, we apply the procedures presented in this article to investigate possible haplotype effects of the PAF-receptor on cardiovascular events in patients with coronary artery disease, and compare our results to those based on the EM algorithm. © 2009, The International Biometric Society.

  9. The effect of genealogy-based haplotypes on genomic prediction

    DEFF Research Database (Denmark)

    Edriss, Vahid; Fernando, Rohan L.; Su, Guosheng

    2013-01-01

    on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. Methods A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using...... local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (pi) of the haplotype covariates had zero effect......, i.e. a Bayesian mixture method. Results About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some...

  10. Variations on Bayesian Prediction and Inference

    Science.gov (United States)

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  11. H-PoP and H-PoPG: heuristic partitioning algorithms for single individual haplotyping of polyploids.

    Science.gov (United States)

    Xie, Minzhu; Wu, Qiong; Wang, Jianxin; Jiang, Tao

    2016-12-15

    Some economically important plants including wheat and cotton have more than two copies of each chromosome. With the decreasing cost and increasing read length of next-generation sequencing technologies, reconstructing the multiple haplotypes of a polyploid genome from its sequence reads becomes practical. However, the computational challenge in polyploid haplotyping is much greater than that in diploid haplotyping, and there are few related methods. This article models the polyploid haplotyping problem as an optimal poly-partition problem of the reads, called the Polyploid Balanced Optimal Partition model. For the reads sequenced from a k-ploid genome, the model tries to divide the reads into k groups such that the difference between the reads of the same group is minimized while the difference between the reads of different groups is maximized. When the genotype information is available, the model is extended to the Polyploid Balanced Optimal Partition with Genotype constraint problem. These models are all NP-hard. We propose two heuristic algorithms, H-PoP and H-PoPG, based on dynamic programming and a strategy of limiting the number of intermediate solutions at each iteration, to solve the two models, respectively. Extensive experimental results on simulated and real data show that our algorithms can solve the models effectively, and are much faster and more accurate than the recent state-of-the-art polyploid haplotyping algorithms. The experiments also show that our algorithms can deal with long reads and deep read coverage effectively and accurately. Furthermore, H-PoP might be applied to help determine the ploidy of an organism. https://github.com/MinzhuXie/H-PoPG CONTACT: xieminzhu@hotmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Founder haplotype analysis of Fanconi anemia in the Korean population finds common ancestral haplotypes for a FANCG variant.

    Science.gov (United States)

    Park, Joonhong; Kim, Myungshin; Jang, Woori; Chae, Hyojin; Kim, Yonggoo; Chung, Nack-Gyun; Lee, Jae-Wook; Cho, Bin; Jeong, Dae-Chul; Park, In Yang; Park, Mi Sun

    2015-05-01

    A common ancestral haplotype is strongly suggested in the Korean and Japanese patients with Fanconi anemia (FA), because common mutations have been frequently found: c.2546delC and c.3720_3724delAAACA of FANCA; c.307+1G>C, c.1066C>T, and c.1589_1591delATA of FANCG. Our aim in this study was to investigate the origin of these common mutations of FANCA and FANCG. We genotyped 13 FA patients consisting of five FA-A patients and eight FA-G patients from the Korean FA population. Microsatellite markers used for haplotype analysis included four CA repeat markers which are closely linked with FANCA and eight CA repeat markers which are contiguous with FANCG. As a result, Korean FA-A patients carrying c.2546delC or c.3720_3724delAAACA did not share the same haplotypes. However, three unique haplotypes carrying c.307+1G>C, c.1066C > T, or c.1589_1591delATA, that consisted of eight polymorphic loci covering a flanking region were strongly associated with Korean FA-G, consistent with founder haplotypes reported previously in the Japanese FA-G population. Our finding confirmed the common ancestral haplotypes on the origins of the East Asian FA-G patients, which will improve our understanding of the molecular population genetics of FA-G. To the best of our knowledge, this is the first report on the association between disease-linked mutations and common ancestral haplotypes in the Korean FA population. © 2015 John Wiley & Sons Ltd/University College London.

  13. Inference in `poor` languages

    Energy Technology Data Exchange (ETDEWEB)

    Petrov, S.

    1996-10-01

    Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.

  14. Estimating haplotype effects for survival data

    DEFF Research Database (Denmark)

    Scheike, Thomas; Martinussen, Torben; Silver, J

    2010-01-01

    Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplo...

  15. Inferring mechanisms of copy number change from haplotype structures at the human DEFA1A3 locus

    OpenAIRE

    Black, Holly A; Khan, Fayeza F; Tyson, Jess; Armour, John AL

    2014-01-01

    Background The determination of structural haplotypes at copy number variable regions can indicate the mechanisms responsible for changes in copy number, as well as explain the relationship between gene copy number and expression. However, obtaining spatial information at regions displaying extensive copy number variation, such as the DEFA1A3 locus, is complex, because of the difficulty in the phasing and assembly of these regions. The DEFA1A3 locus is intriguing in that it falls within a reg...

  16. Diversity and population structure of Plasmodium falciparum in Thailand based on the spatial and temporal haplotype patterns of the C-terminal 19-kDa domain of merozoite surface protein-1.

    Science.gov (United States)

    Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Siripoon, Napaporn; Seugorn, Aree; Kaewthamasorn, Morakot; Butcher, Robert D J; Harnyuttanakorn, Pongchai

    2014-02-12

    The 19-kDa C-terminal region of the merozoite surface protein-1 of the human malaria parasite Plasmodium falciparum (PfMSP-119) constitutes the major component on the surface of merozoites and is considered as one of the leading candidates for asexual blood stage vaccines. Because the protein exhibits a level of sequence variation that may compromise the effectiveness of a vaccine, the global sequence diversity of PfMSP-119 has been subjected to extensive research, especially in malaria endemic areas. In Thailand, PfMSP-119 sequences have been derived from a single parasite population in Tak province, located along the Thailand-Myanmar border, since 1995. However, the extent of sequence variation and the spatiotemporal patterns of the MSP-119 haplotypes along the Thai borders with Laos and Cambodia are unknown. Sixty-three isolates of P. falciparum from five geographically isolated populations along the Thai borders with Myanmar, Laos and Cambodia in three transmission seasons between 2002 and 2008 were collected and culture-adapted. The msp-1 gene block 17 was sequenced and analysed for the allelic diversity, frequency and distribution patterns of PfMSP-119 haplotypes in individual populations. The PfMSP-119 haplotype patterns were then compared between parasite populations to infer the population structure and genetic differentiation of the malaria parasite. Five conserved polymorphic positions, which accounted for five distinct haplotypes, of PfMSP-119 were identified. Differences in the prevalence of PfMSP-119 haplotypes were detected in different geographical regions, with the highest levels of genetic diversity being found in the Kanchanaburi and Ranong provinces along the Thailand-Myanmar border and Trat province located at the Thailand-Cambodia border. Despite this variability, the distribution patterns of individual PfMSP-119 haplotypes seemed to be very similar across the country and over the three malarial transmission seasons, suggesting that gene flow

  17. Genomic identification of founding haplotypes reveals the history of the selfing species Capsella rubella.

    Directory of Open Access Journals (Sweden)

    Yaniv Brandvain

    Full Text Available The shift from outcrossing to self-fertilization is among the most common evolutionary transitions in flowering plants. Until recently, however, a genome-wide view of this transition has been obscured by both a dearth of appropriate data and the lack of appropriate population genomic methods to interpret such data. Here, we present a novel population genomic analysis detailing the origin of the selfing species, Capsella rubella, which recently split from its outcrossing sister, Capsella grandiflora. Due to the recency of the split, much of the variation within C. rubella is also found within C. grandiflora. We can therefore identify genomic regions where two C. rubella individuals have inherited the same or different segments of ancestral diversity (i.e. founding haplotypes present in C. rubella's founder(s. Based on this analysis, we show that C. rubella was founded by multiple individuals drawn from a diverse ancestral population closely related to extant C. grandiflora, that drift and selection have rapidly homogenized most of this ancestral variation since C. rubella's founding, and that little novel variation has accumulated within this time. Despite the extensive loss of ancestral variation, the approximately 25% of the genome for which two C. rubella individuals have inherited different founding haplotypes makes up roughly 90% of the genetic variation between them. To extend these findings, we develop a coalescent model that utilizes the inferred frequency of founding haplotypes and variation within founding haplotypes to estimate that C. rubella was founded by a potentially large number of individuals between 50 and 100 kya, and has subsequently experienced a twenty-fold reduction in its effective population size. As population genomic data from an increasing number of outcrossing/selfing pairs are generated, analyses like the one developed here will facilitate a fine-scaled view of the evolutionary and demographic impact of the

  18. The effect of using genealogy-based haplotypes for genomic prediction.

    Science.gov (United States)

    Edriss, Vahid; Fernando, Rohan L; Su, Guosheng; Lund, Mogens S; Guldbrandtsen, Bernt

    2013-03-06

    Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.

  19. Optimization methods for logical inference

    CERN Document Server

    Chandru, Vijay

    2011-01-01

    Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in

  20. Strategies for haplotype-based association mapping in complex pedigreed populations

    DEFF Research Database (Denmark)

    Boleckova, J; Christensen, Ole Fredslund; Sørensen, Peter

    2012-01-01

    In association mapping, haplotype-based methods are generally regarded to provide higher power and increased precision than methods based on single markers. For haplotype-based association mapping most studies use a fixed haplotype effect in the model. However, an increase in haplotype length inc...

  1. Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution

    Directory of Open Access Journals (Sweden)

    Charleston W. K. Chiang

    2016-05-01

    Full Text Available Identity-by-descent (IBD is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample, and there are currently efforts to detect shorter segments from sequencing. Here, we study a problem of identifiability: because existing approaches detect IBD based on contiguous segments of identity-by-state, inferred long segments of IBD may arise from the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that significant proportions of inferred segments 1–2 cM long are results of conflations of two or more shorter segments, each at least 0.2 cM or longer, under demographic scenarios typical for modern humans for all programs tested. The impact of such conflation is much smaller for longer (> 2 cM segments. This biases the inferred IBD segment length distribution, and so can affect downstream inferences that depend on the assumption that each segment of IBD derives from a single common ancestor. As an example, we present and analyze an estimator of the de novo mutation rate using IBD segments, and demonstrate that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate. Understanding the conflation effect in detail will make its correction in future methods more tractable.

  2. MHC Class II haplotypes of Colombian Amerindian tribes

    Science.gov (United States)

    Yunis, Juan J.; Yunis, Edmond J.; Yunis, Emilio

    2013-01-01

    We analyzed 1041 individuals belonging to 17 Amerindian tribes of Colombia, Chimila, Bari and Tunebo (Chibcha linguistic family), Embera, Waunana (Choco linguistic family), Puinave and Nukak (Maku-Puinave linguistic families), Cubeo, Guanano, Tucano, Desano and Piratapuyo (Tukano linguistic family), Guahibo and Guayabero (Guayabero Linguistic Family), Curripaco and Piapoco (Arawak linguistic family) and Yucpa (Karib linguistic family). for MHC class II haplotypes (HLA-DRB1, DQA1, DQB1). Approximately 90% of the MHC class II haplotypes found among these tribes are haplotypes frequently encountered in other Amerindian tribes. Nonetheless, striking differences were observed among Chibcha and non-Chibcha speaking tribes. The DRB1*04:04, DRB1*04:11, DRB1*09:01 carrying haplotypes were frequently found among non-Chibcha speaking tribes, while the DRB1*04:07 haplotype showed significant frequencies among Chibcha speaking tribes, and only marginal frequencies among non-Chibcha speaking tribes. Our results suggest that the differences in MHC class II haplotype frequency found among Chibcha and non-Chibcha speaking tribes could be due to genetic differentiation in Mesoamerica of the ancestral Amerindian population into Chibcha and non-Chibcha speaking populations before they entered into South America. PMID:23885196

  3. Haplotype-based stratification of Huntington's disease.

    Science.gov (United States)

    Chao, Michael J; Gillis, Tammy; Atwal, Ranjit S; Mysore, Jayalakshmi Srinidhi; Arjomand, Jamshid; Harold, Denise; Holmans, Peter; Jones, Lesley; Orth, Michael; Myers, Richard H; Kwak, Seung; Wheeler, Vanessa C; MacDonald, Marcy E; Gusella, James F; Lee, Jong-Min

    2017-11-01

    Huntington's disease (HD) is an autosomal dominant neurodegenerative disease caused by expansion of a CAG trinucleotide repeat in HTT, resulting in an extended polyglutamine tract in huntingtin. We and others have previously determined that the HD-causing expansion occurs on multiple different haplotype backbones, reflecting more than one ancestral origin of the same type of mutation. In view of the therapeutic potential of mutant allele-specific gene silencing, we have compared and integrated two major systems of HTT haplotype definition, combining data from 74 sequence variants to identify the most frequent disease-associated and control chromosome backbones and revealing that there is potential for additional resolution of HD haplotypes. We have used the large collection of 4078 heterozygous HD subjects analyzed in our recent genome-wide association study of HD age at onset to estimate the frequency of these haplotypes in European subjects, finding that common genetic variation at HTT can distinguish the normal and CAG-expanded chromosomes for more than 95% of European HD individuals. As a resource for the HD research community, we have also determined the haplotypes present in a series of publicly available HD subject-derived fibroblasts, induced pluripotent cells, and embryonic stem cells in order to facilitate efforts to develop inclusive methods of allele-specific HTT silencing applicable to most HD patients. Our data providing genetic guidance for therapeutic gene-based targeting will significantly contribute to the developments of rational treatments and implementation of precision medicine in HD.

  4. Haplotype of platelet receptor P2RY12 gene is associated with residual clopidogrel on-treatment platelet reactivity.

    Science.gov (United States)

    Nie, Xiao-Yan; Li, Jun-Lei; Zhang, Yong; Xu, Yang; Yang, Xue-Li; Fu, Yu; Liang, Guang-Kai; Lu, Yun; Liu, Jian; Shi, Lu-Wen

    To investigate a possible association between common variations of the P2RY12 and the residual clopidogrel on-treatment platelet reactivity after adjusting for the influence of CYP2C19 tested by thromboelastography (TEG). One hundred and eighty patients with acute coronary syndrome (ACS) treated with clopidogrel and aspirin were included and platelet function was assessed by TEG. Five selected P2RY12 single nucleotide polymorphisms (SNPs; rs6798347, rs6787801, rs6801273, rs6785930, and rs2046934), which cover the common variations in the P2RY12 gene and its regulatory regions, and three CYP2C19 SNPs ( * 2, * 3, * 17) were genotyped and possible haplotypes were analyzed. The high on-treatment platelet reactivity (HTPR) prevalence defined by a platelet inhibition rate <30% by TEG in adenosine diphosphate (ADP)-channel was 69 (38.33%). Six common haplotypes were inferred from four of the selected P2RY12 SNPs (denoted H 0 to H 5 ) according to the linkage disequilibrium R square (except for rs2046934). Haplotype H 1 showed a significantly lower incidence of HTPR than the reference haplotype (H 0 ) in the total study population while haplotypes H 1 and H 2 showed significantly lower incidences of HTPR than H 0 in the nonsmoker subgroup after adjusting for CYP2C19 effects and demographic characteristics. rs2046934 (T744C) did not show any significant association with HTPR. The combination of common P2RY12 variations including regulatory regions rather than rs2046934 (T744C) that related to pharmacodynamics of clopidogrel in patients with ACS was independently associated with residual on-clopidogrel platelet reactivity. This is apart from the established association of the CYP2C19. This association seemed more important in the subgroup defined by smoking.

  5. SEMANTIC PATCH INFERENCE

    DEFF Research Database (Denmark)

    Andersen, Jesper

    2009-01-01

    Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....

  6. Interactive Instruction in Bayesian Inference

    DEFF Research Database (Denmark)

    Khan, Azam; Breslav, Simon; Hornbæk, Kasper

    2018-01-01

    An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....

  7. Prion gene haplotypes of U.S. cattle

    Directory of Open Access Journals (Sweden)

    Harhay Gregory P

    2006-11-01

    Full Text Available Abstract Background Bovine spongiform encephalopathy (BSE is a fatal neurological disorder characterized by abnormal deposits of a protease-resistant isoform of the prion protein. Characterizing linkage disequilibrium (LD and haplotype networks within the bovine prion gene (PRNP is important for 1 testing rare or common PRNP variation for an association with BSE and 2 interpreting any association of PRNP alleles with BSE susceptibility. The objective of this study was to identify polymorphisms and haplotypes within PRNP from the promoter region through the 3'UTR in a diverse sample of U.S. cattle genomes. Results A 25.2-kb genomic region containing PRNP was sequenced from 192 diverse U.S. beef and dairy cattle. Sequence analyses identified 388 total polymorphisms, of which 287 have not previously been reported. The polymorphism alleles define PRNP by regions of high and low LD. High LD is present between alleles in the promoter region through exon 2 (6.7 kb. PRNP alleles within the majority of intron 2, the entire coding sequence and the untranslated region of exon 3 are in low LD (18.0 kb. Two haplotype networks, one representing the region of high LD and the other the region of low LD yielded nineteen different combinations that represent haplotypes spanning PRNP. The haplotype combinations are tagged by 19 polymorphisms (htSNPS which characterize variation within and across PRNP. Conclusion The number of polymorphisms in the prion gene region of U.S. cattle is nearly four times greater than previously described. These polymorphisms define PRNP haplotypes that may influence BSE susceptibility in cattle.

  8. In Vivo Characterization of Human APOA5 Haplotypes

    Energy Technology Data Exchange (ETDEWEB)

    Ahituv, Nadav; Akiyama, Jennifer; Chapman-Helleboid, Audrey; Fruchart, Jamila; Pennacchio, Len A.

    2006-10-01

    Increased plasma triglycerides concentrations are an independent risk factor for cardiovascular disease. Numerous studies support a reproducible genetic association between two minor haplotypes in the human apolipoprotein A5 gene (APOA5) and increased plasma triglyceride concentrations. We thus sought to investigate the effect of these minor haplotypes (APOA5*2 and APOA5*3) on ApoAV plasma levels through the precise insertion of single-copy intact APOA5 haplotypes at a targeted location in the mouse genome. While we found no difference in the amount of human plasma ApoAV in mice containing the common APOA5*1 and minor APOA5*2 haplotype, the introduction of the single APOA5*3 defining allele (19W) resulted in 3-fold lower ApoAV plasma levels consistent with existing genetic association studies. These results indicate that S19W polymorphism is likely to be functional and explain the strong association of this variant with plasma triglycerides supporting the value of sensitive in vivo assays to define the functional nature of human haplotypes.

  9. Entropic Inference

    Science.gov (United States)

    Caticha, Ariel

    2011-03-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.

  10. A combined evidence Bayesian method for human ancestry inference applied to Afro-Colombians.

    Science.gov (United States)

    Rishishwar, Lavanya; Conley, Andrew B; Vidakovic, Brani; Jordan, I King

    2015-12-15

    Uniparental genetic markers, mitochondrial DNA (mtDNA) and Y chromosomal DNA, are widely used for the inference of human ancestry. However, the resolution of ancestral origins based on mtDNA haplotypes is limited by the fact that such haplotypes are often found to be distributed across wide geographical regions. We have addressed this issue here by combining two sources of ancestry information that have typically been considered separately: historical records regarding population origins and genetic information on mtDNA haplotypes. To combine these distinct data sources, we applied a Bayesian approach that considers historical records, in the form of prior probabilities, together with data on the geographical distribution of mtDNA haplotypes, formulated as likelihoods, to yield ancestry assignments from posterior probabilities. This combined evidence Bayesian approach to ancestry assignment was evaluated for its ability to accurately assign sub-continental African ancestral origins to Afro-Colombians based on their mtDNA haplotypes. We demonstrate that the incorporation of historical prior probabilities via this analytical framework can provide for substantially increased resolution in sub-continental African ancestry assignment for members of this population. In addition, a personalized approach to ancestry assignment that involves the tuning of priors to individual mtDNA haplotypes yields even greater resolution for individual ancestry assignment. Despite the fact that Colombia has a large population of Afro-descendants, the ancestry of this community has been understudied relative to populations with primarily European and Native American ancestry. Thus, the application of the kind of combined evidence approach developed here to the study of ancestry in the Afro-Colombian population has the potential to be impactful. The formal Bayesian analytical framework we propose for combining historical and genetic information also has the potential to be widely applied

  11. Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values

    NARCIS (Netherlands)

    Calus, M.P.L.; Meuwissen, T.H.E.; Windig, J.J.; Knol, E.F.; Schrooten, C.; Vereijken, A.L.J.; Veerkamp, R.F.

    2009-01-01

    The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD) probabilities between haplotypes, various haplotype definitions were tested i.e.

  12. The Network Completion Problem: Inferring Missing Nodes and Edges in Networks

    Energy Technology Data Exchange (ETDEWEB)

    Kim, M; Leskovec, J

    2011-11-14

    Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.

  13. The Improvement of Communication and Inference Skills in Colloid System Material by Problem Solving Learning Model

    OpenAIRE

    maisarera, yunita; diawati, chansyanah; fadiawati, noor

    2012-01-01

    The aim of this research is to describe the effectiveness of problem solving learning in improving communication and inference skills in colloid system material. Subjects in this research were students of XIIPA1 and XI IPA2 classrooms in Persada Junior High School in Bandar Lampung in academic year 2011-2012 where students of both classrooms had the same characteristics. This research used quasi experiment method and pretest-posttest control group design. Effectiveness of problem solving le...

  14. A DRD1 haplotype is associated with risk for autism spectrum disorders in male-only affected sib-pair families.

    Science.gov (United States)

    Hettinger, Joe A; Liu, Xudong; Schwartz, Charles E; Michaelis, Ron C; Holden, Jeanette J A

    2008-07-05

    Individuals with autism spectrum disorders (ASDs) have impairments in executive function and social cognition, with males generally being more severely affected in these areas than females. Because the dopamine D1 receptor (encoded by DRD1) is integral to the neural circuitry mediating these processes, we examined the DRD1 gene for its role in susceptibility to ASDs by performing single marker and haplotype case-control comparisons, family-based association tests, and genotype-phenotype assessments (quantitative transmission disequilibrium tests: QTDT) using three DRD1 polymorphisms, rs265981C/T, rs4532A/G, and rs686T/C. Our previous findings suggested that the dopaminergic system may be more integrally involved in families with affected males only than in other families. We therefore restricted our study to families with two or more affected males (N = 112). There was over-transmission of rs265981-C and rs4532-A in these families (P = 0.040, P = 0.038), with haplotype TDT analysis showing over-transmission of the C-A-T haplotype (P = 0.022) from mothers to affected sons (P = 0.013). In addition, haplotype case-control comparisons revealed an increase of this putative risk haplotype in affected individuals relative to a comparison group (P = 0.004). QTDT analyses showed associations of the rs265981-C, rs4532-A, rs686-T alleles, and the C-A-T haplotype with more severe problems in social interaction, greater difficulties with nonverbal communication and increased stereotypies compared to individuals with other haplotypes. Preferential haplotype transmission of markers at the DRD1 locus and an increased frequency of a specific haplotype support the DRD1 gene as a risk gene for core symptoms of ASD in families having only affected males. Copyright 2008 Wiley-Liss, Inc.

  15. Entropic Inference

    OpenAIRE

    Caticha, Ariel

    2010-01-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...

  16. Phylogeography of Thlaspi arvense (Brassicaceae in China Inferred from Chloroplast and Nuclear DNA Sequences and Ecological Niche Modeling

    Directory of Open Access Journals (Sweden)

    Miao An

    2015-06-01

    Full Text Available Thlaspi arvense is a well-known annual farmland weed with worldwide distribution, which can be found from sea level to above 4000 m high on the Qinghai-Tibetan Plateau (QTP. In this paper, a phylogeographic history of T. arvense including 19 populations from China was inferred by using three chloroplast (cp DNA segments (trnL-trnF, rpl32-trnL and rps16 and one nuclear (n DNA segment (Fe-regulated transporter-like protein, ZIP. A total of 11 chloroplast haplotypes and six nuclear alleles were identified, and haplotypes unique to the QTP were recognized (C4, C5, C7 and N4. On the basis of molecular dating, haplotypes C4, C5 and C7 have separated from others around 1.58 Ma for cpDNA, which corresponds to the QTP uplift. In addition, this article suggests that the T. arvense populations in China are a mixture of diverged subpopulations as inferred by hT/vT test (hT ≤ vT, cpDNA and positive Tajima’s D values (1.87, 0.05 < p < 0.10 for cpDNA and 3.37, p < 0.01 for nDNA. Multimodality mismatch distribution curves and a relatively large shared area of suitable environmental conditions between the Last Glacial Maximum (LGM as well as the present time recognized by MaxEnt software reject the sudden expansion population model.

  17. Aspects combinatoires des réarrangements génomiques et des réseaux d'haplotypes

    OpenAIRE

    Labarre, Anthony

    2008-01-01

    The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks.Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing th...

  18. Practical Bayesian Inference

    Science.gov (United States)

    Bailer-Jones, Coryn A. L.

    2017-04-01

    Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.

  19. Mineralocorticoid receptor haplotype, oral contraceptives and emotional information processing.

    Science.gov (United States)

    Hamstra, D A; de Kloet, E R; van Hemert, A M; de Rijk, R H; Van der Does, A J W

    2015-02-12

    Oral contraceptives (OCs) affect mood in some women and may have more subtle effects on emotional information processing in many more users. Female carriers of mineralocorticoid receptor (MR) haplotype 2 have been shown to be more optimistic and less vulnerable to depression. To investigate the effects of oral contraceptives on emotional information processing and a possible moderating effect of MR haplotype. Cross-sectional study in 85 healthy premenopausal women of West-European descent. We found significant main effects of oral contraceptives on facial expression recognition, emotional memory and decision-making. Furthermore, carriers of MR haplotype 1 or 3 were sensitive to the impact of OCs on the recognition of sad and fearful faces and on emotional memory, whereas MR haplotype 2 carriers were not. Different compounds of OCs were included. No hormonal measures were taken. Most naturally cycling participants were assessed in the luteal phase of their menstrual cycle. Carriers of MR haplotype 2 may be less sensitive to depressogenic side-effects of OCs. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.

  20. Genome-wide haplotype analysis of cis expression quantitative trait loci in monocytes.

    Directory of Open Access Journals (Sweden)

    Sophie Garnier

    Full Text Available In order to assess whether gene expression variability could be influenced by several SNPs acting in cis, either through additive or more complex haplotype effects, a systematic genome-wide search for cis haplotype expression quantitative trait loci (eQTL was conducted in a sample of 758 individuals, part of the Cardiogenics Transcriptomic Study, for which genome-wide monocyte expression and GWAS data were available. 19,805 RNA probes were assessed for cis haplotypic regulation through investigation of ~2,1 × 10(9 haplotypic combinations. 2,650 probes demonstrated haplotypic p-values >10(4-fold smaller than the best single SNP p-value. Replication of significant haplotype effects were tested for 412 probes for which SNPs (or proxies that defined the detected haplotypes were available in the Gutenberg Health Study composed of 1,374 individuals. At the Bonferroni correction level of 1.2 × 10(-4 (~0.05/412, 193 haplotypic signals replicated. 1000 G imputation was then conducted, and 105 haplotypic signals still remained more informative than imputed SNPs. In-depth analysis of these 105 cis eQTL revealed that at 76 loci genetic associations were compatible with additive effects of several SNPs, while for the 29 remaining regions data could be compatible with a more complex haplotypic pattern. As 24 of the 105 cis eQTL have previously been reported to be disease-associated loci, this work highlights the need for conducting haplotype-based and 1000 G imputed cis eQTL analysis before commencing functional studies at disease-associated loci.

  1. Mineralocorticoid receptor haplotype, estradiol, progesterone and emotional information processing.

    Science.gov (United States)

    Hamstra, Danielle A; de Kloet, E Ronald; Quataert, Ina; Jansen, Myrthe; Van der Does, Willem

    2017-02-01

    Carriers of MR-haplotype 1 and 3 (GA/CG; rs5522 and rs2070951) are more sensitive to the influence of oral contraceptives (OC) and menstrual cycle phase on emotional information processing than MR-haplotype 2 (CA) carriers. We investigated whether this effect is associated with estradiol (E2) and/or progesterone (P4) levels. Healthy MR-genotyped premenopausal women were tested twice in a counterbalanced design. Naturally cycling (NC) women were tested in the early-follicular and mid-luteal phase and OC-users during OC-intake and in the pill-free week. At both sessions E2 and P4 were assessed in saliva. Tests included implicit and explicit positive and negative affect, attentional blink accuracy, emotional memory, emotion recognition, and risky decision-making (gambling). MR-haplotype 2 homozygotes had higher implicit happiness scores than MR-haplotype 2 heterozygotes (p=0.031) and MR-haplotype 1/3 carriers (pemotion recognition test than MR-haplotype 1/3 (p=0.001). Practice effects were observed for most measures. The pattern of correlations between information processing and P4 or E2 differed between sessions, as well as the moderating effects of the MR genotype. In the first session the MR-genotype moderated the influence of P4 on implicit anxiety (sr=-0.30; p=0.005): higher P4 was associated with reduction in implicit anxiety, but only in MR-haplotype 2 homozygotes (sr=-0.61; p=0.012). In the second session the MR-genotype moderated the influence of E2 on the recognition of facial expressions of happiness (sr=-0.21; p=0.035): only in MR-haplotype 1/3 higher E2 was correlated with happiness recognition (sr=0.29; p=0.005). In the second session higher E2 and P4 were negatively correlated with accuracy in lag2 trials of the attentional blink task (pemotional information processing. This moderating effect may depend on the novelty of the situation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Beta-globin gene cluster haplotypes of Amerindian populations from the Brazilian Amazon region.

    Science.gov (United States)

    Guerreiro, J F; Figueiredo, M S; Zago, M A

    1994-01-01

    We have determined the beta-globin cluster haplotypes for 80 Indians from four Brazilian Amazon tribes: Kayapó, Wayampí, Wayana-Apalaí, and Arára. The results are analyzed together with 20 Yanomámi previously studied. From 2 to 4 different haplotypes were identified for each tribe, and 7 of the possible 32 haplotypes were found in a sample of 172 chromosomes for which the beta haplotypes were directly determined or derived from family studies. The haplotype distribution does not differ significantly among the five populations. The two most common haplotypes in all tribes were haplotypes 2 and 6, with average frequencies of 0.843 and 0.122, respectively. The genetic affinities between Brazilian Indians and other human populations were evaluated by estimates of genetic distance based on haplotype data. The lowest values were observed in relation to Asians, especially Chinese, Polynesians, and Micronesians.

  3. Inference Attacks and Control on Database Structures

    Directory of Open Access Journals (Sweden)

    Muhamed Turkanovic

    2015-02-01

    Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.

  4. Effects of Single Nucleotide Polymorphism Marker Density on Haplotype Block Partition

    Directory of Open Access Journals (Sweden)

    Sun Ah Kim

    2016-12-01

    Full Text Available Many researchers have found that one of the most important characteristics of the structure of linkage disequilibrium is that the human genome can be divided into non-overlapping block partitions in which only a small number of haplotypes are observed. The location and distribution of haplotype blocks can be seen as a population property influenced by population genetic events such as selection, mutation, recombination and population structure. In this study, we investigate the effects of the density of markers relative to the full set of all polymorphisms in the region on the results of haplotype partitioning for five popular haplotype block partition methods: three methods in Haploview (confidence interval, four gamete test, and solid spine, MIG++ implemented in PLINK 1.9 and S-MIG++. We used several experimental datasets obtained by sampling subsets of single nucleotide polymorphism (SNP markers of chromosome 22 region in the 1000 Genomes Project data and also the HapMap phase 3 data to compare the results of haplotype block partitions by five methods. With decreasing sampling ratio down to 20% of the original SNP markers, the total number of haplotype blocks decreases and the length of haplotype blocks increases for all algorithms. When we examined the marker-independence of the haplotype block locations constructed from the datasets of different density, the results using below 50% of the entire SNP markers were very different from the results using the entire SNP markers. We conclude that the haplotype block construction results should be used and interpreted carefully depending on the selection of markers and the purpose of the study.

  5. Analysis of DR4 haplotypes in insulin dependent diabetes (IDD)

    International Nuclear Information System (INIS)

    Monos, D.S.; Radka, S.F.; Zmijewski, C.M.; Kamoun, M.

    1986-01-01

    Population studies indicate that HLA-DR4 is implicated in the susceptibility of IDD. However, biochemical characterization of the serologically defined DR4 haplotype from normal individuals revealed five DR4 and three DQW3 molecular forms. Hence, in this study, they investigated the heterogeneity of the DR4 haplotype, using B-lymphoblastoid cell lines (B-LCL) generated from patients with IDD and seropositive for DR4. Class II molecules, metabolically labeled with 35 S-methionine, were immunoprecipitated with monoclonal antibodies specific for DR(L243), DQ(N297), DQW3(IVD12) or DR and DQ(SG465) and analyzed by two-dimensional polyacrylamide gel electrophoresis (2D-PAGE). The isoelectrofocusing (IEF) conditions employed in this study allow representation only of the DR4 haplotype from either DR3/4 or DR4/4 cell lines. The analysis of six different DR4 haplotypes from seven IDD patients, revealed the presence of two DR4 β and two DQW3 β chains. Three of the six DR4 β haplotypes analyzed shared the same DR4 β chain and three others shared a different one. Additionally five of the six haplotypes shared a different one. Additionally five of the six haplotypes shared the same DQW3 β chain and only one was carrying a different one. Different combinations of the two DR4 and two DQW3 β chains constitute three distinct patterns of DR4 haplotypes. These results suggest the prevalence of a DQW3 β chain in the small sample of IDD patients studied. Studies of a large number of patients should clarify whether IDD is associated with unique variants of DR4 or DQW3 β chains

  6. Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values

    Directory of Open Access Journals (Sweden)

    Schrooten Chris

    2009-01-01

    Full Text Available Abstract The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD probabilities between haplotypes, various haplotype definitions were tested i.e. including 2, 6, 12 or 20 marker alleles and clustering base haplotypes related with an IBD probability of > 0.55, 0.75 or 0.95. Simulated data contained 1100 animals with known genotypes and phenotypes and 1000 animals with known genotypes and unknown phenotypes. Genomes comprising 3 Morgan were simulated and contained 74 polymorphic QTL and 383 polymorphic SNP markers with an average r2 value of 0.14 between adjacent markers. The total number of haplotypes decreased up to 50% when the window size was increased from two to 20 markers and decreased by at least 50% when haplotypes related with an IBD probability of > 0.55 instead of > 0.95 were clustered. An intermediate window size led to more precise QTL mapping. Window size and clustering had a limited effect on the accuracy of predicted total breeding values, ranging from 0.79 to 0.81. Our conclusion is that different optimal window sizes should be used in QTL-mapping versus genome-wide breeding value prediction.

  7. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity

    Directory of Open Access Journals (Sweden)

    Benjamin Schwessinger

    2018-02-01

    Full Text Available A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales. In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies.

  8. Entropy, Information Theory, Information Geometry and Bayesian Inference in Data, Signal and Image Processing and Inverse Problems

    Directory of Open Access Journals (Sweden)

    Ali Mohammad-Djafari

    2015-06-01

    Full Text Available The main content of this review article is first to review the main inference tools using Bayes rule, the maximum entropy principle (MEP, information theory, relative entropy and the Kullback–Leibler (KL divergence, Fisher information and its corresponding geometries. For each of these tools, the precise context of their use is described. The second part of the paper is focused on the ways these tools have been used in data, signal and image processing and in the inverse problems, which arise in different physical sciences and engineering applications. A few examples of the applications are described: entropy in independent components analysis (ICA and in blind source separation, Fisher information in data model selection, different maximum entropy-based methods in time series spectral estimation and in linear inverse problems and, finally, the Bayesian inference for general inverse problems. Some original materials concerning the approximate Bayesian computation (ABC and, in particular, the variational Bayesian approximation (VBA methods are also presented. VBA is used for proposing an alternative Bayesian computational tool to the classical Markov chain Monte Carlo (MCMC methods. We will also see that VBA englobes joint maximum a posteriori (MAP, as well as the different expectation-maximization (EM algorithms as particular cases.

  9. Quantum-Like Representation of Non-Bayesian Inference

    Science.gov (United States)

    Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

    2013-01-01

    This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.

  10. HERC1 polymorphisms: population-specific variations in haplotype composition.

    Science.gov (United States)

    Yuasa, Isao; Umetsu, Kazuo; Nishimukai, Hiroaki; Fukumori, Yasuo; Harihara, Shinji; Saitou, Naruya; Jin, Feng; Chattopadhyay, Prasanta K; Henke, Lotte; Henke, Jürgen

    2009-08-01

    Human HERC1 is one of six HERC proteins and may play an important role in intracellular membrane trafficking. The human HERC1 gene is suggested to have been affected by local positive selection. To assess the global frequency distributions of coding and non-coding single nucleotide polymorphisms (SNPs) in the HERC1 gene, we developed a new simultaneous genotyping method for four SNPs, and applied this method to investigate 1213 individuals from 12 global populations. The results confirmed remarked differences in the allele and haplotype frequencies between East Asian and non-East Asian populations. One of the three common haplotypes observed was found to be characteristic of East Asians, who showed a relatively uniform distribution of haplotypes. Information on haplotypes would be useful for testing the function of polymorphisms in the HERC1 gene. This is the first study to investigate the distribution of HERC1 polymorphisms in various populations. (c) 2009 John Wiley & Sons, Ltd.

  11. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  12. An accurate clone-based haplotyping method by overlapping pool sequencing.

    Science.gov (United States)

    Li, Cheng; Cao, Changchang; Tu, Jing; Sun, Xiao

    2016-07-08

    Chromosome-long haplotyping of human genomes is important to identify genetic variants with differing gene expression, in human evolution studies, clinical diagnosis, and other biological and medical fields. Although several methods have realized haplotyping based on sequencing technologies or population statistics, accuracy and cost are factors that prohibit their wide use. Borrowing ideas from group testing theories, we proposed a clone-based haplotyping method by overlapping pool sequencing. The clones from a single individual were pooled combinatorially and then sequenced. According to the distinct pooling pattern for each clone in the overlapping pool sequencing, alleles for the recovered variants could be assigned to their original clones precisely. Subsequently, the clone sequences could be reconstructed by linking these alleles accordingly and assembling them into haplotypes with high accuracy. To verify the utility of our method, we constructed 130 110 clones in silico for the individual NA12878 and simulated the pooling and sequencing process. Ultimately, 99.9% of variants on chromosome 1 that were covered by clones from both parental chromosomes were recovered correctly, and 112 haplotype contigs were assembled with an N50 length of 3.4 Mb and no switch errors. A comparison with current clone-based haplotyping methods indicated our method was more accurate. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Genetic relationships among native americans based on beta-globin gene cluster haplotype frequencies

    Directory of Open Access Journals (Sweden)

    Rita de Cassia Mousinho-Ribeiro

    2003-01-01

    Full Text Available The distribution of b-globin gene haplotypes was studied in 209 Amerindians from eight tribes of the Brazilian Amazon: Asurini from Xingú, Awá-Guajá, Parakanã, Urubú-Kaapór, Zoé, Kayapó (Xikrin from the Bacajá village, Katuena, and Tiriyó. Nine different haplotypes were found, two of which (n. 11 and 13 had not been previously identified in Brazilian indigenous populations. Haplotype 2 (+ - - - - was the most common in all groups studied, with frequencies varying from 70% to 100%, followed by haplotype 6 (- + + - +, with frequencies between 7% and 18%. The frequency distribution of the b-globin gene haplotypes in the eighteen Brazilian Amerindian populations studied to date is characterized by a reduced number of haplotypes (average of 3.5 and low levels of heterozygosity and intrapopulational differentiation, with a single clearly predominant haplotype in most tribes (haplotype 2. The Parakanã, Urubú-Kaapór, Tiriyó and Xavante tribes constitute exceptions, presenting at least four haplotypes with relatively high frequencies. The closest genetic relationships were observed between the Brazilian and the Colombian Amerindians (Wayuu, Kamsa and Inga, and, to a lesser extent, with the Huichol of Mexico. North-American Amerindians are more differentiated and clearly separated from all other tribes, except the Xavante, from Brazil, and the Mapuche, from Argentina. A restricted pool of ancestral haplotypes may explain the low diversity observed among most present-day Brazilian and Colombian Amerindian groups, while interethnic admixture could be the most important factor to explain the high number of haplotypes and high levels of diversity observed in some South-American and most North-American tribes.

  14. Adaptive Inference on General Graphical Models

    OpenAIRE

    Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur

    2012-01-01

    Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...

  15. Updated listing of haplotypes at the human phenylalanine hydroxylase (PAH) locus

    Energy Technology Data Exchange (ETDEWEB)

    Eisensmith, R.C.; Woo, S.L.C. (Baylor College of Medicine, Houston, TX (United States))

    1992-12-01

    Analysis of mutant PAH chromosomes has identified approximately 60 different single-base substitutions and deletions within the PAH locus. Nearly all of these molecular lesions are in strong linkage disequilibrium with specific RFLP haplotypes in different ethnic populations. Thus, haplotype analysis is not only useful for diagnostic purposes but is proving to be a valuable tool in population genetic studies of the origin and spread of phenylketonuria alleles in human populations. PCR-based methods have been developed to detect six of the eight polymorphic restriction sites used for determination of RFLP haplotypes at the PAH locus. A table of the proposed expanded haplotypes is given.

  16. Direct maximum parsimony phylogeny reconstruction from genotype data.

    Science.gov (United States)

    Sridhar, Srinath; Lam, Fumei; Blelloch, Guy E; Ravi, R; Schwartz, Russell

    2007-12-05

    Maximum parsimony phylogenetic tree reconstruction from genetic variation data is a fundamental problem in computational genetics with many practical applications in population genetics, whole genome analysis, and the search for genetic predictors of disease. Efficient methods are available for reconstruction of maximum parsimony trees from haplotype data, but such data are difficult to determine directly for autosomal DNA. Data more commonly is available in the form of genotypes, which consist of conflated combinations of pairs of haplotypes from homologous chromosomes. Currently, there are no general algorithms for the direct reconstruction of maximum parsimony phylogenies from genotype data. Hence phylogenetic applications for autosomal data must therefore rely on other methods for first computationally inferring haplotypes from genotypes. In this work, we develop the first practical method for computing maximum parsimony phylogenies directly from genotype data. We show that the standard practice of first inferring haplotypes from genotypes and then reconstructing a phylogeny on the haplotypes often substantially overestimates phylogeny size. As an immediate application, our method can be used to determine the minimum number of mutations required to explain a given set of observed genotypes. Phylogeny reconstruction directly from unphased data is computationally feasible for moderate-sized problem instances and can lead to substantially more accurate tree size inferences than the standard practice of treating phasing and phylogeny construction as two separate analysis stages. The difference between the approaches is particularly important for downstream applications that require a lower-bound on the number of mutations that the genetic region has undergone.

  17. Direct maximum parsimony phylogeny reconstruction from genotype data

    Directory of Open Access Journals (Sweden)

    Ravi R

    2007-12-01

    Full Text Available Abstract Background Maximum parsimony phylogenetic tree reconstruction from genetic variation data is a fundamental problem in computational genetics with many practical applications in population genetics, whole genome analysis, and the search for genetic predictors of disease. Efficient methods are available for reconstruction of maximum parsimony trees from haplotype data, but such data are difficult to determine directly for autosomal DNA. Data more commonly is available in the form of genotypes, which consist of conflated combinations of pairs of haplotypes from homologous chromosomes. Currently, there are no general algorithms for the direct reconstruction of maximum parsimony phylogenies from genotype data. Hence phylogenetic applications for autosomal data must therefore rely on other methods for first computationally inferring haplotypes from genotypes. Results In this work, we develop the first practical method for computing maximum parsimony phylogenies directly from genotype data. We show that the standard practice of first inferring haplotypes from genotypes and then reconstructing a phylogeny on the haplotypes often substantially overestimates phylogeny size. As an immediate application, our method can be used to determine the minimum number of mutations required to explain a given set of observed genotypes. Conclusion Phylogeny reconstruction directly from unphased data is computationally feasible for moderate-sized problem instances and can lead to substantially more accurate tree size inferences than the standard practice of treating phasing and phylogeny construction as two separate analysis stages. The difference between the approaches is particularly important for downstream applications that require a lower-bound on the number of mutations that the genetic region has undergone.

  18. Fitchi: haplotype genealogy graphs based on the Fitch algorithm.

    Science.gov (United States)

    Matschiner, Michael

    2016-04-15

    : In population genetics and phylogeography, haplotype genealogy graphs are important tools for the visualization of population structure based on sequence data. In this type of graph, node sizes are often drawn in proportion to haplotype frequencies and edge lengths represent the minimum number of mutations separating adjacent nodes. I here present Fitchi, a new program that produces publication-ready haplotype genealogy graphs based on the Fitch algorithm. http://www.evoinformatics.eu/fitchi.htm : michaelmatschiner@mac.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Polymorphism at Expressed DQ and DR Loci in Five Common Equine MHC Haplotypes

    Science.gov (United States)

    Miller, Donald; Tallmadge, Rebecca L.; Binns, Matthew; Zhu, Baoli; Mohamoud, Yasmin Ali; Ahmed, Ayeda; Brooks, Samantha A.; Antczak, Douglas F.

    2016-01-01

    The polymorphism of Major Histocompatibility Complex (MHC) class II DQ and DR genes in five common Equine Leukocyte Antigen (ELA) haplotypes was determined through sequencing of mRNA transcripts isolated from lymphocytes of eight ELA homozygous horses. Ten expressed MHC class II genes were detected in horses of the ELA-A3 haplotype carried by the donor horses of the equine Bacterial Artificial Chromosome (BAC) library and the reference genome sequence: four DR genes and six DQ genes. The other four ELA haplotypes contained at least eight expressed polymorphic MHC class II loci. Next Generation Sequencing (NGS) of genomic DNA of these four MHC haplotypes revealed stop codons in the DQA3 gene in the ELA-A2, ELA-A5, and ELA-A9 haplotypes. Few NGS reads were obtained for the other MHC class II genes that were not amplified in these horses. The amino acid sequences across haplotypes contained locus-specific residues, and the locus clusters produced by phylogenetic analysis were well supported. The MHC class II alleles within the five tested haplotypes were largely non-overlapping between haplotypes. The complement of equine MHC class II DQ and DR genes appears to be well conserved between haplotypes, in contrast to the recently described variation in class I gene loci between equine MHC haplotypes. The identification of allelic series of equine MHC class II loci will aid comparative studies of mammalian MHC conservation and evolution and may also help to interpret associations between the equine MHC class II region and diseases of the horse. PMID:27889800

  20. Dimensional Anxiety Mediates Linkage of GABRA2 Haplotypes With Alcoholism

    Science.gov (United States)

    Enoch, Mary-Anne; Schwartz, Lori; Albaugh, Bernard; Virkkunen, Matti; Goldman, David

    2015-01-01

    The GABAAα2 receptor gene (GABRA2) modulates anxiety and stress response. Three recent association studies implicate GABRA2 in alcoholism, however in these papers both common, opposite-configuration haplotypes in the region distal to intron3 predict risk. We have now replicated the GABRA2 association with alcoholism in 331 Plains Indian men and women and 461 Finnish Caucasian men. Using a dimensional measure of anxiety, harm avoidance (HA), we also found that the association with alcoholism is mediated, or moderated, by anxiety. Nine SNPs were genotyped revealing two haplotype blocks. Within the previously implicated block 2 region, we identified the two common, opposite-configuration risk haplotypes, A and B. Their frequencies differed markedly in Finns and Plains Indians. In both populations, most block 2 SNPs were significantly associated with alcoholism. The associations were due to increased frequencies of both homozygotes in alcoholics, indicating the possibility of alcoholic subtypes with opposite genotypes. Congruently, there was no significant haplotype association. Using HA as an indicator variable for anxiety, we found haplotype linkage to alcoholism with high and low dimensional anxiety, and to HA itself, in both populations. High HA alcoholics had the highest frequency of the more abundant haplotype (A in Finns, B in Plains Indians); low HA alcoholics had the highest frequency of the less abundant haplotype (B in Finns, A in Plains Indians) (Finns: P α0.007, OR α2.1, Plains Indians: P α0.040, OR α1.9). Non-alcoholics had intermediate frequencies. Our results suggest that within the distal GABRA2 region is a functional locus or loci that may differ between populations but that alters risk for alcoholism via the mediating action of anxiety. PMID:16874763

  1. Historical biogeography of the land snail Cornu aspersum: a new scenario inferred from haplotype distribution in the Western Mediterranean basin

    Directory of Open Access Journals (Sweden)

    Madec Luc

    2010-01-01

    Full Text Available Abstract Background Despite its key location between the rest of the continent and Europe, research on the phylogeography of north African species remains very limited compared to European and North American taxa. The Mediterranean land mollusc Cornu aspersum (= Helix aspersa is part of the few species widely sampled in north Africa for biogeographical analysis. It then provides an excellent biological model to understand phylogeographical patterns across the Mediterranean basin, and to evaluate hypotheses of population differentiation. We investigated here the phylogeography of this land snail to reassess the evolutionary scenario we previously considered for explaining its scattered distribution in the western Mediterranean, and to help to resolve the question of the direction of its range expansion (from north Africa to Europe or vice versa. By analysing simultaneously individuals from 73 sites sampled in its putative native range, the present work provides the first broad-scale screening of mitochondrial variation (cyt b and 16S rRNA genes of C. aspersum. Results Phylogeographical structure mirrored previous patterns inferred from anatomy and nuclear data, since all haplotypes could be ascribed to a B (West or a C (East lineage. Alternative migration models tested confirmed that C. aspersum most likely spread from north Africa to Europe. In addition to Kabylia in Algeria, which would have been successively a centre of dispersal and a zone of secondary contacts, we identified an area in Galicia where genetically distinct west and east type populations would have regained contact. Conclusions Vicariant and dispersal processes are reviewed and discussed in the light of signatures left in the geographical distribution of the genetic variation. In referring to Mediterranean taxa which show similar phylogeographical patterns, we proposed a parsimonious scenario to account for the "east-west" genetic splitting and the northward expansion of the

  2. HLA-G Haplotypes Are Differentially Associated with Asthmatic Features

    Directory of Open Access Journals (Sweden)

    Camille Ribeyre

    2018-02-01

    Full Text Available Human leukocyte antigen (HLA-G, a HLA class Ib molecule, interacts with receptors on lymphocytes such as T cells, B cells, and natural killer cells to influence immune responses. Unlike classical HLA molecules, HLA-G expression is not found on all somatic cells, but restricted to tissue sites, including human bronchial epithelium cells (HBEC. Individual variation in HLA-G expression is linked to its genetic polymorphism and has been associated with many pathological situations such as asthma, which is characterized by epithelium abnormalities and inflammatory cell activation. Studies reported both higher and equivalent soluble HLA-G (sHLA-G expression in different cohorts of asthmatic patients. In particular, we recently described impaired local expression of HLA-G and abnormal profiles for alternatively spliced isoforms in HBEC from asthmatic patients. sHLA-G dosage is challenging because of its many levels of polymorphism (dimerization, association with β2-microglobulin, and alternative splicing, thus many clinical studies focused on HLA-G single-nucleotide polymorphisms as predictive biomarkers, but few analyzed HLA-G haplotypes. Here, we aimed to characterize HLA-G haplotypes and describe their association with asthmatic clinical features and sHLA-G peripheral expression and to describe variations in transcription factor (TF binding sites and alternative splicing sites. HLA-G haplotypes were differentially distributed in 330 healthy and 580 asthmatic individuals. Furthermore, HLA-G haplotypes were associated with asthmatic clinical features showed. However, we did not confirm an association between sHLA-G and genetic, biological, or clinical parameters. HLA-G haplotypes were phylogenetically split into distinct groups, with each group displaying particular variations in TF binding or RNA splicing sites that could reflect differential HLA-G qualitative or quantitative expression, with tissue-dependent specificities. Our results, based on a

  3. Active inference, communication and hermeneutics.

    Science.gov (United States)

    Friston, Karl J; Frith, Christopher D

    2015-07-01

    Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. Chronic inflammatory state in sickle cell anemia patients is associated with HBB(*)S haplotype.

    Science.gov (United States)

    Bandeira, Izabel C J; Rocha, Lillianne B S; Barbosa, Maritza C; Elias, Darcielle B D; Querioz, José A N; Freitas, Max Vitor Carioca; Gonçalves, Romélia P

    2014-02-01

    The chronic inflammatory state in sickle cell anemia (SCA) is associated with several factors such as the following: endothelial damage; increased production of reactive oxygen species; hemolysis; increased expression of adhesion molecules by leukocytes, erythrocytes, and platelets; and increased production of proinflammatory cytokines. Genetic characteristics affecting the clinical severity of SCA include variations in the hemoglobin F (HbF) level, coexistence of alpha-thalassemia, and the haplotype associated with the HbS gene. The different haplotypes of SCA are Bantu, Benin, Senegal, Cameroon, and Arab-Indian. These haplotypes are associated with ethnic groups and also based on the geographical origin. Studies have shown that the Bantu haplotype is associated with higher incidence of clinical complications than the other haplotypes and is therefore considered to have the worst prognosis. This study aimed to evaluate the profile of the proinflammatory cytokines interleukin-6, tumor necrosis factor-α, and interleukin-17 in patients with SCA and also to assess the haplotypes associated with beta globin cluster S (HBB(*)S). We analyzed a total of 62 patients who had SCA and had been treated with hydroxyurea; they had received a dose ranging between 15 and 25 (20.0±0.6)mg/kg/day for 6-60 (18±3.4)months; their data were compared with those for 30 normal individuals. The presence of HbS was detected and the haplotypes of the beta S gene cluster were analyzed by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). Our study demonstrated that SCA patients have increased inflammatory profile when compared to the healthy individuals. Further, analysis of the association between the haplotypes and inflammatory profile showed that the levels of IL-6 and TNF-α were greater in subjects with the Bantu/Bantu haplotype than in subjects with the Benin/Benin haplotype. The Bantu/Benin haplotype individuals had lower levels of cytokines than those with

  5. Variational inference & deep learning: A new synthesis

    OpenAIRE

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  6. Variational inference & deep learning : A new synthesis

    NARCIS (Netherlands)

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  7. Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene.

    Science.gov (United States)

    Guo, Zhiai; Song, Yanxia; Zhou, Ronghua; Ren, Zhenglong; Jia, Jizeng

    2010-02-01

    Ppd-D1 is one of the most potent genes affecting the photoperiod response of wheat (Triticum aestivum). Only two alleles, insensitive Ppd-D1a and sensitive Ppd-D1b, were known previously, and these did not adequately explain the broad adaptation of wheat to photoperiod variation. In this study, five diagnostic molecular markers were employed to identify Ppd-D1 haplotypes in 492 wheat varieties from diverse geographic locations and 55 accessions of Aegilops tauschii, the D genome donor species of wheat. Six Ppd-D1 haplotypes, designated I-VI, were identified. Types II, V and VI were considered to be more ancient and types I, III and IV were considered to be derived from type II. The transcript abundances of the Ppd-D1 haplotypes showed continuous variation, being highest for haplotype I, lowest for haplotype III, and correlating negatively with varietal differences in heading time. These haplotypes also significantly affected other agronomic traits. The distribution frequency of Ppd-D1 haplotypes showed partial correlations with both latitudes and altitudes of wheat cultivation regions. The evolution, expression and distribution of Ppd-D1 haplotypes were consistent evidentially with each other. What was regarded as a pair of alleles in the past can now be considered a series of alleles leading to continuous variation.

  8. Congruence as a measurement of extended haplotype structure across the genome

    Science.gov (United States)

    2012-01-01

    Background Historically, extended haplotypes have been defined using only a few data points, such as alleles for several HLA genes in the MHC. High-density SNP data, and the increasing affordability of whole genome SNP typing, creates the opportunity to define higher resolution extended haplotypes. This drives the need for new tools that support quantification and visualization of extended haplotypes as defined by as many as 2000 SNPs. Confronted with high-density SNP data across the major histocompatibility complex (MHC) for 2,300 complete families, compiled by the Type 1 Diabetes Genetics Consortium (T1DGC), we developed software for studying extended haplotypes. Methods The software, called ExHap (Extended Haplotype), uses a similarity measurement we term congruence to identify and quantify long-range allele identity. Using ExHap, we analyzed congruence in both the T1DGC data and family-phased data from the International HapMap Project. Results Congruent chromosomes from the T1DGC data have between 96.5% and 99.9% allele identity over 1,818 SNPs spanning 2.64 megabases of the MHC (HLA-DRB1 to HLA-A). Thirty-three of 132 DQ-DR-B-A defined haplotype groups have > 50% congruent chromosomes in this region. For example, 92% of chromosomes within the DR3-B8-A1 haplotype are congruent from HLA-DRB1 to HLA-A (99.8% allele identity). We also applied ExHap to all 22 autosomes for both CEU and YRI cohorts from the International HapMap Project, identifying multiple candidate extended haplotypes. Conclusions Long-range congruence is not unique to the MHC region. Patterns of allele identity on phased chromosomes provide a simple, straightforward approach to visually and quantitatively inspect complex long-range structural patterns in the genome. Such patterns aid the biologist in appreciating genetic similarities and differences across cohorts, and can lead to hypothesis generation for subsequent studies. PMID:22369243

  9. MGMT DNA repair gene promoter/enhancer haplotypes alter transcription factor binding and gene expression.

    Science.gov (United States)

    Xu, Meixiang; Cross, Courtney E; Speidel, Jordan T; Abdel-Rahman, Sherif Z

    2016-10-01

    The O 6 -methylguanine-DNA methyltransferase (MGMT) protein removes O 6 -alkyl-guanine adducts from DNA. MGMT expression can thus alter the sensitivity of cells and tissues to environmental and chemotherapeutic alkylating agents. Previously, we defined the haplotype structure encompassing single nucleotide polymorphisms (SNPs) in the MGMT promoter/enhancer (P/E) region and found that haplotypes, rather than individual SNPs, alter MGMT promoter activity. The exact mechanism(s) by which these haplotypes exert their effect on MGMT promoter activity is currently unknown, but we noted that many of the SNPs comprising the MGMT P/E haplotypes are located within or in close proximity to putative transcription factor binding sites. Thus, these haplotypes could potentially affect transcription factor binding and, subsequently, alter MGMT promoter activity. In this study, we test the hypothesis that MGMT P/E haplotypes affect MGMT promoter activity by altering transcription factor (TF) binding to the P/E region. We used a promoter binding TF profiling array and a reporter assay to evaluate the effect of different P/E haplotypes on TF binding and MGMT expression, respectively. Our data revealed a significant difference in TF binding profiles between the different haplotypes evaluated. We identified TFs that consistently showed significant haplotype-dependent binding alterations (p ≤ 0.01) and revealed their role in regulating MGMT expression using siRNAs and a dual-luciferase reporter assay system. The data generated support our hypothesis that promoter haplotypes alter the binding of TFs to the MGMT P/E and, subsequently, affect their regulatory function on MGMT promoter activity and expression level.

  10. Y-STR haplotypes of Native American populations from the Brazilian Amazon region.

    Science.gov (United States)

    Palha, Teresinha Jesus Brabo Ferreira; Rodrigues, Elzemar Martins Ribeiro; dos Santos, Sidney Emanuel Batista

    2010-10-01

    The allele and haplotype frequencies of nine Y-STRs (DYS19, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS393, DYS385 I/II) were determined in a sample of six native tribes from the Brazilian Amazon (Tiriyó, Awa-Guajá, Waiãpi, Urubu-Kaapor, Zoé and Parakanã). Forty-eight different haplotypes were identified, 28 of which unique. Five haplotypes are very frequent and were shared by over 10 individuals. The estimated haplotype diversity (0.9114) was very low compared to other geographic groups, including Africans, Europeans and Asians. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  11. Organelle DNA haplotypes reflect crop-use characteristics and geographic origins of Cannabis sativa.

    Science.gov (United States)

    Gilmore, Simon; Peakall, Rod; Robertson, James

    2007-10-25

    Comparative sequencing of cannabis individuals across 12 chloroplast and mitochondrial DNA loci revealed 7 polymorphic sites, including 5 length variable regions and 2 single nucleotide polymorphisms. Simple PCR assays were developed to assay these polymorphisms, and organelle DNA haplotypes were obtained for 188 cannabis individuals from 76 separate populations, including drug-type, fibre-type and wild populations. The haplotype data were analysed using parsimony, UPGMA and neighbour joining methods. Three haplotype groups were recovered by each analysis method, and these groups are suggestive of the crop-use characteristics and geographical origin of the populations, although not strictly diagnostic. We discuss the relationship between our haplotype data and taxonomic opinions of cannabis, and the implications of organelle DNA haplotyping to forensic investigations of cannabis.

  12. Inference in {open_quotes}poor{close_quotes} languages

    Energy Technology Data Exchange (ETDEWEB)

    Petrov, S. [Oak Ridge National Lab., TN (United States)

    1996-12-31

    Languages with a solvable implication problem but without complete and consistent systems of inference rules ({open_quote}poor{close_quote} languages) are considered. The problem of existence of a finite, complete, and consistent inference rule system for a {open_quotes}poor{close_quotes} language is stated independently of the language or the rule syntax. Several properties of the problem are proved. An application of the results to the language of join dependencies is given.

  13. Evolutionary history of the European whitefish Coregonus lavaretus (L.) species complex as inferred from mtDNA phylogeography and gill-raker numbers.

    Science.gov (United States)

    Østbye, K; Bernatchez, L; Naesje, T F; Himberg, K-J M; Hindar, K

    2005-12-01

    We compared mitochondrial DNA and gill-raker number variation in populations of the European whitefish Coregonus lavaretus (L.) species complex to illuminate their evolutionary history, and discuss mechanisms behind diversification. Using single-strand conformation polymorphism (SSCP) and sequencing 528 bp of combined parts of the cytochrome oxidase b (cyt b) and NADH dehydrogenase subunit 3 (ND3) mithochondrial DNA (mtDNA) regions, we documented phylogeographic relationships among populations and phylogeny of mtDNA haplotypes. Demographic events behind geographical distribution of haplotypes were inferred using nested clade analysis (NCA) and mismatch distribution. Concordance between operational taxonomical groups, based on gill-raker numbers, and mtDNA patterns was tested. Three major mtDNA clades were resolved in Europe: a North European clade from northwest Russia to Denmark, a Siberian clade from the Arctic Sea to southwest Norway, and a South European clade from Denmark to the European Alps, reflecting occupation in different glacial refugia. Demographic events inferred from NCA were isolation by distance, range expansion, and fragmentation. Mismatch analysis suggested that clades which colonized Fennoscandia and the Alps expanded in population size 24 500-5800 years before present, with minute female effective population sizes, implying small founder populations during colonization. Gill-raker counts did not commensurate with hierarchical mtDNA clades, and poorly with haplotypes, suggesting recent origin of gill-raker variation. Whitefish designations based on gill-raker numbers were not associated with ancient clades. Lack of congruence in morphology and evolutionary lineages implies that the taxonomy of this species complex should be reconsidered.

  14. A Bayesian Network Schema for Lessening Database Inference

    National Research Council Canada - National Science Library

    Chang, LiWu; Moskowitz, Ira S

    2001-01-01

    .... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...

  15. Identification of the Mislabeled Breast Cancer Samples by Mitochondrial DNA Haplotyping

    Directory of Open Access Journals (Sweden)

    Xiaogang Chen

    2015-01-01

    Full Text Available The task to identify whether an archival malignant tumor specimen had been mislabeled or interchanged is a challenging one for forensic genetics. The nuclear DNA (nDNA markers were affected by the aberration of tumor cells, so they were not suitable for personal identification when the tumor tissues were tested. In this study, we focused on a new solution - mitochondrial single nucleotide polymorphism (mtSNP haplotyping by a multiplex SNaPshot assay. To validate our strategy of haplotyping with 25 mtSNPs, we analyzed 15 pairs of cancerous/healthy tissues taken from patients with ductal breast carcinoma. The haplotypes of all the fifteen breast cancer tissues were matched with their paired breast tissues. The heteroplasmy at 2 sites, 14783A/G and 16519C/T was observed in one breast tissue, which indicated a mixture of related mitochondrial haplotypes. However, only one haplotype was retained in the paired breast cancer tissue, which could be considered the result of proliferation of tumor subclone. The allele drop-out and allele drop-in were observed when 39 STRs and 20 tri-allelic SNPs of nDNA were applied. Compared to nDNA markers applied, 25 mtSNPs were more stable without interference from aberrance of breast cancer. Also, two cases were presented where the investigation of haplotype with 25 mtSNPs was used to prove the origin of biopsy specimen with breast cancer. The mislabeling of biopsy specimen with breast cancer could be certified in one case but could not be supported in the other case. We highlight the importance of stability of mtSNP haplotype in breast cancer. It was implied that our multiplex SNaPshot assay with 25 mtSNPs was a useful strategy to identify mislabeled breast cancer specimen.

  16. Patterns of linkage disequilibrium and haplotype distribution in disease candidate genes.

    Science.gov (United States)

    Long, Ji-Rong; Zhao, Lan-Juan; Liu, Peng-Yuan; Lu, Yan; Dvornyk, Volodymyr; Shen, Hui; Liu, Yong-Jun; Zhang, Yuan-Yuan; Xiong, Dong-Hai; Xiao, Peng; Deng, Hong-Wen

    2004-05-24

    The adequacy of association studies for complex diseases depends critically on the existence of linkage disequilibrium (LD) between functional alleles and surrounding SNP markers. We examined the patterns of LD and haplotype distribution in eight candidate genes for osteoporosis and/or obesity using 31 SNPs in 1,873 subjects. These eight genes are apolipoprotein E (APOE), type I collagen alpha1 (COL1A1), estrogen receptor-alpha (ER-alpha), leptin receptor (LEPR), parathyroid hormone (PTH)/PTH-related peptide receptor type 1 (PTHR1), transforming growth factor-beta1 (TGF-beta1), uncoupling protein 3 (UCP3), and vitamin D (1,25-dihydroxyvitamin D3) receptor (VDR). Yin yang haplotypes, two high-frequency haplotypes composed of completely mismatching SNP alleles, were examined. To quantify LD patterns, two common measures of LD, D' and r2, were calculated for the SNPs within the genes. The haplotype distribution varied in the different genes. Yin yang haplotypes were observed only in PTHR1 and UCP3. D' ranged from 0.020 to 1.000 with the average of 0.475, whereas the average r2 was 0.158 (ranging from 0.000 to 0.883). A decay of LD was observed as the intermarker distance increased, however, there was a great difference in LD characteristics of different genes or even in different regions within gene. The differences in haplotype distributions and LD patterns among the genes underscore the importance of characterizing genomic regions of interest prior to association studies.

  17. Haplotype analysis and linkage disequilibrium for DGAT1

    OpenAIRE

    Strucken, Eva M.; Rahmatalla, Siham; De Koning, Dirk-Jan; Brockmann, Gudrun A.

    2010-01-01

    This study focused on haplotype effects and linkage disequilibrium (LD) for the K232A locus and the promoter VNTR in the DGAT1 gene. Analyses were carried out in three German Holstein Frisian populations (including 492, 305, and 518 animals) for milk yield, milk fat and protein yield, and milk fat and protein content. We found that effects of the promoter VNTR were not significant and explain only a small amount of the variation of the QTL on BTA14. Haplotype effects were less significant tha...

  18. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method

    DEFF Research Database (Denmark)

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-01-01

    The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probabi......The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models...... the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace...... method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous...

  19. VNTR alleles associated with the {alpha}-globin locus are haplotype and population related

    Energy Technology Data Exchange (ETDEWEB)

    Martinson, J.J.; Clegg, J.B.; Boyce, A.J. [Univ. of Oxford (United Kingdom)

    1994-09-01

    The human {alpha}-globin complex contains several polymorphic restriction-enzyme sites (i.e., RFLPs) linked to form haplotypes and is flanked by two hypervariable VNTR loci, the 5{prime} hypervariable region (HVR) and the more highly polymorphic 3{prime}HVR. Using a combination of RFLP analysis and PCR, the authors have characterized the 5{prime}HVR and 3{prime}HVR alleles associated with the {alpha}-globin haplotypes of 133 chromosomes, and they here show that specific {alpha}-globin haplotypes are each associated with discrete subsets of the alleles observed at these two VNTR loci. This statistically highly significant association is observed over a region spanning {approximately} 100 kb. With the exception of closely related haplotypes, different haplotypes do not share identically sized 3{prime}HVR alleles. Earlier studies have shown that {alpha}-globin haplotype distributions differ between populations; the current findings also reveal extensive population substructure in the repertoire of {alpha}-globin VNTRs. If similar features are characteristic of other VNTR loci, this will have important implications for forensic and anthropological studies. 42 refs., 5 figs., 5 tabs.

  20. Multimodel inference and adaptive management

    Science.gov (United States)

    Rehme, S.E.; Powell, L.A.; Allen, Craig R.

    2011-01-01

    Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.

  1. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    International Nuclear Information System (INIS)

    Lakhssassi, K.; González-Recio, O.

    2017-01-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  2. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    Energy Technology Data Exchange (ETDEWEB)

    Lakhssassi, K.; González-Recio, O.

    2017-07-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  3. Plasmodium falciparum isolates from Angola show the StctVMNT haplotype in the pfcrt gene

    Science.gov (United States)

    2010-01-01

    Background Effective treatment remains a mainstay of malaria control, but it is unfortunately strongly compromised by drug resistance, particularly in Plasmodium falciparum, the most important human malaria parasite. Although P. falciparum chemoresistance is well recognized all over the world, limited data are available on the distribution and prevalence of pfcrt and pfmdr1 haplotypes that mediate resistance to commonly used drugs and that show distinct geographic differences. Methods Plasmodium falciparum-infected blood samples collected in 2007 at four municipalities of Luanda, Angola, were genotyped using PCR and direct DNA sequencing. Single nucleotide polymorphisms in the P. falciparum pfcrt and pfmdr1 genes were assessed and haplotype prevalences were determined. Results and Discussion The most prevalent pfcrt haplotype was StctVMNT (representing amino acids at codons 72-76). This result was unexpected, since the StctVMNT haplotype has previously been seen mainly in parasites from South America and India. The CVIET, CVMNT and CVINT drug-resistance haplotypes were also found, and one previously undescribed haplotype (CVMDT) was detected. Regarding pfmdr1, the most prevalent haplotype was YEYSNVD (representing amino acids at codons 86, 130, 184, 1034, 1042, 1109 and 1246). Wild haplotypes for pfcrt and pfmdr1 were uncommon; 3% of field isolates harbored wild type pfcrt (CVMNK), whereas 21% had wild type pfmdr1 (NEYSNVD). The observed predominance of the StctVMNT haplotype in Angola could be a result of frequent travel between Brazil and Angola citizens in the context of selective pressure of heavy CQ use. Conclusions The high prevalence of the pfcrt SVMNT haplotype and the pfmdr1 86Y mutation confirm high-level chloroquine resistance and might suggest reduced efficacy of amodiaquine in Angola. Further studies must be encouraged to examine the in vitro sensitivity of pfcrt SVMNT parasites to artesunate and amodiaquine for better conclusive data. PMID:20565881

  4. On detecting incomplete soft or hard selective sweeps using haplotype structure

    DEFF Research Database (Denmark)

    Ferrer-Admetlla, Anna; Liang, Mason; Korneliussen, Thorfinn Sand

    2014-01-01

    We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust, particu......We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust......, particularly to recombination rate variation. However, all statistics show some sensitivity to the assumptions of the demographic model. Additionally, we show that nSL has at least as much power as other methods under a number of different selection scenarios, most notably in the cases of sweeps from standing...

  5. Two families from New England with usher syndrome type IC with distinct haplotypes.

    Science.gov (United States)

    DeAngelis, M M; McGee, T L; Keats, B J; Slim, R; Berson, E L; Dryja, T P

    2001-03-01

    To search for patients with Usher syndrome type IC among those with Usher syndrome type I who reside in New England. Genotype analysis of microsatellite markers closely linked to the USH1C locus was done using the polymerase chain reaction. We compared the haplotype of our patients who were homozygous in the USH1C region with the haplotypes found in previously reported USH1C Acadian families who reside in southwestern Louisiana and from a single family residing in Lebanon. Of 46 unrelated cases of Usher syndrome type I residing in New England, two were homozygous at genetic markers in the USH1C region. Of these, one carried the Acadian USH1C haplotype and had Acadian ancestors (that is, from Nova Scotia) who did not participate in the 1755 migration of Acadians to Louisiana. The second family had a haplotype that proved to be the same as that of a family with USH1C residing in Lebanon. Each of the two families had haplotypes distinct from the other. This is the first report that some patients residing in New England have Usher syndrome type IC. Patients with Usher syndrome type IC can have the Acadian haplotype or the Lebanese haplotype compatible with the idea that at least two independently arising pathogenic mutations have occurred in the yet-to-be identified USH1C gene.

  6. Haplotypes in CCR5-CCR2, CCL3 and CCL5 are associated with natural resistance to HIV-1 infection in a Colombian cohort.

    Science.gov (United States)

    Vega, Jorge A; Villegas-Ospina, Simón; Aguilar-Jiménez, Wbeimar; Rugeles, María T; Bedoya, Gabriel; Zapata, Wildeman

    2017-06-01

    Variants in genes encoding for HIV-1 co-receptors and their natural ligands have been individually associated to natural resistance to HIV-1 infection. However, the simultaneous presence of these variants has been poorly studied. To evaluate the association of single and multilocus haplotypes in genes coding for the viral co-receptors CCR5 and CCR2, and their ligands CCL3 and CCL5, with resistance or susceptibility to HIV-1 infection. Nine variants in CCR5-CCR2, two SNPs in CCL3 and two in CCL5 were genotyped by PCR-RFLP in 35 seropositive (cases) and 49 HIV-1-exposed seronegative Colombian individuals (controls). Haplotypes were inferred using the Arlequin software, and their frequency in individual or combined loci was compared between cases and controls by the chi-square test. A p' value ;0.05 after Bonferroni correction was considered significant. Homozygosis of the human haplogroup (HH) E was absent in controls and frequent in cases, showing a tendency to susceptibility. The haplotypes C-C and T-T in CCL3 were associated with susceptibility (p'=0.016) and resistance (p';0.0001) to HIV-1 infection, respectively. Finally, in multilocus analysis, the haplotype combinations formed by HHC in CCR5-CCR2, T-T in CCL3 and G-C in CCL5 were associated with resistance (p'=0.006). Our results suggest that specific combinations of variants in genes from the same signaling pathway can define an HIV-1 resistant phenotype. Despite our small sample size, our statistically significant associations suggest strong effects; however, these results should be further validated in larger cohorts.

  7. Fetal hemoglobin in sickle cell anemia: The Arab-Indian haplotype and new therapeutic agents.

    Science.gov (United States)

    Habara, Alawi H; Shaikho, Elmutaz M; Steinberg, Martin H

    2017-11-01

    Fetal hemoglobin (HbF) has well-known tempering effects on the symptoms of sickle cell disease and its levels vary among patients with different haplotypes of the sickle hemoglobin gene. Compared with sickle cell anemia haplotypes found in patients of African descent, HbF levels in Saudi and Indian patients with the Arab-Indian (AI) haplotype exceed that in any other haplotype by nearly twofold. Genetic association studies have identified some loci associated with high HbF in the AI haplotype but these observations require functional confirmation. Saudi patients with the Benin haplotype have HbF levels almost twice as high as African patients with this haplotype but this difference is unexplained. Hydroxyurea is still the only FDA approved drug for HbF induction in sickle cell disease. While most patients treated with hydroxyurea have an increase in HbF and some clinical improvement, 10 to 20% of adults show little response to this agent. We review the genetic basis of HbF regulation focusing on sickle cell anemia in Saudi Arabia and discuss new drugs that can induce increased levels of HbF. © 2017 Wiley Periodicals, Inc.

  8. Statistical inference an integrated Bayesianlikelihood approach

    CERN Document Server

    Aitkin, Murray

    2010-01-01

    Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre

  9. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  10. Mice, humans and haplotypes--the hunt for disease genes in SLE.

    Science.gov (United States)

    Rigby, R J; Fernando, M M A; Vyse, T J

    2006-09-01

    Defining the polymorphisms that contribute to the development of complex genetic disease traits is a challenging, although increasingly tractable problem. Historically, the technical difficulties in conducting association studies across the entire human genome are such that murine models have been used to generate candidate genes for analysis in human complex diseases, such as SLE. In this article we discuss the advantages and disadvantages of this approach and specifically address some assumptions made in the transition from studying one species to another, using lupus as an example. These issues include differences in genetic structure and genetic organisation which are a reflection on the population history. Clearly there are major differences in the histories of the human population and inbred laboratory strains of mice. Both human and murine genomes do exhibit structure at the genetic level. That is to say, they comprise haplotypes which are genomic regions that carry runs of polymorphisms that are not independently inherited. Haplotypes therefore reduce the number of combinations of the polymorphisms in the DNA in that region and facilitate the identification of disease susceptibility genes in both mice and humans. There are now novel means of generating candidate genes in SLE using mutagenesis (with ENU) in mice and identifying mice that generate antinuclear autoimmunity. In addition, murine models still provide a valuable means of exploring the functional consequences of genetic variation. However, advances in technology are such that human geneticists can now screen large fractions of the human genome for disease associations using microchip technologies that provide information on upwards of 100,000 different polymorphisms. These approaches are aimed at identifying haplotypes that carry disease susceptibility mutations and rely less on the generation of candidate genes.

  11. Common ataxia telangiectasia mutated haplotypes and risk of breast cancer: a nested case–control study

    International Nuclear Information System (INIS)

    Tamimi, Rulla M; Hankinson, Susan E; Spiegelman, Donna; Kraft, Peter; Colditz, Graham A; Hunter, David J

    2004-01-01

    The ataxia telangiectasia mutated (ATM) gene is a tumor suppressor gene with functions in cell cycle arrest, apoptosis, and repair of DNA double-strand breaks. Based on family studies, women heterozygous for mutations in the ATM gene are reported to have a fourfold to fivefold increased risk of breast cancer compared with noncarriers of the mutations, although not all studies have confirmed this association. Haplotype analysis has been suggested as an efficient method for investigating the role of common variation in the ATM gene and breast cancer. Five biallelic haplotype tagging single nucleotide polymorphisms are estimated to capture 99% of the haplotype diversity in Caucasian populations. We conducted a nested case–control study of breast cancer within the Nurses' Health Study cohort to address the role of common ATM haplotypes and breast cancer. Cases and controls were genotyped for five haplotype tagging single nucleotide polymorphisms. Haplotypes were predicted for 1309 cases and 1761 controls for which genotype information was available. Six unique haplotypes were predicted in this study, five of which occur at a frequency of 5% or greater. The overall distribution of haplotypes was not significantly different between cases and controls (χ 2 = 3.43, five degrees of freedom, P = 0.63). There was no evidence that common haplotypes of ATM are associated with breast cancer risk. Extensive single nucleotide polymorphism detection using the entire genomic sequence of ATM will be necessary to rule out less common variation in ATM and sporadic breast cancer risk

  12. Genetic population structure of the desert shrub species lycium ruthenicum inferred from chloroplast dna

    International Nuclear Information System (INIS)

    Chen, H.; Yonezawa, T.

    2014-01-01

    Lycium ruthenicum (Solananeae), a spiny shrub mostly distributed in the desert regions of north and northwest China, has been shown to exhibit high tolerance to the extreme environment. In this study, the phylogeography and evolutionary history of L. ruthenicum were examined, on the basis of 80 individuals from eight populations. Using the sequence variations of two spacer regions of chloroplast DNA (trnH-psbA and rps16-trnK) , the absence of a geographic component in the chloroplast DNA genetic structure was identified (GST = 0.351, NST = 0.304, NST< GST), which was consisted with the result of SAMOVA, suggesting weak phylogeographic structure of this species. Phylogenetic and network analyses showed that a total of 10 haplotypes identified in the present study clustered into two clades, in which clade I harbored the ancestral haplotypes that inferred two independent glacial refugia in the middle of Qaidam Basin and the western Inner Mongolia. The existence of regional evolutionary differences was supported by GENETREE, which revealed that one of the population in Qaidam Basin and the two populations in Tarim Basin had experienced rapid expansion, and the other populations retained relatively stable population size during the Pleistocene . Given the results of long-term gene flow and pairwise differences, strong gene flow was insufficient to reduce the genetic differentiation among populations or within populations, probably due to the genetic composition containing a common haplotype and the high number of private haplotypes fixed for most of the population. The divergence times of different lineages were consistent with the rapid uplift phases of the Qinghai-Tibetan Plateau and the initiation and expansion of deserts in northern China, suggesting that the origin and evolution of L. ruthenicum were strongly influenced by Quaternary environment changes. (author)

  13. Type Inference for Session Types in the Pi-Calculus

    DEFF Research Database (Denmark)

    Graversen, Eva Fajstrup; Harbo, Jacob Buchreitz; Huttel, Hans

    2014-01-01

    In this paper we present a direct algorithm for session type inference for the π-calculus. Type inference for session types has previously been achieved by either imposing limitations and restriction on the π-calculus, or by reducing the type inference problem to that for linear types. Our approach...

  14. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor

    DEFF Research Database (Denmark)

    Hansen, Torben Frøstrup; Spindler, Karen-Lise Garm; Andersen, Rikke Fredslund

    2010-01-01

    Abstract: New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study...... using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible...... findings in a second and independent cohort. Haplotype combinations call for further investigation. Keywords: colorectal neoplasm; single nucleotide polymorphisms; haplotypes; vascular endothelial growth factor A; survival...

  15. Analysis of SNPs and haplotypes in vitamin D pathway genes and renal cancer risk.

    Directory of Open Access Journals (Sweden)

    Sara Karami

    2009-09-01

    Full Text Available In the kidney vitamin D is converted to its active form. Since vitamin D exerts its activity through binding to the nuclear vitamin D receptor (VDR, most genetic studies have primarily focused on variation within this gene. Therefore, analysis of genetic variation in VDR and other vitamin D pathway genes may provide insight into the role of vitamin D in renal cell carcinoma (RCC etiology. RCC cases (N = 777 and controls (N = 1,035 were genotyped to investigate the relationship between RCC risk and variation in eight target genes. Minimum-p-value permutation (Min-P tests were used to identify genes associated with risk. A three single nucleotide polymorphism (SNP sliding window was used to identify chromosomal regions with a False Discovery Rate of <10%, where subsequently, haplotype relative risks were computed in Haplostats. Min-P values showed that VDR (p-value = 0.02 and retinoid-X-receptor-alpha (RXRA (p-value = 0.10 were associated with RCC risk. Within VDR, three haplotypes across two chromosomal regions of interest were identified. The first region, located within intron 2, contained two haplotypes that increased RCC risk by approximately 25%. The second region included a haplotype (rs2239179, rs12717991 across intron 4 that increased risk among participants with the TC (OR = 1.31, 95% CI = 1.09-1.57 haplotype compared to participants with the common haplotype, TT. Across RXRA, one haplotype located 3' of the coding sequence (rs748964, rs3118523, increased RCC risk 35% among individuals with the variant haplotype compared to those with the most common haplotype. This study comprehensively evaluated genetic variation across eight vitamin D pathway genes in relation to RCC risk. We found increased risk associated with VDR and RXRA. Replication studies are warranted to confirm these findings.

  16. The JAK2 GGCC (46/1 Haplotype in Myeloproliferative Neoplasms: Causal or Random?

    Directory of Open Access Journals (Sweden)

    Luisa Anelli

    2018-04-01

    Full Text Available The germline JAK2 haplotype known as “GGCC or 46/1 haplotype” (haplotypeGGCC_46/1 consists of a combination of single nucleotide polymorphisms (SNPs mapping in a region of about 250 kb, extending from the JAK2 intron 10 to the Insulin-like 4 (INLS4 gene. Four main SNPs (rs3780367, rs10974944, rs12343867, and rs1159782 generating a “GGCC” combination are more frequently indicated to represent the JAK2 haplotype. These SNPs are inherited together and are frequently associated with the onset of myeloproliferative neoplasms (MPN positive for both JAK2 V617 and exon 12 mutations. The association between the JAK2 haplotypeGGCC_46/1 and mutations in other genes, such as thrombopoietin receptor (MPL and calreticulin (CALR, or the association with triple negative MPN, is still controversial. This review provides an overview of the frequency and the role of the JAK2 haplotypeGGCC_46/1 in the pathogenesis of different myeloid neoplasms and describes the hypothetical mechanisms at the basis of the association with JAK2 gene mutations. Moreover, possible clinical implications are discussed, as different papers reported contrasting data about the correlation between the JAK2 haplotypeGGCC_46/1 and blood cell count, survival, or disease progression.

  17. General Purpose Probabilistic Programming Platform with Effective Stochastic Inference

    Science.gov (United States)

    2018-04-01

    REFERENCES 74 LIST OF ACRONYMS 80 ii List of Figures Figure 1. The problem of inferring curves from data while simultaneously choosing the...bottom path) as the inverse problem to computer graphics (top path). ........ 18 Figure 18. An illustration of generative probabilistic graphics for 3D...Building these systems involves simultaneously developing mathematical models, inference algorithms and optimized software implementations. Small changes

  18. Statistical inference based on divergence measures

    CERN Document Server

    Pardo, Leandro

    2005-01-01

    The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...

  19. HLA-G regulatory haplotypes and implantation outcome in couples who underwent assisted reproduction treatment.

    Science.gov (United States)

    Costa, Cynthia Hernandes; Gelmini, Georgia Fernanda; Wowk, Pryscilla Fanini; Mattar, Sibelle Botogosque; Vargas, Rafael Gustavo; Roxo, Valéria Maria Munhoz Sperandio; Schuffner, Alessandro; Bicalho, Maria da Graça

    2012-09-01

    The role of HLA-G in several clinical conditions related to reproduction has been investigated. Important polymorphisms have been found within the 5'URR and 3'UTR regions of the HLA-G promoter. The aim of the present study was to investigate 16 SNPs in the 5'URR and 14-bp insertion/deletion (ins/del) polymorphism located in the 3'UTR region of the HLA-G gene and its possible association with the implantation outcome in couples who underwent assisted reproduction treatments (ART). The case group was composed of 25 ART couples. Ninety-four couples with two or more term pregnancies composed the control group. Polymorphism haplotype frequencies of the HLA-G were determined for both groups. The Haplotype 5, Haplotype 8 and Haplotype 11 were absolute absence in ART couples. The HLA-G*01:01:02a, HLA-G*01:01:02b alleles and the 14-bp ins polymorphism, Haplotype 2, showed an increased frequency in case women and similar distribution between case and control men. However, this susceptibility haplotype is significantly presented in case women and in couple with failure implantation after treatment, which led us to suggest a maternal effect, associated with this haplotype, once their presence in women is related to a higher number of couples who underwent ART. Copyright © 2012. Published by Elsevier Inc.

  20. EI: A Program for Ecological Inference

    Directory of Open Access Journals (Sweden)

    Gary King

    2004-09-01

    Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.

  1. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  2. On quantum statistical inference

    NARCIS (Netherlands)

    Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have

  3. Approximate Bayesian computation for modular inference problems with many parameters: the example of migration rates.

    Science.gov (United States)

    Aeschbacher, S; Futschik, A; Beaumont, M A

    2013-02-01

    We propose a two-step procedure for estimating multiple migration rates in an approximate Bayesian computation (ABC) framework, accounting for global nuisance parameters. The approach is not limited to migration, but generally of interest for inference problems with multiple parameters and a modular structure (e.g. independent sets of demes or loci). We condition on a known, but complex demographic model of a spatially subdivided population, motivated by the reintroduction of Alpine ibex (Capra ibex) into Switzerland. In the first step, the global parameters ancestral mutation rate and male mating skew have been estimated for the whole population in Aeschbacher et al. (Genetics 2012; 192: 1027). In the second step, we estimate in this study the migration rates independently for clusters of demes putatively connected by migration. For large clusters (many migration rates), ABC faces the problem of too many summary statistics. We therefore assess by simulation if estimation per pair of demes is a valid alternative. We find that the trade-off between reduced dimensionality for the pairwise estimation on the one hand and lower accuracy due to the assumption of pairwise independence on the other depends on the number of migration rates to be inferred: the accuracy of the pairwise approach increases with the number of parameters, relative to the joint estimation approach. To distinguish between low and zero migration, we perform ABC-type model comparison between a model with migration and one without. Applying the approach to microsatellite data from Alpine ibex, we find no evidence for substantial gene flow via migration, except for one pair of demes in one direction. © 2013 Blackwell Publishing Ltd.

  4. Geographical distribution of a specific mitochondrial haplotype of Zymoseptoria tritici

    Directory of Open Access Journals (Sweden)

    Sameh BOUKEF

    2014-01-01

    Full Text Available Severity of disease caused by the fungus Zymoseptoria tritici throughout world cereal growing regions has elicited much debate on the potential evolutionary mechanism conferring high adaptability of the pathogen to diverse climate conditions and different wheat hosts (Triticum durum and T. aestivum. Specific mitochondrial DNA sequence was used to investigate geographic distribution of the type 4 haplotype (mtRFLP4 within 1363 isolates of Z. tritici originating from 21 countries. The mtRFLP4 haplotype was detected from both durum and bread wheat hosts with greater frequency on durum wheat. The distribution of mtRFLP4 was limited to populations sampled from the Mediterranean and the Red Sea region. Greater frequencies of mtRFLP4 were found in Tunisia (87% and Algeria (60%. The haplotype was absent within European, Australian, North and South American populations except Argentina. While alternative hypotheses such as climatic adaptation could not be ruled out, it is postulated that mtRFLP4 originated in North Africa (e.g. Tunisia or Algeria as an adaptation to durum wheat as the prevailing cereal crop. The specialized haplotype has subsequently spread as indicated by lower frequency of occurrence in the surrounding Mediterranean countries and on bread wheat hosts.

  5. Mitochondrial and Y chromosome haplotype motifs as diagnostic markers of Jewish ancestry: a reconsideration.

    Directory of Open Access Journals (Sweden)

    Sergio eTofanelli

    2014-11-01

    Full Text Available Several authors have proposed haplotype motifs based on site variants at the mitochondrial genome (mtDNA and the non-recombining portion of the Y chromosome (NRY to trace the genealogies of Jewish people. Here, we analyzed their main approaches and test the feasibility of adopting motifs as ancestry markers through construction of a large database of mtDNA and NRY haplotypes from public genetic genealogical repositories. We verified the reliability of Jewish ancestry prediction based on the Cohen and Levite Modal Haplotypes in their classical 6 STR marker format or in the extended 12 STR format, as well as four founder mtDNA lineages (HVS-I segments accounting for about 40% of the current population of Ashkenazi Jews. For this purpose we compared haplotype composition in individuals of self-reported Jewish ancestry with the rest of European, African or Middle Eastern samples, to test for non-random association of ethno-geographic groups and haplotypes. Overall, NRY and mtDNA based motifs, previously reported to differentiate between groups, were found to be more represented in Jewish compared to non-Jewish groups. However, this seems to stem from common ancestors of Jewish lineages being rather recent respect to ancestors of non-Jewish lineages with the same haplotype signatures. Moreover, the polyphyly of haplotypes which contain the proposed motifs and the misuse of constant mutation rates heavily affected previous attempts to correctly dating the origin of common ancestries. Accordingly, our results stress the limitations of using the above haplotype motifs as reliable Jewish ancestry predictors and show its inadequacy for forensic or genealogical purposes.

  6. Genetic and molecular characterization of three novel S-haplotypes in sour cherry (Prunus cerasus L.).

    Science.gov (United States)

    Tsukamoto, Tatsuya; Potter, Daniel; Tao, Ryutaro; Vieira, Cristina P; Vieira, Jorge; Iezzoni, Amy F

    2008-01-01

    Tetraploid sour cherry (Prunus cerasus L.) exhibits gametophytic self-incompatibility (GSI) whereby the specificity of self-pollen rejection is controlled by alleles of the stylar and pollen specificity genes, S-RNase and SFB (S haplotype-specific F-box protein gene), respectively. As sour cherry selections can be either self-compatible (SC) or self-incompatible (SI), polyploidy per se does not result in SC. Instead the genotype-dependent loss of SI in sour cherry is due to the accumulation of non-functional S-haplotypes. The presence of two or more non-functional S-haplotypes within sour cherry 2x pollen renders that pollen SC. Two new S-haplotypes from sour cherry, S(33) and S(34), that are presumed to be contributed by the P. fruticosa species parent, the complete S-RNase and SFB sequences of a third S-haplotype, S(35), plus the presence of two previously identified sweet cherry S-haplotypes, S(14) and S(16) are described here. Genetic segregation data demonstrated that the S(16)-, S(33)-, S(34)-, and S(35)-haplotypes present in sour cherry are fully functional. This result is consistent with our previous finding that 'hetero-allelic' pollen is incompatible in sour cherry. Phylogenetic analyses of the SFB and S-RNase sequences from available Prunus species reveal that the relationships among S-haplotypes show no correspondence to known organismal relationships at any taxonomic level within Prunus, indicating that polymorphisms at the S-locus have been maintained throughout the evolution of the genus. Furthermore, the phylogenetic relationships among SFB sequences are generally incongruent with those among S-RNase sequences for the same S-haplotypes. Hypotheses compatible with these results are discussed.

  7. β-globin haplotypes in normal and hemoglobinopathic individuals from Reconcavo Baiano, State of Bahia, Brazil

    Directory of Open Access Journals (Sweden)

    Wellington dos Santos Silva

    2010-01-01

    Full Text Available Five restriction site polymorphisms in the β-globin gene cluster (HincII-5'ε, HindIII-Gγ, HindIII-ªγ, HincII-'ψβ1 and HincII-3''ψβ1 were analyzed in three populations (n = 114 from Reconcavo Baiano, State of Bahia, Brazil. The groups included two urban populations from the towns of Cachoeira and Maragojipe and one rural Afro-descendant population, known as the "quilombo community", from Cachoeira municipality. The number of haplotypes found in the populations ranged from 10 to 13, which indicated higher diversity than in the parental populations. The haplotypes 2 (+----,3(----+,4(-+--+and6(-++-+onthe βA chromosomes were the most common, and two haplotypes, 9 (-++++and 14 (++--+, were found exclusively in the Maragojipe population. The other haplotypes (1, 5, 9, 11, 12, 13, 14 and 16 had lower frequencies. Restriction site analysis and the derived haplotypes indicated homogeneity among the populations. Thirty-two individuals with hemoglobinopathies (17 sickle cell disease, 12 HbSC disease and 3 HbCC disease were also analyzed. The haplotype frequencies of these patients differed significantly from those of the general population. In the sickle cell disease subgroup, the predominant haplotypes were BEN (Benin and CAR (Central African Republic, with frequencies of 52.9% and 32.4%, respectively. The high frequency of the BEN haplotype agreed with the historical origin of the afro-descendant population in the state of Bahia. However, this frequency differed from that of Salvador, the state capital, where the CAR and BEN haplotypes have similar frequencies, probably as a consequence of domestic slave trade and subsequent internal migrations to other regions of Brazil.

  8. Causal Effect Inference with Deep Latent-Variable Models

    NARCIS (Netherlands)

    Louizos, C; Shalit, U.; Mooij, J.; Sontag, D.; Zemel, R.; Welling, M.

    2017-01-01

    Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of

  9. Endothelial Nitric Oxide Synthase Haplotypes Are Associated with Preeclampsia in Maya Mestizo Women

    Science.gov (United States)

    Díaz-Olguín, Lizbeth; Coral-Vázquez, Ramón Mauricio; Canto-Cetina, Thelma; Canizales-Quinteros, Samuel; Ramírez Regalado, Belem; Fernández, Genny; Canto, Patricia

    2011-01-01

    Preeclampsia is a specific disease of pregnancy and believed to have a genetic component. The aim of this study was to investigate if three polymorphisms in eNOS or their haplotypes are associated with preeclampsia in Maya mestizo women. A case-control study was performed where 127 preeclamptic patients and 263 controls were included. Genotyped and haplotypes for the -768T→C, intron 4 variants, Glu298Asp of eNOS were determined by PCR and real-time PCR allelic discrimination. Logistic regression analysis with adjustment for age and body mass index (BMI) was used to test for associations between genotype and preeclampsia under recessive, codominant and dominant models. Pairwise linkage disequilibrium between single nucleotide polymorphisms was calculated by direct correlation r2, and haplotype analysis was conducted. Women homozygous for the Asp298 allele showed an association of preeclampsia. In addition, analysis of the haplotype frequencies revealed that the -786C-4b-Asp298 haplotype was significantly more frequent in preeclamptic patients than in controls (0.143 vs. 0.041, respectively; OR = 3.01; 95% CI = 1.74–5.23; P = 2.9 × 10−4). Despite the Asp298 genotype in a recessive model associated with the presence of preeclampsia in Maya mestizo women, we believe that in this population the -786C-4b-Asp298 haplotype is a better genetic marker. PMID:21897002

  10. Interrelationships between Amerindian tribes of lower Amazonia as manifest by HLA haplotype disequilibria.

    Science.gov (United States)

    Black, F L

    1984-11-01

    HLA B-C haplotypes exhibit common disequilibria in populations drawn from four continents, indicating that they are subject to broadly active selective forces. However, the A-B and A-C associations we have examined show no consistent disequilibrium pattern, leaving open the possibility that these disequilibria are due to descent from common progenitors. By examining HLA haplotype distributions, I have explored the implications that would follow from the hypothesis that biological selection played no role in determining A-C disequilibria in 10 diverse tribes of the lower Amazon Basin. Certain haplotypes are in strong positive disequilibria across a broad geographic area, suggesting that members of diverse tribes descend from common ancestors. On the basis of the extent of diffusion of the components of these haplotypes, one can estimate that the progenitors lived less than 6,000 years ago. One widely encountered lineage entered the area within the last 1,200 years. When haplotype frequencies are used in genetic distance measurements, they give a pattern of relationships very similar to that obtained by conventional chord measurements based on several genetic markers; but more than that, when individual haplotype disequilibria in the several tribes are compared, multiple origins of a single tribe are discernible and relationships are revealed that correlate more closely to geographic and linguistic patterns than do the genetic distance measurements.

  11. Polynomial Chaos Surrogates for Bayesian Inference

    KAUST Repository

    Le Maitre, Olivier

    2016-01-06

    The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.

  12. The putative oncogene Pim-1 in the mouse: its linkage and variation among t haplotypes.

    Science.gov (United States)

    Nadeau, J H; Phillips, S J

    1987-11-01

    Pim-1, a putative oncogene involved in T-cell lymphomagenesis, was mapped between the pseudo-alpha globin gene Hba-4ps and the alpha-crystallin gene Crya-1 on mouse chromosome 17 and therefore within the t complex. Pim-1 restriction fragment variants were identified among t haplotypes. Analysis of restriction fragment sizes obtained with 12 endonucleases demonstrated that the Pim-1 genes in some t haplotypes were indistinguishable from the sizes for the Pim-1b allele in BALB/c inbred mice. There are now three genes, Pim-1, Crya-1 and H-2 I-E, that vary among independently derived t haplotypes and that have indistinguishable alleles in t haplotypes and inbred strains. These genes are closely linked within the distal inversion of the t complex. Because it is unlikely that these variants arose independently in t haplotypes and their wild-type homologues, we propose that an exchange of chromosomal segments, probably through double crossingover, was responsible for indistinguishable Pim-1 genes shared by certain t haplotypes and their wild-type homologues. There was, however, no apparent association between variant alleles of these three genes among t haplotypes as would be expected if a single exchange introduced these alleles into t haplotypes. If these variant alleles can be shown to be identical to the wild-type allele, then lack of association suggests that multiple exchanges have occurred during the evolution of the t complex.

  13. A comprehensive literature review of haplotyping software and methods for use with unrelated individuals

    Directory of Open Access Journals (Sweden)

    Salem Rany M

    2005-03-01

    Full Text Available Abstract Interest in the assignment and frequency analysis of haplotypes in samples of unrelated individuals has increased immeasurably as a result of the emphasis placed on haplotype analyses by, for example, the International HapMap Project and related initiatives. Although there are many available computer programs for haplotype analysis applicable to samples of unrelated individuals, many of these programs have limitations and/or very specific uses. In this paper, the key features of available haplotype analysis software for use with unrelated individuals, as well as pooled DNA samples from unrelated individuals, are summarised. Programs for haplotype analysis were identified through keyword searches on PUBMED and various internet search engines, a review of citations from retrieved papers and personal communications, up to June 2004. Priority was given to functioning computer programs, rather than theoretical models and methods. The available software was considered in light of a number of factors: the algorithm(s used, algorithm accuracy, assumptions, the accommodation of genotyping error, implementation of hypothesis testing, handling of missing data, software characteristics and web-based implementations. Review papers comparing specific methods and programs are also summarised. Forty-six haplotyping programs were identified and reviewed. The programs were divided into two groups: those designed for individual genotype data (a total of 43 programs and those designed for use with pooled DNA samples (a total of three programs. The accuracy of programs using various criteria are assessed and the programs are categorised and discussed in light of: algorithm and method, accuracy, assumptions, genotyping error, hypothesis testing, missing data, software characteristics and web implementation. Many available programs have limitations (eg some cannot accommodate missing data and/or are designed with specific tasks in mind (eg estimating

  14. The Association of DRD2 with Insight Problem Solving.

    Science.gov (United States)

    Zhang, Shun; Zhang, Jinghuan

    2016-01-01

    Although the insight phenomenon has attracted great attention from psychologists, it is still largely unknown whether its variation in well-functioning human adults has a genetic basis. Several lines of evidence suggest that genes involved in dopamine (DA) transmission might be potential candidates. The present study explored for the first time the association of dopamine D2 receptor gene ( DRD2 ) with insight problem solving. Fifteen single-nucleotide polymorphisms (SNPs) covering DRD2 were genotyped in 425 unrelated healthy Chinese undergraduates, and were further tested for association with insight problem solving. Both single SNP and haplotype analysis revealed several associations of DRD2 SNPs and haplotypes with insight problem solving. In conclusion, the present study provides the first evidence for the involvement of DRD2 in insight problem solving, future studies are necessary to validate these findings.

  15. Genetic relationships among native americans based on b-globin gene cluster haplotype frequencies

    OpenAIRE

    Mousinho-Ribeiro Rita de Cassia; Pante-de-Sousa Gabriella; Santos Eduardo José Melo dos; Guerreiro João Farias

    2003-01-01

    The distribution of b-globin gene haplotypes was studied in 209 Amerindians from eight tribes of the Brazilian Amazon: Asurini from Xingú, Awá-Guajá, Parakanã, Urubú-Kaapór, Zoé, Kayapó (Xikrin from the Bacajá village), Katuena, and Tiriyó. Nine different haplotypes were found, two of which (n. 11 and 13) had not been previously identified in Brazilian indigenous populations. Haplotype 2 (+ - - - -) was the most common in all groups studied, with frequencies varying from 70% to 100%, followed...

  16. Association between β2-adrenoceptor (ADRB2) haplotypes and insulin resistance in PCOS.

    Science.gov (United States)

    Tellechea, Mariana L; Muzzio, Damián O; Iglesias Molli, Andrea E; Belli, Susana H; Graffigna, Mabel N; Levalle, Oscar A; Frechtel, Gustavo D; Cerrone, Gloria E

    2013-04-01

    The aim of this study was to explore β2-adrenoceptor (ADRB2) haplotype associations with phenotypes and quantitative traits related to insulin resistance (IR) and the metabolic syndrome (MS) in a polycystic ovary syndrome (PCOS) population. A secondary purpose was to assess the association between ADRB2 haplotype and PCOS. Genetic polymorphism analysis. Cross-sectional case-control association study. Medical University Hospital and research laboratory. One hundred and sixty-five unrelated women with PCOS and 116 unrelated women without PCOS (control sample). Clinical and biochemical measurements, and ADRB2 genotyping in PCOS patients and control subjects. ADRB2 haplotypes (comprising rs1042711, rs1801704, rs1042713 and rs1042714 in that order), genotyping and statistical analysis to evaluate associations with continuous variables and traits related to IR and MS in a PCOS population. Associations between ADRB2 haplotypes and PCOS were also assessed. We observed an age-adjusted association between ADRB2 haplotype CCGG and lower insulin (P = 0·018) and HOMA (P = 0·008) in the PCOS sample. Interestingly, the expected differences in surrogate measures of IR between cases and controls were not significant in CCGG/CCGG carriers. In the case-control study, genotype CCGG/CCGG was associated with a 14% decrease in PCOS risk (P = 0·043), taking into account confounding variables. Haplotype I (CCGG) has a protective role for IR and MS in PCOS. © 2012 Blackwell Publishing Ltd.

  17. Bayesian Inference Methods for Sparse Channel Estimation

    DEFF Research Database (Denmark)

    Pedersen, Niels Lovmand

    2013-01-01

    This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...

  18. On principles of inductive inference

    OpenAIRE

    Kostecki, Ryszard Paweł

    2011-01-01

    We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.

  19. Deep Learning for Population Genetic Inference.

    Science.gov (United States)

    Sheehan, Sara; Song, Yun S

    2016-03-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  20. A mixed integer linear programming model to reconstruct phylogenies from single nucleotide polymorphism haplotypes under the maximum parsimony criterion

    Science.gov (United States)

    2013-01-01

    Background Phylogeny estimation from aligned haplotype sequences has attracted more and more attention in the recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from medical research, to drug discovery, to epidemiology, to population dynamics. The literature on molecular phylogenetics proposes a number of criteria for selecting a phylogeny from among plausible alternatives. Usually, such criteria can be expressed by means of objective functions, and the phylogenies that optimize them are referred to as optimal. One of the most important estimation criteria is the parsimony which states that the optimal phylogeny T∗for a set H of n haplotype sequences over a common set of variable loci is the one that satisfies the following requirements: (i) it has the shortest length and (ii) it is such that, for each pair of distinct haplotypes hi,hj∈H, the sum of the edge weights belonging to the path from hi to hj in T∗ is not smaller than the observed number of changes between hi and hj. Finding the most parsimonious phylogeny for H involves solving an optimization problem, called the Most Parsimonious Phylogeny Estimation Problem (MPPEP), which is NP-hard in many of its versions. Results In this article we investigate a recent version of the MPPEP that arises when input data consist of single nucleotide polymorphism haplotypes extracted from a population of individuals on a common genomic region. Specifically, we explore the prospects for improving on the implicit enumeration strategy of implicit enumeration strategy used in previous work using a novel problem formulation and a series of strengthening valid inequalities and preliminary symmetry breaking constraints to more precisely bound the solution space and accelerate implicit enumeration of possible optimal phylogenies. We present the basic formulation and then introduce a series of provable valid constraints to reduce the solution space. We then prove

  1. Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors

    International Nuclear Information System (INIS)

    Lucka, Felix

    2012-01-01

    Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)

  2. Analysis of HLA class II haplotypes in the Cayapa indians of ecuador: A novel DRBI allele reveals evidence for convergent evolution and balancing selection at position 86

    Energy Technology Data Exchange (ETDEWEB)

    Titus-Trachtenberg, E.A.; Erlich, H. (Roche Molecular Systems, Alameda, CA (United States)); Rickards, O.; De Stefano, G.F. (Universita di Roma, Rome (Italy))

    1994-07-01

    PCR amplification, oligonucleotide probe typing, and sequencing were used to analyze the HLA class II loci (DRB1, DQA1, DAB1, and DPB1) of an isolated South Amerindian tribe. Here the authors report HLA class II variation, including the identification of a new DRB1 allele, several novel DR/DQ haplotypes, and an unusual distribution of DPB1 alleles, among the Cayapa Indians (N=100) of Ecuador. A general reduction of HLA class II allelic variation in the Cayapa is consistent with a population bottleneck during the colonization of the Americas. The new Cayapa DRB1 allele, DRB1[sup *]08042, which arose by a G[yields]T point mutation in the parental DRB1[sup *]0802, contains a novel Val codon (GTT) at position 86. The generation of DRB1[sup *]08042 (Val-86) from DRB1[sup *]0802 (Gly-86) in the Cayapa, by a different mechanism than the (GT[yields]TG) change in the creation of DRB1[sub *]08041 (Val-86) from DRB1[sup *]0802 in Africa, implicates selection in the convergent evolution of position 86 DR[beta] variants. The DRB1[sup *]08042 allele has not been found in >1,800 Amerindian haplotypes and thus presumably arose after the Cayapa separated from other South American Amerindians. Selection pressure for increased haplotype diversity can be inferred in the generation and maintenance of three new DRB1[sup *]08042 haplotypes and several novel DR/DQ haplotypes in this population. The DPB1 allelic distribution in the Cayapa is also extraordinary, with two alleles, DPB1[sup *]1401, a very rare allele in North American Amerindian populations, and DPB1[sup *]0402, the most common Amerindian DPB1 allele, constituting 89% of the Cayapa DPB1. These data are consistent with the postulated rapid rate of evolution as noted for the class I HLA-B locus of other South American Indians. 34 refs., 2 figs., 2 tabs.

  3. Genetic polymorphisms and haplotypes of the organic cation transporter 1 gene (SLC22A1 in the Xhosa population of South Africa

    Directory of Open Access Journals (Sweden)

    Clifford Jacobs

    2014-06-01

    Full Text Available Human organic cation transporter 1 is primarily expressed in hepatocytes and mediates the electrogenic transport of various endogenous and exogenous compounds, including clinically important drugs. Genetic polymorphisms in the gene coding for human organic cation transporter 1, SLC22A1, are increasingly being recognized as a possible mechanism explaining the variable response to clinical drugs, which are substrates for this transporter. The genotypic and allelic distributions of 19 nonsynonymous and one intronic SLC22A1 single nucleotide polymorphisms were determined in 148 healthy Xhosa participants from South Africa, using a SNAPshot® multiplex assay. In addition, haplotype structure for SLC22A1 was inferred from the genotypic data. The minor allele frequencies for S14F (rs34447885, P341L (rs2282143, V519F (rs78899680, and the intronic variant rs622342 were 1.7%, 8.4%, 3.0%, and 21.6%, respectively. None of the participants carried the variant allele for R61C (rs12208357, C88R (rs55918055, S189L (rs34104736, G220V (rs36103319, P283L (rs4646277, R287G (rs4646278, G401S (rs34130495, M440I (rs35956182, or G465R (rs34059508. In addition, no variant alleles were observed for A306T (COSM164365, A413V (rs144322387, M420V (rs142448543, I421F (rs139512541, C436F (rs139512541, V501E (rs143175763, or I542V (rs137928512 in the population. Eight haplotypes were inferred from the genotypic data. This study reports important genetic data that could be useful for future pharmacogenetic studies of drug transporters in the indigenous Sub-Saharan African populations.

  4. State-Space Inference and Learning with Gaussian Processes

    OpenAIRE

    Turner, R; Deisenroth, MP; Rasmussen, CE

    2010-01-01

    18.10.13 KB. Ok to add author version to spiral, authors hold copyright. State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. C...

  5. Genetic variation of the greenhouse whitefly, Trialeurodes vaporariorum (Hemiptera: Aleyrodidae), among populations from Serbia and neighbouring countries, as inferred from COI sequence variability.

    Science.gov (United States)

    Prijović, M; Skaljac, M; Drobnjaković, T; Zanić, K; Perić, P; Marčić, D; Puizina, J

    2014-06-01

    The greenhouse whitefly Trialeurodes vaporariorum Westwood, 1856 (Hemiptera: Aleyrodidae) is an invasive and highly polyphagous phloem-feeding pest of vegetables and ornamentals. Trialeurodes vaporariorum causes serious damage due to direct feeding and transmits several important plant viruses. Excessive use of insecticides has resulted in significantly reduced levels of susceptibility of various T. vaporariorum populations. To determine the genetic variability within and among populations of T. vaporariorum from Serbia and to explore their genetic relatedness with other T. vaporariorum populations, we analysed the mitochondrial cytochrome c oxidase I (COI) sequences of 16 populations from Serbia and six neighbouring countries: Montenegro (three populations), Macedonia (one population) and Croatia (two populations), for a total of 198 analysed specimens. A low overall level of sequence divergence and only five variable nucleotides and six haplotypes were found. The most frequent haplotype, H1, was identified in all Serbian populations and in all specimens from distant localities in Croatia and Macedonia. The COI sequence data that was retrieved from GenBank and the data from our study indicated that H1 is the most globally widespread T. vaporariorum haplotype. A lack of spatial genetic structure among the studied T. vaporariorum populations, as well as two demographic tests that we performed (Tajima's D value and Fu's Fs statistics), indicate a recent colonisation event and population growth. Phylogenetic analyses of the COI haplotypes in this study and other T. vaporariorum haplotypes that were retrieved from GenBank were performed using Bayesian inference and median-joining (MJ) network analysis. Two major haplogroups with only a single unique nucleotide difference were found: haplogroup 1 (containing the five Serbian haplotypes and those previously identified in India, China, the Netherlands, the United Kingdom, Morocco, Reunion and the USA) and haplogroup 3

  6. Haplotypes of the porcine peroxisome proliferator-activated receptor delta gene are associated with backfat thickness

    Directory of Open Access Journals (Sweden)

    Blöcker Helmut

    2009-11-01

    Full Text Available Abstract Background Peroxisome proliferator-activated receptor delta belongs to the nuclear receptor superfamily of ligand-inducible transcription factors. It is a key regulator of lipid metabolism. The peroxisome proliferator-activated receptor delta gene (PPARD has been assigned to a region on porcine chromosome 7, which harbours a quantitative trait locus for backfat. Thus, PPARD is considered a functional and positional candidate gene for backfat thickness. The purpose of this study was to test this candidate gene hypothesis in a cross of breeds that were highly divergent in lipid deposition characteristics. Results Screening for genetic variation in porcine PPARD revealed only silent mutations. Nevertheless, significant associations between PPARD haplotypes and backfat thickness were observed in the F2 generation of the Mangalitsa × Piétrain cross as well as a commercial German Landrace population. Haplotype 5 is associated with increased backfat in F2 Mangalitsa × Piétrain pigs, whereas haplotype 4 is associated with lower backfat thickness in the German Landrace population. Haplotype 4 and 5 carry the same alleles at all but one SNP. Interestingly, the opposite effects of PPARD haplotypes 4 and 5 on backfat thickness are reflected by opposite effects of these two haplotypes on PPAR-δ mRNA levels. Haplotype 4 significantly increases PPAR-δ mRNA levels, whereas haplotype 5 decreases mRNA levels of PPAR-δ. Conclusion This study provides evidence for an association between PPARD and backfat thickness. The association is substantiated by mRNA quantification. Further studies are required to clarify, whether the observed associations are caused by PPARD or are the result of linkage disequilibrium with a causal variant in a neighbouring gene.

  7. Novel Harmful Recessive Haplotypes Identified for Fertility Traits in Nordic Holstein Cattle

    Science.gov (United States)

    Sahana, Goutam; Nielsen, Ulrik Sander; Aamand, Gert Pedersen; Lund, Mogens Sandø; Guldbrandtsen, Bernt

    2013-01-01

    Using genomic data, lethal recessives may be discovered from haplotypes that are common in the population but never occur in the homozygote state in live animals. This approach only requires genotype data from phenotypically normal (i.e. live) individuals and not from the affected embryos that die. A total of 7,937 Nordic Holstein animals were genotyped with BovineSNP50 BeadChip and haplotypes including 25 consecutive markers were constructed and tested for absence of homozygotes states. We have identified 17 homozygote deficient haplotypes which could be loosely clustered into eight genomic regions harboring possible recessive lethal alleles. Effects of the identified haplotypes were estimated on two fertility traits: non-return rates and calving interval. Out of the eight identified genomic regions, six regions were confirmed as having an effect on fertility. The information can be used to avoid carrier-by-carrier mattings in practical animal breeding. Further, identification of causative genes/polymorphisms responsible for lethal effects will lead to accurate testing of the individuals carrying a lethal allele. PMID:24376603

  8. Common ADRB2 haplotypes derived from 26 polymorphic sites direct beta2-adrenergic receptor expression and regulation phenotypes.

    Directory of Open Access Journals (Sweden)

    Alfredo Panebra

    2010-07-01

    Full Text Available The beta2-adrenergic receptor (beta2AR is expressed on numerous cell-types including airway smooth muscle cells and cardiomyocytes. Drugs (agonists or antagonists acting at these receptors for treatment of asthma, chronic obstructive pulmonary disease, and heart failure show substantial interindividual variability in response. The ADRB2 gene is polymorphic in noncoding and coding regions, but virtually all ADRB2 association studies have utilized the two common nonsynonymous coding SNPs, often reaching discrepant conclusions.We constructed the 8 common ADRB2 haplotypes derived from 26 polymorphisms in the promoter, 5'UTR, coding, and 3'UTR of the intronless ADRB2 gene. These were cloned into an expression construct lacking a vector-based promoter, so that beta2AR expression was driven by its promoter, and steady state expression could be modified by polymorphisms throughout ADRB2 within a haplotype. "Whole-gene" transfections were performed with COS-7 cells and revealed 4 haplotypes with increased cell surface beta2AR protein expression compared to the others. Agonist-promoted downregulation of beta2AR protein expression was also haplotype-dependent, and was found to be increased for 2 haplotypes. A phylogenetic tree of the haplotypes was derived and annotated by cellular phenotypes, revealing a pattern potentially driven by expression.Thus for obstructive lung disease, the initial bronchodilator response from intermittent administration of beta-agonist may be influenced by certain beta2AR haplotypes (expression phenotypes, while other haplotypes may influence tachyphylaxis during the response to chronic therapy (downregulation phenotypes. An ideal clinical outcome of high expression and less downregulation was found for two haplotypes. Haplotypes may also affect heart failure antagonist therapy, where beta2AR increase inotropy and are anti-apoptotic. The haplotype-specific expression and regulation phenotypes found in this transfection

  9. Vitamin D Receptor Gene Polymorphisms and Haplotypes in Hungarian Patients with Idiopathic Inflammatory Myopathy

    Directory of Open Access Journals (Sweden)

    Levente Bodoki

    2015-01-01

    Full Text Available Idiopathic inflammatory myopathies are autoimmune diseases characterized by symmetrical proximal muscle weakness. Our aim was to identify a correlation between VDR polymorphisms or haplotypes and myositis. We studied VDR-BsmI, VDR-ApaI, VDR-TaqI, and VDR-FokI polymorphisms and haplotypes in 89 Hungarian poly-/dermatomyositis patients (69 females and 93 controls (52 females. We did not obtain any significant differences for VDR-FokI, BsmI, ApaI, and TaqI genotypes and allele frequencies between patients with myositis and healthy individuals. There was no association of VDR polymorphisms with clinical manifestations and laboratory profiles in myositis patients. Men with myositis had a significantly different distribution of BB, Bb, and bb genotypes than female patients, control male individuals, and the entire control group. Distribution of TT, Tt, and tt genotypes was significantly different in males than in females in patient group. According to four-marker haplotype prevalence, frequencies of sixteen possible haplotypes showed significant differences between patient and control groups. The three most frequent haplotypes in patients were the fbAt, FBaT, and fbAT. Our findings may reveal that there is a significant association: Bb and Tt genotypes can be associated with myositis in the Hungarian population we studied. We underline the importance of our result in the estimated prevalence of four-marker haplotypes.

  10. MCP1 haplotypes associated with protection from pulmonary tuberculosis

    Directory of Open Access Journals (Sweden)

    Owusu-Dabo Ellis

    2011-04-01

    Full Text Available Abstract Background The monocyte chemoattractant protein 1 (MCP-1 is involved in the recruitment of lymphocytes and monocytes and their migration to sites of injury and cellular immune reactions. In a Ghanaian tuberculosis (TB case-control study group, associations of the MCP1 -362C and the MCP1 -2581G alleles with resistance to TB were recently described. The latter association was in contrast to genetic effects previously described in study groups originating from Mexico, Korea, Peru and Zambia. This inconsistency prompted us to further investigate the MCP1 gene in order to determine causal variants or haplotypes genetically and functionally. Results A 14 base-pair deletion in the first MCP1 intron, int1del554-567, was strongly associated with protection against pulmonary TB (OR = 0.84, CI 0.77-0.92, Pcorrected = 0.00098. Compared to the wildtype combination, a haplotype comprising the -2581G and -362C promoter variants and the intronic deletion conferred an even stronger protection than did the -362C variant alone (OR = 0.78, CI 0.69-0.87, Pnominal = 0.00002; adjusted Pglobal = 0.0028. In a luciferase reporter gene assay, a significant reduction of luciferase gene expression was observed in the two constructs carrying the MCP1 mutations -2581 A or G plus the combination -362C and int1del554-567 compared to the wildtype haplotype (P = 0.02 and P = 0.006. The associated variants, in particular the haplotypes composed of these latter variants, result in decreased MCP-1 expression and a decreased risk of pulmonary TB. Conclusions In addition to the results of the previous study of the Ghanaian TB case-control sample, we have now identified the haplotype combination -2581G/-362C/int1del554-567 that mediates considerably stronger protection than does the MCP1 -362C allele alone (OR = 0.78, CI 0.69-0.87 vs OR = 0.83, CI 0.76-0.91. Our findings in both the genetic analysis and the reporter gene study further indicate a largely negligible role of the

  11. Extended HLA-D region haplotype associated with celiac disease

    Energy Technology Data Exchange (ETDEWEB)

    Howell, M.D.; Smith, J.R.; Austin, R.K.; Kelleher, D.; Nepom, G.T.; Volk, B.; Kagnoff, M.F.

    1988-01-01

    Celiac disease has one of the strongest associations with HLA (human leukocyte antigen) class II markers of the known HLA-linked diseases. This association is primarily with the class II serologic specificities HLA-DR3 and -DQw2. The authors previously described a restriction fragment length polymorphism (RFLP) characterized by the presence of a 4.0-kilobase Rsa I fragment derived from an HLA class II ..beta..-chain gene, which distinguishes the class II HLA haplotype of celiac disease patients from those of many serologically matched controls. They now report the isolation of this ..beta..-chain gene from a bacteriophage genomic library constructed from the DNA of a celiac disease patient. Based on restriction mapping and differential hybridization with class II cDNA and oligonucleotide probes, this gene was identified as one encoding an HLA-DP ..beta..-chain. This celiac disease-associated HLA-DP ..beta..-chain gene was flanked by HLA-DP ..cap alpha..-chain genes and, therefore, was probably in its normal chromosomal location. The HLA-DP..cap alpha..-chain genes of celiac disease patients also were studied by RFLP analysis. Celiac disease is associated with a subset of HLA-DR3, -DQw2 haplotypes characterized by HLA-DP ..cap alpha..- and ..beta..-chain gene RFLPs. Within the celiac-disease patient population, the joint segregation of these HLA-DP genes with those encoding the serologic specificities HLA-DR3 and -DQw2 indicates: (i) that the class II HLA haplotype associated with celiac disease is extended throughout the entire HLA-D region, and (ii) that celiac-disease susceptibility genes may reside as far centromeric on this haplotype as the HLA-DP subregion.

  12. Extended HLA-D region haplotype associated with celiac disease

    International Nuclear Information System (INIS)

    Howell, M.D.; Smith, J.R.; Austin, R.K.; Kelleher, D.; Nepom, G.T.; Volk, B.; Kagnoff, M.F.

    1988-01-01

    Celiac disease has one of the strongest associations with HLA (human leukocyte antigen) class II markers of the known HLA-linked diseases. This association is primarily with the class II serologic specificities HLA-DR3 and -DQw2. The authors previously described a restriction fragment length polymorphism (RFLP) characterized by the presence of a 4.0-kilobase Rsa I fragment derived from an HLA class II β-chain gene, which distinguishes the class II HLA haplotype of celiac disease patients from those of many serologically matched controls. They now report the isolation of this β-chain gene from a bacteriophage genomic library constructed from the DNA of a celiac disease patient. Based on restriction mapping and differential hybridization with class II cDNA and oligonucleotide probes, this gene was identified as one encoding an HLA-DP β-chain. This celiac disease-associated HLA-DP β-chain gene was flanked by HLA-DP α-chain genes and, therefore, was probably in its normal chromosomal location. The HLA-DPα-chain genes of celiac disease patients also were studied by RFLP analysis. Celiac disease is associated with a subset of HLA-DR3, -DQw2 haplotypes characterized by HLA-DP α- and β-chain gene RFLPs. Within the celiac-disease patient population, the joint segregation of these HLA-DP genes with those encoding the serologic specificities HLA-DR3 and -DQw2 indicates: (i) that the class II HLA haplotype associated with celiac disease is extended throughout the entire HLA-D region, and (ii) that celiac-disease susceptibility genes may reside as far centromeric on this haplotype as the HLA-DP subregion

  13. Mutation Analysis in Classical Phenylketonuria Patients Followed by Detecting Haplotypes Linked to Some PAH Mutations.

    Science.gov (United States)

    Dehghanian, Fatemeh; Silawi, Mohammad; Tabei, Seyed M B

    2017-02-01

    Deficiency of phenylalanine hydroxylase (PAH) enzyme and elevation of phenylalanine in body fluids cause phenylketonuria (PKU). The gold standard for confirming PKU and PAH deficiency is detecting causal mutations by direct sequencing of the coding exons and splicing involved sequences of the PAH gene. Furthermore, haplotype analysis could be considered as an auxiliary approach for detecting PKU causative mutations before direct sequencing of the PAH gene by making comparisons between prior detected mutation linked-haplotypes and new PKU case haplotypes with undetermined mutations. In this study, 13 unrelated classical PKU patients took part in the study detecting causative mutations. Mutations were identified by polymerase chain reaction (PCR) and direct sequencing in all patients. After that, haplotype analysis was performed by studying VNTR and PAHSTR markers (linked genetic markers of the PAH gene) through application of PCR and capillary electrophoresis (CE). Mutation analysis was performed successfully and the detected mutations were as follows: c.782G>A, c.754C>T, c.842C>G, c.113-115delTCT, c.688G>A, and c.696A>G. Additionally, PAHSTR/VNTR haplotypes were detected to discover haplotypes linked to each mutation. Mutation detection is the best approach for confirming PAH enzyme deficiency in PKU patients. Due to the relatively large size of the PAH gene and high cost of the direct sequencing in developing countries, haplotype analysis could be used before DNA sequencing and mutation detection for a faster and cheaper way via identifying probable mutated exons.

  14. Divergence at the casein haplotypes in dairy and meat goat breeds.

    Science.gov (United States)

    Küpper, Julia; Chessa, Stefania; Rignanese, Daniela; Caroli, Anna; Erhardt, Georg

    2010-02-01

    Casein genes have been proved to have an influence on milk properties, and are in addition appropriate for phylogeny studies. A large number of casein polymorphisms exist in goats, making their analysis quite complex. The four casein loci were analyzed by molecular techniques for genetic polymorphism detection in the two dairy goat breeds Bunte Deutsche Edelziege (BDE; n=96), Weisse Deutsche Edelziege (WDE; n=91), and the meat goat breed Buren (n=75). Of the 35 analyzed alleles, 18 were found in BDE, and 17 in Buren goats and WDE. In addition, a new allele was identified at the CSN1S1 locus in the BDE, showing a frequency of 0.05. This variant, named CSN1S1*A', is characterized by a t-->c transversion in intron 9. Linkage disequilibrium was found at the casein haplotype in all three breeds. A total of 30 haplotypes showed frequencies higher than 0.01. In the Buren breed only one haplotype showed a frequency higher than 0.1. The ancestral haplotype B-A-A-B (in the order: CSN1S1-CSN2-CSN1S2-CSN3) occurred in all three breeds, showing a very high frequency (>0.8) in the Buren.

  15. HLA alleles and haplotypes in Burmese (Myanmarese) and Karen in Thailand.

    Science.gov (United States)

    Kongmaroeng, C; Romphruk, A; Puapairoj, C; Leelayuwat, C; Kulski, J K; Inoko, H; Dunn, D S; Romphruk, A V

    2015-09-01

    This is the first report on human leukocyte antigen (HLA) allele and haplotype frequencies at three class I loci and two class II loci in unrelated healthy individuals from two ethnic groups, 170 Burmese and 200 Karen, originally from Burma (Myanmar), but sampled while residing in Thailand. Overall, the HLA allele and haplotype frequencies detected by polymerase chain reaction sequence-specific primer (PCR-SSP) at five loci (A, B, C, DRB1 and DRQB1) at low resolution showed distinct differences between the Burmese and Karen. In Burmese, five HLA-B*15 haplotypes with different HLA-A and HLA-DR/DQ combinations were detected with three of these not previously reported in other Asian populations. The data are important in the fields of anthropology, transplantation and disease-association studies. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. An alternative empirical likelihood method in missing response problems and causal inference.

    Science.gov (United States)

    Ren, Kaili; Drummond, Christopher A; Brewster, Pamela S; Haller, Steven T; Tian, Jiang; Cooper, Christopher J; Zhang, Biao

    2016-11-30

    Missing responses are common problems in medical, social, and economic studies. When responses are missing at random, a complete case data analysis may result in biases. A popular debias method is inverse probability weighting proposed by Horvitz and Thompson. To improve efficiency, Robins et al. proposed an augmented inverse probability weighting method. The augmented inverse probability weighting estimator has a double-robustness property and achieves the semiparametric efficiency lower bound when the regression model and propensity score model are both correctly specified. In this paper, we introduce an empirical likelihood-based estimator as an alternative to Qin and Zhang (2007). Our proposed estimator is also doubly robust and locally efficient. Simulation results show that the proposed estimator has better performance when the propensity score is correctly modeled. Moreover, the proposed method can be applied in the estimation of average treatment effect in observational causal inferences. Finally, we apply our method to an observational study of smoking, using data from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Deep Learning for Population Genetic Inference.

    Directory of Open Access Journals (Sweden)

    Sara Sheehan

    2016-03-01

    Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  18. Deep Learning for Population Genetic Inference

    Science.gov (United States)

    Sheehan, Sara; Song, Yun S.

    2016-01-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908

  19. Inferring the Clonal Structure of Viral Populations from Time Series Sequencing.

    Directory of Open Access Journals (Sweden)

    Donatien F Chedom

    2015-11-01

    Full Text Available RNA virus populations will undergo processes of mutation and selection resulting in a mixed population of viral particles. High throughput sequencing of a viral population subsequently contains a mixed signal of the underlying clones. We would like to identify the underlying evolutionary structures. We utilize two sources of information to attempt this; within segment linkage information, and mutation prevalence. We demonstrate that clone haplotypes, their prevalence, and maximum parsimony reticulate evolutionary structures can be identified, although the solutions may not be unique, even for complete sets of information. This is applied to a chain of influenza infection, where we infer evolutionary structures, including reassortment, and demonstrate some of the difficulties of interpretation that arise from deep sequencing due to artifacts such as template switching during PCR amplification.

  20. Parametric inference for biological sequence analysis.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  1. Mitochondrial Haplotype Diversity in Zambian Lions: Bridging a Gap in the Biogeography of an Iconic Species.

    Science.gov (United States)

    Curry, Caitlin J; White, Paula A; Derr, James N

    2015-01-01

    Analysis of DNA sequence diversity at the 12S to 16S mitochondrial genes of 165 African lions (Panthera leo) from five main areas in Zambia has uncovered haplotypes which link Southern Africa with East Africa. Phylogenetic analysis suggests Zambia may serve as a bridge connecting the lion populations in southern Africa to eastern Africa, supporting earlier hypotheses that eastern-southern Africa may represent the evolutionary cradle for the species. Overall gene diversity throughout the Zambian lion population was 0.7319 +/- 0.0174 with eight haplotypes found; three haplotypes previously described and the remaining five novel. The addition of these five novel haplotypes, so far only found within Zambia, nearly doubles the number of haplotypes previously reported for any given geographic location of wild lions. However, based on an AMOVA analysis of these haplotypes, there is little to no matrilineal gene flow (Fst = 0.47) when the eastern and western regions of Zambia are considered as two regional sub-populations. Crossover haplotypes (H9, H11, and Z1) appear in both populations as rare in one but common in the other. This pattern is a possible result of the lion mating system in which predominately males disperse, as all individuals with crossover haplotypes were male. The determination and characterization of lion sub-populations, such as done in this study for Zambia, represent a higher-resolution of knowledge regarding both the genetic health and connectivity of lion populations, which can serve to inform conservation and management of this iconic species.

  2. Mitochondrial Haplotype Diversity in Zambian Lions: Bridging a Gap in the Biogeography of an Iconic Species.

    Directory of Open Access Journals (Sweden)

    Caitlin J Curry

    Full Text Available Analysis of DNA sequence diversity at the 12S to 16S mitochondrial genes of 165 African lions (Panthera leo from five main areas in Zambia has uncovered haplotypes which link Southern Africa with East Africa. Phylogenetic analysis suggests Zambia may serve as a bridge connecting the lion populations in southern Africa to eastern Africa, supporting earlier hypotheses that eastern-southern Africa may represent the evolutionary cradle for the species. Overall gene diversity throughout the Zambian lion population was 0.7319 +/- 0.0174 with eight haplotypes found; three haplotypes previously described and the remaining five novel. The addition of these five novel haplotypes, so far only found within Zambia, nearly doubles the number of haplotypes previously reported for any given geographic location of wild lions. However, based on an AMOVA analysis of these haplotypes, there is little to no matrilineal gene flow (Fst = 0.47 when the eastern and western regions of Zambia are considered as two regional sub-populations. Crossover haplotypes (H9, H11, and Z1 appear in both populations as rare in one but common in the other. This pattern is a possible result of the lion mating system in which predominately males disperse, as all individuals with crossover haplotypes were male. The determination and characterization of lion sub-populations, such as done in this study for Zambia, represent a higher-resolution of knowledge regarding both the genetic health and connectivity of lion populations, which can serve to inform conservation and management of this iconic species.

  3. Deletion analysis of male sterility effects of t-haplotypes in the mouse.

    Science.gov (United States)

    Bennett, D; Artzt, K

    1990-01-01

    We present data on the effects of three chromosome 17 deletions on transmission ratio distortion (TRD) and sterility of several t-haplotypes. All three deletions have similar effects on male TRD: that is, Tdel/tcomplete genotypes all transmit their t-haplotype in very high proportion. However, each deletion has different effects on sterility of heterozygous males, with TOr/t being fertile, Thp/t less fertile, and TOrl/t still less fertile. These data suggest that wild-type genes on chromosomes homologous to t-haplotypes can be important regulators of both TRD and fertility in males, and that the wild-type genes concerned with TRD and fertility are at least to some extent different. The data also provide a rough map of the positions of these genes.

  4. Self-compatible peach (Prunus persica) has mutant versions of the S haplotypes found in self-incompatible Prunus species.

    Science.gov (United States)

    Tao, Ryutaro; Watari, Akiko; Hanada, Toshio; Habu, Tsuyoshi; Yaegaki, Hideaki; Yamaguchi, Masami; Yamane, Hisayo

    2007-01-01

    This study demonstrates that self-compatible (SC) peach has mutant versions of S haplotypes that are present in self-incompatible (SI) Prunus species. All three peach S haplotypes, S (1), S (2), and S (2m), found in this study encode mutated pollen determinants, SFB, while only S (2m) has a mutation that affects the function of the pistil determinant S-RNase. A cysteine residue in the C5 domain of the S (2m)-RNase is substituted by a tyrosine residue, thereby reducing RNase stability. The peach SFB mutations are similar to the SFB mutations found in SC haplotypes of sweet cherry (P. avium) and Japanese apricot (P. mume). SFB (1) of the S (1) haplotype, a mutant version of almond (P. dulcis) S (k) haplotype, encodes truncated SFB due to a 155 bp insertion. SFB (2) of the S (2) and S (2m) haplotypes, both of which are mutant versions of the S (a) haplotype in Japanese plum (P. salicina), encodes a truncated SFB due to a 5 bp insertion. Thus, regardless of the functionality of the pistil determinant, all three peach S haplotypes are SC haplotypes. Our finding that peach has mutant versions of S haplotypes that function in almond and Japanese plum, which are phylogenetically close and remote species, respectively, to peach in the subfamily Prunoideae of the Roasaceae, provides insight into the SC/SI evolution in Prunus. We discuss the significance of SC pollen part mutation in peach with special reference to possible differences in the SI mechanisms between Prunus and Solanaceae.

  5. An ancestral haplotype of the human PERIOD2 gene associates with reduced sensitivity to light-induced melatonin suppression.

    Directory of Open Access Journals (Sweden)

    Tokiho Akiyama

    Full Text Available Humans show various responses to the environmental stimulus in individual levels as "physiological variations." However, it has been unclear if these are caused by genetic variations. In this study, we examined the association between the physiological variation of response to light-stimulus and genetic polymorphisms. We collected physiological data from 43 subjects, including light-induced melatonin suppression, and performed haplotype analyses on the clock genes, PER2 and PER3, exhibiting geographical differentiation of allele frequencies. Among the haplotypes of PER3, no significant difference in light sensitivity was found. However, three common haplotypes of PER2 accounted for more than 96% of the chromosomes in subjects, and 1 of those 3 had a significantly low-sensitive response to light-stimulus (P < 0.05. The homozygote of the low-sensitive PER2 haplotype showed significantly lower percentages of melatonin suppression (P < 0.05, and the heterozygotes of the haplotypes varied their ratios, indicating that the physiological variation for light-sensitivity is evidently related to the PER2 polymorphism. Compared with global haplotype frequencies, the haplotype with a low-sensitive response was more frequent in Africans than in non-Africans, and came to the root in the phylogenetic tree, suggesting that the low light-sensitive haplotype is the ancestral type, whereas the other haplotypes with high sensitivity to light are the derived types. Hence, we speculate that the high light-sensitive haplotypes have spread throughout the world after the Out-of-Africa migration of modern humans.

  6. Casein haplotypes and their association with milk production traits in Norwegian Red cattle

    Directory of Open Access Journals (Sweden)

    Nome Torfinn

    2009-02-01

    Full Text Available Abstract A high resolution SNP map was constructed for the bovine casein region to identify haplotype structures and study associations with milk traits in Norwegian Red cattle. Our analyses suggest separation of the casein cluster into two haplotype blocks, one consisting of the CSN1S1, CSN2 and CSN1S2 genes and another one consisting of the CSN3 gene. Highly significant associations with both protein and milk yield were found for both single SNPs and haplotypes within the CSN1S1-CSN2-CSN1S2 haplotype block. In contrast, no significant association was found for single SNPs or haplotypes within the CSN3 block. Our results point towards CSN2 and CSN1S2 as the most likely loci harbouring the underlying causative DNA variation. In our study, the most significant results were found for the SNP CSN2_67 with the C allele consistently associated with both higher protein and milk yields. CSN2_67 calls a C to an A substitution at codon 67 in β-casein gene resulting in histidine replacing proline in the amino acid sequence. This polymorphism determines the protein variants A1/B (CSN2_67 A allele versus A2/A3 (CSN2_67 C allele. Other studies have suggested that a high consumption of A1/B milk may affect human health by increasing the risk of diabetes and heart diseases. Altogether these results argue for an increase in the frequency of the CSN2_67 C allele or haplotypes containing this allele in the Norwegian Red cattle population by selective breeding.

  7. Unique haplotypes of cacao trees as revealed by trnH-psbA chloroplast DNA

    Directory of Open Access Journals (Sweden)

    Nidia Gutiérrez-López

    2016-04-01

    Full Text Available Cacao trees have been cultivated in Mesoamerica for at least 4,000 years. In this study, we analyzed sequence variation in the chloroplast DNA trnH-psbA intergenic spacer from 28 cacao trees from different farms in the Soconusco region in southern Mexico. Genetic relationships were established by two analysis approaches based on geographic origin (five populations and genetic origin (based on a previous study. We identified six polymorphic sites, including five insertion/deletion (indels types and one transversion. The overall nucleotide diversity was low for both approaches (geographic = 0.0032 and genetic = 0.0038. Conversely, we obtained moderate to high haplotype diversity (0.66 and 0.80 with 10 and 12 haplotypes, respectively. The common haplotype (H1 for both networks included cacao trees from all geographic locations (geographic approach and four genetic groups (genetic approach. This common haplotype (ancient derived a set of intermediate haplotypes and singletons interconnected by one or two mutational steps, which suggested directional selection and event purification from the expansion of narrow populations. Cacao trees from Soconusco region were grouped into one cluster without any evidence of subclustering based on AMOVA (FST = 0 and SAMOVA (FST = 0.04393 results. One population (Mazatán showed a high haplotype frequency; thus, this population could be considered an important reservoir of genetic material. The indels located in the trnH-psbA intergenic spacer of cacao trees could be useful as markers for the development of DNA barcoding.

  8. Mineralocorticoid receptor haplotype moderates the effects of oral contraceptives and menstrual cycle on emotional information processing.

    Science.gov (United States)

    Hamstra, Danielle A; de Kloet, E Ronald; Tollenaar, Marieke; Verkuil, Bart; Manai, Meriem; Putman, Peter; Van der Does, Willem

    2016-10-01

    The processing of emotional information is affected by menstrual cycle phase and by the use of oral contraceptives (OCs). The stress hormone cortisol is known to affect emotional information processing via the limbic mineralocorticoid receptor (MR). We investigated in an exploratory study whether the MR-genotype moderates the effect of both OC-use and menstrual cycle phase on emotional cognition. Healthy premenopausal volunteers (n=93) of West-European descent completed a battery of emotional cognition tests. Forty-nine participants were OC users and 44 naturally cycling, 21 of whom were tested in the early follicular (EF) and 23 in the mid-luteal (ML) phase of the menstrual cycle. In MR-haplotype 1/3 carriers, ML women gambled more than EF women when their risk to lose was relatively small. In MR-haplotype 2, ML women gambled more than EF women, regardless of their odds of winning. OC-users with MR-haplotype 1/3 recognised fewer facial expressions than ML women with MR-haplotype 1/3. MR-haplotype 1/3 carriers may be more sensitive to the influence of their female hormonal status. MR-haplotype 2 carriers showed more risky decision-making. As this may reflect optimistic expectations, this finding may support previous observations in female carriers of MR-haplotype 2 in a naturalistic cohort study. © The Author(s) 2016.

  9. Mapping of HLA- DQ haplotypes in a group of Danish patients with celiac disease

    DEFF Research Database (Denmark)

    Lund, Flemming; Hermansen, Mette N; Pedersen, Merete F

    2015-01-01

    BACKGROUND: A cost-effective identification of HLA- DQ risk haplotypes using the single nucleotide polymorphism (SNP) technique has recently been applied in the diagnosis of celiac disease (CD) in four European populations. The objective of the study was to map risk HLA- DQ haplotypes in a group...... of Danish CD patients using the SNP technique. METHODS: Cohort A: Among 65 patients with gastrointestinal symptoms we compared the HLA- DQ2 and HLA- DQ8 risk haplotypes obtained by the SNP technique (method 1) with results based on a sequence specific primer amplification technique (method 2...

  10. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

    KAUST Repository

    Chatterjee, Nilanjan; Chen, Yi-Hau; Luo, Sheng; Carroll, Raymond J.

    2009-01-01

    Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. © Institute of Mathematical Statistics, 2009.

  11. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

    KAUST Repository

    Chatterjee, Nilanjan

    2009-11-01

    Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. © Institute of Mathematical Statistics, 2009.

  12. Using metacognitive cues to infer others' thinking

    OpenAIRE

    André Mata; Tiago Almeida

    2014-01-01

    Three studies tested whether people use cues about the way other people think---for example, whether others respond fast vs. slow---to infer what responses other people might give to reasoning problems. People who solve reasoning problems using deliberative thinking have better insight than intuitive problem-solvers into the responses that other people might give to the same problems. Presumably because deliberative responders think of intuitive responses before they think o...

  13. Common Genetic Variation and Haplotypes of the Anion Exchanger SLC4A2 in Primary Biliary Cirrhosis

    Science.gov (United States)

    Juran, Brian D.; Atkinson, Elizabeth J.; Larson, Joseph J.; Schlicht, Erik M.; Lazaridis, Konstantinos N.

    2010-01-01

    Objectives Deficiencies of the anion exchanger SLC4A2 are thought to play a pathogenic role in primary biliary cirrhosis (PBC), evidenced by decreased expression and activity in PBC patients and development of disease features in SLC4A2 knockout mice. We hypothesized that genetic variation in SLC4A2 might influence this pathogenic contribution. Thus, we aimed to perform a comprehensive assessment of SLC4A2 genetic variation in PBC using a linkage disequilibrium (LD)-based haplotype-tagging approach. Methods Twelve single nucleotide polymorphisms (SNPs) across SLC4A2 were genotyped in 409 PBC patients and 300 controls and evaluated for association with disease, as well as with prior orthotopic liver transplant and antimitochondrial antibody (AMA) status among the PBC patients, both individually and as inferred haplotypes, using logistic regression. Results All SNPs were in Hardy–Weinberg equilibrium. No associations with disease or liver transplantation were detected, but two variants, rs2303929 and rs3793336, were associated with negativity for antimitochondrial antibodies among the PBC patients. Conclusions The common genetic variation of SLC4A2 does not directly affect the risk of PBC or its clinical outcome. Whether the deficiency of SLC4A2 expression and activity observed earlier in PBC patients is an acquired epiphenomenon of underlying disease or is because of heritable factors in unappreciated regulatory regions remains uncertain. Of note, two SLC4A2 variants appear to influence AMA status among PBC patients. The mechanisms behind this finding are unclear. PMID:19491853

  14. Insights into HLA-G genetics provided by worldwide haplotype diversity

    Directory of Open Access Journals (Sweden)

    Erick C Castelli

    2014-10-01

    Full Text Available Human Leucocyte Antigen G (HLA-G belongs to the family of nonclassical HLA class I genes, located within the major histocompatibility complex (MHC. HLA-G has been the target of most recent research regarding the function of class I nonclassical genes. The main features that distinguish HLA-G from classical class I genes are: a limited protein variability; b alternative splicing generating several membrane bound and soluble isoforms; c short cytoplasmic tail; d modulation of immune response (immune tolerance; e restricted expression to certain tissues. In the present work, we describe the HLA-G gene structure and address the HLA-G variability and haplotype diversity among several populations around the world, considering each of its major segments (promoter, coding and 3’untranslated regions. For this purpose, we developed a pipeline to reevaluate the 1000Genomes data and recover miscalled or missing genotypes and haplotypes. It became clear that the overall structure of the HLA-G molecule has been maintained during the evolutionary process and that most of the variation sites found in the HLA-G coding region are either coding synonymous or intronic mutations. In addition, only a few frequent and divergent extended haplotypes are found when the promoter, coding and 3’ untranslated regions are evaluated together. The divergence is particularly evident for the regulatory regions. The population comparisons confirmed that most of the HLA-G variability has originated before human dispersion from Africa and that the allele and haplotype frequencies have probably been shaped by strong selective pressures.

  15. Statistical inference

    CERN Document Server

    Rohatgi, Vijay K

    2003-01-01

    Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth

  16. A Haplotype Information Theory Method Reveals Genes of Evolutionary Interest in European vs. Asian Pigs.

    Science.gov (United States)

    Hudson, Nicholas J; Naval-Sánchez, Marina; Porto-Neto, Laercio; Pérez-Enciso, Miguel; Reverter, Antonio

    2018-06-05

    Asian and European wild boars were independently domesticated ca. 10,000 years ago. Since the 17th century, Chinese breeds have been imported to Europe to improve the genetics of European animals by introgression of favourable alleles, resulting in a complex mosaic of haplotypes. To interrogate the structure of these haplotypes further, we have run a new haplotype segregation analysis based on information theory, namely compression efficiency (CE). We applied the approach to sequence data from individuals from each phylogeographic region (n = 23 from Asia and Europe) including a number of major pig breeds. Our genome-wide CE is able to discriminate the breeds in a manner reflecting phylogeography. Furthermore, 24,956 non-overlapping sliding windows (each comprising 1,000 consecutive SNP) were quantified for extent of haplotype sharing within and between Asia and Europe. The genome-wide distribution of extent of haplotype sharing was quite different between groups. Unlike European pigs, Asian pigs haplotype sharing approximates a normal distribution. In line with this, we found the European breeds possessed a number of genomic windows of dramatically higher haplotype sharing than the Asian breeds. Our CE analysis of sliding windows capture some of the genomic regions reported to contain signatures of selection in domestic pigs. Prominent among these regions, we highlight the role of a gene encoding the mitochondrial enzyme LACTB which has been associated with obesity, and the gene encoding MYOG a fundamental transcriptional regulator of myogenesis. The origin of these regions likely reflects either a population bottleneck in European animals, or selective targets on commercial phenotypes reducing allelic diversity in particular genes and/or regulatory regions.

  17. Surfing among species, populations and morphotypes: Inferring boundaries between two species of new world silversides (Atherinopsidae).

    Science.gov (United States)

    González-Castro, Mariano; Rosso, Juan José; Mabragaña, Ezequiel; Díaz de Astarloa, Juan Martín

    2016-01-01

    Atherinopsidae are widespread freshwater and shallow marine fish with singular economic importance. Morphological, genetical and life cycles differences between marine and estuarine populations were already reported in this family, suggesting ongoing speciation. Also, coexistence and interbreeding between closely related species were documented. The aim of this study was to infer boundaries among: (A) Odontesthes bonariensis and O. argentinensis at species level, and intermediate morphs; (B) the population of O. argentinensis of Mar Chiquita Lagoon and its marine conspecifics. To achieve this, we integrated, meristic, Geometrics Morphometrics and DNA Barcode approaches. Four groups were discriminated and subsequently characterized according to their morphological traits, shape and meristic characters. No shared haplotypes between O. bonariensis and O. argentinensis were found. Significative-meristic and body shape differences between the Mar Chiquita and marine individuals of O. argentinensis were found, suggesting they behave as well differentiated populations, or even incipient ecological species. The fact that the Odontesthes morphotypes shared haplotypes with both, O. argentinensis and O. bonariensis, but also possess meristic and morphometric distinctive traits open new questions related to the origin of this morphogroup. Copyright © 2015 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  18. Multi-Agent Inference in Social Networks: A Finite Population Learning Approach.

    Science.gov (United States)

    Fan, Jianqing; Tong, Xin; Zeng, Yao

    When people in a society want to make inference about some parameter, each person may want to use data collected by other people. Information (data) exchange in social networks is usually costly, so to make reliable statistical decisions, people need to trade off the benefits and costs of information acquisition. Conflicts of interests and coordination problems will arise in the process. Classical statistics does not consider people's incentives and interactions in the data collection process. To address this imperfection, this work explores multi-agent Bayesian inference problems with a game theoretic social network model. Motivated by our interest in aggregate inference at the societal level, we propose a new concept, finite population learning , to address whether with high probability, a large fraction of people in a given finite population network can make "good" inference. Serving as a foundation, this concept enables us to study the long run trend of aggregate inference quality as population grows.

  19. Occurrence of the Southeast Asian/South American SVMNT haplotype of the chloroquine-resistance transporter gene in Plasmodium falciparum in Tanzania

    DEFF Research Database (Denmark)

    Alifrangis, Michael; Dalgaard, Michael B; Lusingu, John P

    2006-01-01

    Two main haplotypes, CVIET and SVMNT, of the Plasmodium falciparum chloroquine-resistance transporter gene (Pfcrt) are linked to 4-aminoquinoline resistance. The CVIET haplotype has been reported in most malaria-endemic regions, whereas the SVMNT haplotype has only been found outside Africa. We...... investigated Pfcrt haplotype frequencies in Korogwe District, Tanzania, in 2003 and 2004. The SVMNT haplotype was not detected in 2003 but was found in 19% of infected individuals in 2004. Amodiaquine use has increased in the region. The introduction and high prevalence of the SVMNT haplotype may reflect...... this and may raise concern regarding the use of amodiaquine in artemisinin-based combination therapies in Africa....

  20. Different DRB1*03:01-DQB1*02:01 haplotypes confer different risk for celiac disease.

    Science.gov (United States)

    Alshiekh, S; Zhao, L P; Lernmark, Å; Geraghty, D E; Naluai, Å T; Agardh, D

    2017-08-01

    Celiac disease is associated with the HLA-DR3-DQA1*05:01-DQB1*02:01 and DR4-DQA1*03:01-DQB1*03:02 haplotypes. In addition, there are currently over 40 non-HLA loci associated with celiac disease. This study extends previous analyses on different HLA haplotypes in celiac disease using next generation targeted sequencing. Included were 143 patients with celiac disease and 135 non-celiac disease controls investigated at median 9.8 years (1.4-18.3 years). PCR-based amplification of HLA and sequencing with Illumina MiSeq technology were used for extended sequencing of the HLA class II haplotypes HLA-DRB1, DRB3, DRB4, DRB5, DQA1 and DQB1, respectively. Odds ratios were computed marginally for every allele and haplotype as the ratio of allelic frequency in patients and controls as ratio of exposure rates (RR), when comparing a null reference with equal exposure rates in cases and controls. Among the extended HLA haplotypes, the strongest risk haplotype for celiac disease was shown for DRB3*01:01:02 in linkage with DQA1*05:01-DQB1*02:01 (RR = 6.34; P-value celiac disease among non-Scandinavians (RR = 7.94; P = .011). The data also revealed 2 distinct celiac disease risk DR3-DQA1*05:01-DQB*02:01 haplotypes distinguished by either the DRB3*01:01:02 or DRB3*02:02:01 alleles, indicating that different DRB1*03:01-DQB1*02:01 haplotypes confer different risk for celiac disease. The associated risk of celiac disease for DR3-DRB3*01:01:02-DQA1*05:01-DQB1*02:01 is predominant among patients of Scandinavian ethnicity. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  1. A Candidate Trans-acting Modulator of Fetal Hemoglobin Gene Expression in the Arab-Indian Haplotype of Sickle Cell Anemia

    Science.gov (United States)

    Vathipadiekal, Vinod; Farrell, John J.; Wang, Shuai; Edward, Heather L.; Shappell, Heather; Al-Rubaish, A.M.; Al-Muhanna, Fahad; Naserullah, Z.; Alsuliman, A.; Qutub, Hatem Othman; Simkin, Irene; Farrer, Lindsay A.; Jiang, Zhihua; Luo, Hong-Yuan; Huang, Shengwen; Mostoslavsky, Gustavo; Murphy, George J.; Patra, Pradeep.K.; Chui, David H.K.; Alsultan, Abdulrahman; Al-Ali, Amein K.; Sebastiani, Paola.; Steinberg, Martin. H.

    2016-01-01

    Fetal hemoglobin (HbF) levels are higher in the Arab-Indian (AI) β-globin gene haplotype of sickle cell anemia compared with African-origin haplotypes. To study genetic elements that effect HbF expression in the AI haplotype we completed whole genome sequencing in 14 Saudi AI haplotype sickle hemoglobin homozygotes—seven selected for low HbF (8.2±1.3%) and seven selected for high HbF (23.5±.2.6%). An intronic single nucleotide polymorphism (SNP) in ANTXR1, an anthrax toxin receptor (chromosome 2p13), was associated with HbF. These results were replicated in two independent Saudi AI haplotype cohorts of 120 and 139 patients, but not in 76 Saudi Benin haplotype, 894 African origin haplotype and 44 Arab Indian haplotype patients of Indian descent, suggesting that this association is effective only in the Saudi AI haplotype background. ANTXR1 variants explained 10% of the HbF variability compared with 8% for BCL11A. These two genes had independent, additive effects on HbF and together explained about 15% of HbF variability in Saudi AI sickle cell anemia patients. ANTXR1 was expressed at mRNA and protein levels in erythroid progenitors derived from induced pluripotent stem cells (iPSCs) and CD34+ cells. As CD34+ cells matured and their HbF decreased ANTXR1 expression increased; as iPSCs differentiated and their HbF increased, ANTXR1 expression decreased. Along with elements in cis to the HbF genes, ANTXR1 contributes to the variation in HbF in Saudi AI haplotype sickle cell anemia and is the first gene in trans to HBB that is associated with HbF only in carriers of the Saudi AI haplotype. PMID:27501013

  2. Mitochondrial DNA haplotype distribution patterns in Pinus ponderosa (Pinaceae): range-wide evolutionary history and implications for conservation.

    Science.gov (United States)

    Potter, Kevin M; Hipkins, Valerie D; Mahalovich, Mary F; Means, Robert E

    2013-08-01

    Ponderosa pine (Pinus ponderosa Douglas ex P. Lawson & C. Lawson) exhibits complicated patterns of morphological and genetic variation across its range in western North America. This study aims to clarify P. ponderosa evolutionary history and phylogeography using a highly polymorphic mitochondrial DNA marker, with results offering insights into how geographical and climatological processes drove the modern evolutionary structure of tree species in the region. We amplified the mtDNA nad1 second intron minisatellite region for 3,100 trees representing 104 populations, and sequenced all length variants. We estimated population-level haplotypic diversity and determined diversity partitioning among varieties, races and populations. After aligning sequences of minisatellite repeat motifs, we evaluated evolutionary relationships among haplotypes. The geographical structuring of the 10 haplotypes corresponded with division between Pacific and Rocky Mountain varieties. Pacific haplotypes clustered with high bootstrap support, and appear to have descended from Rocky Mountain haplotypes. A greater proportion of diversity was partitioned between Rocky Mountain races than between Pacific races. Areas of highest haplotypic diversity were the southern Sierra Nevada mountain range in California, northwestern California, and southern Nevada. Pinus ponderosa haplotype distribution patterns suggest a complex phylogeographic history not revealed by other genetic and morphological data, or by the sparse paleoecological record. The results appear consistent with long-term divergence between the Pacific and Rocky Mountain varieties, along with more recent divergences not well-associated with race. Pleistocene refugia may have existed in areas of high haplotypic diversity, as well as the Great Basin, Southwestern United States/northern Mexico, and the High Plains.

  3. Oestrogen receptor α gene haplotype and postmenopausal breast cancer risk: a case control study

    International Nuclear Information System (INIS)

    Wedrén, Sara; Stiger, Fredrik; Persson, Ingemar; Baron, John; Weiderpass, Elisabete; Lovmar, Lovisa; Humphreys, Keith; Magnusson, Cecilia; Melhus, Håkan; Syvänen, Ann-Christine; Kindmark, Andreas; Landegren, Ulf; Fermér, Maria Lagerström

    2004-01-01

    Oestrogen receptor α, which mediates the effect of oestrogen in target tissues, is genetically polymorphic. Because breast cancer development is dependent on oestrogenic influence, we have investigated whether polymorphisms in the oestrogen receptor α gene (ESR1) are associated with breast cancer risk. We genotyped breast cancer cases and age-matched population controls for one microsatellite marker and four single-nucleotide polymorphisms (SNPs) in ESR1. The numbers of genotyped cases and controls for each marker were as follows: TA n , 1514 cases and 1514 controls; c.454-397C → T, 1557 cases and 1512 controls; c.454-351A → G, 1556 cases and 1512 controls; c.729C → T, 1562 cases and 1513 controls; c.975C → G, 1562 cases and 1513 controls. Using logistic regression models, we calculated odds ratios (ORs) and 95% confidence intervals (CIs). Haplotype effects were estimated in an exploratory analysis, using expectation-maximisation algorithms for case-control study data. There were no compelling associations between single polymorphic loci and breast cancer risk. In haplotype analyses, a common haplotype of the c.454-351A → G or c.454-397C → T and c.975C → G SNPs appeared to be associated with an increased risk for ductal breast cancer: one copy of the c.454-351A → G and c.975C → G haplotype entailed an OR of 1.19 (95% CI 1.06–1.33) and two copies with an OR of 1.42 (95% CI 1.15–1.77), compared with no copies, under a model of multiplicative penetrance. The association with the c.454-397C → T and c.975C → G haplotypes was similar. Our data indicated that these haplotypes were more influential in women with a high body mass index. Adjustment for multiple comparisons rendered the associations statistically non-significant. We found suggestions of an association between common haplotypes in ESR1 and the risk for ductal breast cancer that is stronger in heavy women

  4. A comparison of different algorithms for phasing haplotypes using Holstein cattle genotypes and pedigree data.

    Science.gov (United States)

    Miar, Younes; Sargolzaei, Mehdi; Schenkel, Flavio S

    2017-04-01

    Phasing genotypes to haplotypes is becoming increasingly important due to its applications in the study of diseases, population and evolutionary genetics, imputation, and so on. Several studies have focused on the development of computational methods that infer haplotype phase from population genotype data. The aim of this study was to compare phasing algorithms implemented in Beagle, Findhap, FImpute, Impute2, and ShapeIt2 software using 50k and 777k (HD) genotyping data. Six scenarios were considered: no-parents, sire-progeny pairs, sire-dam-progeny trios, each with and without pedigree information in Holstein cattle. Algorithms were compared with respect to their phasing accuracy and computational efficiency. In the studied population, Beagle and FImpute were more accurate than other phasing algorithms. Across scenarios, phasing accuracies for Beagle and FImpute were 99.49-99.90% and 99.44-99.99% for 50k, respectively, and 99.90-99.99% and 99.87-99.99% for HD, respectively. Generally, FImpute resulted in higher accuracy when genotypic information of at least one parent was available. In the absence of parental genotypes and pedigree information, Beagle and Impute2 (with double the default number of states) were slightly more accurate than FImpute. Findhap gave high phasing accuracy when parents' genotypes and pedigree information were available. In terms of computing time, Findhap was the fastest algorithm followed by FImpute. FImpute was 30 to 131, 87 to 786, and 353 to 1,400 times faster across scenarios than Beagle, ShapeIt2, and Impute2, respectively. In summary, FImpute and Beagle were the most accurate phasing algorithms. Moreover, the low computational requirement of FImpute makes it an attractive algorithm for phasing genotypes of large livestock populations. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  5. An efficient Bayesian inference approach to inverse problems based on an adaptive sparse grid collocation method

    International Nuclear Information System (INIS)

    Ma Xiang; Zabaras, Nicholas

    2009-01-01

    A new approach to modeling inverse problems using a Bayesian inference method is introduced. The Bayesian approach considers the unknown parameters as random variables and seeks the probabilistic distribution of the unknowns. By introducing the concept of the stochastic prior state space to the Bayesian formulation, we reformulate the deterministic forward problem as a stochastic one. The adaptive hierarchical sparse grid collocation (ASGC) method is used for constructing an interpolant to the solution of the forward model in this prior space which is large enough to capture all the variability/uncertainty in the posterior distribution of the unknown parameters. This solution can be considered as a function of the random unknowns and serves as a stochastic surrogate model for the likelihood calculation. Hierarchical Bayesian formulation is used to derive the posterior probability density function (PPDF). The spatial model is represented as a convolution of a smooth kernel and a Markov random field. The state space of the PPDF is explored using Markov chain Monte Carlo algorithms to obtain statistics of the unknowns. The likelihood calculation is performed by directly sampling the approximate stochastic solution obtained through the ASGC method. The technique is assessed on two nonlinear inverse problems: source inversion and permeability estimation in flow through porous media

  6. F8 haplotype and inhibitor risk: results from the Hemophilia Inhibitor Genetics Study (HIGS) Combined Cohort

    Science.gov (United States)

    Schwarz, John; Astermark, Jan; Menius, Erika D.; Carrington, Mary; Donfield, Sharyne M.; Gomperts, Edward D.; Nelson, George W.; Oldenburg, Johannes; Pavlova, Anna; Shapiro, Amy D.; Winkler, Cheryl A.; Berntorp, Erik

    2012-01-01

    Background Ancestral background, specifically African descent, confers higher risk for development of inhibitory antibodies to factor VIII (FVIII) in hemophilia A. It has been suggested that differences in the distribution of factor VIII gene (F8) haplotypes, and mismatch between endogenous F8 haplotypes and those comprising products used for treatment could contribute to risk. Design and Methods Data from the HIGS Combined Cohort were used to determine the association between F8 haplotype 3 (H3) vs. haplotypes 1 and 2 (H1+H2) and inhibitor risk among individuals of genetically-determined African descent. Other variables known to affect inhibitor risk including type of F8 mutation and HLA were included in the analysis. A second research question regarding risk related to mismatch in endogenous F8 haplotype and recombinant FVIII products used for treatment was addressed. Results H3 was associated with higher inhibitor risk among those genetically-identified (N=49) as of African ancestry, but the association did not remain significant after adjustment for F8 mutation type and the HLA variables. Among subjects of all racial ancestries enrolled in HIGS who reported early use of recombinant products (N=223), mismatch in endogenous haplotype and the FVIII proteins constituting the products used did not confer greater risk for inhibitor development. Conclusion H3 was not an independent predictor of inhibitor risk. Further, our findings did not support a higher risk of inhibitors in the presence of a haplotype mismatch between the FVIII molecule infused and that of the individual. PMID:22958194

  7. Russell and Humean Inferences

    Directory of Open Access Journals (Sweden)

    João Paulo Monteiro

    2001-12-01

    Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.

  8. Haplotype diversity and linkage disequilibrium at DRD2 locus--a study on four population groups of Andhra Pradesh, India.

    Science.gov (United States)

    Saraswathy, Kallur Nava; Mukhopadhyay, Rupak; Shukla, Deepti; Kaur, Harpreet; Sachdeva, Mohinder Pal; Rao, A P; Saksena, Deepti; Kalla, Aloke Kumar

    2009-02-01

    Dopamine receptor D2 (DRD2) is expressed in the central nervous system and has a high affinity for many antipsychotic drugs. Besides several epidemiological investigations on association of DRD2 locus polymorphism(s) with neuropsychiatric problems and addictive behavior, a few polymorphisms in this locus have also been used to understand genomic diversity and population migratory histories globally. The present study attempts to understand the genomic diversity/affinity among four endogamous groups of Andhra Pradesh (India) against the backdrop of diversity studies from other parts of India and the rest of the world, with special reference to DRD2 locus. The four population groups from Adilabad District of Andhra Pradesh, namely, Brahmin (n=50), Nayakpod (n=49), Thoti (n=52), and Kolam (n=53), were included in the study. The DRD2 markers typed for the present study are three biallelic restriction fragments, that is, TaqI A (rs1800497), TaqI B (rs1079597), and TaqI D (rs1800498). Scoring of DRD2 haplotypes with respect to the three TaqI sites shows that five out of eight possible haplotypes are shared by the four populations. Ancestral haplotype B2D2A1 is most frequent among Thotis (0.359). The results of the present study indicate a differential gene flow into South India followed by certain important demographic events resulting in diversified peopling of India.

  9. Statistical perspectives on inverse problems

    DEFF Research Database (Denmark)

    Andersen, Kim Emil

    of the interior of an object from electrical boundary measurements. One part of this thesis concerns statistical approaches for solving, possibly non-linear, inverse problems. Thus inverse problems are recasted in a form suitable for statistical inference. In particular, a Bayesian approach for regularisation...... problem is given in terms of probability distributions. Posterior inference is obtained by Markov chain Monte Carlo methods and new, powerful simulation techniques based on e.g. coupled Markov chains and simulated tempering is developed to improve the computational efficiency of the overall simulation......Inverse problems arise in many scientific disciplines and pertain to situations where inference is to be made about a particular phenomenon from indirect measurements. A typical example, arising in diffusion tomography, is the inverse boundary value problem for non-invasive reconstruction...

  10. Grammatical inference algorithms, routines and applications

    CERN Document Server

    Wieczorek, Wojciech

    2017-01-01

    This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.

  11. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity.

    Science.gov (United States)

    Schwessinger, Benjamin; Sperschneider, Jana; Cuddy, William S; Garnica, Diana P; Miller, Marisa E; Taylor, Jennifer M; Dodds, Peter N; Figueroa, Melania; Park, Robert F; Rathjen, John P

    2018-02-20

    A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N 50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies. IMPORTANCE Current representations of eukaryotic microbial genomes are haploid, hiding the genomic diversity intrinsic to diploid and polyploid life forms. This hidden diversity contributes to the organism's evolutionary potential and ability to adapt to stress conditions. Yet, it is

  12. Mitochondrial control region haplotypes of the South American sea lion Otaria flavescens (Shaw, 1800).

    Science.gov (United States)

    Artico, L O; Bianchini, A; Grubel, K S; Monteiro, D S; Estima, S C; Oliveira, L R de; Bonatto, S L; Marins, L F

    2010-09-01

    The South American sea lion, Otaria flavescens, is widely distributed along the Pacific and Atlantic coasts of South America. However, along the Brazilian coast, there are only two nonbreeding sites for the species (Refúgio de Vida Silvestre da Ilha dos Lobos and Refúgio de Vida Silvestre do Molhe Leste da Barra do Rio Grande), both in Southern Brazil. In this region, the species is continuously under the effect of anthropic activities, mainly those related to environmental contamination with organic and inorganic chemicals and fishery interactions. This paper reports, for the first time, the genetic diversity of O. flavescens found along the Southern Brazilian coast. A 287-bp fragment of the mitochondrial DNA control region (D-loop) was analyzed. Seven novel haplotypes were found in 56 individuals (OFA1-OFA7), with OFA1 being the most frequent (47.54%). Nucleotide diversity was moderate (π = 0.62%) and haplotype diversity was relatively low (67%). Furthermore, the median joining network analysis indicated that Brazilian haplotypes formed a reciprocal monophyletic clade when compared to the haplotypes from the Peruvian population on the Pacific coast. These two populations do not share haplotypes and may have become isolated some time back. Further genetic studies covering the entire species distribution are necessary to better understand the biological implications of the results reported here for the management and conservation of South American sea lions.

  13. The impact of sample size and marker selection on the study of haplotype structures

    Directory of Open Access Journals (Sweden)

    Sun Xiao

    2004-03-01

    Full Text Available Abstract Several studies of haplotype structures in the human genome in various populations have found that the human chromosomes are structured such that each chromosome can be divided into many blocks, within which there is limited haplotype diversity. In addition, only a few genetic markers in a putative block are needed to capture most of the diversity within a block. There has been no systematic empirical study of the effects of sample size and marker set on the identified block structures and representative marker sets, however. The purpose of this study was to conduct a detailed empirical study to examine such impacts. Towards this goal, we have analysed three representative autosomal regions from a large genome-wide study of haplotypes with samples consisting of African-Americans and samples consisting of Japanese and Chinese individuals. For both populations, we have found that the sample size and marker set have significant impact on the number of blocks and the total number of representative markers identified. The marker set in particular has very strong impacts, and our results indicate that the marker density in the original datasets may not be adequate to allow a meaningful characterisation of haplotype structures. In general, we conclude that we need a relatively large sample size and a very dense marker panel in the study of haplotype structures in human populations.

  14. Direct chromosome-length haplotyping by single-cell sequencing

    NARCIS (Netherlands)

    Porubský, David; Sanders, Ashley D; van Wietmarschen, Niek; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Bevova, Marianna R; Guryev, Victor; Lansdorp, Peter Michael

    Haplotypes are fundamental to fully characterize the diploid genome of an individual, yet methods to directly chart the unique genetic makeup of each parental chromosome are lacking. Here we introduce single-cell DNA template strand sequencing (Strand-seq) as a novel approach to phasing diploid

  15. Five novel glucose-6-phosphate dehydrogenase deficiency haplotypes correlating with disease severity

    Directory of Open Access Journals (Sweden)

    Dallol Ashraf

    2012-09-01

    Full Text Available Abstract Background Glucose-6-phosphate dehydrogenase (G6PD, EC 1.1.1.49 deficiency is caused by one or more mutations in the G6PD gene on chromosome X. An association between enzyme levels and gene haplotypes remains to be established. Methods In this study, we determined G6PD enzyme levels and sequenced the coding region, including the intron-exon boundaries, in a group of individuals (163 males and 86 females who were referred to the clinic with suspected G6PD deficiency. The sequence data were analysed by physical linkage analysis and PHASE haplotype reconstruction. Results All previously reported G6PD missense changes, including the AURES, MEDITERRANEAN, A-, SIBARI, VIANGCHAN and ANANT, were identified in our cohort. The AURES mutation (p.Ile48Thr was the most common variant in the cohort (30% in males patients followed by the Mediterranean variant (p.Ser188Phe detectable in 17.79% in male patients. Variant forms of the A- mutation (p.Val68Met, p.Asn126Asp or a combination of both were detectable in 15.33% of the male patients. However, unique to this study, several of such mutations co-existed in the same patient as shown by physical linkage in males or PHASE haplotype reconstruction in females. Based on 6 non-synonymous variants of G6PD, 13 different haplotypes (13 in males, 8 in females were identified. Five of these were previously unreported (Jeddah A, B, C, D and E and were defined by previously unreported combinations of extant mutations where patients harbouring these haplotypes exhibited severe G6PD deficiency. Conclusions Our findings will help design a focused population screening approach and provide better management for G6PD deficiency patients.

  16. Refined candidate region specified by haplotype sharing for Escherichia coli F4ab/F4ac susceptibility alleles in pigs

    DEFF Research Database (Denmark)

    Jacobsen, Mette Juul; Kracht, Steffen Skaarup; Esteso, G.

    2009-01-01

    Infection of the small intestine by enterotoxigenic Escherichia coli F4ab/ac is a major welfare problem and financial burden for the pig industry. Natural resistance to this infection is inherited as a Mendelian recessive trait, and a polymorphism in the MUC4 gene segregating for susceptibility....../resistance is presently used in a selection programme by the Danish pig breeding industry. To elucidate the genetic background involved in E. coli F4ab/ac susceptibility in pigs, a detailed haplotype map of the porcine candidate region was established. This region covers approximately 3.7 Mb. The material used...... for the study is a three generation family, where the founders are two Wild boars and eight Large White sows. All pigs have been phenotyped for susceptibility to F4ab/ac using an adhesion assay. Their haplotypes are known from segregation analysis using flanking markes. By a targeted approach, the candidate...

  17. Power laws for heavy-tailed distributions: modeling allele and haplotype diversity for the national marrow donor program.

    Directory of Open Access Journals (Sweden)

    Noa Slater

    2015-04-01

    Full Text Available Measures of allele and haplotype diversity, which are fundamental properties in population genetics, often follow heavy tailed distributions. These measures are of particular interest in the field of hematopoietic stem cell transplant (HSCT. Donor/Recipient suitability for HSCT is determined by Human Leukocyte Antigen (HLA similarity. Match predictions rely upon a precise description of HLA diversity, yet classical estimates are inaccurate given the heavy-tailed nature of the distribution. This directly affects HSCT matching and diversity measures in broader fields such as species richness. We, therefore, have developed a power-law based estimator to measure allele and haplotype diversity that accommodates heavy tails using the concepts of regular variation and occupancy distributions. Application of our estimator to 6.59 million donors in the Be The Match Registry revealed that haplotypes follow a heavy tail distribution across all ethnicities: for example, 44.65% of the European American haplotypes are represented by only 1 individual. Indeed, our discovery rate of all U.S. European American haplotypes is estimated at 23.45% based upon sampling 3.97% of the population, leaving a large number of unobserved haplotypes. Population coverage, however, is much higher at 99.4% given that 90% of European Americans carry one of the 4.5% most frequent haplotypes. Alleles were found to be less diverse suggesting the current registry represents most alleles in the population. Thus, for HSCT registries, haplotype discovery will remain high with continued recruitment to a very deep level of sampling, but population coverage will not. Finally, we compared the convergence of our power-law versus classical diversity estimators such as Capture recapture, Chao, ACE and Jackknife methods. When fit to the haplotype data, our estimator displayed favorable properties in terms of convergence (with respect to sampling depth and accuracy (with respect to diversity

  18. On Maximum Entropy and Inference

    Directory of Open Access Journals (Sweden)

    Luigi Gresele

    2017-11-01

    Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.

  19. Cis-acting mutation and duplication: History of molecular evolution in a P450 haplotype responsible for insecticide resistance in Culex quinquefasciatus.

    Science.gov (United States)

    Itokawa, Kentaro; Komagata, Osamu; Kasai, Shinji; Masada, Masahiro; Tomita, Takashi

    2011-07-01

    A cytochrome P450 gene, Cyp9m10, is more than 200-fold overexpressed in a pyrethroid resistant strain of Culex quinquefasciatus, JPal-per. The haplotype of this strain contains two copies of Cyp9m10 resulted from recent tandem duplication. In this study, we discovered and isolated a Cyp9m10 haplotype closely related to this duplicated Cyp9m10 haplotype from JHB, a strain used for the recent genome project for this mosquito species. The isolated haplotype (JHB-NIID-B haplotype) shared the same insertion of a transposable element upstream of the coding region with JPal-per strain but not duplicated. The JHB-NIID-B haplotype was considered to have diverged from the JPal-per lineage just before the duplication event. Cyp9m10 was moderately overexpressed in larvae with the JHB-NIID-B haplotype. The overexpressions in JHB-NIID-B and JPal-per haplotypes were developmentally regulated in similar pattern indicating both haplotypes share a common cis-acting mutation responsible for the overexpressions. The isolated moderately overexpressed haplotype conferred resistance, however, its efficacy was relatively small. We hypothesized that the first cis-acting mutation modified the consequence of the subsequent duplication in JPal-per lineage to confer stronger phenotypic effect than that if it occurred before the first cis-acting mutation. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Vitamin K epoxide reductase complex subunit 1 (Vkorc1 haplotype diversity in mouse priority strains

    Directory of Open Access Journals (Sweden)

    Kohn Michael H

    2008-12-01

    Full Text Available Abstract Background Polymorphisms in the vitamin K-epoxide reductase complex subunit 1 gene, Vkorc1, could affect blood coagulation and other vitamin K-dependent proteins, such as osteocalcin (bone Gla protein, BGP. Here we sequenced the Vkorc1 gene in 40 mouse priority strains. We analyzed Vkorc1 haplotypes with respect to prothrombin time (PT and bone mineral density and composition (BMD and BMC; phenotypes expected to be vitamin K-dependent and represented by data in the Mouse Phenome Database (MPD. Findings In the commonly used laboratory strains of Mus musculus domesticus we identified only four haplotypes differing in the intron or 5' region sequence of the Vkorc1. Six haplotypes differing by coding and non-coding polymorphisms were identified in the other subspecies of Mus. We detected no significant association of Vkorc1 haplotypes with PT, BMD and BMC within each subspecies of Mus. Vkorc1 haplotype sequences divergence between subspecies was associated with PT, BMD and BMC. Conclusion Phenotypic variation in PT, BMD and BMC within subspecies of Mus, while substantial, appears to be dominated by genetic variation in genes other than the Vkorc1. This was particularly evident for M. m. domesticus, where a single haplotype was observed in conjunction with virtually the entire range of PT, BMD and BMC values of all 5 subspecies of Mus included in this study. Differences in these phenotypes between subspecies also should not be attributed to Vkorc1 variants, but should be viewed as a result of genome wide genetic divergence.

  1. Haplotype frequencies at the DRD2 locus in populations of the East European Plain

    Directory of Open Access Journals (Sweden)

    Mikulich Alexey I

    2009-09-01

    Full Text Available Abstract Background It was demonstrated previously that the three-locus RFLP haplotype, TaqI B-TaqI D-TaqI A (B-D-A, at the DRD2 locus constitutes a powerful genetic marker and probably reflects the most ancient dispersal of anatomically modern humans. Results We investigated TaqI B, BclI, MboI, TaqI D, and TaqI A RFLPs in 17 contemporary populations of the East European Plain and Siberia. Most of these populations belong to the Indo-European or Uralic language families. We identified three common haplotypes, which occurred in more than 90% of chromosomes investigated. The frequencies of the haplotypes differed according to linguistic and geographical affiliation. Conclusion Populations in the northwestern (Byelorussians from Mjadel', northern (Russians from Mezen' and Oshevensk, and eastern (Russians from Puchezh parts of the East European Plain had relatively high frequencies of haplotype B2-D2-A2, which may reflect admixture with Uralic-speaking populations that inhabited all of these regions in the Early Middle Ages.

  2. PADI4 Haplotypes in Association with RA Mexican Patients, a New Prospect for Antigen Modulation

    Directory of Open Access Journals (Sweden)

    Maria Guadalupe Zavala-Cerna

    2013-01-01

    Full Text Available Peptidyl arginine deiminase IV (PAD 4 is the responsible enzyme for a posttranslational modification called citrullination, originating the antigenic determinant recognized by anti-cyclic citrullinated peptide antibodies (ACPA. Four SNPs (single nucleotide polymorphisms have been described in PADI4 gene to form a susceptibility haplotype for rheumatoid arthritis (RA; nevertheless, results in association studies appear contradictory in different populations. The aim of the study was to analyze if the presence of three SNPs in PADI4 gene susceptibility haplotype (GTG is associated with ACPA positivity in patients with RA. This was a cross-sectional study that included 86 RA patients and 98 healthy controls. Polymorphisms PADI4_89, PADI4_90, and PADI4_92 in the PADI4 gene were genotyped. The susceptibility haplotype (GTG was more frequent in RA patients; interestingly, we found a new haplotype associated with RA with a higher frequency (GTC. There were no associations between polymorphisms and high scores in Spanish HAQ-DI and DAS-28, but we did find an association between RARBIS index and PADI4_89, PADI4_90 polymorphisms. We could not confirm an association between susceptibility haplotype presence and ACPA positivity. Further evidence about proteomic expression of this gene will determine its participation in antigenic generation and autoimmunity.

  3. Spatial Inference Based on Geometric Proportional Analogies

    OpenAIRE

    Mullally, Emma-Claire; O'Donoghue, Diarmuid P.

    2006-01-01

    We describe an instance-based reasoning solution to a variety of spatial reasoning problems. The solution centers on identifying an isomorphic mapping between labelled graphs that represent some problem data and a known solution instance. We describe a number of spatial reasoning problems that are solved by generating non-deductive inferences, integrating topology with area (and other) features. We report the accuracy of our algorithm on different categories of spatial reasoning tasks from th...

  4. Identification of Tribolium castaneum (Herbst) haplotypes, the pest of ...

    African Journals Online (AJOL)

    SARAH

    2016-07-31

    Jul 31, 2016 ... haplotypes of T. castaneum and their distribution in Senegal. Methodology ... very strong marketing of cereals and vegetables in that area. The mutations ..... for each channel by sampling the various parameters every 1000 ...

  5. Mitochondrial control region haplotypes of the South American sea lion Otaria flavescens (Shaw, 1800

    Directory of Open Access Journals (Sweden)

    L.O. Artico

    2010-09-01

    Full Text Available The South American sea lion, Otaria flavescens, is widely distributed along the Pacific and Atlantic coasts of South America. However, along the Brazilian coast, there are only two nonbreeding sites for the species (Refúgio de Vida Silvestre da Ilha dos Lobos and Refúgio de Vida Silvestre do Molhe Leste da Barra do Rio Grande, both in Southern Brazil. In this region, the species is continuously under the effect of anthropic activities, mainly those related to environmental contamination with organic and inorganic chemicals and fishery interactions. This paper reports, for the first time, the genetic diversity of O. flavescens found along the Southern Brazilian coast. A 287-bp fragment of the mitochondrial DNA control region (D-loop was analyzed. Seven novel haplotypes were found in 56 individuals (OFA1-OFA7, with OFA1 being the most frequent (47.54%. Nucleotide diversity was moderate (π = 0.62% and haplotype diversity was relatively low (67%. Furthermore, the median joining network analysis indicated that Brazilian haplotypes formed a reciprocal monophyletic clade when compared to the haplotypes from the Peruvian population on the Pacific coast. These two populations do not share haplotypes and may have become isolated some time back. Further genetic studies covering the entire species distribution are necessary to better understand the biological implications of the results reported here for the management and conservation of South American sea lions.

  6. Statistical inference an integrated approach

    CERN Document Server

    Migon, Helio S; Louzada, Francisco

    2014-01-01

    Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...

  7. Statistical learning and selective inference.

    Science.gov (United States)

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  8. Haplotype analysis of common variants in the BRCA1 gene and risk of sporadic breast cancer

    International Nuclear Information System (INIS)

    Cox, David G; Kraft, Peter; Hankinson, Susan E; Hunter, David J

    2005-01-01

    Truncation mutations in the BRCA1 gene cause a substantial increase in risk of breast cancer. However, these mutations are rare in the general population and account for little of the overall incidence of sporadic breast cancer. We used whole-gene resequencing data to select haplotype tagging single nucleotide polymorphisms, and examined the association between common haplotypes of BRCA1 and breast cancer in a nested case-control study in the Nurses' Health Study (1323 cases and 1910 controls). One haplotype was associated with a slight increase in risk (odds ratio 1.18, 95% confidence interval 1.02–1.37). A significant interaction (P = 0.05) was seen between this haplotype, positive family history of breast cancer, and breast cancer risk. Although not statistically significant, similar interactions were observed with age at diagnosis and with menopausal status at diagnosis; risk tended to be higher among younger, pre-menopausal women. We have described a haplotype in the BRCA1 gene that was associated with an approximately 20% increase in risk of sporadic breast cancer in the general population. However, the functional variant(s) responsible for the association are unclear

  9. BagReg: Protein inference through machine learning.

    Science.gov (United States)

    Zhao, Can; Liu, Dao; Teng, Ben; He, Zengyou

    2015-08-01

    Protein inference from the identified peptides is of primary importance in the shotgun proteomics. The target of protein inference is to identify whether each candidate protein is truly present in the sample. To date, many computational methods have been proposed to solve this problem. However, there is still no method that can fully utilize the information hidden in the input data. In this article, we propose a learning-based method named BagReg for protein inference. The method firstly artificially extracts five features from the input data, and then chooses each feature as the class feature to separately build models to predict the presence probabilities of proteins. Finally, the weak results from five prediction models are aggregated to obtain the final result. We test our method on six public available data sets. The experimental results show that our method is superior to the state-of-the-art protein inference algorithms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. The interaction between coagulation factor 2 receptor and interleukin 6 haplotypes increases the risk of myocardial infarction in men.

    Directory of Open Access Journals (Sweden)

    Bruna Gigante

    Full Text Available The aim of the study was to investigate if the interaction between the coagulation factor 2 receptor (F2R and the interleukin 6 (IL6 haplotypes modulates the risk of myocardial infarction (MI in the Stockholm Heart Epidemiology Program (SHEEP. Seven SNPs at the F2R locus and three SNPs at the IL6 locus were genotyped. Haplotypes and haplotype pairs (IL6*F2R were generated. A logistic regression analysis was performed to analyze the association of the haplotypes and haplotype pairs with the MI risk. Presence of an interaction between the two haplotypes in each haplotype pair was calculated using two different methods: the statistical, on a multiplicative scale, which includes the cross product of the two factors into the logistic regression model; the biological, on an additive scale, which evaluates the relative risk associated with the joint presence of both factors. The ratio between the observed and the predicted effect of the joint exposure, the synergy index (S, indicates the presence of a synergy (S>1 or of an antagonism (S<1. None of the haplotypes within the two loci was associated with the risk of MI. Out of 22 different haplotype pairs, the haplotype pair 17 GGG*ADGTCCT was associated with an increased risk of MI with an OR (95%CI of 1.58 (1.05-2.41 (p = 0.02 in the crude and an OR of 1.72 (1.11-2.67 (p = 0.01 in the adjusted analysis. We observed the presence of an interaction on a multiplicative scale with an OR (95%CI of 2.24 (1.27-3.95 (p = 0.005 and a slight interactive effect between the two haplotypes on an additive scale with an OR (95%CI of 1.56 (1.02-2.37 (p = 0.03 and S of 1.66 (0.89-31. In conclusion, our results support the hypothesis that the interaction between these two functionally related genes may influence the risk of MI and suggest new mechanisms involved in the genetic susceptibility to MI.

  11. Interrelationships between Amerindian tribes of lower Amazonia as manifest by HLA haplotype disequilibria.

    OpenAIRE

    Black, F L

    1984-01-01

    HLA B-C haplotypes exhibit common disequilibria in populations drawn from four continents, indicating that they are subject to broadly active selective forces. However, the A-B and A-C associations we have examined show no consistent disequilibrium pattern, leaving open the possibility that these disequilibria are due to descent from common progenitors. By examining HLA haplotype distributions, I have explored the implications that would follow from the hypothesis that biological selection pl...

  12. Prognostic importance of VEGF-A haplotype combinations in a stage II colon cancer population

    DEFF Research Database (Denmark)

    Kjaer-Frifeldt, Sanne; Fredslund, Rikke; Lindebjerg, Jan

    2012-01-01

    To investigate the prognostic effect of three VEGF-A SNPs, -2578, -460 and 405, as well as the corresponding haplotype combinations, in a unique population of stage II colon cancer patients.......To investigate the prognostic effect of three VEGF-A SNPs, -2578, -460 and 405, as well as the corresponding haplotype combinations, in a unique population of stage II colon cancer patients....

  13. Inferring Social Influence of Anti-Tobacco Mass Media Campaign.

    Science.gov (United States)

    Zhan, Qianyi; Zhang, Jiawei; Yu, Philip S; Emery, Sherry; Xie, Junyuan

    2017-07-01

    Anti-tobacco mass media campaigns are designed to influence tobacco users. It has been proved that campaigns will produce users' changes in awareness, knowledge, and attitudes, and also produce meaningful behavior change of audience. Anti-smoking television advertising is the most important part in the campaign. Meanwhile, nowadays, successful online social networks are creating new media environment, however, little is known about the relation between social conversations and anti-tobacco campaigns. This paper aims to infer social influence of these campaigns, and the problem is formally referred to as the Social Influence inference of anti-Tobacco mass mEdia campaigns (Site) problem. To address the Site problem, a novel influence inference framework, TV advertising social influence estimation (Asie), is proposed based on our analysis of two real anti-tobacco campaigns. Asie divides audience attitudes toward TV ads into three distinct stages: 1) cognitive; 2) affective; and 3) conative. Audience online reactions at each of these three stages are depicted by Asie with specific probabilistic models based on the synergistic influences from both online social friends and offline TV ads. Extensive experiments demonstrate the effectiveness of Asie.

  14. Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

    Science.gov (United States)

    Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

    2017-06-26

    Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.

  15. Haplotypes in the dystrophin DNA segment point to a mosaic origin of modern human diversity.

    Science.gov (United States)

    Zietkiewicz, Ewa; Yotova, Vania; Gehl, Dominik; Wambach, Tina; Arrieta, Isabel; Batzer, Mark; Cole, David E C; Hechtman, Peter; Kaplan, Feige; Modiano, David; Moisan, Jean-Paul; Michalski, Roman; Labuda, Damian

    2003-11-01

    Although Africa has played a central role in human evolutionary history, certain studies have suggested that not all contemporary human genetic diversity is of recent African origin. We investigated 35 simple polymorphic sites and one T(n) microsatellite in an 8-kb segment of the dystrophin gene. We found 86 haplotypes in 1,343 chromosomes from around the world. Although a classical out-of-Africa topology was observed in trees based on the variant frequencies, the tree of haplotype sequences reveals three lineages accounting for present-day diversity. The proportion of new recombinants and the diversity of the T(n) microsatellite were used to estimate the age of haplotype lineages and the time of colonization events. The lineage that underwent the great expansion originated in Africa prior to the Upper Paleolithic (27,000-56,000 years ago). A second group, of structurally distinct haplotypes that occupy a central position on the tree, has never left Africa. The third lineage is represented by the haplotype that lies closest to the root, is virtually absent in Africa, and appears older than the recent out-of-Africa expansion. We propose that this lineage could have left Africa before the expansion (as early as 160,000 years ago) and admixed, outside of Africa, with the expanding lineage. Contemporary human diversity, although dominated by the recently expanded African lineage, thus represents a mosaic of different contributions.

  16. Novel harmful recessive haplotypes identified for fertility traits in Nordic Holstein cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Nielsen, Ulrik Sander; Aamand, Gert Pedersen

    2013-01-01

    harboring possible recessive lethal alleles. Effects of the identified haplotypes were estimated on two fertility traits: non-return rates and calving interval. Out of the eight identified genomic regions, six regions were confirmed as having an effect on fertility. The information can be used to avoid......Using genomic data, lethal recessives may be discovered from haplotypes that are common in the population but never occur in the homozygote state in live animals. This approach only requires genotype data from phenotypically normal (i.e. live) individuals and not from the affected embryos that die...

  17. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants

    Science.gov (United States)

    2014-01-01

    Background Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. Results We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Conclusions Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes. PMID:24996694

  18. Inheritance of the 8.1 ancestral haplotype in recurrent pregnancy loss

    DEFF Research Database (Denmark)

    Kolte, Astrid M; Nielsen, Henriette S; Steffensen, Rudi

    2015-01-01

    . The objective was to test the gestational drive theory for the 8.1AH in women with RPL and their live born children. METHODOLOGY: We investigated the inheritance of the 8.1AH from 82 heterozygous RPL women to 110 live born children. All participants were genotyped for HLA-A, -B and -DRB1 in DNA from EDTA...... pleiotropy. It has also been proposed that the survival of long, conserved haplotypes may be due to gestational drive, i.e. selective miscarriage of fetuses who have not inherited the haplotype from a heterozygous mother. Recurrent pregnancy loss (RPL) is defined as three or more consecutive pregnancy losses...

  19. Inference in partially identified models with many moment inequalities using Lasso

    DEFF Research Database (Denmark)

    Bugni, Federico A.; Caner, Mehmet; Kock, Anders Bredahl

    This paper considers the problem of inference in a partially identified moment (in)equality model with possibly many moment inequalities. Our contribution is to propose a novel two-step new inference method based on the combination of two ideas. On the one hand, our test statistic and critical...

  20. Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Pauline M Goubet

    Full Text Available Self-incompatibility has been considered by geneticists a model system for reproductive biology and balancing selection, but our understanding of the genetic basis and evolution of this molecular lock-and-key system has remained limited by the extreme level of sequence divergence among haplotypes, resulting in a lack of appropriate genomic sequences. In this study, we report and analyze the full sequence of eleven distinct haplotypes of the self-incompatibility locus (S-locus in two closely related Arabidopsis species, obtained from individual BAC libraries. We use this extensive dataset to highlight sharply contrasted patterns of molecular evolution of each of the two genes controlling self-incompatibility themselves, as well as of the genomic region surrounding them. We find strong collinearity of the flanking regions among haplotypes on each side of the S-locus together with high levels of sequence similarity. In contrast, the S-locus region itself shows spectacularly deep gene genealogies, high variability in size and gene organization, as well as complete absence of sequence similarity in intergenic sequences and striking accumulation of transposable elements. Of particular interest, we demonstrate that dominant and recessive S-haplotypes experience sharply contrasted patterns of molecular evolution. Indeed, dominant haplotypes exhibit larger size and a much higher density of transposable elements, being matched only by that in the centromere. Overall, these properties highlight that the S-locus presents many striking similarities with other regions involved in the determination of mating-types, such as sex chromosomes in animals or in plants, or the mating-type locus in fungi and green algae.

  1. The IGF1 small dog haplotype is derived from Middle Eastern grey wolves

    Directory of Open Access Journals (Sweden)

    Ostrander Elaine A

    2010-02-01

    Full Text Available Abstract Background A selective sweep containing the insulin-like growth factor 1 (IGF1 gene is associated with size variation in domestic dogs. Intron 2 of IGF1 contains a SINE element and single nucleotide polymorphism (SNP found in all small dog breeds that is almost entirely absent from large breeds. In this study, we surveyed a large sample of grey wolf populations to better understand the ancestral pattern of variation at IGF1 with a particular focus on the distribution of the small dog haplotype and its relationship to the origin of the dog. Results We present DNA sequence data that confirms the absence of the derived small SNP allele in the intron 2 region of IGF1 in a large sample of grey wolves and further establishes the absence of a small dog associated SINE element in all wild canids and most large dog breeds. Grey wolf haplotypes from the Middle East have higher nucleotide diversity suggesting an origin there. Additionally, PCA and phylogenetic analyses suggests a closer kinship of the small domestic dog IGF1 haplotype with those from Middle Eastern grey wolves. Conclusions The absence of both the SINE element and SNP allele in grey wolves suggests that the mutation for small body size post-dates the domestication of dogs. However, because all small dogs possess these diagnostic mutations, the mutations likely arose early in the history of domestic dogs. Our results show that the small dog haplotype is closely related to those in Middle Eastern wolves and is consistent with an ancient origin of the small dog haplotype there. Thus, in concordance with past archeological studies, our molecular analysis is consistent with the early evolution of small size in dogs from the Middle East. See associated opinion by Driscoll and Macdonald: http://jbiol.com/content/9/2/10

  2. The clinical application of single-sperm-based SNP haplotyping for PGD of osteogenesis imperfecta.

    Science.gov (United States)

    Chen, Linjun; Diao, Zhenyu; Xu, Zhipeng; Zhou, Jianjun; Yan, Guijun; Sun, Haixiang

    2018-05-15

    Osteogenesis imperfecta (OI) is a genetically heterogeneous disorder, presenting either autosomal dominant, autosomal recessive or X-linked inheritance patterns. The majority of OI cases are autosomal dominant and are caused by heterozygous mutations in either the COL1A1 or COL1A2 gene. In these dominant disorders, allele dropout (ADO) can lead to misdiagnosis in preimplantation genetic diagnosis (PGD). Polymorphic markers linked to the mutated genes have been used to establish haplotypes for identifying ADO and ensuring the accuracy of PGD. However, the haplotype of male patients cannot be determined without data from affected relatives. Here, we developed a method for single-sperm-based single-nucleotide polymorphism (SNP) haplotyping via next-generation sequencing (NGS) for the PGD of OI. After NGS, 10 informative polymorphic SNP markers located upstream and downstream of the COL1A1 gene and its pathogenic mutation site were linked to individual alleles in a single sperm from an affected male. After haplotyping, a normal blastocyst was transferred to the uterus for a subsequent frozen embryo transfer cycle. The accuracy of PGD was confirmed by amniocentesis at 19 weeks of gestation. A healthy infant weighing 4,250 g was born via vaginal delivery at the 40th week of gestation. Single-sperm-based SNP haplotyping can be applied for PGD of any monogenic disorders or de novo mutations in males in whom the haplotype of paternal mutations cannot be determined due to a lack of affected relatives. ADO: allele dropout; DI: dentinogenesis imperfect; ESHRE: European Society of Human Reproduction and Embryology; FET: frozen embryo transfer; gDNA: genomic DNA; ICSI: intracytoplasmic sperm injection; IVF: in vitro fertilization; MDA: multiple displacement amplification; NGS: next-generation sequencing; OI: osteogenesis imperfect; PBS: phosphate buffer saline; PCR: polymerase chain reaction; PGD: preimplantation genetic diagnosis; SNP: single-nucleotide polymorphism; STR

  3. Association of specific haplotype of TNFα with Helicobacter pylori ...

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Genetics; Volume 87; Issue 3. Association of specific haplotype of TNF with Helicobacter pylori-mediated duodenal ulcer in eastern Indian population. Meenakshi Chakravorty Dipanjana Datta De Abhijit Choudhury Amal Santra Susanta Roychoudhury. Research Note Volume 87 Issue 3 ...

  4. An unusual haplotype structure on human chromosome 8p23 derived from the inversion polymorphism.

    Science.gov (United States)

    Deng, Libin; Zhang, Yuezheng; Kang, Jian; Liu, Tao; Zhao, Hongbin; Gao, Yang; Li, Chaohua; Pan, Hao; Tang, Xiaoli; Wang, Dunmei; Niu, Tianhua; Yang, Huanming; Zeng, Changqing

    2008-10-01

    Chromosomal inversion is an important type of genomic variations involved in both evolution and disease pathogenesis. Here, we describe the refined genetic structure of a 3.8-Mb inversion polymorphism at chromosome 8p23. Using HapMap data of 1,073 SNPs generated from 209 unrelated samples from CEPH-Utah residents with ancestry from northern and western Europe (CEU); Yoruba in Ibadan, Nigeria (YRI); and Asian (ASN) samples, which were comprised of Han Chinese from Beijing, China (CHB) and Japanese from Tokyo, Japan (JPT)-we successfully deduced the inversion orientations of all their 418 haplotypes. In particular, distinct haplotype subgroups were identified based on principal component analysis (PCA). Such genetic substructures were consistent with clustering patterns based on neighbor-joining tree reconstruction, which revealed a total of four haplotype clades across all samples. Metaphase fluorescence in situ hybridization (FISH) in a subset of 10 HapMap samples verified their inversion orientations predicted by PCA or phylogenetic tree reconstruction. Positioning of the outgroup haplotype within one of YRI clades suggested that Human NCBI Build 36-inverted order is most likely the ancestral orientation. Furthermore, the population differentiation test and the relative extended haplotype homozygosity (REHH) analysis in this region discovered multiple selection signals, also in a population-specific manner. A positive selection signal was detected at XKR6 in the ASN population. These results revealed the correlation of inversion polymorphisms to population-specific genetic structures, and various selection patterns as possible mechanisms for the maintenance of a large chromosomal rearrangement at 8p23 region during evolution. In addition, our study also showed that haplotype-based clustering methods, such as PCA, can be applied in scanning for cryptic inversion polymorphisms at a genome-wide scale.

  5. Fused Regression for Multi-source Gene Regulatory Network Inference.

    Directory of Open Access Journals (Sweden)

    Kari Y Lam

    2016-12-01

    Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.

  6. Cycle-Based Cluster Variational Method for Direct and Inverse Inference

    Science.gov (United States)

    Furtlehner, Cyril; Decelle, Aurélien

    2016-08-01

    Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to 10^5 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.

  7. Class I gene regulation of haplotype preference may influence antiviral immunity in vivo

    DEFF Research Database (Denmark)

    Thomsen, Allan Randrup; Marker, O

    1989-01-01

    targets. In regard to the in vivo significance of haplotype preference it was found that (C X C3) F1 mice expressed an earlier and stronger virus-specific delayed type hypersensitivity response and exerted a more efficient virus control than did (C-H-2dm2 X C3) F1. Taken together these findings suggest...... that haplotype preference reflects a selection process favoring the restriction element associated with the most efficient immune response in vivo. The implications of this are discussed....

  8. Cortical hierarchies perform Bayesian causal inference in multisensory perception.

    Directory of Open Access Journals (Sweden)

    Tim Rohe

    2015-02-01

    Full Text Available To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI, and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation. At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion. Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.

  9. Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems

    KAUST Repository

    Contreras, Andres A.; Le Maî tre, Olivier P.; Aquino, Wilkins; Knio, Omar

    2016-01-01

    of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic

  10. Sign Inference for Dynamic Signed Networks via Dictionary Learning

    Directory of Open Access Journals (Sweden)

    Yi Cen

    2013-01-01

    Full Text Available Mobile online social network (mOSN is a burgeoning research area. However, most existing works referring to mOSNs deal with static network structures and simply encode whether relationships among entities exist or not. In contrast, relationships in signed mOSNs can be positive or negative and may be changed with time and locations. Applying certain global characteristics of social balance, in this paper, we aim to infer the unknown relationships in dynamic signed mOSNs and formulate this sign inference problem as a low-rank matrix estimation problem. Specifically, motivated by the Singular Value Thresholding (SVT algorithm, a compact dictionary is selected from the observed dataset. Based on this compact dictionary, the relationships in the dynamic signed mOSNs are estimated via solving the formulated problem. Furthermore, the estimation accuracy is improved by employing a dictionary self-updating mechanism.

  11. SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines

    Directory of Open Access Journals (Sweden)

    Smith Oscar

    2002-10-01

    Full Text Available Abstract Background Recent studies of ancestral maize populations indicate that linkage disequilibrium tends to dissipate rapidly, sometimes within 100 bp. We set out to examine the linkage disequilibrium and diversity in maize elite inbred lines, which have been subject to population bottlenecks and intense selection by breeders. Such population events are expected to increase the amount of linkage disequilibrium, but reduce diversity. The results of this study will inform the design of genetic association studies. Results We examined the frequency and distribution of DNA polymorphisms at 18 maize genes in 36 maize inbreds, chosen to represent most of the genetic diversity in U.S. elite maize breeding pool. The frequency of nucleotide changes is high, on average one polymorphism per 31 bp in non-coding regions and 1 polymorphism per 124 bp in coding regions. Insertions and deletions are frequent in non-coding regions (1 per 85 bp, but rare in coding regions. A small number (2–8 of distinct and highly diverse haplotypes can be distinguished at all loci examined. Within genes, SNP loci comprising the haplotypes are in linkage disequilibrium with each other. Conclusions No decline of linkage disequilibrium within a few hundred base pairs was found in the elite maize germplasm. This finding, as well as the small number of haplotypes, relative to neutral expectation, is consistent with the effects of breeding-induced bottlenecks and selection on the elite germplasm pool. The genetic distance between haplotypes is large, indicative of an ancient gene pool and of possible interspecific hybridization events in maize ancestry.

  12. Molecular characterization of a long range haplotype affecting protein yield and mastitis susceptibility in Norwegian Red cattle

    Directory of Open Access Journals (Sweden)

    Hayes Ben J

    2011-08-01

    Full Text Available Abstract Background Previous fine mapping studies in Norwegian Red cattle (NRC in the region 86-90.4 Mb on Bos taurus chromosome 6 (BTA6 has revealed a quantitative trait locus (QTL for protein yield (PY around 88 Mb and a QTL for clinical mastitis (CM around 90 Mb. The close proximity of these QTLs may partly explain the unfavorable genetic correlation between these two traits in NRC. A long range haplotype covering this region was introduced into the NRC population through the importation of a Holstein-Friesian bull (1606 Frasse from Sweden in the 1970s. It has been suggested that this haplotype has a favorable effect on milk protein content but an unfavorable effect on mastitis susceptibility. Selective breeding for milk production traits is likely to have increased the frequency of this haplotype in the NRC population. Results Association mapping for PY and CM in NRC was performed using genotypes from 556 SNPs throughout the region 86-97 Mb on BTA6 and daughter-yield-deviations (DYDs from 2601 bulls made available from the Norwegian dairy herd recording system. Highest test scores for PY were found for single-nucleotide polymorphisms (SNPs within and surrounding the genes CSN2 and CSN1S2, coding for the β-casein and αS2-casein proteins. High coverage re-sequencing by high throughput sequencing technology enabled molecular characterization of a long range haplotype from 1606 Frasse encompassing these two genes. Haplotype analysis of a large number of descendants from this bull indicated that the haplotype was not markedly disrupted by recombination in this region. The haplotype was associated with both increased milk protein content and increased susceptibility to mastitis, which might explain parts of the observed genetic correlation between PY and CM in NRC. Plausible causal polymorphisms affecting PY were detected in the promoter region and in the 5'-flanking UTR of CSN1S2. These polymorphisms could affect transcription or translation of

  13. Evidence that the ancestral haplotype in Australian hemochromatosis patients may be associated with a common mutation in the gene

    Energy Technology Data Exchange (ETDEWEB)

    Crawford, D.H.G.; Powell, L.W.; Leggett, B.A. [Univ. of Queensland (Australia)] [and others

    1995-08-01

    Hemochromatosis (HC) is a common inherited disorder of iron metabolism for which neither the gene nor biochemical defect have yet been identified. The aim of this study was to look for clinical evidence that the predominant ancestral haplotype in Australian patients is associated with a common mutation in the gene. We compared indices of iron metabolism and storage in three groups of HC patients categorized according to the presence of the ancestral haplotype (i.e., patients with two copies, one copy, and no copies of the ancestral haplotype). We also examined iron indices in two groups of HC heterozygotes (those with the ancestral haplotype and those without) and in age-matched controls. These analyses indicate that (i) HC patients with two copies of the ancestral haplotype show significantly more severe expression of the disorder than those with one copy or those without, (ii) HC heterozygotes have partial clinical expression, which may be influenced by the presence of the ancestral haplotype in females but not in males, and (iii) the high population frequency of the HC gene may be the result of the selective advantage conferred by protecting heterozygotes against iron deficiency. 18 refs., 3 tabs.

  14. Global spread and genetic variants of the two CYP9M10 haplotype forms associated with insecticide resistance in Culex quinquefasciatus Say.

    Science.gov (United States)

    Itokawa, K; Komagata, O; Kasai, S; Kawada, H; Mwatele, C; Dida, G O; Njenga, S M; Mwandawiro, C; Tomita, T

    2013-09-01

    Insecticide resistance develops as a genetic factor (allele) conferring lower susceptibility to insecticides proliferates within a target insect population under strong positive selection. Intriguingly, a resistance allele pre-existing in a population often bears a series of further adaptive allelic variants through new mutations. This phenomenon occasionally results in replacement of the predominating resistance allele by fitter new derivatives, and consequently, development of greater resistance at the population level. The overexpression of the cytochrome P450 gene CYP9M10 is associated with pyrethroid resistance in the southern house mosquito Culex quinquefasciatus. Previously, we have found two genealogically related overexpressing CYP9M10 haplotypes, which differ in gene copy number (duplicated and non-duplicated). The duplicated haplotype was derived from the non-duplicated overproducer probably recently. In the present study, we investigated allelic series of CYP9M10 involved in three C. quinquefasciatus laboratory colonies recently collected from three different localities. Duplicated and non-duplicated overproducing haplotypes coexisted in African and Asian colonies indicating a global distribution of both haplotype lineages. The duplicated haplotypes both in the Asian and African colonies were associated with higher expression levels and stronger resistance than non-duplicated overproducing haplotypes. There were slight variation in expression level among the non-duplicated overproducing haplotypes. The nucleotide sequences in coding and upstream regions among members of this group also showed a little diversity. Non-duplicated overproducing haplotypes with relatively higher expression were genealogically closer to the duplicated haplotypes than the other non-duplicated overproducing haplotypes, suggesting multiple cis-acting mutations before duplication.

  15. Mapping the genetic diversity of HLA haplotypes in the Japanese populations

    Science.gov (United States)

    Saw, Woei-Yuh; Liu, Xuanyao; Khor, Chiea-Chuen; Takeuchi, Fumihiko; Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Akiyama, Koichi; Asano, Hiroyuki; Asayama, Kei; Haga, Toshikazu; Hara, Azusa; Hirose, Takuo; Hosaka, Miki; Ichihara, Sahoko; Imai, Yutaka; Inoue, Ryusuke; Ishiguro, Aya; Isomura, Minoru; Isono, Masato; Kamide, Kei; Kato, Norihiro; Katsuya, Tomohiro; Kikuya, Masahiro; Kohara, Katsuhiko; Matsubara, Tatsuaki; Matsuda, Ayako; Metoki, Hirohito; Miki, Tetsuro; Murakami, Keiko; Nabika, Toru; Nakatochi, Masahiro; Ogihara, Toshio; Ohnaka, Keizo; Ohkubo, Takayoshi; Rakugi, Hiromi; Satoh, Michihiro; Shiwaku, Kunihiro; Sugimoto, Ken; Tabara, Yasuharu; Takami, Yoichi; Takayanagi, Ryoichi; Takeuchi, Fumihiko; Tsubota-Utsugi, Megumi; Yamamoto, Ken; Yamamoto, Koichi; Yamasaki, Masayuki; Yasui, Daisaku; Yokota, Mitsuhiro; Teo, Yik-Ying; Kato, Norihiro

    2015-01-01

    Japan has often been viewed as an Asian country that possesses a genetically homogenous community. The basis for partitioning the country into prefectures has largely been geographical, although cultural and linguistic differences still exist between some of the districts/prefectures, especially between Okinawa and the mainland prefectures. The Major Histocompatibility Complex (MHC) region has consistently emerged as the most polymorphic region in the human genome, harbouring numerous biologically important variants; nevertheless the presence of population-specific long haplotypes hinders the imputation of SNPs and classical HLA alleles. Here, we examined the extent of genetic variation at the MHC between eight Japanese populations sampled from Okinawa, and six other prefectures located in or close to the mainland of Japan, specifically focusing at the haplotypes observed within each population, and what the impact of any variation has on imputation. Our results indicated that Okinawa was genetically farther to the mainland Japanese than were Gujarati Indians from Tamil Indians, while the mainland Japanese from six prefectures were more homogeneous than between northern and southern Han Chinese. The distribution of haplotypes across Japan was similar, although imputation was most accurate for Okinawa and several mainland prefectures when population-specific panels were used as reference. PMID:26648100

  16. Improved Inference of Heteroscedastic Fixed Effects Models

    Directory of Open Access Journals (Sweden)

    Afshan Saeed

    2016-12-01

    Full Text Available Heteroscedasticity is a stern problem that distorts estimation and testing of panel data model (PDM. Arellano (1987 proposed the White (1980 estimator for PDM with heteroscedastic errors but it provides erroneous inference for the data sets including high leverage points. In this paper, our attempt is to improve heteroscedastic consistent covariance matrix estimator (HCCME for panel dataset with high leverage points. To draw robust inference for the PDM, our focus is to improve kernel bootstrap estimators, proposed by Racine and MacKinnon (2007. The Monte Carlo scheme is used for assertion of the results.

  17. Intercoalescence time distribution of incomplete gene genealogies in temporally varying populations, and applications in population genetic inference.

    Science.gov (United States)

    Chen, Hua

    2013-03-01

    Tracing back to a specific time T in the past, the genealogy of a sample of haplotypes may not have reached their common ancestor and may leave m lineages extant. For such an incomplete genealogy truncated at a specific time T in the past, the distribution and expectation of the intercoalescence times conditional on T are derived in an exact form in this paper for populations of deterministically time-varying sizes, specifically, for populations growing exponentially. The derived intercoalescence time distribution can be integrated to the coalescent-based joint allele frequency spectrum (JAFS) theory, and is useful for population genetic inference from large-scale genomic data, without relying on computationally intensive approaches, such as importance sampling and Markov Chain Monte Carlo (MCMC) methods. The inference of several important parameters relying on this derived conditional distribution is demonstrated: quantifying population growth rate and onset time, and estimating the number of ancestral lineages at a specific ancient time. Simulation studies confirm validity of the derivation and statistical efficiency of the methods using the derived intercoalescence time distribution. Two examples of real data are given to show the inference of the population growth rate of a European sample from the NIEHS Environmental Genome Project, and the number of ancient lineages of 31 mitochondrial genomes from Tibetan populations. © 2013 Blackwell Publishing Ltd/University College London.

  18. Inference and the Introductory Statistics Course

    Science.gov (United States)

    Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross

    2011-01-01

    This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its…

  19. Influence of βS-Globin Haplotypes and Hydroxyurea on Arginase I Levels in Sickle Cell Disease

    Directory of Open Access Journals (Sweden)

    J. A. Moreira

    2016-01-01

    Full Text Available Introduction. Sickle cell disease (SCD is characterized by hemoglobin S homozygosity, leading to hemolysis and vasoocclusion. The hemolysis releases arginase I, an enzyme that decreases the bioavailability of nitric oxide, worsening the symptoms. The different SCD haplotypes are related to clinical symptoms and varied hemoglobin F (HbF concentration. The aim of this study was to evaluate the impact of the βS gene haplotypes and HbF concentration on arginase I levels in SCD patients. Methods. Fifty SCD adult patients were enrolled in the study and 20 blood donors composed the control group. Arginase I was measured by ELISA. The βS haplotypes were identified by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP. Statistical analyses were performed with GraphPad Prism program and the significance level was p<0.05. Results. Significant increase was observed in the arginase I levels in SCD patients compared to the control group (p<0.0001. The comparison between the levels of arginase I in three haplotypes groups showed a difference between the Bantu/Bantu × Bantu/Benin groups; Bantu/Bantu × Benin/Benin, independent of HU dosage. An inverse correlation with the arginase I levels and HbF concentration was observed. Conclusion. The results support the hypothesis that arginase I is associated with HbF concentration, also measured indirectly by the association with haplotypes.

  20. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

    NARCIS (Netherlands)

    M.D. Patterson (Murray); T. Marschall (Tobias); N. Pisanti (Nadia); L.J.J. van Iersel (Leo); L. Stougie (Leen); G.W. Klau (Gunnar); A. Schönhuth (Alexander)

    2014-01-01

    htmlabstractThe human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are

  1. Haplotype mapping of a diploid non-meiotic organism using existing and induced aneuploidies.

    Directory of Open Access Journals (Sweden)

    Melanie Legrand

    2008-01-01

    Full Text Available Haplotype maps (HapMaps reveal underlying sequence variation and facilitate the study of recombination and genetic diversity. In general, HapMaps are produced by analysis of Single-Nucleotide Polymorphism (SNP segregation in large numbers of meiotic progeny. Candida albicans, the most common human fungal pathogen, is an obligate diploid that does not appear to undergo meiosis. Thus, standard methods for haplotype mapping cannot be used. We exploited naturally occurring aneuploid strains to determine the haplotypes of the eight chromosome pairs in the C. albicans laboratory strain SC5314 and in a clinical isolate. Comparison of the maps revealed that the clinical strain had undergone a significant amount of genome rearrangement, consisting primarily of crossover or gene conversion recombination events. SNP map haplotyping revealed that insertion and activation of the UAU1 cassette in essential and non-essential genes can result in whole chromosome aneuploidy. UAU1 is often used to construct homozygous deletions of targeted genes in C. albicans; the exact mechanism (trisomy followed by chromosome loss versus gene conversion has not been determined. UAU1 insertion into the essential ORC1 gene resulted in a large proportion of trisomic strains, while gene conversion events predominated when UAU1 was inserted into the non-essential LRO1 gene. Therefore, induced aneuploidies can be used to generate HapMaps, which are essential for analyzing genome alterations and mitotic recombination events in this clonal organism.

  2. KIR And HLA Haplotype Analysis in a Family Lacking The KIR 2DL1-2DP1 Genes

    Directory of Open Access Journals (Sweden)

    Vojvodić Svetlana

    2015-06-01

    Full Text Available The killer cell immunoglobulin-like receptor (KIR gene cluster exhibits extensive allelic and haplotypic diversity that is observed as presence/absence of genes, resulting in expansion and contraction of KIR haplotypes and by allelic variation of individual KIR genes. We report a case of KIR pseudogene 2DP1 and 2DL1 gene absence in members of one family with the children suffering from acute myelogenous leukemia (AML. Killer cell immunoglo-bulin-like receptor low resolution genotyping was performed by the polymerase chain reaction (PCR-sequencespecific primers (SSP/sequence-specific oligonucleotide (SSO method and haplotype assignment was done by gene content analysis. Both parents and the maternal grandfather, shared the same Cen-B2 KIR haplotype, containing KIR 3DL3, -2DS2, -2DL2 and -3DP1 genes. The second haplotype in the KIR genotype of the mother and grandfather was Tel-A1 with KIR 2DL4 (normal and deleted variant, -3DL1, -22 bp deletion variant of the 2DS4 gene and -3DL2, while the second haplotype in the KIR genotype of the father was Tel-B1 with 2DL4 (normal variant, -3DS1, -2DL5, -2DS5, -2DS1 and 3DL2 genes. Haplotype analysis in all three offsprings revealed that the children inherited the Cen-B2 haplotype with the same gene content but two of the children inherited a deleted variant of the 2DL4 gene, while the third child inherited a normal one. The second haplotype of all three offspring contained KIR 2DL4, -2DL5, -2DS1, -2DS4 (del 22bp variant, -2DS5, -3DL1 and -3DL2 genes, which was the basis of the assumption that there is a hybrid haplotype and that the present 3DL1 gene is a variant of the 3DS1 gene. Due to consanguinity among the ancestors, the results of KIR segregation analysis showed the existence of a very rare KIR genotype in the offspring. The family who is the subject of this case is even more interesting because the father was 10/10 human leukocyte antigen (HLA-matched to his daughter, all members of the family have

  3. On Quantum Statistical Inference, II

    OpenAIRE

    Barndorff-Nielsen, O. E.; Gill, R. D.; Jupp, P. E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, theoretical developments in the theory of quantum measurements have brought the basic mathematical framework for the probability calculations much closer to that of classical probability theory. The present paper reviews this field and proposes and inte...

  4. Cystic fibrosis transmembrane regulator haplotypes in households of patients with cystic fibrosis.

    Science.gov (United States)

    Furgeri, Daniela Tenório; Marson, Fernando Augusto Lima; Correia, Cyntia Arivabeni Araújo; Ribeiro, José Dirceu; Bertuzzo, Carmen Sílvia

    2018-01-30

    Nearly 2000 mutations in the cystic fibrosis transmembrane regulator (CFTR) gene have been reported. The F508del mutation occurs in approximately 50-65% of patients with cystic fibrosis (CF). However, molecular diagnosis is not always possible. Therefore, silent polymorphisms can be used to label the mutant allele in households of patients with CF. To verify the haplotypes of four polymorphisms at the CFTR locus in households of patients with CF for pre-fertilization, pre-implantation, and prenatal indirect mutation diagnosis to provide better genetic counseling for families and patients with CF and to associate the genotypes/haplotypes with the F508del mutation screening. GATT polymorphism analysis was performed using direct polymerase chain reaction amplification, and the MP6-D9, TUB09 and TUB18 polymorphism analyses were performed using restriction fragment length polymorphism. Nine haplotypes were found in 37 CFTR alleles, and of those, 24 were linked with the F508del mutation and 13 with other CFTR mutations. The 6 (GATT), C (MP6-D9), G (TUB09), and C (TUB18) haplotypes showed the highest prevalence (48%) of the mutant CFTR allele and were linked to the F508del mutation (64%). In 43% of households analyzed, at least one informative polymorphism can be used for the indirect diagnostic test. CFTR polymorphisms are genetic markers that are useful for identifying the mutant CFTR alleles in households of patients with CF when it is not possible to establish the complete CFTR genotype. Moreover, the polymorphisms can be used for indirect CFTR mutation identification in cases of pre-fertilization, pre-implantation and prenatal analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Association of Xmn I Polymorphism and Hemoglobin E Haplotypes on Postnatal Gamma Globin Gene Expression in Homozygous Hemoglobin E

    Directory of Open Access Journals (Sweden)

    Supachai Ekwattanakit

    2012-01-01

    Full Text Available Background and Objectives. To explore the role of cis-regulatory sequences within the β globin gene cluster at chromosome 11 on human γ globin gene expression related to Hb E allele, we analyze baseline hematological data and Hb F values together with β globin haplotypes in homozygous Hb E. Patients and Methods. 80 individuals with molecularly confirmed homozygous Hb E were analyzed for the β globin haplotypes and Xmn I polymorphism using PCR-RFLPs. 74 individuals with complete laboratory data were further studied for association analyses. Results. Eight different β globin haplotypes were found linked to Hb E alleles; three major haplotypes were (a (III, (b (V, and (c (IV accounting for 94% of Hb E chromosomes. A new haplotype (Th-1 was identified and most likely converted from the major ones. The majority of individuals had Hb F < 5%; only 10.8% of homozygous Hb E had high Hb F (average 10.5%, range 5.8–14.3%. No association was found on a specific haplotype or Xmn I in these individuals with high Hb F, measured by alkaline denaturation. Conclusion. The cis-regulation of γ globin gene expression might not be apparent under a milder condition with lesser globin imbalance such as homozygous Hb E.

  6. Estimating uncertainty of inference for validation

    Energy Technology Data Exchange (ETDEWEB)

    Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM

    2010-09-30

    We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the

  7. Inference algorithms and learning theory for Bayesian sparse factor analysis

    International Nuclear Information System (INIS)

    Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John

    2009-01-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  8. Inference algorithms and learning theory for Bayesian sparse factor analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)

    2009-12-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  9. Different patterns of evolution in the centromeric and telomeric regions of group A and B haplotypes of the human killer cell Ig-like receptor locus.

    Directory of Open Access Journals (Sweden)

    Chul-Woo Pyo

    Full Text Available The fast evolving human KIR gene family encodes variable lymphocyte receptors specific for polymorphic HLA class I determinants. Nucleotide sequences for 24 representative human KIR haplotypes were determined. With three previously defined haplotypes, this gave a set of 12 group A and 15 group B haplotypes for assessment of KIR variation. The seven gene-content haplotypes are all combinations of four centromeric and two telomeric motifs. 2DL5, 2DS5 and 2DS3 can be present in centromeric and telomeric locations. With one exception, haplotypes having identical gene content differed in their combinations of KIR alleles. Sequence diversity varied between haplotype groups and between centromeric and telomeric halves of the KIR locus. The most variable A haplotype genes are in the telomeric half, whereas the most variable genes characterizing B haplotypes are in the centromeric half. Of the highly polymorphic genes, only the 3DL3 framework gene exhibits a similar diversity when carried by A and B haplotypes. Phylogenetic analysis and divergence time estimates, point to the centromeric gene-content motifs that distinguish A and B haplotypes having emerged ~6 million years ago, contemporaneously with the separation of human and chimpanzee ancestors. In contrast, the telomeric motifs that distinguish A and B haplotypes emerged more recently, ~1.7 million years ago, before the emergence of Homo sapiens. Thus the centromeric and telomeric motifs that typify A and B haplotypes have likely been present throughout human evolution. The results suggest the common ancestor of A and B haplotypes combined a B-like centromeric region with an A-like telomeric region.

  10. HLA haplotype map of river valley populations with hemochromatosis traced through five centuries in Central Sweden.

    Science.gov (United States)

    Olsson, K Sigvard; Ritter, Bernd; Hansson, Norbeth; Chowdhury, Ruma R

    2008-07-01

    The hemochromatosis mutation, C282Y of the HFE gene, seems to have originated from a single event which once occurred in a person living in the north west of Europe carrying human leukocyte antigen (HLA)-A3-B7. In descendants of this ancestor also other haplotypes appear probably caused by local recombinations and founder effects. The background of these associations is unknown. Isolated river valley populations may be fruitful for the mapping of genetic disorders such as hemochromatosis. In this study, we try to test this hypothesis in a study from central Sweden where the haplotyope A1-B8 was common. HLA haplotypes and HFE mutations were studied in hemochromatosis patients with present or past parental origin in a sparsely populated (1/km(2)) rural district (n = 8366 in the year of 2005), in central Sweden. Pedigrees were constructed from the Swedish church book registry. Extended haplotypes were studied to evaluate origin of recombinations. There were 87 original probands, 36 females and 51 males identified during 30 yr, of whom 86% carried C282Y/C282Y and 14% C282Y/H63D. Of 32 different HLA haplotypes A1-B8 was the most common (34%), followed by A3-B7 (16%), both in strong linkage disequilibrium with controls, (P females. River valley populations may contain HLA haplotypes reflecting their demographic history. This study has demonstrated that the resistance against recombinations between HLA-A and HFE make HLA haplotypes excellent markers for population movements. Founder effects and genetic drift from bottleneck populations (surviving the plague?) may explain the commonness of the mutation in central Scandinavia. The intergenerational time difference >30 yr was greater than expected and means that the age of the original mutation may be underestimated.

  11. Minimal sharing of Y-chromosome STR haplotypes among five endogamous population groups from western and southwestern India.

    Science.gov (United States)

    Das, Birajalaxmi; Chauhan, P S; Seshadri, M

    2004-10-01

    We attempt to address the issue of genetic variation and the pattern of male gene flow among and between five Indian population groups of two different geographic and linguistic affiliations using Y-chromosome markers. We studied 221 males at three Y-chromosome biallelic loci and 184 males for the five Y-chromosome STRs. We observed 111 Y-chromosome STR haplotypes. An analysis of molecular variance (AMOVA) based on Y-chromosome STRs showed that the variation observed between the population groups belonging to two major regions (western and southwestern India) was 0.17%, which was significantly lower than the level of genetic variance among the five populations (0.59%) considered as a single group. Combined haplotype analysis of the five STRs and the biallelic locus 92R7 revealed minimal sharing of haplotypes among these five ethnic groups, irrespective of the similar origin of the linguistic and geographic affiliations; this minimal sharing indicates restricted male gene flow. As a consequence, most of the haplotypes were population specific. Network analysis showed that the haplotypes, which were shared between the populations, seem to have originated from different mutational pathways at different loci. Biallelic markers showed that all five ethnic groups have a similar ancestral origin despite their geographic and linguistic diversity.

  12. How to solve mathematical problems

    CERN Document Server

    Wickelgren, Wayne A

    1995-01-01

    Seven problem-solving techniques include inference, classification of action sequences, subgoals, contradiction, working backward, relations between problems, and mathematical representation. Also, problems from mathematics, science, and engineering with complete solutions.

  13. Distribution of QPY and RAH haplotypes of granzyme B gene in distinct Brazilian populations

    Directory of Open Access Journals (Sweden)

    Fernanda Bernadelli Garcia

    2012-08-01

    Full Text Available INTRODUCTION: The cytolysis mediated by granules is one of the most important effector functions of cytotoxic T lymphocytes and natural killer cells. Recently, three single nucleotide polymorphisms (SNPs were identified at exons 2, 3, and 5 of the granzyme B gene, resulting in a haplotype in which three amino acids of mature protein Q48P88Y245 are changed to R48A88H245, which leads to loss of cytotoxic activity of the protein. In this study, we evaluated the frequency of these polymorphisms in Brazilian populations. METHODS: We evaluated the frequency of these polymorphisms in Brazilian ethnic groups (white, Afro-Brazilian, and Asian by sequencing these regions. RESULTS: The allelic and genotypic frequencies of SNP 2364A/G at exon 2 in Afro-Brazilian individuals (42.3% and 17.3% were significantly higher when compared with those in whites and Asians (p < 0.0001 and p = 0.0007, respectively. The polymorphisms 2933C/G and 4243C/T also were more frequent in Afro-Brazilians but without any significant difference regarding the other groups. The Afro-Brazilian group presented greater diversity of haplotypes, and the RAH haplotype seemed to be more frequent in this group (25%, followed by the whites (20.7% and by the Asians (11.9%, similar to the frequency presented in the literature. CONCLUSIONS: There is a higher frequency of polymorphisms in Afro-Brazilians, and the RAH haplotype was more frequent in these individuals. We believe that further studies should aim to investigate the correlation of this haplotype with diseases related to immunity mediated by cytotoxic lymphocytes, and if this correlation is confirmed, novel treatment strategies might be elaborated.

  14. HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR

    International Nuclear Information System (INIS)

    Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin

    2015-01-01

    Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics

  15. Inverse Ising inference with correlated samples

    International Nuclear Information System (INIS)

    Obermayer, Benedikt; Levine, Erel

    2014-01-01

    Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)

  16. Population Structure of Pseudocercospora fijiensis in Costa Rica Reveals Shared Haplotype Diversity with Southeast Asian Populations.

    Science.gov (United States)

    Saville, Amanda; Charles, Melodi; Chavan, Suchitra; Muñoz, Miguel; Gómez-Alpizar, Luis; Ristaino, Jean Beagle

    2017-12-01

    Pseudocercospora fijiensis is the causal pathogen of black Sigatoka, a devastating disease of banana that can cause 20 to 80% yield loss in the absence of fungicides in banana crops. The genetic structure of populations of P. fijiensis in Costa Rica was examined and compared with Honduran and global populations to better understand migration patterns and inform management strategies. In total, 118 isolates of P. fijiensis collected from Costa Rica and Honduras from 2010 to 2014 were analyzed using multilocus genotyping of six loci and compared with a previously published global dataset of populations of P. fijiensis. The Costa Rican and Honduran populations shared haplotype diversity with haplotypes from Southeast Asia, Oceania, and the Americas but not Africa for all but one of the six loci studied. Gene flow and shared haplotype diversity was found in Honduran and Costa Rican populations of the pathogen. The data indicate that the haplotypic diversity observed in Costa Rican populations of P. fijiensis is derived from dispersal from initial outbreak sources in Honduras and admixtures between genetically differentiated sources from Southeast Asia, Oceania, and the Americas.

  17. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

    NARCIS (Netherlands)

    Patterson, M.; Marschall, T.; Pisanti, N.; van Iersel, L.J.J.; Stougie, L.; Klau, G.W.; Schoenhuth, A.

    2014-01-01

    The human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for

  18. Dense and accurate whole-chromosome haplotyping of individual genomes

    NARCIS (Netherlands)

    Porubsky, David; Garg, Shilpa; Sanders, Ashley D.; Korbel, Jan O.; Guryev, Victor; Lansdorp, Peter M.; Marschall, Tobias

    2017-01-01

    The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate

  19. Inference for Ecological Dynamical Systems: A Case Study of Two Endemic Diseases

    Directory of Open Access Journals (Sweden)

    Daniel A. Vasco

    2012-01-01

    Full Text Available A Bayesian Markov chain Monte Carlo method is used to infer parameters for an open stochastic epidemiological model: the Markovian susceptible-infected-recovered (SIR model, which is suitable for modeling and simulating recurrent epidemics. This allows exploring two major problems of inference appearing in many mechanistic population models. First, trajectories of these processes are often only partly observed. For example, during an epidemic the transmission process is only partly observable: one cannot record infection times. Therefore, one only records cases (infections as the observations. As a result some means of imputing or reconstructing individuals in the susceptible cases class must be accomplished. Second, the official reporting of observations (cases in epidemiology is typically done not as they are actually recorded but at some temporal interval over which they have been aggregated. To address these issues, this paper investigates the following problems. Parameter inference for a perfectly sampled open Markovian SIR is first considered. Next inference for an imperfectly observed sample path of the system is studied. Although this second problem has been solved for the case of closed epidemics, it has proven quite difficult for the case of open recurrent epidemics. Lastly, application of the statistical theory is made to measles and pertussis epidemic time series data from 60 UK cities.

  20. The association between individual SNPs or haplotypes of matrix metalloproteinase 1 and gastric cancer susceptibility, progression and prognosis.

    Directory of Open Access Journals (Sweden)

    Yong-Xi Song

    Full Text Available BACKGROUND: The single nucleotide polymorphisms (SNPs in matrix metalloproteinase 1(MMP-1 play important roles in some cancers. This study examined the associations between individual SNPs or haplotypes in MMP-1 and susceptibility, clinicopathological parameters and prognosis of gastric cancer in a large sample of the Han population in northern China. METHODS: In this case-controlled study, there were 404 patients with gastric cancer and 404 healthy controls. Seven SNPs were genotyped using the MALDI-TOF MS system. Then, SPSS software, Haploview 4.2 software, Haplo.states software and THEsias software were used to estimate the association between individual SNPs or haplotypes of MMP-1 and gastric cancer susceptibility, progression and prognosis. RESULTS: Among seven SNPs, there were no individual SNPs correlated to gastric cancer risk. Moreover, only the rs470206 genotype had a correlation with histologic grades, and the patients with GA/AA had well cell differentiation compared to the patients with genotype GG (OR=0.573; 95%CI: 0.353-0.929; P=0.023. Then, we constructed a four-marker haplotype block that contained 4 common haplotypes: TCCG, GCCG, TTCG and TTTA. However, all four common haplotypes had no correlation with gastric cancer risk and we did not find any relationship between these haplotypes and clinicopathological parameters in gastric cancer. Furthermore, neither individual SNPs nor haplotypes had an association with the survival of patients with gastric cancer. CONCLUSIONS: This study evaluated polymorphisms of the MMP-1 gene in gastric cancer with a MALDI-TOF MS method in a large northern Chinese case-controlled cohort. Our results indicated that these seven SNPs of MMP-1 might not be useful as significant markers to predict gastric cancer susceptibility, progression or prognosis, at least in the Han population in northern China.

  1. Different origin and dispersal of sulfadoxine-resistant Plasmodium falciparum haplotypes between Eastern Africa and Democratic Republic of Congo

    DEFF Research Database (Denmark)

    Baraka, Vito; Delgado-Ratto, Christopher; Nag, Sidsel

    2017-01-01

    Sulfadoxine/pyrimethamine (SP) is still used for malaria control in sub-Saharan Africa; however, widespread resistance is a major concern. This study aimed to determine the dispersal and origin of sulfadoxine resistance lineages in the Democratic Republic of the Congo compared with East African.......3 and 7.7 kb) flanking the Pfdhps gene were assayed. Evolutionary analysis revealed a shared origin of Pfdhps haplotypes in East Africa, with a distinct population clustering in DR Congo. Furthermore, in Tanzania there was an independent distinct origin of Pfdhps SGEGA resistant haplotype. In Uganda...... and Tanzania, gene flow patterns contribute to the dispersal and shared origin of parasites carrying double- and triple-mutant Pfdhps haplotypes associated with poor outcomes of intermittent preventive treatment during pregnancy using SP (IPTp-SP). However, the origins of the Pfdhps haplotypes in DR Congo...

  2. Bayesian inference for Markov jump processes with informative observations.

    Science.gov (United States)

    Golightly, Andrew; Wilkinson, Darren J

    2015-04-01

    In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.

  3. Bayesian genomic selection: the effect of haplotype lenghts and priors

    DEFF Research Database (Denmark)

    Villumsen, Trine Michelle; Janss, Luc

    2009-01-01

    Breeding values for animals with marker data are estimated using a genomic selection approach where data is analyzed using Bayesian multi-marker association models. Fourteen model scenarios with varying haplotype lengths, hyper parameter and prior distributions were compared to find the scenario ...

  4. Haplotype-based case-control study between human apurinic/apyrimidinic endonuclease 1/redox effector factor-1 gene and cerebral infarction.

    Science.gov (United States)

    Naganuma, Takahiro; Nakayama, Tomohiro; Sato, Naoyuki; Fu, Zhenyan; Yamaguchi, Mai; Soma, Masayoshi; Aoi, Noriko; Usami, Ron; Doba, Nobutaka; Hinohara, Shigeaki

    2009-10-01

    The aim of this study was to investigate the relationship between cerebral infarction (CI) and the human apurinic/apyrimidinic endonuclease 1/redox effector factor-1 (APE1/REF-1) gene using single-nucleotide polymorphisms (SNPs) and a haplotype-based case-control study. We selected 5 SNPs in the human APE1/REF1 gene (rs1760944, rs3136814, rs17111967, rs3136817 and rs1130409), and performed case-control studies in 177 CI patients and 309 control subjects. rs17111967 was found to have no heterogeneity in Japanese. The overall distribution of the haplotype-based case-control study constructed by rs1760944, rs3136814 and rs1130409 showed a significant difference. The frequency of the G-C-T haplotype was significantly higher in the CI group than in the control group (2.5% vs. 0.0%, p>0.001). Based on the results of the haplotype-based case-control-study, the G-C-T haplotype may be a genetic marker of CI, and the APE1/REF-1 gene may be a CI susceptibility gene.

  5. Causal inference based on counterfactuals

    Directory of Open Access Journals (Sweden)

    Höfler M

    2005-09-01

    Full Text Available Abstract Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept.

  6. An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

    KAUST Repository

    Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

    2016-01-01

    then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve

  7. The importance of learning when making inferences

    Directory of Open Access Journals (Sweden)

    Jorg Rieskamp

    2008-03-01

    Full Text Available The assumption that people possess a repertoire of strategies to solve the inference problems they face has been made repeatedly. The experimental findings of two previous studies on strategy selection are reexamined from a learning perspective, which argues that people learn to select strategies for making probabilistic inferences. This learning process is modeled with the strategy selection learning (SSL theory, which assumes that people develop subjective expectancies for the strategies they have. They select strategies proportional to their expectancies, which are updated on the basis of experience. For the study by Newell, Weston, and Shanks (2003 it can be shown that people did not anticipate the success of a strategy from the beginning of the experiment. Instead, the behavior observed at the end of the experiment was the result of a learning process that can be described by the SSL theory. For the second study, by Br"oder and Schiffer (2006, the SSL theory is able to provide an explanation for why participants only slowly adapted to new environments in a dynamic inference situation. The reanalysis of the previous studies illustrates the importance of learning for probabilistic inferences.

  8. Effect of genetic type and casein haplotype on antioxidant activity of yogurts during storage.

    Science.gov (United States)

    Perna, A; Intaglietta, I; Simonetti, A; Gambacorta, E

    2013-06-01

    The aim of this work was to investigate the antioxidant activity of yogurt made from the milk of 2 breeds-Italian Brown and Italian Holstein-characterized by different casein haplotypes (αS1-, β-, and κ-caseins) during storage up to 15 d. The casein haplotype was determined by isoelectric focusing; antioxidant activity of yogurt was measured using 2,2'-azino-bis-(3-ethylbenzothiazoline-6-sulfonic acid). The statistical analysis showed a significant effect of the studied factors. Antioxidant activity increased during storage of both yogurt types, but yogurt produced with Italian Brown milk showed higher antioxidant activity than those produced with Italian Holstein milk. A high scavenging activity was present in yogurts with the allelic combination of BB-A(2)A(2)-BB. The results of this study suggest that the genetic type and the haplotype make a significant contribution in the production of yogurts with high nutraceutical value. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  9. Evidence and Consequence of a Highly Adapted Clonal Haplotype within the Australian Ascochyta rabiei Population

    Directory of Open Access Journals (Sweden)

    Yasir Mehmood

    2017-06-01

    Full Text Available The Australian Ascochyta rabiei (Pass. Labr. (syn. Phoma rabiei population has low genotypic diversity with only one mating type detected to date, potentially precluding substantial evolution through recombination. However, a large diversity in aggressiveness exists. In an effort to better understand the risk from selective adaptation to currently used resistance sources and chemical control strategies, the population was examined in detail. For this, a total of 598 isolates were quasi-hierarchically sampled between 2013 and 2015 across all major Australian chickpea growing regions and commonly grown host genotypes. Although a large number of haplotypes were identified (66 through short sequence repeat (SSR genotyping, overall low gene diversity (Hexp = 0.066 and genotypic diversity (D = 0.57 was detected. Almost 70% of the isolates assessed were of a single dominant haplotype (ARH01. Disease screening on a differential host set, including three commonly deployed resistance sources, revealed distinct aggressiveness among the isolates, with 17% of all isolates identified as highly aggressive. Almost 75% of these were of the ARH01 haplotype. A similar pattern was observed at the host level, with 46% of all isolates collected from the commonly grown host genotype Genesis090 (classified as “resistant” during the term of collection identified as highly aggressive. Of these, 63% belonged to the ARH01 haplotype. In conclusion, the ARH01 haplotype represents a significant risk to the Australian chickpea industry, being not only widely adapted to the diverse agro-geographical environments of the Australian chickpea growing regions, but also containing a disproportionately large number of aggressive isolates, indicating fitness to survive and replicate on the best resistance sources in the Australian germplasm.

  10. Enhancing the mathematical properties of new haplotype homozygosity statistics for the detection of selective sweeps.

    Science.gov (United States)

    Garud, Nandita R; Rosenberg, Noah A

    2015-06-01

    Soft selective sweeps represent an important form of adaptation in which multiple haplotypes bearing adaptive alleles rise to high frequency. Most statistical methods for detecting selective sweeps from genetic polymorphism data, however, have focused on identifying hard selective sweeps in which a favored allele appears on a single haplotypic background; these methods might be underpowered to detect soft sweeps. Among exceptions is the set of haplotype homozygosity statistics introduced for the detection of soft sweeps by Garud et al. (2015). These statistics, examining frequencies of multiple haplotypes in relation to each other, include H12, a statistic designed to identify both hard and soft selective sweeps, and H2/H1, a statistic that conditional on high H12 values seeks to distinguish between hard and soft sweeps. A challenge in the use of H2/H1 is that its range depends on the associated value of H12, so that equal H2/H1 values might provide different levels of support for a soft sweep model at different values of H12. Here, we enhance the H12 and H2/H1 haplotype homozygosity statistics for selective sweep detection by deriving the upper bound on H2/H1 as a function of H12, thereby generating a statistic that normalizes H2/H1 to lie between 0 and 1. Through a reanalysis of resequencing data from inbred lines of Drosophila, we show that the enhanced statistic both strengthens interpretations obtained with the unnormalized statistic and leads to empirical insights that are less readily apparent without the normalization. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Y chromosome haplotype diversity of domestic sheep (Ovis aries) in northern Eurasia.

    Science.gov (United States)

    Zhang, Min; Peng, Wei-Feng; Yang, Guang-Li; Lv, Feng-Hua; Liu, Ming-Jun; Li, Wen-Rong; Liu, Yong-Gang; Li, Jin-Quan; Wang, Feng; Shen, Zhi-Qiang; Zhao, Sheng-Guo; Hehua, Eer; Marzanov, Nurbiy; Murawski, Maziek; Kantanen, Juha; Li, Meng-Hua

    2014-12-01

    Variation in two SNPs and one microsatellite on the Y chromosome was analyzed in a total of 663 rams representing 59 breeds from a large geographic range in northern Eurasia. SNPA-oY1 showed the highest allele frequency (91.55%) across the breeds, whereas SNPG-oY1 was present in only 56 samples. Combined genotypes established seven haplotypes (H4, H5, H6, H7, H8, H12 and H19). H6 dominated in northern Eurasia, and H8 showed the second-highest frequency. H4, which had been earlier reported to be absent in European breeds, was detected in one European breed (Swiniarka), whereas H7, which had been previously identified to be unique to European breeds, was present in two Chinese breeds (Ninglang Black and Large-tailed Han), one Buryatian (Transbaikal Finewool) and two Russian breeds (North Caucasus Mutton-Wool and Kuibyshev). H12, which had been detected only in Turkish breeds, was also found in Chinese breeds in this work. An overall low level of haplotype diversity (median h = 0.1288) was observed across the breeds with relatively higher median values in breeds from the regions neighboring the Near Eastern domestication center of sheep. H6 is the dominant haplotype in northwestern and eastern China, in which the haplotype distribution could be explained by the historical translocations of the H4 and H8 Y chromosomes to China via the Mongol invasions followed by expansions to northwestern and eastern China. Our findings extend previous results of sheep Y chromosomal genetic variability and indicate probably recent paternal gene flows between sheep breeds from distinct major geographic regions. © 2014 Stichting International Foundation for Animal Genetics.

  12. SNP and haplotype analysis reveal IGF2 variants associated with growth traits in Chinese Qinchuan cattle.

    Science.gov (United States)

    Huang, Yong-Zhen; Zhan, Zhao-Yang; Li, Xin-Yi; Wu, Sheng-Ru; Sun, Yu-Jia; Xue, Jing; Lan, Xian-Yong; Lei, Chu-Zhao; Zhang, Chun-Lei; Jia, Yu-Tang; Chen, Hong

    2014-02-01

    Insulin-like growth factor 2 (IGF2) is a potent cell growth and differentiation factor and is implicated in mammals' growth and development. The objective of this study was to evaluate the effects of the mutations in the bovine IGF2 with growth traits in Chinese Qinchuan cattle. Four single nucleotide polymorphisms (SNPs) were detected of the bovine IGF2 by DNA pool sequencing and forced polymerase chain reaction-restriction fragment length polymorphism (forced PCR-RFLP) methods. We also investigated haplotype structure and linkage disequilibrium (LD) coefficients for four SNPs in 817 individuals representing two main cattle breeds from China. The result of haplotype analysis showed eight different haplotypes and 27 combined genotypes within the study population. The statistical analyses indicated that the four SNPs, combined genotypes and haplotypes are associated with the withers height, body length, chest breadth, chest depth and body weight in Qinchuan cattle population (P growth traits; the heterozygote diplotype was associated with higher growth traits compared to wild-type homozygote. Our results provide evidence that polymorphisms in the IGF2 gene are associated with growth traits, and may be used for marker-assisted selection in beef cattle breeding program.

  13. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor A Gene in Colorectal Cancer

    International Nuclear Information System (INIS)

    Hansen, Torben F.; Spindler, Karen-Lise G.; Andersen, Rikke F.; Lindebjerg, Jan; Kølvraa, Steen; Brandslund, Ivan; Jakobsen, Anders

    2010-01-01

    New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study was to investigate the prognostic importance of haplotypes in the VEGF-A gene in patients with CRC. The study included 486 patients surgically resected for stage II and III CRC, divided into two independent cohorts. Three SNPs in the VEGF-A gene were analyzed by polymerase chain reaction. Haplotypes were estimated using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible for this effect, was present in approximately 30% of the patients and demonstrated a significant relationship with poor survival, and it remained an independent prognostic marker after multivariate analysis, hazard ratio 2.46 (95% confidence interval 1.49–4.06), p < 0.001. Validation was provided by consistent findings in a second and independent cohort. Haplotype combinations call for further investigation

  14. A Learning Algorithm for Multimodal Grammar Inference.

    Science.gov (United States)

    D'Ulizia, A; Ferri, F; Grifoni, P

    2011-12-01

    The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.

  15. HLA-E regulatory and coding region variability and haplotypes in a Brazilian population sample.

    Science.gov (United States)

    Ramalho, Jaqueline; Veiga-Castelli, Luciana C; Donadi, Eduardo A; Mendes-Junior, Celso T; Castelli, Erick C

    2017-11-01

    The HLA-E gene is characterized by low but wide expression on different tissues. HLA-E is considered a conserved gene, being one of the least polymorphic class I HLA genes. The HLA-E molecule interacts with Natural Killer cell receptors and T lymphocytes receptors, and might activate or inhibit immune responses depending on the peptide associated with HLA-E and with which receptors HLA-E interacts to. Variable sites within the HLA-E regulatory and coding segments may influence the gene function by modifying its expression pattern or encoded molecule, thus, influencing its interaction with receptors and the peptide. Here we propose an approach to evaluate the gene structure, haplotype pattern and the complete HLA-E variability, including regulatory (promoter and 3'UTR) and coding segments (with introns), by using massively parallel sequencing. We investigated the variability of 420 samples from a very admixed population such as Brazilians by using this approach. Considering a segment of about 7kb, 63 variable sites were detected, arranged into 75 extended haplotypes. We detected 37 different promoter sequences (but few frequent ones), 27 different coding sequences (15 representing new HLA-E alleles) and 12 haplotypes at the 3'UTR segment, two of them presenting a summed frequency of 90%. Despite the number of coding alleles, they encode mainly two different full-length molecules, known as E*01:01 and E*01:03, which corresponds to about 90% of all. In addition, differently from what has been previously observed for other non classical HLA genes, the relationship among the HLA-E promoter, coding and 3'UTR haplotypes is not straightforward because the same promoter and 3'UTR haplotypes were many times associated with different HLA-E coding haplotypes. This data reinforces the presence of only two main full-length HLA-E molecules encoded by the many HLA-E alleles detected in our population sample. In addition, this data does indicate that the distal HLA-E promoter is by

  16. Baselines and test data for cross-lingual inference

    DEFF Research Database (Denmark)

    Agic, Zeljko; Schluter, Natalie

    2018-01-01

    The recent years have seen a revival of interest in textual entailment, sparked by i) the emergence of powerful deep neural network learners for natural language processing and ii) the timely development of large-scale evaluation datasets such as SNLI. Recast as natural language inference......, the problem now amounts to detecting the relation between pairs of statements: they either contradict or entail one another, or they are mutually neutral. Current research in natural language inference is effectively exclusive to English. In this paper, we propose to advance the research in SNLI-style natural...... language inference toward multilingual evaluation. To that end, we provide test data for four major languages: Arabic, French, Spanish, and Russian. We experiment with a set of baselines. Our systems are based on cross-lingual word embeddings and machine translation. While our best system scores an average...

  17. HLA class II linkage disequilibrium and haplotype evolution in the Cayapa Indians of Ecuador

    Energy Technology Data Exchange (ETDEWEB)

    Trachtenberg, E.A.; Erlich, H.A. [Roche Molecular Systems, Alameda, CA (United States); Klitz, W. [Univ. of California, Berkeley, CA (United States)] [and others

    1995-08-01

    DNA-based typing of the HLA class II loci in a sample of the Cayapa Indians of Ecuador reveals several lines of evidence that selection has operated to maintain and to diversify the existing level of polymorphism in the class II region. As has been noticed for other Native American groups, the overall level of polymorphism at the DRB1, DQA1, DQB1, and DPB1 loci is reduced relative to that found in other human populations. Nonetheless, the relative eveness in the distribution of allele frequencies at each of the four loci points to the role of balancing selection in the maintenance of the polymorphism. The DQA1 and DQB1 loci, in particular, have near-maximum departures from the neutrality model, which suggests that balancing selection has been especially strong in these cases. Several novel DQA1-DQB1 haplotypes and the discovery of a new DRB1 allele demonstrate an evolutionary tendency favoring the diversification of class II alleles and haplotypes. The recombination interval between the centromeric DPB1 locus and the other class II loci will, in the absence of other forces such as selection, reduce disequilibrium across this region. However, nearly all common alleles were found to be part of DR-DP haplotypes in strong disequilibrium, consistent with the recent action of selection acting on these haplotypes in the Cayapa. 50 refs., 3 figs., 3 tabs.

  18. SDG multiple fault diagnosis by real-time inverse inference

    International Nuclear Information System (INIS)

    Zhang Zhaoqian; Wu Chongguang; Zhang Beike; Xia Tao; Li Anfeng

    2005-01-01

    In the past 20 years, one of the qualitative simulation technologies, signed directed graph (SDG) has been widely applied in the field of chemical fault diagnosis. However, the assumption of single fault origin was usually used by many former researchers. As a result, this will lead to the problem of combinatorial explosion and has limited SDG to the realistic application on the real process. This is mainly because that most of the former researchers used forward inference engine in the commercial expert system software to carry out the inverse diagnosis inference on the SDG model which violates the internal principle of diagnosis mechanism. In this paper, we present a new SDG multiple faults diagnosis method by real-time inverse inference. This is a method of multiple faults diagnosis from the genuine significance and the inference engine use inverse mechanism. At last, we give an example of 65t/h furnace diagnosis system to demonstrate its applicability and efficiency

  19. SDG multiple fault diagnosis by real-time inverse inference

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Zhaoqian; Wu Chongguang; Zhang Beike; Xia Tao; Li Anfeng

    2005-02-01

    In the past 20 years, one of the qualitative simulation technologies, signed directed graph (SDG) has been widely applied in the field of chemical fault diagnosis. However, the assumption of single fault origin was usually used by many former researchers. As a result, this will lead to the problem of combinatorial explosion and has limited SDG to the realistic application on the real process. This is mainly because that most of the former researchers used forward inference engine in the commercial expert system software to carry out the inverse diagnosis inference on the SDG model which violates the internal principle of diagnosis mechanism. In this paper, we present a new SDG multiple faults diagnosis method by real-time inverse inference. This is a method of multiple faults diagnosis from the genuine significance and the inference engine use inverse mechanism. At last, we give an example of 65t/h furnace diagnosis system to demonstrate its applicability and efficiency.

  20. Hierarchical modeling and inference in ecology: The analysis of data from populations, metapopulations and communities

    Science.gov (United States)

    Royle, J. Andrew; Dorazio, Robert M.

    2008-01-01

    A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.

  1. Integrative inference of population history in the Ibero-Maghrebian endemic Pleurodeles waltl (Salamandridae).

    Science.gov (United States)

    Gutiérrez-Rodríguez, Jorge; Barbosa, A Márcia; Martínez-Solano, Íñigo

    2017-07-01

    Inference of population histories from the molecular signatures of past demographic processes is challenging, but recent methodological advances in species distribution models and their integration in time-calibrated phylogeographic studies allow detailed reconstruction of complex biogeographic scenarios. We apply an integrative approach to infer the evolutionary history of the Iberian ribbed newt (Pleurodeles waltl), an Ibero-Maghrebian endemic with populations north and south of the Strait of Gibraltar. We analyzed an extensive multilocus dataset (mitochondrial and nuclear DNA sequences and ten polymorphic microsatellite loci) and found a deep east-west phylogeographic break in Iberian populations dating back to the Plio-Pleistocene. This break is inferred to result from vicariance associated with the formation of the Guadalquivir river basin. In contrast with previous studies, North African populations showed exclusive mtDNA haplotypes, and formed a monophyletic clade within the Eastern Iberian lineage in the mtDNA genealogy. On the other hand, microsatellites failed to recover Moroccan populations as a differentiated genetic cluster. This is interpreted to result from post-divergence gene flow based on the results of IMA2 and Migrate analyses. Thus, Moroccan populations would have originated after overseas dispersal from the Iberian Peninsula in the Pleistocene, with subsequent gene flow in more recent times, implying at least two trans-marine dispersal events. We modeled the distribution of the species and of each lineage, and projected these models back in time to infer climatically favourable areas during the mid-Holocene, the last glacial maximum (LGM) and the last interglacial (LIG), to reconstruct more recent population dynamics. We found minor differences in climatic favourability across lineages, suggesting intraspecific niche conservatism. Genetic diversity was significantly correlated with the intersection of environmental favourability in the LIG and

  2. The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project.

    Science.gov (United States)

    Peng, Ting; Wang, Li; Li, Guisen

    2017-08-11

    The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3'UTR. Total 12 SNPs in URR and 24 SNPs in 3'UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3'UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P 1  = 3.33E-4 vs P 2  = 3.61E-30). The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide

  3. Mitochondrial haplotype distribution and phylogenetic relationship of an endangered species Reeve's turtle (Mauremys reevesii in East Asia

    Directory of Open Access Journals (Sweden)

    Hong-Shik Oh

    2017-03-01

    Full Text Available This study was examined to reveal haplotype distribution and phylogenetic relationship using mitochondrial DNA CYTB gene sequences of Reeve’s turtle (Mauremys reevesii of East Asia. CYTB sequences of Reeve’s turtles were divided into 6 haplotypes (Hap01–Hap06. Chinese turtles were found in Hap01, Hap02, Hap04, and Hap05, and Hap01 was the highest frequency of 85.0%. Korean Turtles were found in Hap01, Hap03, Hap04, and Hap05, and Hap03 was the highest frequency of 52.1%. Although there was no haplotype which includes only the CYTB sequence exclusive for Reeve’s turtles of Korea, since no CYTB sequence of China was found in Hap03, it would be possible that Hap03 turtles of Korea are separated from those of China. The haplotypes of Reeve’s turtles of East Asia were monophyletic, which indicated that they had been evolved from a single maternal lineage, but went through local evolution after geographical migration and isolation in East Asia.

  4. Introgression of a Rare Haplotype from Southeastern Africa to Breed California Blackeyes with Larger Seeds

    Directory of Open Access Journals (Sweden)

    Mitchell R Lucas

    2015-03-01

    Full Text Available Seed size distinguishes most crops from their wild relatives and is an important quality trait for the grain legume cowpea. In order to breed cowpea varieties with larger seeds we introgressed a rare haplotype associated with large seeds at the Css-1 locus from an African buff seed type cultivar, IT82E-18 (18.5g/100 seeds, into a blackeye seed type cultivar, CB27 (22g/100 seed. Four RILs derived from these two parents were chosen for marker-assisted breeding based on SNP genotyping with a goal of stacking large seed haplotypes into a CB27 background. Foreground and background selection were performed during two cycles of backcrossing based on genome-wide SNP markers. The average seed size of introgression lines homozygous for haplotypes associated with large seeds was 28.7g/100 seed and 24.8g/100 seed for cycles 1 and 2, respectively. One cycle 1 introgression line with desirable seed quality was selfed for two generations to make families with very large seeds (28-35g/100 seeds. Field-based performance trials helped identify breeding lines that not only have large seeds but are also desirable in terms of yield, maturity, and plant architecture when compared to industry standards. A principal component analysis was used to explore the relationships between the parents relative to a core set of landraces and improved varieties based on high-density SNP data. The geographic distribution of haplotypes at the Css-1 locus suggest the haplotype associated with large seeds is unique to accessions collected from Southeastern Africa. Therefore this QTL has a strong potential to develop larger seeded varieties for other growing regions which is demonstrated in this work using a California pedigree.

  5. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

    Science.gov (United States)

    Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

    2010-07-16

    Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A

  6. More powerful haplotype sharing by accounting for the mode of inheritance.

    Science.gov (United States)

    Ziegler, Andreas; Ewhida, Adel; Brendel, Michael; Kleensang, André

    2009-04-01

    The concept of haplotype sharing (HS) has received considerable attention recently, and several haplotype association methods have been proposed. Here, we extend the work of Beckmann and colleagues [2005 Hum. Hered. 59:67-78] who derived an HS statistic (BHS) as special case of Mantel's space-time clustering approach. The Mantel-type HS statistic correlates genetic similarity with phenotypic similarity across pairs of individuals. While phenotypic similarity is measured as the mean-corrected cross product of phenotypes, we propose to incorporate information of the underlying genetic model in the measurement of the genetic similarity. Specifically, for the recessive and dominant modes of inheritance we suggest the use of the minimum and maximum of shared length of haplotypes around a marker locus for pairs of individuals. If the underlying genetic model is unknown, we propose a model-free HS Mantel statistic using the max-test approach. We compare our novel HS statistics to BHS using simulated case-control data and illustrate its use by re-analyzing data from a candidate region of chromosome 18q from the Rheumatoid Arthritis (RA) Consortium. We demonstrate that our approach is point-wise valid and superior to BHS. In the re-analysis of the RA data, we identified three regions with point-wise P-values<0.005 containing six known genes (PMIP1, MC4R, PIGN, KIAA1468, TNFRSF11A and ZCCHC2) which might be worth follow-up.

  7. Inferring the conservative causal core of gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2010-09-01

    Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  8. Inferring the conservative causal core of gene regulatory networks.

    Science.gov (United States)

    Altay, Gökmen; Emmert-Streib, Frank

    2010-09-28

    Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  9. Nonparametric Bayesian inference in biostatistics

    CERN Document Server

    Müller, Peter

    2015-01-01

    As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...

  10. RFLP's for the human pepsinogen A haplotypes (PGA)

    Energy Technology Data Exchange (ETDEWEB)

    Taggart, R T; Boudi, F B; Bell, G I

    1988-10-11

    PGA 101 is a 1340 bp cDNA clone containing exons 1-9 of the predicted human pepsinogen A coding sequence. Two distinct polymorphisms are detected with EcoRI and Bg1 II. Analysis with these enzymes provides for discrimination of the PGA haplotypes A, B, and C containing three, two and one PGA genes respectively. The PGA complex is located at 11q13. Mendelian inheritance was demonstrated in 20 families.

  11. Haplotype-based approach for noninvasive prenatal tests of Duchenne muscular dystrophy using cell-free fetal DNA in maternal plasma

    DEFF Research Database (Denmark)

    Xu, Yan; Li, Xuchao; Ge, Hui-Juan

    2015-01-01

    Purpose:This study demonstrates noninvasive prenatal testing (NIPT) for Duchenne muscular dystrophy (DMD) using a newly developed haplotype-based approach.Methods:Eight families at risk for DMD were recruited for this study. Parental haplotypes were constructed using target-region sequencing data...

  12. [The haplomatch program for comparing Y-chromosome STR-haplotypes and its application to the analysis of the origin of Don Cossacks].

    Science.gov (United States)

    Chukhryaeva, M I; Ivanov, I O; Frolova, S A; Koshel, S M; Utevska, O M; Skhalyakho, R A; Agdzhoyan, A T; Bogunov, Yu V; Balanovska, E V; Balanovsky, O P

    2016-05-01

    STR haplotypes of the Y chromosome are widely used as effective genetic markers in studies of human populations and in forensic DNA analysis. The task often arises to compare the spectrum of haplotypes in individuals or entire populations. Performing this task manually is too laborious and thus unrealistic. We propose an algorithm for counting similarity between STR haplotypes. This algorithm is suitable for massive analyses of samples. It is implemented in the computer program Haplomatch, which makes it possible to find haplotypes that differ from the target haplotype by 0, 1, 2, 3, or more mutational steps. The program may operate in two modes: comparison of individuals and comparison of populations. Flexibility of the program (the possibility of using any external database), its usability (MS Excel spreadsheets are used), and the capability of being applied to other chromosomes and other species could make this software a new useful tool in population genetics and forensic and genealogical studies. The Haplomatch software is freely available on our website www.genofond.ru. The program is applied to studying the gene pool of Cossacks. Experimental analysis of Y-chromosomal diversity in a representative set (N = 131) of Upper Don Cossacks is performed. Analysis of the STR haplotypes detects genetic proximity of Cossacks to East Slavic populations (in particular, to Southern and Central Russians, as well as to Ukrainians), which confirms the hypothesis of the origin of the Cossacks mainly due to immigration from Russia and Ukraine. Also, a small genetic influence of Turkicspeaking Nogais is found, probably caused by their occurrence in the Don Voisko as part of the Tatar layer. No similarities between haplotype spectra of Cossacks and Caucasus populations are found. This case study demonstrates the effectiveness of the Haplomatch software in analyzing large sets of STR haplotypes.

  13. The Probabilistic Convolution Tree: Efficient Exact Bayesian Inference for Faster LC-MS/MS Protein Inference

    Science.gov (United States)

    Serang, Oliver

    2014-01-01

    Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234

  14. Bootstrap-based Support of HGT Inferred by Maximum Parsimony

    Directory of Open Access Journals (Sweden)

    Nakhleh Luay

    2010-05-01

    Full Text Available Abstract Background Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. Results In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. Conclusions We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/, and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  15. Bootstrap-based support of HGT inferred by maximum parsimony.

    Science.gov (United States)

    Park, Hyun Jung; Jin, Guohua; Nakhleh, Luay

    2010-05-05

    Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/), and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  16. Haplotypes in the Dystrophin DNA Segment Point to a Mosaic Origin of Modern Human Diversity

    OpenAIRE

    Ziętkiewicz, Ewa; Yotova, Vania; Gehl, Dominik; Wambach, Tina; Arrieta, Isabel; Batzer, Mark; Cole, David E.C.; Hechtman, Peter; Kaplan, Feige; Modiano, David; Moisan, Jean-Paul; Michalski, Roman; Labuda, Damian

    2003-01-01

    Although Africa has played a central role in human evolutionary history, certain studies have suggested that not all contemporary human genetic diversity is of recent African origin. We investigated 35 simple polymorphic sites and one Tn microsatellite in an 8-kb segment of the dystrophin gene. We found 86 haplotypes in 1,343 chromosomes from around the world. Although a classical out-of-Africa topology was observed in trees based on the variant frequencies, the tree of haplotype sequences re...

  17. Covariance Between Genotypic Effects and its Use for Genomic Inference in Half-Sib Families

    Science.gov (United States)

    Wittenburg, Dörte; Teuscher, Friedrich; Klosa, Jan; Reinsch, Norbert

    2016-01-01

    In livestock, current statistical approaches utilize extensive molecular data, e.g., single nucleotide polymorphisms (SNPs), to improve the genetic evaluation of individuals. The number of model parameters increases with the number of SNPs, so the multicollinearity between covariates can affect the results obtained using whole genome regression methods. In this study, dependencies between SNPs due to linkage and linkage disequilibrium among the chromosome segments were explicitly considered in methods used to estimate the effects of SNPs. The population structure affects the extent of such dependencies, so the covariance among SNP genotypes was derived for half-sib families, which are typical in livestock populations. Conditional on the SNP haplotypes of the common parent (sire), the theoretical covariance was determined using the haplotype frequencies of the population from which the individual parent (dam) was derived. The resulting covariance matrix was included in a statistical model for a trait of interest, and this covariance matrix was then used to specify prior assumptions for SNP effects in a Bayesian framework. The approach was applied to one family in simulated scenarios (few and many quantitative trait loci) and using semireal data obtained from dairy cattle to identify genome segments that affect performance traits, as well as to investigate the impact on predictive ability. Compared with a method that does not explicitly consider any of the relationship among predictor variables, the accuracy of genetic value prediction was improved by 10–22%. The results show that the inclusion of dependence is particularly important for genomic inference based on small sample sizes. PMID:27402363

  18. Congested Link Inference Algorithms in Dynamic Routing IP Network

    Directory of Open Access Journals (Sweden)

    Yu Chen

    2017-01-01

    Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.

  19. Assessing children's inference generation: what do tests of reading comprehension measure?

    Science.gov (United States)

    Bowyer-Crane, Claudine; Snowling, Margaret J

    2005-06-01

    Previous research suggests that children with specific comprehension difficulties have problems with the generation of inferences. This raises important questions as to whether poor comprehenders have poor comprehension skills generally, or whether their problems are confined to specific inference types. The main aims of the study were (a) using two commonly used tests of reading comprehension to classify the questions requiring the generation of inferences, and (b) to investigate the relative performance of skilled and less-skilled comprehenders on questions tapping different inference types. The performance of 10 poor comprehenders (mean age 110.06 months) was compared with the performance of 10 normal readers (mean age 112.78 months) on two tests of reading comprehension. A qualitative analysis of the NARA II (form 1) and the WORD comprehension subtest was carried out. Participants were then administered the NARA II, WORD comprehension subtest and a test of non-word reading. The NARA II was heavily reliant on the generation of knowledge-based inferences, while the WORD comprehension subtest was biased towards the retention of literal information. Children identified by the NARA II as having comprehension difficulties performed in the normal range on the WORD comprehension subtests. Further, children with comprehension difficulties performed poorly on questions requiring the generation of knowledge-based and elaborative inferences. However, they were able to answer questions requiring attention to literal information or use of cohesive devices at a level comparable to normal readers. Different reading tests tap different types of inferencing skills. Lessskilled comprehenders have particular difficulty applying real-world knowledge to a text during reading, and this has implications for the formulation of effective intervention strategies.

  20. Eurasian otters, Lutra lutra, have a dominant mtDNA haplotype from the Iberian Peninsula to Scandinavia.

    Science.gov (United States)

    Ferrando, Ainhoa; Ponsà, Montserrat; Marmi, Josep; Domingo-Roura, Xavier

    2004-01-01

    The Eurasian otter, Lutra lutra, has a Palaearctic distribution and has suffered a severe decline throughout Europe during the last century. Previous studies in this and other mustelids have shown reduced levels of variability in mitochondrial DNA, although otter phylogeographic studies were restricted to central-western Europe. In this work we have sequenced 361 bp of the mtDNA control region in 73 individuals from eight countries and added our results to eight sequences available from GenBank and the literature. The range of distribution has been expanded in relation to previous works north towards Scandinavia, east to Russia and Belarus, and south to the Iberian Peninsula. We found a single dominant haplotype in 91.78% of the samples, and six more haplotypes deviating a maximum of two mutations from the dominant haplotype restricted to a single country. Variability was extremely low in western Europe but higher in eastern countries. This, together with the lack of phylogeographical structuring, supports the postglacial recolonization of Europe from a single refugium. The Eurasian otter mtDNA control region has a 220-bp variable minisatellite in Domain III that we sequenced in 29 otters. We found a total of 19 minisatellite haplotypes, but they showed no phylogenetic information.

  1. Mitochondrial haplotypes are not associated with mice selectively bred for high voluntary wheel running.

    Science.gov (United States)

    Wone, Bernard W M; Yim, Won C; Schutz, Heidi; Meek, Thomas H; Garland, Theodore

    2018-04-04

    Mitochondrial haplotypes have been associated with human and rodent phenotypes, including nonshivering thermogenesis capacity, learning capability, and disease risk. Although the mammalian mitochondrial D-loop is highly polymorphic, D-loops in laboratory mice are identical, and variation occurs elsewhere mainly between nucleotides 9820 and 9830. Part of this region codes for the tRNA Arg gene and is associated with mitochondrial densities and number of mtDNA copies. We hypothesized that the capacity for high levels of voluntary wheel-running behavior would be associated with mitochondrial haplotype. Here, we analyzed the mtDNA polymorphic region in mice from each of four replicate lines selectively bred for 54 generations for high voluntary wheel running (HR) and from four control lines (Control) randomly bred for 54 generations. Sequencing the polymorphic region revealed a variable number of adenine repeats. Single nucleotide polymorphisms (SNPs) varied from 2 to 3 adenine insertions, resulting in three haplotypes. We found significant genetic differentiations between the HR and Control groups (F st  = 0.779, p ≤ 0.0001), as well as among the replicate lines of mice within groups (F sc  = 0.757, p ≤ 0.0001). Haplotypes, however, were not strongly associated with voluntary wheel running (revolutions run per day), nor with either body mass or litter size. This system provides a useful experimental model to dissect the physiological processes linking mitochondrial, genomic SNPs, epigenetics, or nuclear-mitochondrial cross-talk to exercise activity. Copyright © 2018. Published by Elsevier B.V.

  2. Kernel learning at the first level of inference.

    Science.gov (United States)

    Cawley, Gavin C; Talbot, Nicola L C

    2014-05-01

    Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. RTEL1 tagging SNPs and haplotypes were associated with glioma development.

    Science.gov (United States)

    Li, Gang; Jin, Tianbo; Liang, Hongjuan; Zhang, Zhiguo; He, Shiming; Tu, Yanyang; Yang, Haixia; Geng, Tingting; Cui, Guangbin; Chen, Chao; Gao, Guodong

    2013-05-17

    As glioma ranks as the first most prevalent solid tumors in primary central nervous system, certain single-nucleotide polymorphisms (SNPs) may be related to increased glioma risk, and have implications in carcinogenesis. The present case-control study was carried out to elucidate how common variants contribute to glioma susceptibility. Ten candidate tagging SNPs (tSNPs) were selected from seven genes whose polymorphisms have been proven by classical literatures and reliable databases to be tended to relate with gliomas, and with the minor allele frequency (MAF)>5% in the HapMap Asian population. The selected tSNPs were genotyped in 629 glioma patients and 645 controls from a Han Chinese population using the multiplexed SNP MassEXTEND assay calibrated. Two significant tSNPs in RTEL1 gene were observed to be associated with glioma risk (rs6010620, P=0.0016, OR: 1.32, 95% CI: 1.11-1.56; rs2297440, P=0.001, OR: 1.33, 95% CI: 1.12-1.58) by χ2 test. It was identified the genotype "GG" of rs6010620 acted as the protective genotype for glioma (OR, 0.46; 95% CI, 0.31-0.7; P=0.0002), while the genotype "CC" of rs2297440 as the protective genotype in glioma (OR, 0.47; 95% CI, 0.31-0.71; P=0.0003). Furthermore, haplotype "GCT" in RTEL1 gene was found to be associated with risk of glioma (OR, 0.7; 95% CI, 0.57-0.86; Fisher's P=0.0005; Pearson's P=0.0005), and haplotype "ATT" was detected to be associated with risk of glioma (OR, 1.32; 95% CI, 1.12-1.57; Fisher's P=0.0013; Pearson's P=0.0013). Two single variants, the genotypes of "GG" of rs6010620 and "CC" of rs2297440 (rs6010620 and rs2297440) in the RTEL1 gene, together with two haplotypes of GCT and ATT, were identified to be associated with glioma development. And it might be used to evaluate the glioma development risks to screen the above RTEL1 tagging SNPs and haplotypes. The virtual slides for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1993021136961998.

  4. Inference with constrained hidden Markov models in PRISM

    DEFF Research Database (Denmark)

    Christiansen, Henning; Have, Christian Theil; Lassen, Ole Torp

    2010-01-01

    A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference. De......_different are integrated. We experimentally validate our approach on the biologically motivated problem of global pairwise alignment.......A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference...

  5. Probability biases as Bayesian inference

    Directory of Open Access Journals (Sweden)

    Andre; C. R. Martins

    2006-11-01

    Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.

  6. Worldwide distribution of the MYH9 kidney disease susceptibility alleles and haplotypes: evidence of historical selection in Africa.

    Directory of Open Access Journals (Sweden)

    Taras K Oleksyk

    2010-07-01

    Full Text Available MYH9 was recently identified as renal susceptibility gene (OR 3-8, p or = 60% than in European Americans (< 4%, revealing a genetic basis for a major health disparity. The population distributions of MYH9 risk alleles and the E-1 risk haplotype and the demographic and selective forces acting on the MYH9 region are not well explored. We reconstructed MYH9 haplotypes from 4 tagging single nucleotide polymorphisms (SNPs spanning introns 12-23 using available data from HapMap Phase II, and by genotyping 938 DNAs from the Human Genome Diversity Panel (HGDP. The E-1 risk haplotype followed a cline, being most frequent within sub-Saharan African populations (range 50-80%, less frequent in populations from the Middle East (9-27% and Europe (0-9%, and rare or absent in Asia, the Americas, and Oceania. The fixation indexes (F(ST for pairwise comparisons between the risk haplotypes for continental populations were calculated for MYH9 haplotypes; F(ST ranged from 0.27-0.40 for Africa compared to other continental populations, possibly due to selection. Uniquely in Africa, the Yoruba population showed high frequency extended haplotype length around the core risk allele (C compared to the alternative allele (T at the same locus (rs4821481, iHs = 2.67, as well as high population differentiation (F(ST(CEU vs. YRI = 0.51 in HapMap Phase II data, also observable only in the Yoruba population from HGDP (F(ST = 0.49, pointing to an instance of recent selection in the genomic region. The population-specific divergence in MYH9 risk allele frequencies among the world's populations may prove important in risk assessment and public health policies to mitigate the burden of kidney disease in vulnerable populations.

  7. Two extended haplotype blocks are associated with adaptation to high altitude habitats in East African honey bees.

    Directory of Open Access Journals (Sweden)

    Andreas Wallberg

    2017-05-01

    Full Text Available Understanding the genetic basis of adaption is a central task in biology. Populations of the honey bee Apis mellifera that inhabit the mountain forests of East Africa differ in behavior and morphology from those inhabiting the surrounding lowland savannahs, which likely reflects adaptation to these habitats. We performed whole genome sequencing on 39 samples of highland and lowland bees from two pairs of populations to determine their evolutionary affinities and identify the genetic basis of these putative adaptations. We find that in general, levels of genetic differentiation between highland and lowland populations are very low, consistent with them being a single panmictic population. However, we identify two loci on chromosomes 7 and 9, each several hundred kilobases in length, which exhibit near fixation for different haplotypes between highland and lowland populations. The highland haplotypes at these loci are extremely rare in samples from the rest of the world. Patterns of segregation of genetic variants suggest that recombination between haplotypes at each locus is suppressed, indicating that they comprise independent structural variants. The haplotype on chromosome 7 harbors nearly all octopamine receptor genes in the honey bee genome. These have a role in learning and foraging behavior in honey bees and are strong candidates for adaptation to highland habitats. Molecular analysis of a putative breakpoint indicates that it may disrupt the coding sequence of one of these genes. Divergence between the highland and lowland haplotypes at both loci is extremely high suggesting that they are ancient balanced polymorphisms that greatly predate divergence between the extant honey bee subspecies.

  8. The polymorphism and haplotypes of XRCC1 and survival of non-small-cell lung cancer after radiotherapy

    International Nuclear Information System (INIS)

    Yoon, Sang Min; Hong, Yun-Chul; Park, Heon Joo; Lee, Jong-Eun; Kim, Sang Yoon; Kim, Jong Hoon; Lee, Sang-Wook; Park, So-Yeon; Lee, Jung Shin; Choi, Eun Kyung

    2005-01-01

    Purpose: The X-ray repair cross-complementing Group 1 (XRCC1) protein is involved mainly in the base excision repair of the DNA repair process. This study examined the association of 3 polymorphisms (codon 194, 280, and 399) of XRCC1 and lung cancer in terms of whether or not these polymorphisms have an effect on the survival of lung cancer patients who have received radiotherapy. Methods and Materials: Between January 2000 and April 2004, 229 lung cancer patients with non-small-cell lung cancer in Stages I-III were enrolled. Genotyping was performed by single base primer extension assay using the SNP-IT Kit with genomic DNA samples from all patients. The haplotype of the XRCC1 polymorphisms was estimated by PHASE version 2.1. Results: The patients consisted of 191 (83.4%) males and 38 (16.6%) females with a median age of 62 (range, 26-88 years). Sixty percent of the patients were included in Stage I-IIIa. The median progression-free and overall survival was 13 months and 16 months, respectively. The XRCC1 codon 194, histology, and stage were shown to be significant predictors of the progression-free survival. The 6 haplotypes among the XRCC1 polymorphisms (194, 280, and 399) were estimated by PHASE v.2.1. The patients with haplotype pairs other than the homozygous TGG haplotype pairs survived significantly longer (p = 0.04). Conclusions: Polymorphisms of XRCC1 have an effect on the survival of lung cancer patients treated with radiotherapy, and this effect seems to be more significant after the haplotype pairs are considered

  9. Two extended haplotype blocks are associated with adaptation to high altitude habitats in East African honey bees

    Science.gov (United States)

    Schöning, Caspar

    2017-01-01

    Understanding the genetic basis of adaption is a central task in biology. Populations of the honey bee Apis mellifera that inhabit the mountain forests of East Africa differ in behavior and morphology from those inhabiting the surrounding lowland savannahs, which likely reflects adaptation to these habitats. We performed whole genome sequencing on 39 samples of highland and lowland bees from two pairs of populations to determine their evolutionary affinities and identify the genetic basis of these putative adaptations. We find that in general, levels of genetic differentiation between highland and lowland populations are very low, consistent with them being a single panmictic population. However, we identify two loci on chromosomes 7 and 9, each several hundred kilobases in length, which exhibit near fixation for different haplotypes between highland and lowland populations. The highland haplotypes at these loci are extremely rare in samples from the rest of the world. Patterns of segregation of genetic variants suggest that recombination between haplotypes at each locus is suppressed, indicating that they comprise independent structural variants. The haplotype on chromosome 7 harbors nearly all octopamine receptor genes in the honey bee genome. These have a role in learning and foraging behavior in honey bees and are strong candidates for adaptation to highland habitats. Molecular analysis of a putative breakpoint indicates that it may disrupt the coding sequence of one of these genes. Divergence between the highland and lowland haplotypes at both loci is extremely high suggesting that they are ancient balanced polymorphisms that greatly predate divergence between the extant honey bee subspecies. PMID:28542163

  10. Musical aptitude is associated with AVPR1A-haplotypes.

    Directory of Open Access Journals (Sweden)

    Liisa T Ukkola

    Full Text Available Artistic creativity forms the basis of music culture and music industry. Composing, improvising and arranging music are complex creative functions of the human brain, which biological value remains unknown. We hypothesized that practicing music is social communication that needs musical aptitude and even creativity in music. In order to understand the neurobiological basis of music in human evolution and communication we analyzed polymorphisms of the arginine vasopressin receptor 1A (AVPR1A, serotonin transporter (SLC6A4, catecol-O-methyltranferase (COMT, dopamin receptor D2 (DRD2 and tyrosine hydroxylase 1 (TPH1, genes associated with social bonding and cognitive functions in 19 Finnish families (n = 343 members with professional musicians and/or active amateurs. All family members were tested for musical aptitude using the auditory structuring ability test (Karma Music test; KMT and Carl Seashores tests for pitch (SP and for time (ST. Data on creativity in music (composing, improvising and/or arranging music was surveyed using a web-based questionnaire. Here we show for the first time that creative functions in music have a strong genetic component (h(2 = .84; composing h(2 = .40; arranging h(2 = .46; improvising h(2 = .62 in Finnish multigenerational families. We also show that high music test scores are significantly associated with creative functions in music (p<.0001. We discovered an overall haplotype association with AVPR1A gene (markers RS1 and RS3 and KMT (p = 0.0008; corrected p = 0.00002, SP (p = 0.0261; corrected p = 0.0072 and combined music test scores (COMB (p = 0.0056; corrected p = 0.0006. AVPR1A haplotype AVR+RS1 further suggested a positive association with ST (p = 0.0038; corrected p = 0.00184 and COMB (p = 0.0083; corrected p = 0.0040 using haplotype-based association test HBAT. The results suggest that the neurobiology of music perception and production is likely to be related to the pathways affecting intrinsic attachment

  11. Bayesian inference in probabilistic risk assessment-The current state of the art

    International Nuclear Information System (INIS)

    Kelly, Dana L.; Smith, Curtis L.

    2009-01-01

    Markov chain Monte Carlo (MCMC) approaches to sampling directly from the joint posterior distribution of aleatory model parameters have led to tremendous advances in Bayesian inference capability in a wide variety of fields, including probabilistic risk analysis. The advent of freely available software coupled with inexpensive computing power has catalyzed this advance. This paper examines where the risk assessment community is with respect to implementing modern computational-based Bayesian approaches to inference. Through a series of examples in different topical areas, it introduces salient concepts and illustrates the practical application of Bayesian inference via MCMC sampling to a variety of important problems

  12. Decision aid by fuzzy inference: a case study related to the problem of radioactive waste management

    International Nuclear Information System (INIS)

    Krunsch, P.; Fiordalisa, A.; Fortemps, Ph.

    1999-01-01

    This paper illustrates a fuzzy inference system (FIS) developed to assist the economic calculus in radioactive waste management (RWM). The extended time horizons and, in addition, the first-of-a-kind nature of many RWM systems induce large cost uncertainties in project funding. The traditional approach in economic calculus is to include contingency factors in basic cost estimates. A distinction is made between T-factors, used for technological uncertainties, and P-factors, used for project contingencies. In the particular case of nuclear projects, the Electric Power Research Institute (EPRI) has developed specific recommendations for defining both contingency factors. As a generalisation of the EPRI results, a new methodology using fuzzy inference rules is proposed. The inputs to the FIS are derived from the answers of experts regarding both the degrees of technological maturity and project advancement. Inferred T- and P-factors proposed by the FIS are given either as single estimates as possibility intervals. (authors)

  13. How to deal with Haplotype data: An Extension to the Conceptual Schema of the Human Genome

    Directory of Open Access Journals (Sweden)

    José Fabián Reyes Román

    2016-12-01

    Full Text Available The goal of this work is to describe the advantages of the application of Conceptual Modeling (CM in complex domains, such as genomics. Nowadays, the study and comprehension of the human genome is a major challenge due to its high level of complexity. The constant evolution in the genomic domain contributes to the generation of ever larger amounts of new data, which means that if we do not manage it correctly data quality could be compromised (i.e., problems related with heterogeneity and inconsistent data. In this paper, we propose the use of a Conceptual Schema of the Human Genome (CSHG, designed to understand and improve our ontological commitment to the domain and also extend (enrich this schema with the integration of a novel concept: Haplotypes. Our focus is on improving the understanding of the relationship between genotype and phenotype, since new findings show that this question is more complex than was originally thought. Here we present the first steps in our data management approach with haplotypes (variations, frequencies and populations and discuss the database evolution to support this data. Each new version in our conceptual schema (CS introduces changes to the underlying database structure that has essential and practical implications for better understanding and managing the relevant information. A solution based on conceptual models gives a clear definition of the domain with direct implications in the medical field (Precision Medicine, in which Genomic Information Systems (GeIS play a very important role.

  14. Bayesian inference for hybrid discrete-continuous stochastic kinetic models

    International Nuclear Information System (INIS)

    Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S

    2014-01-01

    We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)

  15. Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models.

    Directory of Open Access Journals (Sweden)

    Richard R Stein

    2015-07-01

    Full Text Available Maximum entropy-based inference methods have been successfully used to infer direct interactions from biological datasets such as gene expression data or sequence ensembles. Here, we review undirected pairwise maximum-entropy probability models in two categories of data types, those with continuous and categorical random variables. As a concrete example, we present recently developed inference methods from the field of protein contact prediction and show that a basic set of assumptions leads to similar solution strategies for inferring the model parameters in both variable types. These parameters reflect interactive couplings between observables, which can be used to predict global properties of the biological system. Such methods are applicable to the important problems of protein 3-D structure prediction and association of gene-gene networks, and they enable potential applications to the analysis of gene alteration patterns and to protein design.

  16. Bayesian inference for spatio-temporal spike-and-slab priors

    DEFF Research Database (Denmark)

    Andersen, Michael Riis; Vehtari, Aki; Winther, Ole

    2017-01-01

    a transformed Gaussian process on the spike-and-slab probabilities. An expectation propagation (EP) algorithm for posterior inference under the proposed model is derived. For large scale problems, the standard EP algorithm can be prohibitively slow. We therefore introduce three different approximation schemes...

  17. The Impact of Transitive Inference Operations on Mathematics ...

    African Journals Online (AJOL)

    This study examined the extent to which operations of transitive inference tasks have affected the mathematics problem solving abilities of pre-primary school children. Four research hypotheses were tested at 0.05 level of significance using 400 nursery school children whose ages ranged between 4.5 and 5.5 years ...

  18. Sparse linear models: Variational approximate inference and Bayesian experimental design

    International Nuclear Information System (INIS)

    Seeger, Matthias W

    2009-01-01

    A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

  19. Sparse linear models: Variational approximate inference and Bayesian experimental design

    Energy Technology Data Exchange (ETDEWEB)

    Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)

    2009-12-01

    A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

  20. Novel Nucleotide Variations, Haplotypes Structure and Associations with Growth Related Traits of Goat AT Motif-Binding Factor ( Gene

    Directory of Open Access Journals (Sweden)

    Xiaoyan Zhang

    2015-10-01

    Full Text Available The AT motif-binding factor (ATBF1 not only interacts with protein inhibitor of activated signal transducer and activator of transcription 3 (STAT3 (PIAS3 to suppress STAT3 signaling regulating embryo early development and cell differentiation, but is required for early activation of the pituitary specific transcription factor 1 (Pit1 gene (also known as POU1F1 critically affecting mammalian growth and development. The goal of this study was to detect novel nucleotide variations and haplotypes structure of the ATBF1 gene, as well as to test their associations with growth-related traits in goats. Herein, a total of seven novel single nucleotide polymorphisms (SNPs (SNP 1-7 within this gene were found in two well-known Chinese native goat breeds. Haplotypes structure analysis demonstrated that there were four haplotypes in Hainan black goat while seventeen haplotypes in Xinong Saanen dairy goat, and both breeds only shared one haplotype (hap1. Association testing revealed that the SNP2, SNP5, SNP6, and SNP7 loci were also found to significantly associate with growth-related traits in goats, respectively. Moreover, one diplotype in Xinong Saanen dairy goats significantly linked to growth related traits. These preliminary findings not only would extend the spectrum of genetic variations of the goat ATBF1 gene, but also would contribute to implementing marker-assisted selection in genetics and breeding in goats.

  1. BMP4 and FGF3 haplotypes increase the risk of tendinopathy in volleyball athletes.

    Science.gov (United States)

    Salles, José Inácio; Amaral, Marcus Vinícius; Aguiar, Diego Pinheiro; Lira, Daisy Anne; Quinelato, Valquiria; Bonato, Letícia Ladeira; Duarte, Maria Eugenia Leite; Vieira, Alexandre Rezende; Casado, Priscila Ladeira

    2015-03-01

    To investigate whether genetic variants can be correlated with tendinopathy in elite male volleyball athletes. Case-control study. Fifteen single nucleotide polymorphisms within BMP4, FGF3, FGF10, FGFR1 genes were investigated in 138 elite volleyball athletes, aged between 18 and 35 years, who undergo 4-5h of training per day: 52 with tendinopathy and 86 with no history of pain suggestive of tendinopathy in patellar, Achilles, shoulder, and hip abductors tendons. The clinical diagnostic criterion was progressive pain during training, confirmed by magnetic resonance image. Genomic DNA was obtained from saliva samples. Genetic markers were genotyped using TaqMan real-time PCR. Chi-square test compared genotypes and haplotype differences between groups. Multivariate logistic regression analyzed the significance of covariates and incidence of tendinopathy. Statistical analysis revealed participant age (p=0.005) and years of practice (p=0.004) were risk factors for tendinopathy. A significant association between BMP4 rs2761884 (p=0.03) and tendinopathy was observed. Athletes with a polymorphic genotype have 2.4 times more susceptibility to tendinopathy (OR=2.39; 95%CI=1.10-5.19). Also, association between disease and haplotype TTGGA in BMP4 (p=0.01) was observed. The FGF3 TGGTA haplotype showed a tendency of association with tendinopathy (p=0.05), and so did FGF10 rs900379. FGFR1 showed no association with disease. These findings indicate that haplotypes in BMP4 and FGF3 genes may contribute to the tendon disease process in elite volleyball athletes. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  2. Association of ATG16L1 gene haplotype with inflammatory bowel disease in Indians.

    Directory of Open Access Journals (Sweden)

    Srinivasan Pugazhendhi

    Full Text Available Inflammatory bowel disease (IBD is characterized by multigenic inheritance. Defects in autophagy related genes are considered to show genetic heterogeneity between populations. We evaluated the association of several single nucleotide polymorphisms (SNPs in the autophagy related 16 like 1 (ATG16L1 gene with IBD in Indians. The ATG16L1 gene was genotyped for ten different SNPs using DNA extracted from peripheral blood of 234 patients with Crohn's disease (CD, 249 patients with ulcerative colitis (UC and 393 healthy controls The SNPs rs2241880, rs4663396, rs3792106, rs10210302, rs3792109, rs2241877, rs6737398, rs11682898, rs4663402 and rs4663421 were genotyped using the Sequenom MassArray platform. PLINK was used for the association analysis and pairwise linkage disequilibrium (LD values. Haplotype analysis was done using Haploview. All SNPs were in Hardy Weinberg equilibrium in cases and controls. The G allele at rs6737398 exhibited a protective association with both CD and UC. The T allele at rs4663402 and C allele at rs4663421 were positively associated with CD and UC. The T allele at rs2241877 exhibited protective association with UC only. The AA genotype at rs4663402 and the GG genotype at rs4663421 were protectively associated with both CD and UC. Haplotype analysis revealed that all the SNPs in tight LD (D' = 0.76-1.0 and organized in a single haplotype block. Haplotype D was positively associated with IBD (P = 5.8 x 10-6 for CD and 0.002 for UC. SNPs in ATG16L1 were associated with IBD in Indian patients. The relevance to management of individual patients requires further study.

  3. Evidence of a Native Northwest Atlantic COI Haplotype Clade in the Cryptogenic Colonial Ascidian Botryllus schlosseri.

    Science.gov (United States)

    Yund, Philip O; Collins, Catherine; Johnson, Sheri L

    2015-06-01

    The colonial ascidian Botryllus schlosseri should be considered cryptogenic (i.e., not definitively classified as either native or introduced) in the Northwest Atlantic. Although all the evidence is quite circumstantial, over the last 15 years most research groups have accepted the scenario of human-mediated dispersal and classified B. schlosseri as introduced; others have continued to consider it native or cryptogenic. We address the invasion status of this species by adding 174 sequences to the growing worldwide database for the mitochondrial gene cytochrome c oxidase subunit I (COI) and analyzing 1077 sequences to compare genetic diversity of one clade of haplotypes in the Northwest Atlantic with two hypothesized source regions (the Northeast Atlantic and Mediterranean). Our results lead us to reject the prevailing view of the directionality of transport across the Atlantic. We argue that the genetic diversity patterns at COI are far more consistent with the existence of at least one haplotype clade in the Northwest Atlantic (and possibly a second) that substantially pre-dates human colonization from Europe, with this native North American clade subsequently introduced to three sites in Northeast Atlantic and Mediterranean waters. However, we agree with past researchers that some sites in the Northwest Atlantic have more recently been invaded by alien haplotypes, so that some populations are currently composed of a mixture of native and invader haplotypes. © 2015 Marine Biological Laboratory.

  4. Echinococcus equinus and Echinococcus granulosus sensu stricto from the United Kingdom: genetic diversity and haplotypic variation.

    Science.gov (United States)

    Boufana, Belgees; Lett, Wai San; Lahmar, Samia; Buishi, Imad; Bodell, Anthony J; Varcasia, Antonio; Casulli, Adriano; Beeching, Nicholas J; Campbell, Fiona; Terlizzo, Monica; McManus, Donald P; Craig, Philip S

    2015-02-01

    Cystic echinococcosis is endemic in Europe including the United Kingdom. However, information on the molecular epidemiology of Echinococcus spp. from the United Kingdom is limited. Echinococcus isolates from intermediate and definitive animal hosts as well as from human cystic echinococcosis cases were analysed to determine species and genotypes within these hosts. Echinococcus equinus was identified from horse hydatid isolates, cysts retrieved from captive UK mammals and copro-DNA of foxhounds and farm dogs. Echinococcus granulosus sensu stricto (s.s.) was identified from hydatid cysts of sheep and cattle as well as in DNA extracted from farm dog and foxhound faecal samples, and from four human cystic echinococcosis isolates, including the first known molecular confirmation of E. granulosus s.s. infection in a Welsh sheep farmer. Low genetic variability for E. equinus from various hosts and from different geographical locations was detected using the mitochondrial cytochrome c oxidase subunit 1 gene (cox1), indicating the presence of a dominant haplotype (EQUK01). In contrast, greater haplotypic variation was observed for E. granulosus s.s. cox1 sequences. The haplotype network showed a star-shaped network with a centrally placed main haplotype (EgUK01) that had been reported from other world regions. Copyright © 2014 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  5. The DAOA/G30 locus and affective disorders: haplotype based association study in a polydiagnostic approach

    Directory of Open Access Journals (Sweden)

    Knapp Michael

    2010-07-01

    Full Text Available Abstract Background The DAOA/G30 (D-amino acid oxidase activator gene complex at chromosomal region 13q32-33 is one of the most intriguing susceptibility loci for the major psychiatric disorders, although there is no consensus about the specific risk alleles or haplotypes across studies. Methods In a case-control sample of German descent (affective psychosis: n = 248; controls: n = 188 we examined seven single nucleotide polymorphisms (SNPs around DAOA/G30 (rs3916966, rs1935058, rs2391191, rs1935062, rs947267, rs3918342, and rs9558575 for genetic association in a polydiagnostic approach (ICD 10; Leonhard's classification. Results No single marker showed evidence of overall association with affective disorder neither in ICD10 nor Leonhard's classification. Haplotype analysis revealed no association with recurrent unipolar depression or bipolar disorder according to ICD10, within Leonhard's classification manic-depression was associated with a 3-locus haplotype (rs2391191, rs1935062, and rs3916966; P = 0.022 and monopolar depression with a 5-locus combination at the DAOA/G30 core region (P = 0.036. Conclusion Our data revealed potential evidence for partially overlapping risk haplotypes at the DAOA/G30 locus in Leonhard's affective psychoses, but do not support a common genetic contribution of the DAOA/G30 gene complex to the pathogenesis of affective disorders.

  6. De novo assembly of a haplotype-resolved human genome

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang

    2015-01-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-...

  7. Inheritance of the 8.1 ancestral haplotype in recurrent pregnancy loss

    DEFF Research Database (Denmark)

    Kolte, Astrid M; Nielsen, Henriette S; Steffensen, Rudi

    2015-01-01

    pleiotropy. It has also been proposed that the survival of long, conserved haplotypes may be due to gestational drive, i.e. selective miscarriage of fetuses who have not inherited the haplotype from a heterozygous mother. Recurrent pregnancy loss (RPL) is defined as three or more consecutive pregnancy losses....... The objective was to test the gestational drive theory for the 8.1AH in women with RPL and their live born children. METHODOLOGY: We investigated the inheritance of the 8.1AH from 82 heterozygous RPL women to 110 live born children. All participants were genotyped for HLA-A, -B and -DRB1 in DNA from EDTA......-treated blood or buccal swaps. Inheritance was compared with a Mendelian inheritance of 50% using a two-sided exact binomial test. RESULTS: We found that 55% of the live born children had inherited the 8.1AH, which was not significantly higher than the expected 50% (P = 0.29). Interestingly, we found a non...

  8. Model averaging, optimal inference and habit formation

    Directory of Open Access Journals (Sweden)

    Thomas H B FitzGerald

    2014-06-01

    Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.

  9. Asymptotic inference for jump diffusions with state-dependent intensity

    NARCIS (Netherlands)

    Becheri, Gaia; Drost, Feico; Werker, Bas

    2016-01-01

    We establish the local asymptotic normality property for a class of ergodic parametric jump-diffusion processes with state-dependent intensity and known volatility function sampled at high frequency. We prove that the inference problem about the drift and jump parameters is adaptive with respect to

  10. Haplotype and genetic relationship of 27 Y-STR loci in Han population of Chaoshan area of China

    Directory of Open Access Journals (Sweden)

    Qing-hua TIAN

    2017-04-01

    Full Text Available Objective  To investigate the genetic polymorphisms of 27 Y-chromosomal short tandem repeats (Y-STR loci included in Yfiler® Plus kit in Han population of Chaoshan area, and explore the population genetic relationships and evaluate its application value on forensic medicine. Methods  By detecting 795 unrelated Chaoshan Han males with Yfiler® Plus kit, haplotype frequencies and population genetics parameters of the 27 Y-STR loci were statistically analyzed and compared with available data of other populations from different races and regions for analyzing the genetic distance and clustering relation of Chaoshan Han population. Results  Seven hundred and eighty-seven different haplotypes were observed in 795 unrelated male individuals, of which 779 haplotypes were unique, and 8 haplotypes occurred twice. The haplotype diversity (HD was 0.999975 with discriminative capacity (DC of 98.99%. The gene diversity (GD at the 27 Y-STR loci ranged from 0.3637(DYS391 to 0.9559(DYS385a/b. Comparing with Asian reference populations, the genetic distance (Rst between Chaoshan Han and Guangdong Han was the smallest (0.0036, while it was relatively larger between Chaoshan Han and Gansu Tibetan population (0.0935. The multi-dimensional scaling (MDS plot based on Rst values was similar to the results of clustering analysis. Conclusion  Multiplex detection of the 27 Y-STR loci reveals a highly polymorphic genetic distribution in Chaoshan Han population, which demonstrates the important significance of Yfiler® Plus kit for establishing a Y-STR database, studying population genetics, and for good practice in forensic medicine. DOI: 10.11855/j.issn.0577-7402.2017.03.08

  11. No shortcut solution to the problem of Y-STR match probability calculation.

    Science.gov (United States)

    Caliebe, Amke; Jochens, Arne; Willuweit, Sascha; Roewer, Lutz; Krawczak, Michael

    2015-03-01

    Match probability calculation is deemed much more intricate for lineage genetic markers, including Y-chromosomal short tandem repeats (Y-STRs), than for autosomal markers. This is because, owing to the lack of recombination, strong interdependence between markers is likely, which implies that haplotype frequency estimates cannot simply be obtained through the multiplication of allele frequency estimates. As yet, however, the practical relevance of this problem has not been studied in much detail using real data. In fact, such scrutiny appears well warranted because the high mutation rates of Y-STRs and the possibility of backward mutation should have worked against the statistical association of Y-STRs. We examined haplotype data of 21 markers included in the PowerPlex(®)Y23 set (PPY23, Promega Corporation, Madison, WI) originating from six different populations (four European and two Asian). Assessing the conditional entropies of the markers, given different subsets of markers from the same panel, we demonstrate that the PowerPlex(®)Y23 set cannot be decomposed into smaller marker subsets that would be (conditionally) independent. Nevertheless, in all six populations, >94% of the joint entropy of the 21 markers is explained by the seven most rapidly mutating markers. Although this result might render a reduction in marker number a sensible option for practical casework, the partial haplotypes would still be almost as diverse as the full haplotypes. Therefore, match probability calculation remains difficult and calls for the improvement of currently available methods of haplotype frequency estimation. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  12. Sequence variation of bovine mitochondrial ND-5 between haplotypes of composite and Hereford Breeds of beef cattle

    Directory of Open Access Journals (Sweden)

    SUTARNO

    2002-07-01

    Full Text Available The aims of the study were to: Investigate polymorphisms in the ND-5 region of bovine mitochondrial DNA in the composite and purebred Hereford herds from the Wokalup selection experiment, sequencing and compare the sequences between haplotypes and published sequence from Genebank. A total of 194 Hereford and 235 composite breed cattle from Wokalup Research Station were used in this study. The mitochondrial DNA was extracted using Wizard genomic DNA purification system from Promega. ND-5 fragment of mitochondrial DNA was amplified using PCR and continued with RFLP. Each haplotypes were sequenced. PCR products of each haplotype were cloned into pCR II, transformed, colonies selection, plasmid DNA extraction continued with cycle sequencing. Polymorphisms were found in both breeds of cattle in ND-5 region of mitochondrial DNA by PCR-RFLP analysis. Sequencing analysis confirmed the RFLPs data.

  13. The anatomy of choice: active inference and agency

    Directory of Open Access Journals (Sweden)

    Karl eFriston

    2013-09-01

    Full Text Available This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behaviour. In particular, we consider prior beliefs that action minimises the Kullback-Leibler divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimises a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimising free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action – constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualises optimal decision theory and economic (utilitarian formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimisation, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria has a unique and Bayes-optimal solution – that minimises free energy. This sensitivity corresponds to the precision of beliefs about behaviour, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behaviour entails a representation of confidence about outcomes that are under an agent's control.

  14. The anatomy of choice: active inference and agency.

    Science.gov (United States)

    Friston, Karl; Schwartenbeck, Philipp; Fitzgerald, Thomas; Moutoussis, Michael; Behrens, Timothy; Dolan, Raymond J

    2013-01-01

    This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behavior. In particular, we consider prior beliefs that action minimizes the Kullback-Leibler (KL) divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimizes a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimizing free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action-constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualizes optimal decision theory and economic (utilitarian) formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimization, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria) has a unique and Bayes-optimal solution-that minimizes free energy. This sensitivity corresponds to the precision of beliefs about behavior, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behavior entails a representation of confidence about outcomes that are under an agent's control.

  15. Haplotypes in the APOA1-C3-A4-A5 gene cluster affect plasma lipids in both humans and baboons

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Qian-fei; Liu, Xin; O' Connell, Jeff; Peng, Ze; Krauss, Ronald M.; Rainwater, David L.; VandeBerg, John L.; Rubin, Edward M.; Cheng, Jan-Fang; Pennacchio, Len A.

    2003-09-15

    Genetic studies in non-human primates serve as a potential strategy for identifying genomic intervals where polymorphisms impact upon human disease-related phenotypes. It remains unclear, however, whether independently arising polymorphisms in orthologous regions of non-human primates leads to similar variation in a quantitative trait found in both species. To explore this paradigm, we studied a baboon apolipoprotein gene cluster (APOA1/C3/A4/A5) for which the human gene orthologs have well established roles in influencing plasma HDL-cholesterol and triglyceride concentrations. Our extensive polymorphism analysis of this 68 kb gene cluster in 96 pedigreed baboons identified several haplotype blocks each with limited diversity, consistent with haplotype findings in humans. To determine whether baboons, like humans, also have particular haplotypes associated with lipid phenotypes, we genotyped 634 well characterized baboons using 16 haplotype tagging SNPs. Genetic analysis of single SNPs, as well as haplotypes, revealed an association of APOA5 and APOC3 variants with HDL cholesterol and triglyceride concentrations, respectively. Thus, independent variation in orthologous genomic intervals does associate with similar quantitative lipid traits in both species, supporting the possibility of uncovering human QTL genes in a highly controlled non-human primate model.

  16. Effects of Bos taurus autosome 9-located quantitative trait loci haplotypes on the disease phenotypes of dairy cows with experimentally induced Escherichia coli mastitis

    DEFF Research Database (Denmark)

    Khatun, Momena; Sørensen, Peter; Jørgensen, Hanne Birgitte Hede

    2013-01-01

    Several quantitative trait loci (QTL) affecting mastitis incidence and mastitis-related traits such as somatic cell score exist in dairy cows. Previously, QTL haplotypes associated with susceptibility to Escherichia coli mastitis in Nordic Holstein-Friesian (HF) cows were identified on Bos taurus...... autosome 9. In the present study, we induced experimental E. coli mastitis in Danish HF cows to investigate the effect of 2 E. coli mastitis-associated QTL haplotypes on the cows' disease phenotypes and recovery in early lactation. Thirty-two cows were divided in 2 groups bearing haplotypes with either low...... the HH group did. However, we also found interactions between the effects of haplotype and biopsy for body temperature, heart rate, and PMNL. In conclusion, when challenged with E. coli mastitis, HF cows with the specific Bos taurus autosome 9-located QTL haplotypes were associated with differences...

  17. Bayesian inference from count data using discrete uniform priors.

    Directory of Open Access Journals (Sweden)

    Federico Comoglio

    Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.

  18. Distribution pattern of Plasmodium falciparum chloroquine transporter (pfcrt) gene haplotypes in Sri Lanka 1996-2006

    DEFF Research Database (Denmark)

    Zhang, Jenny J; Senaratne, Tharanga N; Daniels, Rachel

    2011-01-01

    Abstract. Widespread antimalarial resistance has been a barrier to malaria elimination efforts in Sri Lanka. Analysis of genetic markers in historic parasites may uncover trends in the spread of resistance. We examined the frequency of Plasmodium falciparum chloroquine transporter (pfcrt; codons 72......-76) haplotypes in Sri Lanka in 1996-1998 and 2004-2006 using a high-resolution melting assay. Among 59 samples from 1996 to 1998, we detected the SVMNT (86%), CVMNK (10%), and CVIET (2%) haplotypes, with a positive trend in SVMNT and a negative trend in CVMNK frequency (P = 0.004) over time. Among 24 samples...

  19. Risk of Pediatric Celiac Disease According to HLA Haplotype and Country

    Science.gov (United States)

    Liu, Edwin; Lee, Hye-Seung; Aronsson, Carin A.; Hagopian, William A.; Koletzko, Sibylle; Rewers, Marian J.; Eisenbarth, George S.; Bingley, Polly J.; Bonifacio, Ezio; Simell, Ville; Agardh, Daniel

    2014-01-01

    BACKGROUND The presence of HLA haplotype DR3–DQ2 or DR4–DQ8 is associated with an increased risk of celiac disease. In addition, nearly all children with celiac disease have serum antibodies against tissue transglutaminase (tTG). METHODS We studied 6403 children with HLA haplotype DR3–DQ2 or DR4–DQ8 prospectively from birth in the United States, Finland, Germany, and Sweden. The primary end point was the development of celiac disease autoimmunity, which was defined as the presence of tTG antibodies on two consecutive tests at least 3 months apart. The secondary end point was the development of celiac disease, which was defined for the purpose of this study as either a diagnosis on biopsy or persistently high levels of tTG antibodies. RESULTS The median follow-up was 60 months (interquartile range, 46 to 77). Celiac disease autoimmunity developed in 786 children (12%). Of the 350 children who underwent biopsy, 291 had confirmed celiac disease; an additional 21 children who did not undergo biopsy had persistently high levels of tTG antibodies. The risks of celiac disease autoimmunity and celiac disease by the age of 5 years were 11% and 3%, respectively, among children with a single DR3–DQ2 haplotype, and 26% and 11%, respectively, among those with two copies (DR3–DQ2 homozygosity). In the adjusted model, the hazard ratios for celiac disease autoimmunity were 2.09 (95% confidence interval [CI], 1.70 to 2.56) among heterozygotes and 5.70 (95% CI, 4.66 to 6.97) among homozygotes, as compared with children who had the lowest-risk genotypes (DR4–DQ8 heterozygotes or homozygotes). Residence in Sweden was also independently associated with an increased risk of celiac disease autoimmunity (hazard ratio, 1.90; 95% CI, 1.61 to 2.25). CONCLUSIONS Children with the HLA haplotype DR3–DQ2, especially homozygotes, were found to be at high risk for celiac disease autoimmunity and celiac disease early in childhood. The higher risk in Sweden than in other countries

  20. Association of galanin haplotypes with alcoholism and anxiety in two ethnically distinct populations

    Science.gov (United States)

    Belfer, I; Hipp, H; McKnight, C; Evans, C; Buzas, B; Bollettino, A; Albaugh, B; Virkkunen, M; Yuan, Q; Max, MB; Goldman, D; Enoch, MA

    2009-01-01

    The neuropeptide galanin (GAL) is widely expressed in the central nervous system. Animal studies have implicated GAL in alcohol abuse and anxiety: chronic ethanol intake increases hypothalamic GAL mRNA; high levels of stress increase GAL release in the central amygdala. The coding sequence of the galanin gene, GAL, is highly conserved and a functional polymorphism has not yet been found. The aim of our study was, for the first time, to identify GAL haplotypes and investigate associations with alcoholism and anxiety. Seven single-nucleotide polymorphisms (SNPs) spanning GAL were genotyped in 65 controls from five populations: US and Finnish Caucasians, African Americans, Plains and Southwestern Indians. A single haplotype block with little evidence of historical recombination was observed for each population. Four tag SNPs were then genotyped in DSM-III-R lifetime alcoholics and nonalcoholics from two population isolates: 514 Finnish Caucasian men and 331 Plains Indian men and women. Tridimensional Personality Questionnaire harm avoidance (HA) scores, a dimensional measure of anxiety, were obtained. There was a haplotype association with alcoholism in both the Finnish (P=0.001) and Plains Indian (P=0.004) men. The SNPs were also significantly associated. Alcoholics were divided into high and low HA groups (≥ and alcoholics, low HA alcoholics and nonalcoholics. Our results from two independent populations suggest that GAL may contribute to vulnerability to alcoholism, perhaps mediated by dimensional anxiety. PMID:16314872

  1. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Directory of Open Access Journals (Sweden)

    Tiffany Langewisch

    Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  2. Inverse problem of solar oscillations

    International Nuclear Information System (INIS)

    Sekii, T.; Shibahashi, H.

    1987-01-01

    The authors present some preliminary results of numerical simulation to infer the sound velocity distribution in the solar interior from the oscillation data of the Sun as the inverse problem. They analyze the acoustic potential itself by taking account of some factors other than the sound velocity, and infer the sound velocity distribution in the deep interior of the Sun

  3. Inflammation, insulin resistance, and diabetes--Mendelian randomization using CRP haplotypes points upstream.

    Directory of Open Access Journals (Sweden)

    Eric J Brunner

    2008-08-01

    Full Text Available Raised C-reactive protein (CRP is a risk factor for type 2 diabetes. According to the Mendelian randomization method, the association is likely to be causal if genetic variants that affect CRP level are associated with markers of diabetes development and diabetes. Our objective was to examine the nature of the association between CRP phenotype and diabetes development using CRP haplotypes as instrumental variables.We genotyped three tagging SNPs (CRP + 2302G > A; CRP + 1444T > C; CRP + 4899T > G in the CRP gene and measured serum CRP in 5,274 men and women at mean ages 49 and 61 y (Whitehall II Study. Homeostasis model assessment-insulin resistance (HOMA-IR and hemoglobin A1c (HbA1c were measured at age 61 y. Diabetes was ascertained by glucose tolerance test and self-report. Common major haplotypes were strongly associated with serum CRP levels, but unrelated to obesity, blood pressure, and socioeconomic position, which may confound the association between CRP and diabetes risk. Serum CRP was associated with these potential confounding factors. After adjustment for age and sex, baseline serum CRP was associated with incident diabetes (hazard ratio = 1.39 [95% confidence interval 1.29-1.51], HOMA-IR, and HbA1c, but the associations were considerably attenuated on adjustment for potential confounding factors. In contrast, CRP haplotypes were not associated with HOMA-IR or HbA1c (p = 0.52-0.92. The associations of CRP with HOMA-IR and HbA1c were all null when examined using instrumental variables analysis, with genetic variants as the instrument for serum CRP. Instrumental variables estimates differed from the directly observed associations (p = 0.007-0.11. Pooled analysis of CRP haplotypes and diabetes in Whitehall II and Northwick Park Heart Study II produced null findings (p = 0.25-0.88. Analyses based on the Wellcome Trust Case Control Consortium (1,923 diabetes cases, 2,932 controls using three SNPs in tight linkage disequilibrium with our

  4. Haplotype analysis of the genes encoding glutamine synthetase plastic isoforms and their association with nitrogen-use- and yield-related traits in bread wheat.

    Science.gov (United States)

    Li, Xin-Peng; Zhao, Xue-Qiang; He, Xue; Zhao, Guang-Yao; Li, Bin; Liu, Dong-Cheng; Zhang, Ai-Min; Zhang, Xue-Yong; Tong, Yi-Ping; Li, Zhen-Sheng

    2011-01-01

    Glutamine synthetase (GS) plays a key role in the growth, nitrogen (N) use and yield potential of cereal crops. Investigating the haplotype variation of GS genes and its association with agronomic traits may provide useful information for improving wheat N-use efficiency and yield. We isolated the promoter and coding region sequences of the plastic glutamine synthetase isoform (GS2) genes located on chromosomes 2A, 2B and 2D in bread wheat. By analyzing nucleotide sequence variations of the coding region, two, six and two haplotypes were distinguished for TaGS2-A1 (a and b), TaGS2-B1 (a-f) and TaGS2-D1 (a and b), respectively. By analyzing the frequency data of different haplotypes and their association with N use and agronomic traits, four major and favorable TaGS2 haplotypes (A1b, B1a, B1b, D1a) were revealed. These favorable haplotypes may confer better seedling growth, better agronomic performance, and improved N uptake during vegetative growth or grain N concentration. Our data suggest that certain TaGS2 haplotypes may be valuable in breeding wheat varieties with improved agronomic performance and N-use efficiency. © The Authors (2010). Journal compilation © New Phytologist Trust (2010).

  5. Critical examination of logical formulations in quantum theory. Statistical inference and Hilbertian distance between quantum states

    International Nuclear Information System (INIS)

    Hadjisawas, Nicolas.

    1982-01-01

    After a critical study of the logical quantum mechanics formulations of Jauch and Piron, classical and quantum versions of statistical inference are studied. In order to do this, the significance of the Jaynes and Kulback principles (maximum likelihood, least squares principles) is revealed from the theorems established. In the quantum mechanics inference problem, a ''distance'' between states is defined. This concept is used to solve the quantum equivalent of the classical problem studied by Kulback. The ''projection postulate'' proposition is subsequently deduced [fr

  6. Bayesian inference for Hawkes processes

    DEFF Research Database (Denmark)

    Rasmussen, Jakob Gulddahl

    The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

  7. Bayesian inference for Hawkes processes

    DEFF Research Database (Denmark)

    Rasmussen, Jakob Gulddahl

    2013-01-01

    The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

  8. Genome patterns of selection and introgression of haplotypes in natural populations of the house mouse (Mus musculus.

    Directory of Open Access Journals (Sweden)

    Fabian Staubach

    Full Text Available General parameters of selection, such as the frequency and strength of positive selection in natural populations or the role of introgression, are still insufficiently understood. The house mouse (Mus musculus is a particularly well-suited model system to approach such questions, since it has a defined history of splits into subspecies and populations and since extensive genome information is available. We have used high-density single-nucleotide polymorphism (SNP typing arrays to assess genomic patterns of positive selection and introgression of alleles in two natural populations of each of the subspecies M. m. domesticus and M. m. musculus. Applying different statistical procedures, we find a large number of regions subject to apparent selective sweeps, indicating frequent positive selection on rare alleles or novel mutations. Genes in the regions include well-studied imprinted loci (e.g. Plagl1/Zac1, homologues of human genes involved in adaptations (e.g. alpha-amylase genes or in genetic diseases (e.g. Huntingtin and Parkin. Haplotype matching between the two subspecies reveals a large number of haplotypes that show patterns of introgression from specific populations of the respective other subspecies, with at least 10% of the genome being affected by partial or full introgression. Using neutral simulations for comparison, we find that the size and the fraction of introgressed haplotypes are not compatible with a pure migration or incomplete lineage sorting model. Hence, it appears that introgressed haplotypes can rise in frequency due to positive selection and thus can contribute to the adaptive genomic landscape of natural populations. Our data support the notion that natural genomes are subject to complex adaptive processes, including the introgression of haplotypes from other differentiated populations or species at a larger scale than previously assumed for animals. This implies that some of the admixture found in inbred strains of mice

  9. Bayesian Estimation and Inference using Stochastic Hardware

    Directory of Open Access Journals (Sweden)

    Chetan Singh Thakur

    2016-03-01

    Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  10. Bayesian Estimation and Inference Using Stochastic Electronics.

    Science.gov (United States)

    Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André

    2016-01-01

    In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  11. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  12. Genomic association for sexual precocity in beef heifers using pre-selection of genes and haplotype reconstruction.

    Directory of Open Access Journals (Sweden)

    Luciana Takada

    Full Text Available Reproductive traits are of the utmost importance for any livestock farming, but are difficult to measure and to interpret since they are influenced by various factors. The objective of this study was to detect associations between known polymorphisms in candidate genes related to sexual precocity in Nellore heifers, which could be used in breeding programs. Records of 1,689 precocious and non-precocious heifers from farms participating in the Conexão Delta G breeding program were analyzed. A subset of single nucleotide polymorphisms (SNP located in the region of the candidate genes at a distance of up to 5 kb from the boundaries of each gene, were selected from the panel of 777,000 SNPs of the High-Density Bovine SNP BeadChip. Linear mixed models were used for statistical analysis of early heifer pregnancy, relating the trait with isolated SNPs or with haplotype groups. The model included the contemporary group (year and month of birth as fixed effect and parent of the animal (sire effect as random effect. The fastPHASE® and GenomeStudio® were used for reconstruction of the haplotypes and for analysis of linkage disequilibrium based on r2 statistics. A total of 125 candidate genes and 2,024 SNPs forming haplotypes were analyzed. Statistical analysis after Bonferroni correction showed that nine haplotypes exerted a significant effect (p<0.05 on sexual precocity. Four of these haplotypes were located in the Pregnancy-associated plasma protein-A2 gene (PAPP-A2, two in the Estrogen-related receptor gamma gene (ESRRG, and one each in the Pregnancy-associated plasma protein-A gene (PAPP-A, Kell blood group complex subunit-related family (XKR4 and mannose-binding lectin genes (MBL-1 genes. Although the present results indicate that the PAPP-A2, PAPP-A, XKR4, MBL-1 and ESRRG genes influence sexual precocity in Nellore heifers, further studies are needed to evaluate their possible use in breeding programs.

  13. Compound Heterozygosity for Null Mutations and a Common Hypomorphic Risk Haplotype in TBX6 Causes Congenital Scoliosis.

    Science.gov (United States)

    Takeda, Kazuki; Kou, Ikuyo; Kawakami, Noriaki; Iida, Aritoshi; Nakajima, Masahiro; Ogura, Yoji; Imagawa, Eri; Miyake, Noriko; Matsumoto, Naomichi; Yasuhiko, Yukuto; Sudo, Hideki; Kotani, Toshiaki; Nakamura, Masaya; Matsumoto, Morio; Watanabe, Kota; Ikegawa, Shiro

    2017-03-01

    Congenital scoliosis (CS) occurs as a result of vertebral malformations and has an incidence of 0.5-1/1,000 births. Recently, TBX6 on chromosome 16p11.2 was reported as a disease gene for CS; about 10% of Chinese CS patients were compound heterozygotes for rare null mutations and a common haplotype defined by three SNPs in TBX6. All patients had hemivertebrae. We recruited 94 Japanese CS patients, investigated the TBX6 locus for both mutations and the risk haplotype, examined transcriptional activities of mutant TBX6 in vitro, and evaluated clinical and radiographic features. We identified TBX6 null mutations in nine patients, including a missense mutation that had a loss of function in vitro. All had the risk haplotype in the opposite allele. One of the mutations showed dominant negative effect. Although all Chinese patients had one or more hemivertebrae, two Japanese patients did not have hemivertebra. The compound heterozygosity of null mutations and the common risk haplotype in TBX6 also causes CS in Japanese patients with similar incidence. Hemivertebra was not a specific type of spinal malformation in TBX6-associated CS (TACS). A heterozygous TBX6 loss-of-function mutation has been reported in a family with autosomal-dominant spondylocostal dysostosis, but it may represent a spectrum of the same disease with TACS. © 2017 WILEY PERIODICALS, INC.

  14. Bootstrap-Based Inference for Cube Root Consistent Estimators

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Jansson, Michael; Nagasawa, Kenichi

    This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known...... to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. This modification leads to a generic and easy-to-implement resampling method for inference that is conceptually distinct...... from other available distributional approximations based on some form of modified bootstrap. We offer simulation evidence showcasing the performance of our inference method in finite samples. An extension of our methodology to general M-estimation problems is also discussed....

  15. ParaHaplo 3.0: A program package for imputation and a haplotype-based whole-genome association study using hybrid parallel computing

    Directory of Open Access Journals (Sweden)

    Kamatani Naoyuki

    2011-05-01

    Full Text Available Abstract Background Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required. Results We developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo. Conclusions ParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.

  16. The influence of casein haplotype on morphometric characteristics of fat globules and fatty acid composition of milk in Italian Holstein cows.

    Science.gov (United States)

    Perna, Annamaria; Intaglietta, Immacolata; Simonetti, Amalia; Gambacorta, Emilio

    2016-04-01

    The aim of this work was to investigate the effect of casein haplotypes (αS1-, β-, and κ-caseins) on morphometric characteristics of fat globules and fatty acid composition of Italian Holstein milk. Casein haplotypes were determined by isoelectric focusing; milk fat globule size was measured by using a fluorescence microscope; and fatty acid profile was determined by gas chromatography. Casein haplotype significantly affected the fat globule size, the percentage incidence of each globule size class on total measured milk fat globules, and fatty acid composition. A higher incidence of smaller milk fat globules was associated with the BB-A(2)A(2)-BB genotype (αS1-, β-, and κ-casein haplotypes, respectively), whereas small globules were not detected in BB-A(2)A(1)-AA milk, but that milk had the highest percentage of large globules. A higher content of monounsaturated fatty acids was associated with the BB-A(2)A(2)-AB genotype, whereas higher contents of conjugated linoleic acid and docosahexaenoic acid were detected in BB-A(1)A(1)-AA milk. Our results indicate that casein haplotype could affect fat characteristics and, therefore, the nutritional and technological quality of milk. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  17. Exploring and Harnessing Haplotype Diversity to Improve Yield Stability in Crops

    Directory of Open Access Journals (Sweden)

    Lunwen Qian

    2017-09-01

    Full Text Available In order to meet future food, feed, fiber, and bioenergy demands, global yields of all major crops need to be increased significantly. At the same time, the increasing frequency of extreme weather events such as heat and drought necessitates improvements in the environmental resilience of modern crop cultivars. Achieving sustainably increase yields implies rapid improvement of quantitative traits with a very complex genetic architecture and strong environmental interaction. Latest advances in genome analysis technologies today provide molecular information at an ultrahigh resolution, revolutionizing crop genomic research, and paving the way for advanced quantitative genetic approaches. These include highly detailed assessment of population structure and genotypic diversity, facilitating the identification of selective sweeps and signatures of directional selection, dissection of genetic variants that underlie important agronomic traits, and genomic selection (GS strategies that not only consider major-effect genes. Single-nucleotide polymorphism (SNP markers today represent the genotyping system of choice for crop genetic studies because they occur abundantly in plant genomes and are easy to detect. SNPs are typically biallelic, however, hence their information content compared to multiallelic markers is low, limiting the resolution at which SNP–trait relationships can be delineated. An efficient way to overcome this limitation is to construct haplotypes based on linkage disequilibrium, one of the most important features influencing genetic analyses of crop genomes. Here, we give an overview of the latest advances in genomics-based haplotype analyses in crops, highlighting their importance in the context of polyploidy and genome evolution, linkage drag, and co-selection. We provide examples of how haplotype analyses can complement well-established quantitative genetics frameworks, such as quantitative trait analysis and GS, ultimately

  18. Exploring and Harnessing Haplotype Diversity to Improve Yield Stability in Crops.

    Science.gov (United States)

    Qian, Lunwen; Hickey, Lee T; Stahl, Andreas; Werner, Christian R; Hayes, Ben; Snowdon, Rod J; Voss-Fels, Kai P

    2017-01-01

    In order to meet future food, feed, fiber, and bioenergy demands, global yields of all major crops need to be increased significantly. At the same time, the increasing frequency of extreme weather events such as heat and drought necessitates improvements in the environmental resilience of modern crop cultivars. Achieving sustainably increase yields implies rapid improvement of quantitative traits with a very complex genetic architecture and strong environmental interaction. Latest advances in genome analysis technologies today provide molecular information at an ultrahigh resolution, revolutionizing crop genomic research, and paving the way for advanced quantitative genetic approaches. These include highly detailed assessment of population structure and genotypic diversity, facilitating the identification of selective sweeps and signatures of directional selection, dissection of genetic variants that underlie important agronomic traits, and genomic selection (GS) strategies that not only consider major-effect genes. Single-nucleotide polymorphism (SNP) markers today represent the genotyping system of choice for crop genetic studies because they occur abundantly in plant genomes and are easy to detect. SNPs are typically biallelic, however, hence their information content compared to multiallelic markers is low, limiting the resolution at which SNP-trait relationships can be delineated. An efficient way to overcome this limitation is to construct haplotypes based on linkage disequilibrium, one of the most important features influencing genetic analyses of crop genomes. Here, we give an overview of the latest advances in genomics-based haplotype analyses in crops, highlighting their importance in the context of polyploidy and genome evolution, linkage drag, and co-selection. We provide examples of how haplotype analyses can complement well-established quantitative genetics frameworks, such as quantitative trait analysis and GS, ultimately providing an effective tool

  19. Inferring network topology from complex dynamics

    International Nuclear Information System (INIS)

    Shandilya, Srinivas Gorur; Timme, Marc

    2011-01-01

    Inferring the network topology from dynamical observations is a fundamental problem pervading research on complex systems. Here, we present a simple, direct method for inferring the structural connection topology of a network, given an observation of one collective dynamical trajectory. The general theoretical framework is applicable to arbitrary network dynamical systems described by ordinary differential equations. No interference (external driving) is required and the type of dynamics is hardly restricted in any way. In particular, the observed dynamics may be arbitrarily complex; stationary, invariant or transient; synchronous or asynchronous and chaotic or periodic. Presupposing a knowledge of the functional form of the dynamical units and of the coupling functions between them, we present an analytical solution to the inverse problem of finding the network topology from observing a time series of state variables only. Robust reconstruction is achieved in any sufficiently long generic observation of the system. We extend our method to simultaneously reconstructing both the entire network topology and all parameters appearing linear in the system's equations of motion. Reconstruction of network topology and system parameters is viable even in the presence of external noise that distorts the original dynamics substantially. The method provides a conceptually new step towards reconstructing a variety of real-world networks, including gene and protein interaction networks and neuronal circuits.

  20. Overview of worldwide diversity of Diaphorina citri Kuwayama mitochondrial cytochrome oxidase 1 haplotypes: two Old World lineages and a New World invasion

    Science.gov (United States)

    Boykin, L.M.; De Barro, P.; Hall, D.G.; Hunter, W.B.; McKenzie, C.L.; Powell, C.A.; Shatters, R.G.

    2012-01-01

    Relationships among worldwide collections of Diaphorina citri (Asian citrus psyllid) were analyzed using mitochondrial cytochrome oxidase I (mtCOI) haplotypes from novel primers. Sequences were produced from PCR amplicons of an 821bp portion of the mtCOI gene using D. citri specific primers, derived from an existing EST library. An alignment was constructed using 612bps of this fragment and consisted of 212 individuals from 52 collections representing 15 countries. There were a total of eight polymorphic sites that separated the sequences into eight different haplotypes (Dcit-1 through Dcit-8). Phylogenetic network analysis using the statistical parsimony software, TCS, suggests two major haplotype groups with preliminary geographic bias between southwestern Asia (SWA) and southeastern Asia (SEA). The recent (within the last 15 to 25 years) invasion into the New World originated from only the SWA group in the northern hemisphere (USA and Mexico) and from both the SEA and SWA groups in the southern hemisphere (Brazil). In only one case, Reunion Island, did haplotypes from both the SEA and SWA group appear in the same location. In Brazil, both groups were present, but in separate locations. The Dcit-1 SWA haplotype was the most frequently encountered, including ~50% of the countries sampled and 87% of the total sequences obtained from India, Pakistan and Saudi Arabia. The second most frequently encountered haplotype, Dcit-2, the basis of the SEA group, represented ~50% of the countries and contained most of the sequences from Southeast Asia and China. Interestingly, only the Caribbean collections (Puerto Rico and Guadeloupe) represented a unique haplotype not found in other countries, indicating no relationship between the USA (Florida) and Caribbean introductions. There is no evidence for cryptic speciation for D. citri based on the COI region included in this study. PMID:22717059

  1. More than one kind of inference: re-examining what's learned in feature inference and classification.

    Science.gov (United States)

    Sweller, Naomi; Hayes, Brett K

    2010-08-01

    Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.

  2. Constrained bayesian inference of project performance models

    OpenAIRE

    Sunmola, Funlade

    2013-01-01

    Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...

  3. Transcriptome analysis reveals the same 17 S-locus F-box genes in two haplotypes of the self-incompatibility locus of Petunia inflata.

    Science.gov (United States)

    Williams, Justin S; Der, Joshua P; dePamphilis, Claude W; Kao, Teh-Hui

    2014-07-01

    Petunia possesses self-incompatibility, by which pistils reject self-pollen but accept non-self-pollen for fertilization. Self-/non-self-recognition between pollen and pistil is regulated by the pistil-specific S-RNase gene and by multiple pollen-specific S-locus F-box (SLF) genes. To date, 10 SLF genes have been identified by various methods, and seven have been shown to be involved in pollen specificity. For a given S-haplotype, each SLF interacts with a subset of its non-self S-RNases, and an as yet unknown number of SLFs are thought to collectively mediate ubiquitination and degradation of all non-self S-RNases to allow cross-compatible pollination. To identify a complete suite of SLF genes of P. inflata, we used a de novo RNA-seq approach to analyze the pollen transcriptomes of S2-haplotype and S3-haplotype, as well as the leaf transcriptome of the S3S3 genotype. We searched for genes that fit several criteria established from the properties of the known SLF genes and identified the same seven new SLF genes in S2-haplotype and S3-haplotype, suggesting that a total of 17 SLF genes constitute pollen specificity in each S-haplotype. This finding lays the foundation for understanding how multiple SLF genes evolved and the biochemical basis for differential interactions between SLF proteins and S-RNases. © 2014 American Society of Plant Biologists. All rights reserved.

  4. Perceptual inference.

    Science.gov (United States)

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Analysis of associations between major histocompatibility complex (BoLA) class I haplotypes and subclinical mastitis of dairy cows

    DEFF Research Database (Denmark)

    Aarestrup, Frank Møller; Jensen, N. E.; Østergård, H.

    1995-01-01

    The associations between BoLA class I haplotypes and subclinical mastitis were investigated using information on 333 cows from three different breeds and crossbreeds from 14 dairy herds in Denmark. Somatic cell count and bacteriological status were used as markers for subclinical mastitis....... Associations between BoLA class I haplotypes and IMI status were also determined. The association between BoLA class I haplotypes and subclinical mastitis was weak. The A10(W50), A11, A12(A30), A16, A19(A6), A21, A26, and A31(A30) alleles were associated with different markers of subclinical mastitis....... Susceptibility or resistance to the two bacteria categories was associated with different alleles. This study indicated that BoLA antigens may be involved in resistance to mastitis and that resistance may be specific for a particular pathogen....

  6. ABCB1 haplotype and OPRM1 118A > G genotype interaction in methadone maintenance treatment pharmacogenetics

    Directory of Open Access Journals (Sweden)

    Barratt DT

    2012-04-01

    Full Text Available Daniel T Barratt1, Janet K Coller1, Richard Hallinan2, Andrew Byrne2, Jason M White1, David JR Foster3, Andrew A Somogyi1,41Discipline of Pharmacology, School of Medical Sciences, University of Adelaide, Adelaide, South Australia; 2The Byrne Surgery, Specialist Drug and Alcohol Practice, Redfern, New South Wales; 3Division of Health Sciences, Sansom Institute, School of Pharmacy and Medical Sciences, University of South Australia, Adelaide, South Australia; 4Department of Clinical Pharmacology, Royal Adelaide Hospital, Adelaide, South Australia, AustraliaBackground: Genetic variability in ABCB1, encoding the P-glycoprotein efflux transporter, has been linked to altered methadone maintenance treatment dose requirements. However, subsequent studies have indicated that additional environmental or genetic factors may confound ABCB1 pharmacogenetics in different methadone maintenance treatment settings. There is evidence that genetic variability in OPRM1, encoding the mu opioid receptor, and ABCB1 may interact to affect morphine response in opposite ways. This study aimed to examine whether a similar gene-gene interaction occurs for methadone in methadone maintenance treatment.Methods: Opioid-dependent subjects (n = 119 maintained on methadone (15–300 mg/day were genotyped for five single nucleotide polymorphisms of ABCB1 (61A > G; 1199G > A; 1236C > T; 2677G > T; 3435C > T, as well as for the OPRM1 18A > G single nucleotide polymorphism. Subjects’ methadone doses and trough plasma (R-methadone concentrations (Ctrough were compared between ABCB1 haplotypes (with and without controlling for OPRM1 genotype, and between OPRM1 genotypes (with and without controlling for ABCB1 haplotype.Results: Among wild-type OPRM1 subjects, an ABCB1 variant haplotype group (subjects with a wild-type and 61A:1199G:1236C:2677T:3435T haplotype combination, or homozygous for the 61A:1199G:1236C:2677T:3435T haplotype had significantly lower doses (median ± standard

  7. Asian population frequencies and haplotype distribution of killer cell immunoglobulin-like receptor (KIR) genes among Chinese, Malay, and Indian in Singapore.

    Science.gov (United States)

    Lee, Yi Chuan; Chan, Soh Ha; Ren, Ee Chee

    2008-11-01

    Killer cell immunoglobulin-like receptors (KIR) gene frequencies have been shown to be distinctly different between populations and contribute to functional variation in the immune response. We have investigated KIR gene frequencies in 370 individuals representing three Asian populations in Singapore and report here the distribution of 14 KIR genes (2DL1, 2DL2, 2DL3, 2DL4, 2DL5, 2DS1, 2DS2, 2DS3, 2DS4, 2DS5, 3DL1, 3DL2, 3DL3, 3DS1) with two pseudogenes (2DP1, 3DP1) among Singapore Chinese (n = 210); Singapore Malay (n = 80), and Singapore Indian (n = 80). Four framework genes (KIR3DL3, 3DP1, 2DL4, 3DL2) and a nonframework pseudogene 2DP1 were detected in all samples while KIR2DS2, 2DL2, 2DL5, and 2DS5 had the greatest significant variation across the three populations. Fifteen significant linkage patterns, consistent with associations between genes of A and B haplotypes, were observed. Eighty-four distinct KIR profiles were determined in our populations, 38 of which had not been described in other populations. KIR haplotype studies were performed using nine Singapore Chinese families comprising 34 individuals. All genotypes could be resolved into corresponding pairs of existing haplotypes with eight distinct KIR genotypes and eight different haplotypes. The haplotype A2 with frequency of 63.9% was dominant in Singapore Chinese, comparable to that reported in Korean and Chinese Han. The A haplotypes predominate in Singapore Chinese, with ratio of A to B haplotypes of approximately 3:1. Comparison with KIR frequencies in other populations showed that Singapore Chinese shared similar distributions with Chinese Han, Japanese, and Korean; Singapore Indian was found to be comparable with North Indian Hindus while Singapore Malay resembled the Thai.

  8. Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations

    Directory of Open Access Journals (Sweden)

    Omberg Larsson

    2012-06-01

    Full Text Available Abstract Background Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. Results Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. Conclusions By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought.

  9. Associations of Haplotypes Upstream of IRS1 with Insulin Resistance, Type 2 Diabetes, Dyslipidemia, Preclinical Atherosclerosis, and Skeletal Muscle LOC646736 mRNA Levels

    Directory of Open Access Journals (Sweden)

    Selma M. Soyal

    2015-01-01

    Full Text Available The genomic region ~500 kb upstream of IRS1 has been implicated in insulin resistance, type 2 diabetes, adverse lipid profile, and cardiovascular risk. To gain further insight into this chromosomal region, we typed four SNPs in a cross-sectional cohort and subjects with type 2 diabetes recruited from the same geographic region. From 16 possible haplotypes, 6 haplotypes with frequencies >0.01 were observed. We identified one haplotype that was protective against insulin resistance (determined by HOMA-IR and fasting plasma insulin levels, type 2 diabetes, an adverse lipid profile, increased C-reactive protein, and asymptomatic atherosclerotic disease (assessed by intima media thickness of the common carotid arteries. BMI and total adipose tissue mass as well as visceral and subcutaneous adipose tissue mass did not differ between the reference and protective haplotypes. In 92 subjects, we observed an association of the protective haplotype with higher skeletal muscle mRNA levels of LOC646736, which is located in the same haplotype block as the informative SNPs and is mainly expressed in skeletal muscle, but only at very low levels in liver or adipose tissues. These data suggest a role for LOC646736 in human insulin resistance and warrant further studies on the functional effects of this locus.

  10. Causal inference of asynchronous audiovisual speech

    Directory of Open Access Journals (Sweden)

    John F Magnotti

    2013-11-01

    Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.

  11. Adaptive surrogate modeling for response surface approximations with application to bayesian inference

    KAUST Repository

    Prudhomme, Serge; Bryant, Corey M.

    2015-01-01

    Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.

  12. Adaptive surrogate modeling for response surface approximations with application to bayesian inference

    KAUST Repository

    Prudhomme, Serge

    2015-09-17

    Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.

  13. HLA-A, -B, -DRB1 allele and haplotype frequencies of 920 cord blood units from Central Chile.

    Science.gov (United States)

    Schäfer, Christian; Sauter, Jürgen; Riethmüller, Tobias; Kashi, Zahra Mehdizadeh; Schmidt, Alexander H; Barriga, Francisco J

    2016-08-01

    We present human leukocyte antigen (HLA) haplotype and allele/antigenic group frequencies derived from a data set of 920 umbilical cord blood units collected in Central Chile. HLA-A and -B genotypes were typed using sequence specific oligonucleotide probe methods while HLA-DRB1 genotypes were obtained from sequencing-based typing. The most frequent haplotype is A*29~B*44~DRB1*07:01 with an estimated frequency of 2.1%. Copyright © 2016 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.

  14. New chicken Rfp-Y haplotypes on the basis of MHC class II RFLP and MLC analyses

    DEFF Research Database (Denmark)

    Juul-Madsen, H R; Zoorob, R; Auffray, C

    1997-01-01

    New chicken Rfp-Y haplotypes were determined by the use of restriction fragment length polymorphism (RFLP) and mixed lymphocyte culture (MLC) in four different chicken haplotypes, B15, B19, B21, B201. The RFLP polymorphism was mapped to the Rfp-Y system by the use of a subclone (18.1) which maps...... near a polymorphic lectin gene located in the Rfp-Y system and DNA from families with known segregation of the implicated RFLP polymorphism. For the first time it is shown that major histocompatibility complex class II genes in the Rfp-Y system have functional implications. Sequence information...

  15. Mixed integer linear programming for maximum-parsimony phylogeny inference.

    Science.gov (United States)

    Sridhar, Srinath; Lam, Fumei; Blelloch, Guy E; Ravi, R; Schwartz, Russell

    2008-01-01

    Reconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue to make effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear programming (ILP) formulations to find the most parsimonious phylogenetic tree from a set of binary variation data. One method uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has, however, proven extremely efficient in practice on datasets that are well beyond the reach of the available provably efficient methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times competitive with fast heuristics than cannot guarantee optimality. An alternative formulation establishes that the problem can be solved with a polynomial-sized ILP. We further present a web server developed based on the exponential-sized ILP that performs fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human genome.

  16. Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions

    KAUST Repository

    Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul

    2016-01-01

    In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.

  17. Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions

    KAUST Repository

    Ruggeri, Fabrizio

    2015-01-07

    In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.

  18. Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions

    KAUST Repository

    Ruggeri, Fabrizio

    2016-01-06

    In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.

  19. Frequency of alleles and haplotypes of the human leukocyte antigen system in Bauru, São Paulo, Brazil

    Directory of Open Access Journals (Sweden)

    Luana de Cassia Salvadori

    2014-04-01

    Full Text Available Background: HLA allele identification is used in bone marrow transplant programs as HLA compatibility between the donor and recipient may prevent graft rejection. Objective: This study aimed to estimate the frequency of alleles and haplotypes of the HLA system in the region of Bauru and compare these with the frequencies found in other regions of the country. Methods: HLA-A*, HLA-B*, and HLA-DRB1* allele frequencies and haplotypes were analyzed in a sample of 3542 volunteer donors at the National Registry of Voluntary Bone Marrow Donors (REDOME in Bauru. HLA low resolution typing was performed using reverse line blot with the Dynal Reli(tm SSO-HLA Typing Kit and automated Dynal AutoReli(tm48 device (Invitrogen, USA. Results: Twenty, 36, and 13 HLA-A*, HLA-B*, and HLA-DRB1* allele groups, respectively, were identified. The most common alleles for each locus were HLA-A*02, HLA-B*35, and HLA-DRB1*07. The most frequent haplotype was A*01-B*08-DRB1*03. Allele and haplotype frequencies were compared to other regions in Brazil and the similarities and differences among populations are shown. Conclusion: The knowledge of the immunogenic profile of a population contributes to the comprehension of the historical and anthropological aspects of different regions. Moreover, this helps to find suitable donors quickly, thereby shortening waiting lists for transplants and thus increasing survival rates among recipients.

  20. Generative inference for cultural evolution.

    Science.gov (United States)

    Kandler, Anne; Powell, Adam

    2018-04-05

    One of the major challenges in cultural evolution is to understand why and how various forms of social learning are used in human populations, both now and in the past. To date, much of the theoretical work on social learning has been done in isolation of data, and consequently many insights focus on revealing the learning processes or the distributions of cultural variants that are expected to have evolved in human populations. In population genetics, recent methodological advances have allowed a greater understanding of the explicit demographic and/or selection mechanisms that underlie observed allele frequency distributions across the globe, and their change through time. In particular, generative frameworks-often using coalescent-based simulation coupled with approximate Bayesian computation (ABC)-have provided robust inferences on the human past, with no reliance on a priori assumptions of equilibrium. Here, we demonstrate the applicability and utility of generative inference approaches to the field of cultural evolution. The framework advocated here uses observed population-level frequency data directly to establish the likely presence or absence of particular hypothesized learning strategies. In this context, we discuss the problem of equifinality and argue that, in the light of sparse cultural data and the multiplicity of possible social learning processes, the exclusion of those processes inconsistent with the observed data might be the most instructive outcome. Finally, we summarize the findings of generative inference approaches applied to a number of case studies.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).

  1. Identification of a type 1 diabetes-associated CD4 promoter haplotype with high constitutive activity

    DEFF Research Database (Denmark)

    Kristiansen, O P; Karlsen, A E; Larsen, Z M

    2004-01-01

    screened the human CD4 promoter for mutations and identified three frequent single nucleotide polymorphisms (SNPs): CD4-181C/G, CD4-521C/G and CD4-1050T/C. The SNPs are in strong linkage disequilibrium (LD) and association with the CD4-1188(TTTTC)(5-14) alleles, and we observed nine CD4 promoter haplotypes...... promoter activity and (2) the CD4-181G variant encodes higher stimulated promoter activity than the CD4-181C variant. This difference is in part neutralized in the frequently occurring CD4 promoter haplotypes by the more upstream genetic variants. Thus, we report functional impact of a novel CD4-181C/G SNP...

  2. The NIFTY way of Bayesian signal inference

    International Nuclear Information System (INIS)

    Selig, Marco

    2014-01-01

    We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D 3 PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy

  3. The NIFTy way of Bayesian signal inference

    Science.gov (United States)

    Selig, Marco

    2014-12-01

    We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.

  4. Two major groups of chloroplast DNA haplotypes in diploid and tetraploid Aconitum subgen. Aconitum (Ranunculaceae in the Carpathians

    Directory of Open Access Journals (Sweden)

    J. Mitka

    2016-04-01

    Full Text Available Aconitum in Europe is represented by ca. 10% of the total number of species and the Carpathian Mts. are the center of the genus variability in the subcontinent. We studied the chloroplast DNA intergenic spacer trnL(UAG-rpl32- ndhF (cpDNA variability of the Aconitum subgen. Aconitum in the Carpathians: diploids (2n=16, sect. Cammarum, tetraploids (2n=32, sect. Aconitum and triploids (2n=24, nothosect. Acomarum. Altogether 25 Aconitum accessions representing the whole taxonomic variability of the subgenus were sequenced and subjected to phylogenetic analyses. Both parsimony, Bayesian and character network analyses showed the two distinct types of the cpDNA chloroplast, one typical of the diploid and the second of the tetraploid groups. Some specimens had identical cpDNA sequences (haplotypes and scattered across the whole mountain arch. In the sect. Aconitum 9 specimens shared one haplotype, while in the sect. Camarum one haplotype represents 4 accessions and the second – 5 accessions. The diploids and tetraploids were diverged by 6 mutations, while the intrasectional variability amounted maximally to 3 polymorphisms. Taking into consideration different types of cpDNA haplotypes and ecological profiles of the sections (tetraploids – high‑mountain species, diploids – species from forest montane belt we speculate on the different and independent history of the sections in the Carpathians.

  5. Frequency and origin of haplotypes associated with the beta-globin gene cluster in individuals with trait and sickle cell anemia in the Atlantic and Pacific coastal regions of Colombia.

    Science.gov (United States)

    Fong, Cristian; Lizarralde-Iragorri, María Alejandra; Rojas-Gallardo, Diana; Barreto, Guillermo

    2013-12-01

    Sickle cell anemia is a genetic disease with high prevalence in people of African descent. There are five typical haplotypes associated with this disease and the haplotypes associated with the beta-globin gene cluster have been used to establish the origin of African-descendant people in America. In this work, we determined the frequency and the origin of haplotypes associated with hemoglobin S in a sample of individuals with sickle cell anemia (HbSS) and sickle cell hemoglobin trait (HbAS) in coastal regions of Colombia. Blood samples from 71 HbAS and 79 HbSS individuals were obtained. Haplotypes were determined based on the presence of variable restriction sites within the β-globin gene cluster. On the Pacific coast of Colombia the most frequent haplotype was Benin, while on the Atlantic coast Bantu was marginally higher than Benin. Eight atypical haplotypes were observed on both coasts, being more diverse in the Atlantic than in the Pacific region. These results suggest a differential settlement of the coasts, dependent on where slaves were brought from, either from the Gulf of Guinea or from Angola, where the haplotype distributions are similar. Atypical haplotypes probably originated from point mutations that lost or gained a restriction site and/or by recombination events.

  6. The Systemic Lupus Erythematosus IRF5 Risk Haplotype Is Associated with Systemic Sclerosis

    Science.gov (United States)

    Beretta, Lorenzo; Simeón, Carmen P.; Carreira, Patricia E.; Callejas, José Luis; Fernández-Castro, Mónica; Sáez-Comet, Luis; Beltrán, Emma; Camps, María Teresa; Egurbide, María Victoria; Airó, Paolo; Scorza, Raffaella; Lunardi, Claudio; Hunzelmann, Nicolas; Riemekasten, Gabriela; Witte, Torsten; Kreuter, Alexander; Distler, Jörg H. W.; Madhok, Rajan; Shiels, Paul; van Laar, Jacob M.; Fonseca, Carmen; Denton, Christopher; Herrick, Ariane; Worthington, Jane; Schuerwegh, Annemie J.; Vonk, Madelon C.; Voskuyl, Alexandre E.; Radstake, Timothy R. D. J.; Martín, Javier

    2013-01-01

    Systemic sclerosis (SSc) is a fibrotic autoimmune disease in which the genetic component plays an important role. One of the strongest SSc association signals outside the human leukocyte antigen (HLA) region corresponds to interferon (IFN) regulatory factor 5 (IRF5), a major regulator of the type I IFN pathway. In this study we aimed to evaluate whether three different haplotypic blocks within this locus, which have been shown to alter the protein function influencing systemic lupus erythematosus (SLE) susceptibility, are involved in SSc susceptibility and clinical phenotypes. For that purpose, we genotyped one representative single-nucleotide polymorphism (SNP) of each block (rs10488631, rs2004640, and rs4728142) in a total of 3,361 SSc patients and 4,012 unaffected controls of Caucasian origin from Spain, Germany, The Netherlands, Italy and United Kingdom. A meta-analysis of the allele frequencies was performed to analyse the overall effect of these IRF5 genetic variants on SSc. Allelic combination and dependency tests were also carried out. The three SNPs showed strong associations with the global disease (rs4728142: P  = 1.34×10−8, OR  = 1.22, CI 95%  = 1.14–1.30; rs2004640: P  = 4.60×10−7, OR  = 0.84, CI 95%  = 0.78–0.90; rs10488631: P  = 7.53×10−20, OR  = 1.63, CI 95%  = 1.47–1.81). However, the association of rs2004640 with SSc was not independent of rs4728142 (conditioned P  = 0.598). The haplotype containing the risk alleles (rs4728142*A-rs2004640*T-rs10488631*C: P  = 9.04×10−22, OR  = 1.75, CI 95%  = 1.56–1.97) better explained the observed association (likelihood P-value  = 1.48×10−4), suggesting an additive effect of the three haplotypic blocks. No statistical significance was observed in the comparisons amongst SSc patients with and without the main clinical characteristics. Our data clearly indicate that the SLE risk haplotype also influences SSc predisposition, and that

  7. Optimal inference with suboptimal models: Addiction and active Bayesian inference

    Science.gov (United States)

    Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

    2015-01-01

    When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321

  8. Endothelial Nitric Oxide Synthase Haplotypes Are Associated with Preeclampsia in Maya Mestizo Women

    Directory of Open Access Journals (Sweden)

    Lizbeth Díaz-Olguín

    2011-01-01

    Full Text Available Preeclampsia is a specific disease of pregnancy and believed to have a genetic component. The aim of this study was to investigate if three polymorphisms in eNOS or their haplotypes are associated with preeclampsia in Maya mestizo women.

  9. Falciparum malaria in the north of Laos: the occurrence and implications of the Plasmodium falciparum chloroquine resistance transporter (pfcrt) gene haplotype SVMNT

    DEFF Research Database (Denmark)

    Dittrich, Sabine; Alifrangis, Michael; Stohrer, Jörg M

    2005-01-01

    the SVMNT haplotype. METHOD: Eighty-eight samples from an area with reported in vivo Chloroquine and in vitro Amodiaquine-resistance were screened for the K76T mutation and their Pfcrt-haplotype (c72-76) using a new SSOP-ELISA. RESULTS: Hundred percent of the analysed samples showed the K76T mutation which......OBJECTIVE: The Pfcrt-gene encodes a transmembrane protein located in the Plasmodium falciparum digestive vacuole. Chloroquine resistant (CQR) strains of African and Southeast Asian origin carry the Pfcrt-haplotype (c72-76) CVIET, whereas most South American and Papua New Guinean CQR stains carry...... is highly associated with in vivo drug failure. This very high rate of a CQR-marker is alarming in an area were CQ is still used as first line drug. The distribution of the three main Pfcrt-haplotypes was as follows: 68% CVIET, 31% SVMNT, 0% CVMNT. CONCLUSIONS: These data show, for the first time, the South...

  10. A common haplotype in the G-protein-coupled receptor gene GPR74 is associated with leanness and increased lipolysis

    DEFF Research Database (Denmark)

    Dahlman, Ingrid; Dicker, Andrea; Jiao, Hong

    2007-01-01

    0.36; P=.036) among those selected for obese or lean phenotypes. The ATAG haplotype was associated with increased adipocyte lipid mobilization (lipolysis) in vivo and in vitro. In human fat cells, GPR74 receptor stimulation and inhibition caused a significant and marked decrease and increase......, respectively, of lipolysis, which could be linked to catecholamine stimulation of adipocytes through beta -adrenergic receptors. These findings suggest that a common haplotype in the GPR74 gene protects against obesity, which, at least in part, is caused by a relief of inhibition of lipid mobilization from......The G-protein-coupled receptor GPR74 is a novel candidate gene for body weight regulation. In humans, it is predominantly expressed in brain, heart, and adipose tissue. We report a haplotype in the GPR74 gene, ATAG, with allele frequency ~4% in Scandinavian cohorts, which was associated...

  11. Haplotype frequency distribution for 7 microsatellites in chromosome 8 and 11 in relation to the metabolic syndrome in four ethnic groups: Tehran Lipid and Glucose Study.

    Science.gov (United States)

    Daneshpour, Maryam Sadat; Hosseinzadeh, Nima; Zarkesh, Maryam; Azizi, Fereidoun

    2012-03-01

    Different variants of haplotype frequencies may lead to various frequencies of the same variants in individuals with drug resistance and disease susceptibility at the population level. In this study, the haplotype frequencies of 4 STR loci including the D8S1132, D8S1779, D8S514 and D8S1743, and 3 STR loci including D11S1304, D11S1998 and D11S934 were investigated in 563 individuals of four Iranian ethnic groups in the capital city of Iran, Tehran. One hundred thirty subjects had the metabolic syndrome. Haplotype frequencies of all markers were calculated. There were significant differences in the haplotype frequencies in short and long alleles between the metabolic affected subjects and controls. In addition, haplotype frequencies were significant in the four ethnic groups in both chromosomes 8 and 11. Our findings show a relation between the short allele of D8S1743 in all related haplotype frequencies of subjects with metabolic syndrome. These findings may require more studies of some candidate genes, including the lipoprotein lipase gene, in this chromosomal region. Copyright © 2011. Published by Elsevier B.V.

  12. Role of MAPT mutations and haplotype in frontotemporal lobar degeneration in Northern Finland

    Directory of Open Access Journals (Sweden)

    Tuominen Hannu

    2008-12-01

    Full Text Available Abstract Background Frontotemporal lobar degeneration (FTLD consists of a clinically and neuropathologically heterogeneous group of syndromes affecting the frontal and temporal lobes of the brain. Mutations in microtubule-associated protein tau (MAPT, progranulin (PGRN and charged multi-vesicular body protein 2B (CHMP2B are associated with familial forms of the disease. The prevalence of these mutations varies between populations. The H1 haplotype of MAPT has been found to be closely associated with tauopathies and with sporadic FTLD. Our aim was to investigate MAPT mutations and haplotype frequencies in a clinical series of patients with FTLD in Northern Finland. Methods MAPT exons 1, 2 and 9–13 were sequenced in 59 patients with FTLD, and MAPT haplotypes were analysed in these patients, 122 patients with early onset Alzheimer's disease (eoAD and 198 healthy controls. Results No pathogenic mutations were found. The H2 allele frequency was 11.0% (P = 0.028 in the FTLD patients, 9.8% (P = 0.029 in the eoAD patients and 5.3% in the controls. The H2 allele was especially clustered in patients with a positive family history (P = 0.011 but did not lower the age at onset of the disease. The ApoE4 allele frequency was significantly increased in the patients with eoAD and in those with FTLD. Conclusion We conclude that although pathogenic MAPT mutations are rare in Northern Finland, the MAPT H2 allele may be associated with increased risks of FTLD and eoAD in the Finnish population.

  13. Serpin peptidase inhibitor (SERPINB5) haplotypes are associated with susceptibility to hepatocellular carcinoma

    Science.gov (United States)

    Yang, Shun-Fa; Yeh, Chao-Bin; Chou, Ying-Erh; Lee, Hsiang-Lin; Liu, Yu-Fan

    2016-05-01

    Hepatocellular carcinoma (HCC) represents the second leading cause of cancer-related death worldwide. The serpin peptidase inhibitor SERPINB5 is a tumour-suppressor gene that promotes the development of various cancers in humans. However, whether SERPINB5 gene variants play a role in HCC susceptibility remains unknown. In this study, we genotyped 6 SNPs of the SERPINB5 gene in an independent cohort from a replicate population comprising 302 cases and 590 controls. Additionally, patients who had at least one rs2289520 C allele in SERPINB5 tended to exhibit better liver function than patients with genotype GG (Child-Pugh grade A vs. B or C; P = 0.047). Next, haplotype blocks were reconstructed according to the linkage disequilibrium structure of the SERPINB5 gene. A haplotype “C-C-C” (rs17071138 + rs3744941 + rs8089204) in SERPINB5-correlated promoter showed a significant association with an increased HCC risk (AOR = 1.450 P = 0.031). Haplotypes “T-C-A” and “C-C-C” (rs2289519 + rs2289520 + rs1455555) located in the SERPINB5 coding region had a decreased (AOR = 0.744 P = 0.031) and increased (AOR = 1.981 P = 0.001) HCC risk, respectively. Finally, an additional integrated in silico analysis confirmed that these SNPs affected SERPINB5 expression and protein stability, which significantly correlated with tumour expression and subsequently with tumour development and aggressiveness. Taken together, our findings regarding these biomarkers provide a prediction model for risk assessment.

  14. The problem of sampling families rather than populations: Relatedness among individuals in samples of juvenile brown trout Salmo trutta L

    DEFF Research Database (Denmark)

    Hansen, Michael Møller; Eg Nielsen, Einar; Mensberg, Karen-Lise Dons

    1997-01-01

    In species exhibiting a nonrandom distribution of closely related individuals, sampling of a few families may lead to biased estimates of allele frequencies in populations. This problem was studied in two brown trout populations, based on analysis of mtDNA and microsatellites. In both samples mt......DNA haplotype frequencies differed significantly between age classes, and in one sample 17 out of 18 individuals less than 1 year of age shared one particular mtDNA haplotype. Estimates of relatedness showed that these individuals most likely represented only three full-sib families. Older trout exhibiting...

  15. Causal Inference and Explaining Away in a Spiking Network

    Science.gov (United States)

    Moreno-Bote, Rubén; Drugowitsch, Jan

    2015-01-01

    While the brain uses spiking neurons for communication, theoretical research on brain computations has mostly focused on non-spiking networks. The nature of spike-based algorithms that achieve complex computations, such as object probabilistic inference, is largely unknown. Here we demonstrate that a family of high-dimensional quadratic optimization problems with non-negativity constraints can be solved exactly and efficiently by a network of spiking neurons. The network naturally imposes the non-negativity of causal contributions that is fundamental to causal inference, and uses simple operations, such as linear synapses with realistic time constants, and neural spike generation and reset non-linearities. The network infers the set of most likely causes from an observation using explaining away, which is dynamically implemented by spike-based, tuned inhibition. The algorithm performs remarkably well even when the network intrinsically generates variable spike trains, the timing of spikes is scrambled by external sources of noise, or the network is mistuned. This type of network might underlie tasks such as odor identification and classification. PMID:26621426

  16. Mathematical inference and control of molecular networks from perturbation experiments

    Science.gov (United States)

    Mohammed-Rasheed, Mohammed

    One of the main challenges facing biologists and mathematicians in the post genomic era is to understand the behavior of molecular networks and harness this understanding into an educated intervention of the cell. The cell maintains its function via an elaborate network of interconnecting positive and negative feedback loops of genes, RNA and proteins that send different signals to a large number of pathways and molecules. These structures are referred to as genetic regulatory networks (GRNs) or molecular networks. GRNs can be viewed as dynamical systems with inherent properties and mechanisms, such as steady-state equilibriums and stability, that determine the behavior of the cell. The biological relevance of the mathematical concepts are important as they may predict the differentiation of a stem cell, the maintenance of a normal cell, the development of cancer and its aberrant behavior, and the design of drugs and response to therapy. Uncovering the underlying GRN structure from gene/protein expression data, e.g., microarrays or perturbation experiments, is called inference or reverse engineering of the molecular network. Because of the high cost and time consuming nature of biological experiments, the number of available measurements or experiments is very small compared to the number of molecules (genes, RNA and proteins). In addition, the observations are noisy, where the noise is due to the measurements imperfections as well as the inherent stochasticity of genetic expression levels. Intra-cellular activities and extra-cellular environmental attributes are also another source of variability. Thus, the inference of GRNs is, in general, an under-determined problem with a highly noisy set of observations. The ultimate goal of GRN inference and analysis is to be able to intervene within the network, in order to force it away from undesirable cellular states and into desirable ones. However, it remains a major challenge to design optimal intervention strategies

  17. The ocean circulation inverse problem

    National Research Council Canada - National Science Library

    Wunsch, C

    1996-01-01

    .... This book addresses the problem of inferring the state of the ocean circulation, understanding it dynamically, and even forecasting it through a quantitative combination of theory and observation...

  18. Haplotype Analysis Discriminates Genetic Risk for DR3-Associated Endocrine Autoimmunity and Helps Define Extreme Risk for Addison’s Disease

    Science.gov (United States)

    Baker, Peter R.; Baschal, Erin E.; Fain, Pam R.; Triolo, Taylor M.; Nanduri, Priyaanka; Siebert, Janet C.; Armstrong, Taylor K.; Babu, Sunanda R.; Rewers, Marian J.; Gottlieb, Peter A.; Barker, Jennifer M.; Eisenbarth, George S.

    2010-01-01

    Context: Multiple autoimmune disorders (e.g. Addison’s disease, type 1 diabetes, celiac disease) are associated with HLA-DR3, but it is likely that alleles of additional genes in linkage disequilibrium with HLA-DRB1 contribute to disease. Objective: The objective of the study was to characterize major histocompatability complex (MHC) haplotypes conferring extreme risk for autoimmune Addison’s disease (AD). Design, Setting, and Participants: Eighty-six 21-hydroxylase autoantibody-positive, nonautoimmune polyendocrine syndrome type 1, Caucasian individuals collected from 1992 to 2009 with clinical AD from 68 families (12 multiplex and 56 simplex) were genotyped for HLA-DRB1, HLA-DQB1, MICA, HLA-B, and HLA-A as well as high density MHC single-nucleotide polymorphism (SNP) analysis for 34. Main Outcome Measures: AD and genotype were measured. Result: Ninety-seven percent of the multiplex individuals had both HLA-DR3 and HLA-B8 vs. 60% of simplex AD patients (P = 9.72 × 10−4) and 13% of general population controls (P = 3.00 × 10−19). The genotype DR3/DR4 with B8 was present in 85% of AD multiplex patients, 24% of simplex patients, and 1.5% of control individuals (P = 4.92 × 10−191). The DR3-B8 haplotype of AD patients had HLA-A1 less often (47%) than controls (81%, P = 7.00 × 10−5) and type 1 diabetes patients (73%, P = 1.93 × 10−3). Analysis of 1228 SNPs across the MHC for individuals with AD revealed a shorter conserved haplotype (3.8) with the loss of the extended conserved 3.8.1 haplotype approximately halfway between HLA-B and HLA-A. Conclusion: Extreme risk for AD, especially in multiplex families, is associated with haplotypic DR3 variants, in particular a portion (3.8) but not all of the conserved 3.8.1 haplotype. PMID:20631027

  19. Haplotype Diversity at Sub1 Locus and Allelic Distribution Among Rice Varieties of Tide and Flood Prone Areas of South-East Asia

    Directory of Open Access Journals (Sweden)

    A.S.M. Masuduzzaman

    2017-07-01

    Full Text Available Single nucleotide polymorphisms and restriction digestion-based haplotype variations among 160 flood prone rice varieties were analyzed with enzymes Alu I and Cac8 I to generate polymorphisms at Sub1A and Sub1C loci (conferring submergence tolerance, respectively. Haplotype associated with phenotype was used to study the haplotype variations at Sub1A and Sub1C loci and to determine their functional influence on submergence tolerance and stem elongation. Three patterns at Sub1A locus, Sub1A0 (null allele, Sub1A1 (does not cut and Sub1A2 (one SNP, and four patterns at Sub1C locus, Sub1C1, Sub1C2, Sub1C3 and Sub1C4, were generated. Both tolerant Sub1A1 and intolerant Sub1A2 had the same length, but the difference was presence of a restriction site in the Sub1A2, but absent at the Sub1A1. Further, two types of polymorphism were detected at the Sub1C, one included major length polymorphisms (165, 170 and 175 bp and the other was a single restriction site at different position. Eight haplotypes (different combinations of the two loci, A1C1, A1C2, A1C4, A2C2, A2C4, A0C2, A0C3 and A0C4, were detected among 160 varieties. Haplotype A1C1 was comparatively more related to haplotypes A1C2 and A1C4, having the same Sub1A allele, and these haplotypes were found only in Bangladeshi, Sri Lankan and Indian varieties. Most tolerant varieties in A1C1 haplotype showed slow elongation, having tolerant specific Sub1A1 and Sub1C1 alleles. Further, the varieties Madabaru and Kottamali (A2C2 also showed moderate level of tolerance without Sub1A1 allele. These varieties were different with FR13A and also suspected to carry different novel tolerant genes at other loci. These materials could be used for hybridization with Sub1 varieties for pyramiding additional tolerant specific alleles into a single genotype for improving submergence tolerance in rice.

  20. Reconciling taxonomy and phylogenetic inference: formalism and algorithms for describing discord and inferring taxonomic roots

    Directory of Open Access Journals (Sweden)

    Matsen Frederick A

    2012-05-01

    Full Text Available Abstract Background Although taxonomy is often used informally to evaluate the results of phylogenetic inference and the root of phylogenetic trees, algorithmic methods to do so are lacking. Results In this paper we formalize these procedures and develop algorithms to solve the relevant problems. In particular, we introduce a new algorithm that solves a "subcoloring" problem to express the difference between a taxonomy and a phylogeny at a given rank. This algorithm improves upon the current best algorithm in terms of asymptotic complexity for the parameter regime of interest; we also describe a branch-and-bound algorithm that saves orders of magnitude in computation on real data sets. We also develop a formalism and an algorithm for rooting phylogenetic trees according to a taxonomy. Conclusions The algorithms in this paper, and the associated freely-available software, will help biologists better use and understand taxonomically labeled phylogenetic trees.

  1. Facility Activity Inference Using Radiation Networks

    Energy Technology Data Exchange (ETDEWEB)

    Rao, Nageswara S. [ORNL; Ramirez Aviles, Camila A. [ORNL

    2017-11-01

    We consider the problem of inferring the operational status of a reactor facility using measurements from a radiation sensor network deployed around the facility’s ventilation off-gas stack. The intensity of stack emissions decays with distance, and the sensor counts or measurements are inherently random with parameters determined by the intensity at the sensor’s location. We utilize the measurements to estimate the intensity at the stack, and use it in a one-sided Sequential Probability Ratio Test (SPRT) to infer on/off status of the reactor. We demonstrate the superior performance of this method over conventional majority fusers and individual sensors using (i) test measurements from a network of 21 NaI detectors, and (ii) effluence measurements collected at the stack of a reactor facility. We also analytically establish the superior detection performance of the network over individual sensors with fixed and adaptive thresholds by utilizing the Poisson distribution of the counts. We quantify the performance improvements of the network detection over individual sensors using the packing number of the intensity space.

  2. PTCH1 gene haplotype association with basal cell carcinoma after transplantation.

    Science.gov (United States)

    Begnini, A; Tessari, G; Turco, A; Malerba, G; Naldi, L; Gotti, E; Boschiero, L; Forni, A; Rugiu, C; Piaserico, S; Fortina, A B; Brunello, A; Cascone, C; Girolomoni, G; Gomez Lira, M

    2010-08-01

    Basal cell carcinoma (BCC) is 10 times more frequent in organ transplant recipients (OTRs) than in the general population. Factors in OTRs conferring increased susceptibility to BCC include ultraviolet radiation exposure, immunosuppression, viral infections such as human papillomavirus, phototype and genetic predisposition. The PTCH1 gene is a negative regulator of the hedgehog pathway, that provides mitogenic signals to basal cells in skin. PTCH1 gene mutations cause naevoid BCC syndrome, and contribute to the development of sporadic BCC and other types of cancers. Associations have been reported between PTCH1 polymorphisms and BCC susceptibility in nontransplanted individuals. To search for novel common polymorphisms in the proximal 5' regulatory region upstream of PTCH1 gene exon 1B, and to investigate the possible association of PTCH1 polymorphisms and haplotypes with BCC risk after organ transplantation. Three PTCH1 single nucleotide polymorphisms (rs2297086, rs2066836 and rs357564) were analysed by restriction fragment length polymorphism analysis in 161 northern Italian OTRs (56 BCC cases and 105 controls). Two regions of the PTCH1 gene promoter were screened by heteroduplex analysis in 30 cases and 30 controls. Single locus analysis showed no significant association. Haplotype T(1686)-T(3944) appeared to confer a significantly higher risk for BCC development (odds ratio 2.98, 95% confidence interval 2.55-3.48; P = 0.001). Two novel rare polymorphisms were identified at positions 176 and 179 of the 5'UTR. Two novel alleles of the -4 (CGG)(n) microsatellite were identified. No association of this microsatellite with BCC was observed. Haplotypes containing T(1686)-T(3944) alleles were shown to be associated with an increased BCC risk in our study population. These data appear to be of great interest for further investigations in a larger group of transplant individuals. Our results do not support the hypothesis that common polymorphisms in the proximal 5

  3. Genetic variation and phylogenetic relationship analysis of Jatropha curcas L. inferred from nrDNA ITS sequences.

    Science.gov (United States)

    Guo, Guo-Ye; Chen, Fang; Shi, Xiao-Dong; Tian, Yin-Shuai; Yu, Mao-Qun; Han, Xue-Qin; Yuan, Li-Chun; Zhang, Ying

    2016-01-01

    Genetic variation and phylogenetic relationships among 102 Jatropha curcas accessions from Asia, Africa, and the Americas were assessed using the internal transcribed spacer region of nuclear ribosomal DNA (nrDNA ITS). The average G+C content (65.04%) was considerably higher than the A+T (34.96%) content. The estimated genetic diversity revealed moderate genetic variation. The pairwise genetic divergences (GD) between haplotypes were evaluated and ranged from 0.000 to 0.017, suggesting a higher level of genetic differentiation in Mexican accessions than those of other regions. Phylogenetic relationships and intraspecific divergence were inferred by Bayesian inference (BI), maximum parsimony (MP), and median joining (MJ) network analysis and were generally resolved. The J. curcas accessions were consistently divided into three lineages, groups A, B, and C, which demonstrated distant geographical isolation and genetic divergence between American accessions and those from other regions. The MJ network analysis confirmed that Central America was the possible center of origin. The putative migration route suggested that J. curcas was distributed from Mexico or Brazil, via Cape Verde and then split into two routes. One route was dispersed to Spain, then migrated to China, eventually spreading to southeastern Asia, while the other route was dispersed to Africa, via Madagascar and migrated to China, later spreading to southeastern Asia. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  4. An Intuitive Dashboard for Bayesian Network Inference

    International Nuclear Information System (INIS)

    Reddy, Vikas; Farr, Anna Charisse; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K D V

    2014-01-01

    Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++

  5. An Intuitive Dashboard for Bayesian Network Inference

    Science.gov (United States)

    Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.

    2014-03-01

    Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.

  6. Evidence that the ancestral haplotype in Australian hemochromatosis patients may be associated with a common mutation in the gene.

    OpenAIRE

    Crawford, D H; Powell, L W; Leggett, B A; Francis, J S; Fletcher, L M; Webb, S I; Halliday, J W; Jazwinska, E C

    1995-01-01

    Hemochromatosis (HC) is a common inherited disorder of iron metabolism for which neither the gene nor biochemical defect have yet been identified. The aim of this study was to look for clinical evidence that the predominant ancestral haplotype in Australian patients is associated with a common mutation in the gene. We compared indices of iron metabolism and storage in three groups of HC patients categorized according to the presence of the ancestral haplotype (i.e., patients with two copies, ...

  7. Haplotypes of nine single nucleotide polymorphisms on chromosome 19q13.2-3 associated with susceptibility of lung cancer in a Chinese population

    DEFF Research Database (Denmark)

    Yin, Jiaoyang; Vogel, Ulla Birgitte; Ma, Yegang

    2008-01-01

    To evaluate the joint effect of nine single nucleotide polymorphisms for three DNA repair genes in the region of chromosome 19q13.2-3 on susceptibility of lung cancer in a Chinese population, we conducted a hospital-based case-control study consisting of 247 lung cancer cases and 253 cancer......-free controls matched on age, gender and ethnicity. Associations between the haplotypes and susceptibility of lung cancer were tested. The global test of haplotype association revealed a statistically significant difference in the haplotype distribution between cases and controls (global test: chi(2) = 60.45, d...

  8. Bootstrapping phylogenies inferred from rearrangement data

    Directory of Open Access Journals (Sweden)

    Lin Yu

    2012-08-01

    Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its

  9. Bootstrapping phylogenies inferred from rearrangement data.

    Science.gov (United States)

    Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me

    2012-08-29

    Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver

  10. Haplotype-based case-control study on human apurinic/apyrimidinic endonuclease 1/redox effector factor-1 gene and essential hypertension.

    Science.gov (United States)

    Naganuma, Takahiro; Nakayama, Tomohiro; Sato, Naoyuki; Fu, Zhenyan; Soma, Masayoshi; Yamaguchi, Mai; Shimodaira, Masanori; Aoi, Noriko; Usami, Ron

    2010-02-01

    Oxidative DNA damage is involved in the pathophysiology of essential hypertension (EH), which is a multifactorial disorder. Apurinic/apyrimidinic endonuclease 1/redox effector factor-1 (APE1/REF-1) is an essential endonuclease in the base excision repair pathway of oxidatively damaged DNA, in addition to having reducing properties that promote the binding of redox-sensitive transcription factors. Blood pressure in APE1/REF-1-knockout mice is reported to be significantly higher than in wild-type mice. The aim of this study was to investigate the relationship between EH and the human APE1/REF-1 gene through a haplotype-based case-control study using single-nucleotide polymorphisms (SNPs). We selected five SNPs in the human APE1/REF-1 gene (rs1760944, rs3136814, rs17111967, rs3136817, and rs1130409), and performed case-control studies in 265 EH patients and 266 age-matched normotensive (NT) subjects. rs17111967 was found to show nonheterogeneity among Japanese subjects. There were no significant differences in the overall distribution of genotypes or alleles for each SNP between EH and NT groups. In the overall distribution of the haplotype-based case-control study constructed based on rs1760944, rs3136817, and rs1130409, the frequency of the G-T-T haplotype was significantly higher in the EH group than in the NT group (2.1% vs. 0.0%, P = 0.001). Multiple logistic regression analysis also revealed significant differences for the G-T-T haplotype, even after adjustment for confounding factors (OR = 8.600, 95% CI: 1.073-68.951, P = 0.043). Based on the present results, the G-T-T haplotype appears to be a genetic marker of EH, and the APE1/REF-1 gene appears to be a susceptibility gene for EH.

  11. ABO alleles are linked with haplotypes of an erythroid cell-specific regulatory element in intron 1 with a few exceptions attributable to genetic recombination.

    Science.gov (United States)

    Nakajima, T; Sano, R; Takahashi, Y; Watanabe, K; Kubo, R; Kobayashi, M; Takahashi, K; Takeshita, H; Kominato, Y

    2016-01-01

    Recent investigation of transcriptional regulation of the ABO genes has identified a candidate erythroid cell-specific regulatory element, named the +5·8-kb site, in the first intron of ABO. Six haplotypes of the site have been reported previously. The present genetic population study demonstrated that each haplotype was mostly linked with specific ABO alleles with a few exceptions, possibly as a result of hybrid formation between common ABO alleles. Thus, investigation of these haplotypes could provide a clue to further elucidation of ABO alleles. © 2015 International Society of Blood Transfusion.

  12. Exploring genetic variation in haplotypes of the filariasis vector Culex quinquefasciatus (Diptera: Culicidae) through DNA barcoding.

    Science.gov (United States)

    Vadivalagan, Chithravel; Karthika, Pushparaj; Murugan, Kadarkarai; Panneerselvam, Chellasamy; Del Serrone, Paola; Benelli, Giovanni

    2017-05-01

    Culex quinquefasciatus (Diptera: Culicidae) is a vector of many pathogens and parasites of humans, as well as domestic and wild animals. In urban and semi-urban Asian countries, Cx. quinquefasciatus is a main vector of nematodes causing lymphatic filariasis. In the African region, it vectors the Rift Valley fever virus, while in the USA it transmits West Nile, St. Louis encephalitis and Western equine encephalitis virus. In this study, DNA barcoding was used to explore the genetic variation of Cx. quinquefasciatus populations from 88 geographical regions. We presented a comprehensive approach analyzing the effectiveness of two gene markers, i.e. CO1 and 16S rRNA. The high threshold genetic divergence of CO1 (0.47%) gene was reported as an ideal marker for molecular identification of this mosquito vector. Furthermore, null substitutions were lower in CO1 if compared to 16S rRNA, which influenced its differentiating potential among Indian haplotypes. NJ tree was well supported with high branch values for CO1 gene than 16S rRNA, indicating ideal genetic differentiation among haplotypes. TCS haplotype network revealed 14 distinct clusters. The intra- and inter-population polymorphism were calculated among the global and Indian Cx. quinquefasciatus lineages. The genetic diversity index Tajima' D showed negative values for all the 4 intra-population clusters (G2-4, G10). Fu's FS showed negative value for G10 cluster, which was significant and indicated recent population expansion. However, the G2-G4 (i.e. Indian lineages) had positive values, suggesting a bottleneck effect. Overall, our research firstly shed light on the genetic differences among the haplotypes of Cx. quinquefasciatus species complex, adding basic knowledge to the molecular ecology of this important mosquito vector. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Improving preimplantation genetic diagnosis (PGD) reliability by selection of sperm donor with the most informative haplotype.

    Science.gov (United States)

    Malcov, Mira; Gold, Veronica; Peleg, Sagit; Frumkin, Tsvia; Azem, Foad; Amit, Ami; Ben-Yosef, Dalit; Yaron, Yuval; Reches, Adi; Barda, Shimi; Kleiman, Sandra E; Yogev, Leah; Hauser, Ron

    2017-04-26

    The study is aimed to describe a novel strategy that increases the accuracy and reliability of PGD in patients using sperm donation by pre-selecting the donor whose haplotype does not overlap the carrier's one. A panel of 4-9 informative polymorphic markers, flanking the mutation in carriers of autosomal dominant/X-linked disorders, was tested in DNA of sperm donors before PGD. Whenever the lengths of donors' repeats overlapped those of the women, additional donors' DNA samples were analyzed. The donor that demonstrated the minimal overlapping with the patient was selected for IVF. In 8 out of 17 carriers the markers of the initially chosen donors overlapped the patients' alleles and 2-8 additional sperm donors for each patient were haplotyped. The selection of additional sperm donors increased the number of informative markers and reduced misdiagnosis risk from 6.00% ± 7.48 to 0.48% ±0.68. The PGD results were confirmed and no misdiagnosis was detected. Our study demonstrates that pre-selecting a sperm donor whose haplotype has minimal overlapping with the female's haplotype, is critical for reducing the misdiagnosis risk and ensuring a reliable PGD. This strategy may contribute to prevent the transmission of affected IVF-PGD embryos using a simple and economical procedure. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. DNA testing of donors was approved by the institutional Helsinki committee (registration number 319-08TLV, 2008). The present study was approved by the institutional Helsinki committee (registration number 0385-13TLV, 2013).

  14. Sparse Bayesian Inference and the Temperature Structure of the Solar Corona

    Energy Technology Data Exchange (ETDEWEB)

    Warren, Harry P. [Space Science Division, Naval Research Laboratory, Washington, DC 20375 (United States); Byers, Jeff M. [Materials Science and Technology Division, Naval Research Laboratory, Washington, DC 20375 (United States); Crump, Nicholas A. [Naval Center for Space Technology, Naval Research Laboratory, Washington, DC 20375 (United States)

    2017-02-20

    Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.

  15. Goal inferences about robot behavior : goal inferences and human response behaviors

    NARCIS (Netherlands)

    Broers, H.A.T.; Ham, J.R.C.; Broeders, R.; De Silva, P.; Okada, M.

    2014-01-01

    This explorative research focused on the goal inferences human observers draw based on a robot's behavior, and the extent to which those inferences predict people's behavior in response to that robot. Results show that different robot behaviors cause different response behavior from people.

  16. The evolutionary history of the DMRT3 'Gait keeper' haplotype.

    Science.gov (United States)

    Staiger, E A; Almén, M S; Promerová, M; Brooks, S; Cothran, E G; Imsland, F; Jäderkvist Fegraeus, K; Lindgren, G; Mehrabani Yeganeh, H; Mikko, S; Vega-Pla, J L; Tozaki, T; Rubin, C J; Andersson, L

    2017-10-01

    A previous study revealed a strong association between the DMRT3:Ser301STOP mutation in horses and alternate gaits as well as performance in harness racing. Several follow-up studies have confirmed a high frequency of the mutation in gaited horse breeds and an effect on gait quality. The aim of this study was to determine when and where the mutation arose, to identify additional potential causal mutations and to determine the coalescence time for contemporary haplotypes carrying the stop mutation. We utilized sequences from 89 horses representing 26 breeds to identify 102 SNPs encompassing the DMRT3 gene that are in strong linkage disequilibrium with the stop mutation. These 102 SNPs were genotyped in an additional 382 horses representing 72 breeds, and we identified 14 unique haplotypes. The results provided conclusive evidence that DMRT3:Ser301STOP is causal, as no other sequence polymorphisms showed an equally strong association to locomotion traits. The low sequence diversity among mutant chromosomes demonstrated that they must have diverged from a common ancestral sequence within the last 10 000 years. Thus, the mutation occurred either just before domestication or more likely some time after domestication and then spread across the world as a result of selection on locomotion traits. © 2017 Stichting International Foundation for Animal Genetics.

  17. MtDNA haplotype identification of aurochs remains originating from the Czech Republic (Central Europe)

    Czech Academy of Sciences Publication Activity Database

    Kyselý, René; Hájek, M.

    2012-01-01

    Roč. 17, č. 2 (2012), s. 118-125 ISSN 1461-4103 Institutional research plan: CEZ:AV0Z80020508 Institutional support: RVO:67985912 Keywords : wild cattle (Bos primigenius) * aDNA * haplotype P * domestication Subject RIV: AC - Archeology, Anthropology, Ethnology

  18. Inferring regulatory networks from expression data using tree-based methods.

    Directory of Open Access Journals (Sweden)

    Vân Anh Huynh-Thu

    2010-09-01

    Full Text Available One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene is predicted from the expression patterns of all the other genes (input genes, using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.

  19. Reinforcement learning or active inference?

    Science.gov (United States)

    Friston, Karl J; Daunizeau, Jean; Kiebel, Stefan J

    2009-07-29

    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

  20. Reinforcement learning or active inference?

    Directory of Open Access Journals (Sweden)

    Karl J Friston

    2009-07-01

    Full Text Available This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

  1. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  2. Haplotype analysis of the HFE gene among populations of Northern Eurasia, in patients with metabolic disorders or stomach cancer, and in long-lived people.

    Science.gov (United States)

    Mikhailova, S V; Babenko, V N; Ivanoshchuk, D E; Gubina, M A; Maksimov, V N; Solovjova, I G; Voevoda, M I

    2016-06-17

    Previously, it was shown that the HFE gene (associated with human hereditary hemochromatosis) has several haplotypes of intronic polymorphisms. Some haplotype frequencies are race specific and hence can be used in phylogenetic analysis. We assumed that analysis of Caucasoid patients-living now in Western Siberia and having diseases associated with dietary habits and metabolic rate-will allow us to understand the processes of possible selection during settling of the northern part of Asia. Haplotype analysis of Northern Eurasian native and recently settled ethnic groups was performed on polymorphisms rs1799945, rs1800730, rs1800562, rs2071303, rs1800708, rs1572982, rs2794719, rs807209, and rs2032451 of this gene. The CCA haplotype of the rs2071303, rs1800708, and rs1572982 was found to be associated with HLA-A2 (39 %) in Asian populations. Haplotype analysis for the rs1799945, rs1800730, rs1800562, rs2071303, rs1800708, and rs1572982 was performed on Russian patients with some metabolic disorders or stomach cancer and among long-lived people. Decreased frequencies of the TTA haplotype (T in rs2071303, T in rs1800708, and A in rs1572982) were observed in the groups of patients with diseases associated with overweight (fatty liver disease, type 2 diabetes mellitus, or metabolic syndrome + arterial hypertension) as compared with the control sample. We detected significant differences in this haplotype's frequency between the patients with type 2 diabetes mellitus and Russian adolescents, elderly citizens, and long-lived people (χ(2) P value = 0.003, 0.010, and 0.015, respectively). No significant differences in frequencies of the alleles with mutations in coding regions of the HFE gene (C282Y, H63D, and S65C) were detected between the analyzed patients (with stomach cancer, metabolic syndrome, fatty liver disease, or type 2 diabetes mellitus) and the control Caucasoid sample. Monophyletic origin of H63D (rs1799945) was confirmed in Caucasoids and Northern

  3. Predictive minimum description length principle approach to inferring gene regulatory networks.

    Science.gov (United States)

    Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping

    2011-01-01

    Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.

  4. Strong association between a splice mutation (IVS12+5G{r_arrow}A) and haplotype 6 in hereditary tyrosinemia type I

    Energy Technology Data Exchange (ETDEWEB)

    Tanguay, R.M.; St-Louis, M.; Gibson, K. [Universite Laval, Ste-Foy (Canada)] [and others

    1994-09-01

    Hereditary tyrosinemia type I (HT I; McKusick 276700) is a severe inborn error of tyrosine catabolism pathway caused by a deficiency of fumarylacetoacetate hydrolase (FAH). The highest frequency reported is the one in Saguenay-Lac St-Jean (Quebec, Canada) where 1:1,846 births are affected. The FAH gene has been cloned and several mutations have been described. Allele specific oligonucleotide (ASO) hybridization was used to examine the frequency of a splice (IVS12-5G{r_arrow}A) mutation recently reported and RFLP analysis was done to identify haplotypes related to HT I. The splice mutation was found on 45/50 alleles (90%) in patients from SLSJ and 12/66 (18%) alleles from patients world-wide. All 25 patients from the SLSJ region were positive with 20 being homozygous, indicating that this mutation is the major cause of HT I in French Canada. Of these 25 patients, 96% were positive for one haplotype called no 6 which is these 25 patients, 96% were positive for one haplotype called no 6 which is identified by TaqI, RsaI, BglII, MspI and KpnI digestions. These data show a really strong association between the mutation (IVS12+5G{r_arrow}A) and haplotype 6. Among our patients from around the world, {approximately}52% were positive for haplotype 6 indicating its strong relation with HT I. These results provide the rationale for DNA-based carrier testing for HT I in the F-C population at risk as well as in HT I patients in general.

  5. NIFTY - Numerical Information Field Theory. A versatile PYTHON library for signal inference

    Science.gov (United States)

    Selig, M.; Bell, M. R.; Junklewitz, H.; Oppermann, N.; Reinecke, M.; Greiner, M.; Pachajoa, C.; Enßlin, T. A.

    2013-06-01

    NIFTy (Numerical Information Field Theory) is a software package designed to enable the development of signal inference algorithms that operate regardless of the underlying spatial grid and its resolution. Its object-oriented framework is written in Python, although it accesses libraries written in Cython, C++, and C for efficiency. NIFTy offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on fields into classes. Thereby, the correct normalization of operations on fields is taken care of automatically without concerning the user. This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory. Thus, NIFTy permits its user to rapidly prototype algorithms in 1D, and then apply the developed code in higher-dimensional settings of real world problems. The set of spaces on which NIFTy operates comprises point sets, n-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those. The functionality and diversity of the package is demonstrated by a Wiener filter code example that successfully runs without modification regardless of the space on which the inference problem is defined. NIFTy homepage http://www.mpa-garching.mpg.de/ift/nifty/; Excerpts of this paper are part of the NIFTy source code and documentation.

  6. Nonlinear Inference in Partially Observed Physical Systems and Deep Neural Networks

    Science.gov (United States)

    Rozdeba, Paul J.

    The problem of model state and parameter estimation is a significant challenge in nonlinear systems. Due to practical considerations of experimental design, it is often the case that physical systems are partially observed, meaning that data is only available for a subset of the degrees of freedom required to fully model the observed system's behaviors and, ultimately, predict future observations. Estimation in this context is highly complicated by the presence of chaos, stochasticity, and measurement noise in dynamical systems. One of the aims of this dissertation is to simultaneously analyze state and parameter estimation in as a regularized inverse problem, where the introduction of a model makes it possible to reverse the forward problem of partial, noisy observation; and as a statistical inference problem using data assimilation to transfer information from measurements to the model states and parameters. Ultimately these two formulations achieve the same goal. Similar aspects that appear in both are highlighted as a means for better understanding the structure of the nonlinear inference problem. An alternative approach to data assimilation that uses model reduction is then examined as a way to eliminate unresolved nonlinear gating variables from neuron models. In this formulation, only measured variables enter into the model, and the resulting errors are themselves modeled by nonlinear stochastic processes with memory. Finally, variational annealing, a data assimilation method previously applied to dynamical systems, is introduced as a potentially useful tool for understanding deep neural network training in machine learning by exploiting similarities between the two problems.

  7. Differences in meiotic recombination rates in childhood acute lymphoblastic leukemia at an MHC class II hotspot close to disease associated haplotypes.

    Directory of Open Access Journals (Sweden)

    Pamela Thompson

    Full Text Available Childhood Acute Lymphoblastic Leukemia (ALL is a malignant lymphoid disease of which B-cell precursor- (BCP and T-cell- (T ALL are subtypes. The role of alleles encoded by major histocompatibility loci (MHC have been examined in a number of previous studies and results indicating weak, multi-allele associations between the HLA-DPB1 locus and BCP-ALL suggested a role for immunosusceptibility and possibly infection. Two independent SNP association studies of ALL identified loci approximately 37 kb from one another and flanking a strong meiotic recombination hotspot (DNA3, adjacent to HLA-DOA and centromeric of HLA-DPB1. To determine the relationship between this observation and HLA-DPB1 associations, we constructed high density SNP haplotypes of the 316 kb region from HLA-DMB to COL11A2 in childhood ALL and controls using a UK GWAS data subset and the software PHASE. Of four haplotype blocks identified, predicted haplotypes in Block 1 (centromeric of DNA3 differed significantly between BCP-ALL and controls (P = 0.002 and in Block 4 (including HLA-DPB1 between T-ALL and controls (P = 0.049. Of specific common (>5% haplotypes in Block 1, two were less frequent in BCP-ALL, and in Block 4 a single haplotype was more frequent in T-ALL, compared to controls. Unexpectedly, we also observed apparent differences in ancestral meiotic recombination rates at DNA3, with BCP-ALL showing increased and T-ALL decreased levels compared to controls. In silico analysis using LDsplit sotware indicated that recombination rates at DNA3 are influenced by flanking loci, including SNPs identified in childhood ALL association studies. The observed differences in rates of meiotic recombination at this hotspot, and potentially others, may be a characteristic of childhood leukemia and contribute to disease susceptibility, alternatively they may reflect interactions between ALL-associated haplotypes in this region.

  8. Polymorphic haplotypes on R408BW PKU and normal PAH chromosomes in Quebec and European populations

    Energy Technology Data Exchange (ETDEWEB)

    Byck, S.; Morgan, K.; Scriver, C.R. [McGill Univ., Montreal (Canada)] [and others

    1994-09-01

    The R408W mutation in the phenylalanine hydroxylase gene (PAH) is associated with haplotype 2.3 (RFLP haplotype 2, VNTR 3 of the HindIII system) in most European populations. Another chromosome, first observed in Quebec and then in northwest Europe, carries R408W on haplotype 1.8. The occurrence of the R408W mutation on two different PKU chromosomes could be the result of intragenic recombination, recurrent mutation or gene conversion. In this study, we analyzed both normal and R408W chromosomes carrying 1.8 and 2.3 haplotypes in Quebec and European populations; we used the TCTA{sub (n)} short tandem repeat sequence (STR) at the 5{prime} end of the PAH gene and the HindIII VNTR system at the 3{prime} end of the PAH gene to characterize chromosomes. Fourteen of sixteen R408W chromosomes from {open_quotes}Celtic{close_quotes} families in Quebec and the United Kingdom (UK) harbor a 244 bp STR allele; the remaining two chromosomes, carry a 240 bp or 248bp STR allele. Normal chromosomes (n=18) carry the 240 bp STR allele. R408W chromosomes are different from mutant H1.8 chromosomes; mutant H2.3 carries the 240 bp STR allele (14 of 16 chromosomes) or the 236 allele (2 of 16 chromosomes). The HindIII VNTR comprises variable numbers of 30 bp repeats (cassettes); the repeats also vary in nucleotide sequence. Variation clusters toward the 3{prime} end of cassettes and VNTRs. VNTR 3 alleles on normal H2 (n=9) and mutant R408W H2 (n=19) chromosomes were identical. VNTR 8 alleles on normal H1 chromosomes (n=9) and on R408W H1 chromosomes (n=15) differ by 1 bp substitution near the 3{prime} end of the 6th cassette. In summary, the mutant H1.8 chromosome harboring the R408W mutation has unique features at both the 5{prime} and 3{prime} end of the gene that distinguish it from the mutant H2.3 and normal H1.8 and H2.3 counterparts. The explanation for the occurrence of R408W on two different PAH haplotypes is recurrent mutation affecting the CpG dinucleotide in PAH codon 408.

  9. Learning Convex Inference of Marginals

    OpenAIRE

    Domke, Justin

    2012-01-01

    Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...

  10. Maximum Likelihood Method for Predicting Environmental Conditions from Assemblage Composition: The R Package bio.infer

    Directory of Open Access Journals (Sweden)

    Lester L. Yuan

    2007-06-01

    Full Text Available This paper provides a brief introduction to the R package bio.infer, a set of scripts that facilitates the use of maximum likelihood (ML methods for predicting environmental conditions from assemblage composition. Environmental conditions can often be inferred from only biological data, and these inferences are useful when other sources of data are unavailable. ML prediction methods are statistically rigorous and applicable to a broader set of problems than more commonly used weighted averaging techniques. However, ML methods require a substantially greater investment of time to program algorithms and to perform computations. This package is designed to reduce the effort required to apply ML prediction methods.

  11. Modeling and E-M estimation of haplotype-specific relative risks from genotype data for a case-control study of unrelated individuals.

    Science.gov (United States)

    Stram, Daniel O; Leigh Pearce, Celeste; Bretsky, Phillip; Freedman, Matthew; Hirschhorn, Joel N; Altshuler, David; Kolonel, Laurence N; Henderson, Brian E; Thomas, Duncan C

    2003-01-01

    The US National Cancer Institute has recently sponsored the formation of a Cohort Consortium (http://2002.cancer.gov/scpgenes.htm) to facilitate the pooling of data on very large numbers of people, concerning the effects of genes and environment on cancer incidence. One likely goal of these efforts will be generate a large population-based case-control series for which a number of candidate genes will be investigated using SNP haplotype as well as genotype analysis. The goal of this paper is to outline the issues involved in choosing a method of estimating haplotype-specific risk estimates for such data that is technically appropriate and yet attractive to epidemiologists who are already comfortable with odds ratios and logistic regression. Our interest is to develop and evaluate extensions of methods, based on haplotype imputation, that have been recently described (Schaid et al., Am J Hum Genet, 2002, and Zaykin et al., Hum Hered, 2002) as providing score tests of the null hypothesis of no effect of SNP haplotypes upon risk, which may be used for more complex tasks, such as providing confidence intervals, and tests of equivalence of haplotype-specific risks in two or more separate populations. In order to do so we (1) develop a cohort approach towards odds ratio analysis by expanding the E-M algorithm to provide maximum likelihood estimates of haplotype-specific odds ratios as well as genotype frequencies; (2) show how to correct the cohort approach, to give essentially unbiased estimates for population-based or nested case-control studies by incorporating the probability of selection as a case or control into the likelihood, based on a simplified model of case and control selection, and (3) finally, in an example data set (CYP17 and breast cancer, from the Multiethnic Cohort Study) we compare likelihood-based confidence interval estimates from the two methods with each other, and with the use of the single-imputation approach of Zaykin et al. applied under both

  12. Subjective randomness as statistical inference.

    Science.gov (United States)

    Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B

    2018-06-01

    Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.

  13. Quantitative trait loci and the relevance of phased haplotypes

    DEFF Research Database (Denmark)

    Gregersen, Vivi Raundahl

    Genetic control of different production traits and diseases within livestock has been of great interest since domenstication. SNPs have greatly facilitated the use of QTL studies in the search of genomic regions affecting different phenotypes. The studies have been conducted to identify regions...... underlying gentic control both as traditional linkage studies relying on genetic maps and as GWAS where an approach of phasing haplotypes within the QTL have been conducted to validate the regions. Overall, regions of interest have been identified for chronic pleuritis and osteochondrosis in addition to meat...... quality and boar taint in pigs, and for improved chees production within cows...

  14. EEG Based Inference of Spatio-Temporal Brain Dynamics

    DEFF Research Database (Denmark)

    Hansen, Sofie Therese

    Electroencephalography (EEG) provides a measure of brain activity and has improved our understanding of the brain immensely. However, there is still much to be learned and the full potential of EEG is yet to be realized. In this thesis we suggest to improve the information gain of EEG using three...... different approaches; 1) by recovery of the EEG sources, 2) by representing and inferring the propagation path of EEG sources, and 3) by combining EEG with functional magnetic resonance imaging (fMRI). The common goal of the methods, and thus of this thesis, is to improve the spatial dimension of EEG...... recovery ability. The forward problem describes the propagation of neuronal activity in the brain to the EEG electrodes on the scalp. The geometry and conductivity of the head layers are normally required to model this path. We propose a framework for inferring forward models which is based on the EEG...

  15. Genetic polymorphisms in MDR1 and CYP3A4 genes in Asians and the influence of MDR1 haplotypes on cyclosporin disposition in heart transplant recipients.

    Science.gov (United States)

    Chowbay, Balram; Cumaraswamy, Sivathasan; Cheung, Yin Bun; Zhou, Qingyu; Lee, Edmund J D

    2003-02-01

    Intestinal cytochrome P450 3A4 (CYP3A4) and P-glycoprotein (P-gp) both play a vital role in the metabolism of oral cyclosporine (CsA). We investigated the genetic polymorphisms in CYP3A4(promoter region and exons 5, 7 and 9) and MDR1 (exons 12, 21 and 26) genes and the impact of these polymorphisms on the pharmacokinetics of oral CsA in stable heart transplant patients (n = 14). CYP3A4 polymorphisms were rare in the Asian population and transplant patients. Haplotype analysis revealed 12 haplotypes in the Chinese, eight in the Malays and 10 in the Indians. T-T-T was the most common haplotype in all ethnic groups. The frequency of the homozygous mutant genotype at all three loci (TT-TT-TT) was highest in the Indians (31%) compared to 19% and 15% in the Chinese and Malays, respectively. In heart transplant patients, CsA exposure (AUC(0-4 h), AUC(0-12 h) and C(max)) was high in patients with the T-T-T haplotypes compared to those with C-G-C haplotypes. These findings suggest that haplotypes rather than genotypes influence CsA disposition in transplant patients.

  16. Epistatic interaction between haplotypes of the ghrelin ligand and receptor genes influence susceptibility to myocardial infarction and coronary artery disease.

    Science.gov (United States)

    Baessler, Andrea; Fischer, Marcus; Mayer, Bjoern; Koehler, Martina; Wiedmann, Silke; Stark, Klaus; Doering, Angela; Erdmann, Jeanette; Riegger, Guenter; Schunkert, Heribert; Kwitek, Anne E; Hengstenberg, Christian

    2007-04-15

    Data from both experimental models and humans provide evidence that ghrelin and its receptor, the growth hormone secretagogue receptor (ghrelin receptor, GHSR), possess a variety of cardiovascular effects. Thus, we hypothesized that genetic variants within the ghrelin system (ligand ghrelin and its receptor GHSR) are associated with susceptibility to myocardial infarction (MI) and coronary artery disease (CAD). Seven single nucleotide polymorphisms (SNPs) covering the GHSR region as well as eight SNPs across the ghrelin gene (GHRL) region were genotyped in index MI patients (864 Caucasians, 'index MI cases') from the German MI family study and in matched controls without evidence of CAD (864 Caucasians, 'controls', MONICA Augsburg). In addition, siblings of these MI patients with documented severe CAD (826 'affected sibs') were matched likewise with controls (n = 826 Caucasian 'controls') and used for verification. The effect of interactions between genetic variants of both genes of the ghrelin system was explored by conditional classification tree models. We found association of several GHSR SNPs with MI [best SNP odds ratio (OR) 1.7 (1.2-2.5); P = 0.002] using a recessive model. Moreover, we identified a common GHSR haplotype which significantly increases the risk for MI [multivariate adjusted OR for homozygous carriers 1.6 (1.1-2.5) and CAD OR 1.6 (1.1-2.5)]. In contrast, no relationship between genetic variants and the disease could be revealed for GHRL. However, the increase in MI/CAD frequency related to the susceptible GHSR haplotype was abolished when it coincided with a common GHRL haplotype. Multivariate adjustments as well as permutation-based methods conveyed the same results. These data are the first to demonstrate an association of SNPs and haplotypes within important genes of the ghrelin system and the susceptibility to MI, whereas association with MI/CAD could be identified for genetic variants across GHSR, no relationship could be revealed for GHRL

  17. Polymorphism screening and haplotype analysis of the tryptophan hydroxylase gene (TPH1 and association with bipolar affective disorder in Taiwan

    Directory of Open Access Journals (Sweden)

    Lin Yi-Mei J

    2005-03-01

    Full Text Available Abstract Background Disturbances in serotonin neurotransmission are implicated in the etiology of many psychiatric disorders, including bipolar affective disorder (BPD. The tryptophan hydroxylase gene (TPH, which codes for the enzyme catalyzing the rate-limiting step in serotonin biosynthetic pathway, is one of the leading candidate genes for psychiatric and behavioral disorders. In a preliminary study, we found that TPH1 intron7 A218C polymorphism was associated with BPD. This study was designed to investigate sequence variants of the TPH1 gene in Taiwanese and to test whether the TPH1 gene is a susceptibility factor for the BPD. Methods Using a systematic approach, we have searched the exons and promoter region of the TPH1 gene for sequence variants in Taiwanese Han and have identified five variants, A-1067G, G-347T, T3804A, C27224T, and A27237G. These five variants plus another five taken from the literature and a public database were examined for an association in 108 BPD patients and 103 controls; no association was detected for any of the 10 variants. Results Haplotype constructions using these 10 SNPs showed that the 3 most common haplotypes in both patients and controls were identical. One of the fourth common haplotype in the patient group (i.e. GGGAGACCCA was unique and showed a trend of significance with the disease (P = 0.028. However, the significance was abolished after Bonferroni correction thus suggesting the association is weak. In addition, three haplotype-tagged SNPs (htSNPs were selected to represent all haplotypes with frequencies larger than 2% in the Taiwanese Han population. The defined TPH1 htSNPs significantly reduce the marker number for haplotype analysis thus provides useful information for future association studies in our population. Conclusion Results of this study did not support the role of TPH1 gene in BPD etiology. As the current studies found the TPH1 gene under investigation belongs to the peripheral

  18. Identification of a two-marker-haplotype on Bos taurus autosome 18 associated with somatic cell score in German Holstein cattle

    Directory of Open Access Journals (Sweden)

    Reinsch Norbert

    2009-09-01

    Full Text Available Abstract Background The somatic cell score (SCS is implemented in routine sire evaluations in many countries as an indicator trait for udder health. Somatic cell score is highly correlated with clinical mastitis, and in the German Holstein population quantitative trait loci (QTL for SCS have been repeatedly mapped on Bos taurus autosome 18 (BTA18. In the present study, we report a refined analysis of previously detected QTL regions on BTA18 with the aim of identifying marker and marker haplotypes in linkage disequilibrium with SCS. A combined linkage and linkage disequilibrium approach was implemented, and association analyses of marker genotypes and maternally inherited two-marker-haplotypes were conducted to identify marker and haplotypes in linkage disequilibrium with a locus affecting SCS in the German Holstein population. Results We detected a genome-wide significant QTL within marker interval 9 (HAMP_c.366+109G>A - BMS833 in the middle to telomeric region on BTA18 and a second putative QTL in marker interval 12-13 (BB710 - PVRL2_c.392G>A. Association analyses with genotypes of markers flanking the most likely QTL positions revealed the microsatellite marker BMS833 (interval 9 to be associated with a locus affecting SCS within the families investigated. A further analysis of maternally inherited two-marker haplotypes and effects of maternally inherited two-marker-interval gametes indicated haplotype 249-G in marker interval 12-13 (BB710 - PVRL2_c.392G>A to be associated with SCS in the German Holstein population. Conclusion Our results confirmed previous QTL mapping results for SCS and support the hypothesis that more than one locus presumably affects udder health in the middle to telomeric region of BTA18. However, a subsequent investigation of the reported QTL regions is necessary to verify the two-QTL hypothesis and confirm the association of two-marker-haplotype 249-G in marker interval 12-13 (BB710 - PVRL2_c.392G>A with SCS. For this

  19. Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.

  20. An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks

    KAUST Repository

    Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

    2016-01-01

    In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem

  1. A haplotype specific to North European wheat (Triticum aestivum L.)

    Czech Academy of Sciences Publication Activity Database

    Tsombalova, J.; Karafiátová, Miroslava; Vrána, Jan; Kubaláková, Marie; Peusa, H.; Jakobson, I.; Jarve, M.; Valárik, Miroslav; Doležel, Jaroslav; Jarve, K.

    2017-01-01

    Roč. 64, č. 4 (2017), s. 653-664 ISSN 0925-9864 R&D Projects: GA MŠk(CZ) LO1204; GA ČR(CZ) GA14-07164S Institutional support: RVO:61389030 Keywords : bread wheat * genetic diversity * polyploid wheat * introgression lines * molecular analysis * tetraploid wheat * hexaploid wheat * powdery mildew * spelta l. * map * Common wheat * Triticum aestivum L * Spelt * Triticum spelta L * Chromosome 4A * Zero alleles * Haplotype * Linkage disequilibrium Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Plant sciences, botany Impact factor: 1.294, year: 2016

  2. Probabilistic inductive inference: a survey

    OpenAIRE

    Ambainis, Andris

    2001-01-01

    Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.

  3. LAIT: a local ancestry inference toolkit.

    Science.gov (United States)

    Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei

    2017-09-06

    Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.

  4. Characterizing Metastatic HER2-Positive Gastric Cancer at the CDH1 Haplotype

    Science.gov (United States)

    Caggiari, Laura; Miolo, Gianmaria; Buonadonna, Angela; Basile, Debora; Santeufemia, Davide A.; De Zorzi, Mariangela; Fornasarig, Mara; Alessandrini, Lara; Lo Re, Giovanni; Puglisi, Fabio; Steffan, Agostino

    2017-01-01

    The CDH1 gene, coding for the E-cadherin protein, is linked to gastric cancer (GC) susceptibility and tumor invasion. The human epidermal growth factor receptor 2 (HER2) is amplified and overexpressed in a portion of GC. HER2 is an established therapeutic target in metastatic GC (mGC). Trastuzumab, in combination with various chemotherapeutic agents, is a standard treatment for these tumors leading to outcome improvement. Unfortunately, the survival benefit is limited to a fraction of patients. The aim of this study was to improve knowledge of the HER2 and the E-cadherin alterations in the context of GC to characterize subtypes of patients that could better benefit from targeted therapy. An association between the P7-CDH1 haplotype, including two polymorphisms (rs16260A-rs1801552T) and a subset of HER2-positive mGC with better prognosis was observed. Results indicated the potential evaluation of CDH1 haplotypes in mGC to stratify patients that will benefit from trastuzumab-based treatments. Moreover, data may have implications to understanding the HER2 and the E-cadherin interactions in vivo and in response to treatments. PMID:29295527

  5. A canonical correlation analysis-based dynamic bayesian network prior to infer gene regulatory networks from multiple types of biological data.

    Science.gov (United States)

    Baur, Brittany; Bozdag, Serdar

    2015-04-01

    One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.

  6. Lack of concordance and linkage disequilibrium among brothers for androgenetic alopecia and CAG/GGC haplotypes of the androgen receptor gene in Mexican families.

    Science.gov (United States)

    Arteaga-Vázquez, Jazmín; López-Hernández, María A; Svyryd, Yevgeniya; Mutchinick, Osvaldo M

    2015-12-01

    Androgenetic alopecia (AGA) or common baldness is the most prevalent form of hair loss in males. Familial predisposition has been recognized, and heritability estimated in monozygotic twins suggests an important genetic predisposition. Several studies indicate that the numbers of CAG/GGC repeats in exon 1 of the androgen receptor gene (AR) maybe associated with AGA susceptibility. To investigate a possible correlation between AR CAG/GGC haplotypes and the presence or not of alopecia in sibships with two or more brothers among them at least one of them has AGA. Thirty-two trios including an alopecic man, one brother alopecic or not, and their mother were enrolled. Sanger sequencing of the exon 1 of the AR gene was conducted to ascertain the number of CAG/GGC repeats in each individual. Heterozygous mother for the CAG/GGC haplotypes was an inclusion criterion to analyze the segregation haplotype patterns in the family. Concordance for the number of repeats and AGA among brothers was evaluated using kappa coefficient and the probability of association in the presence of genetic linkage between CAG and GGC repeats and AGA estimated by means of the family-based association test (FBAT). The median for the CAG and GGC repeats in the AR is similar to that reported in other populations. The CAG/GGC haplotypes were less polymorphic than that reported in other studies, especially due to the GGC number of repeats found. Kappa coefficient resulted in a concordance of 37.3% (IC 95%, 5.0-69.0%) for the AGA phenotype and identical CAG/GGC haplotypes. There was no evidence of linkage disequilibrium. Our results do not confirm a possible correlation or linkage disequilibrium between the CAG/GGC haplotypes of the AR gene and androgenetic alopecia in Mexican brothers. © 2015 Wiley Periodicals, Inc.

  7. Bayesian statistical inference

    Directory of Open Access Journals (Sweden)

    Bruno De Finetti

    2017-04-01

    Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.

  8. Evaluation of haplotype diversity of Achatina fulica (Lissachatina) [Bowdich] from Indian sub-continent by means of 16S rDNA sequence and its phylogenetic relationships with other global populations.

    Science.gov (United States)

    Ayyagari, Vijaya Sai; Sreerama, Krupanidhi

    2017-08-01

    Achatina fulica (Lissachatina fulica) is one of the most invasive species found across the globe causing a significant damage to crops, vegetables, and horticultural plants. This terrestrial snail is native to east Africa and spread to different parts of the world by introductions. India, a hot spot for biodiversity of several endemic gastropods, has witnessed an outburst of this snail population in several parts of the country posing a serious threat to crop loss and also to human health. With an objective to evaluate the genetic diversity of this snail, we have sampled this snail from different parts of India and analyzed its haplotype diversity by means of 16S rDNA sequence information. Apart from this, we have studied the phylogenetic relationships of the isolates sequenced in the present study in relation with other global populations by Bayesian and Maximum-likelihood approaches. Of the isolates sequenced, haplotype 'C' is the predominant one. A new haplotype 'S' from the state of Odisha was observed. The isolates sequenced in the present study clustered with its conspecifics from the Indian sub-continent. Haplotype network analyses were also carried out for studying the evolution of different haplotypes. It was observed that haplotype 'S' was associated with a Mauritius haplotype 'H', indicating the possibility of multiple introductions of A. fulica to India.

  9. Models for inference in dynamic metacommunity systems

    Science.gov (United States)

    Dorazio, Robert M.; Kery, Marc; Royle, J. Andrew; Plattner, Matthias

    2010-01-01

    A variety of processes are thought to be involved in the formation and dynamics of species assemblages. For example, various metacommunity theories are based on differences in the relative contributions of dispersal of species among local communities and interactions of species within local communities. Interestingly, metacommunity theories continue to be advanced without much empirical validation. Part of the problem is that statistical models used to analyze typical survey data either fail to specify ecological processes with sufficient complexity or they fail to account for errors in detection of species during sampling. In this paper, we describe a statistical modeling framework for the analysis of metacommunity dynamics that is based on the idea of adopting a unified approach, multispecies occupancy modeling, for computing inferences about individual species, local communities of species, or the entire metacommunity of species. This approach accounts for errors in detection of species during sampling and also allows different metacommunity paradigms to be specified in terms of species- and location-specific probabilities of occurrence, extinction, and colonization: all of which are estimable. In addition, this approach can be used to address inference problems that arise in conservation ecology, such as predicting temporal and spatial changes in biodiversity for use in making conservation decisions. To illustrate, we estimate changes in species composition associated with the species-specific phenologies of flight patterns of butterflies in Switzerland for the purpose of estimating regional differences in biodiversity.

  10. Experiments for Evaluating Application of Bayesian Inference to Situation Awareness of Human Operators in NPPs

    Energy Technology Data Exchange (ETDEWEB)

    Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)

    2014-08-15

    Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)

  11. Experiments for Evaluating Application of Bayesian Inference to Situation Awareness of Human Operators in NPPs

    International Nuclear Information System (INIS)

    Kang, Seong Keun; Seong, Poong Hyun

    2014-01-01

    Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)

  12. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ

    Directory of Open Access Journals (Sweden)

    Qing-Ming An

    2015-11-01

    Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.

  13. Identification of the ancestral haplotype for apolipoprotein B suggests an African origin of Homo sapiens sapiens and traces their subsequent migration to Europe and the Pacific

    Energy Technology Data Exchange (ETDEWEB)

    Rapacz, J.; Hasler-Rapacz, J.O. (Univ. of Wisconsin, Madison (United States)); Chen, L.; Wu, Mingjiuan; Schumaker, V.N. (Univ. of California, Los Angeles (United States)); Butler-Brunner, E.; Butler, R. (Swiss Red Cross Blood Transfusion Service, Bern (Switzerland))

    1991-02-15

    The probable ancestral haplotype for human apolipoprotein B (apoB) has been identified through immunological analysis of chimpanzee and gorilla serum and sequence analysis of their DNA. Moreover, the frequency of this ancestral apoB haplotype among different human populations provides strong support for the African origin of Homo sapiens sapiens and their subsequent migration from Africa to Europe and to the Pacific. The approach used here for the identification of the ancestral human apoB haplotype is likely to be applicable to many other genes.

  14. Identification of the ancestral haplotype for apolipoprotein B suggests an African origin of Homo sapiens sapiens and traces their subsequent migration to Europe and the Pacific

    International Nuclear Information System (INIS)

    Rapacz, J.; Hasler-Rapacz, J.O.; Chen, L.; Wu, Mingjiuan; Schumaker, V.N.; Butler-Brunner, E.; Butler, R.

    1991-01-01

    The probable ancestral haplotype for human apolipoprotein B (apoB) has been identified through immunological analysis of chimpanzee and gorilla serum and sequence analysis of their DNA. Moreover, the frequency of this ancestral apoB haplotype among different human populations provides strong support for the African origin of Homo sapiens sapiens and their subsequent migration from Africa to Europe and to the Pacific. The approach used here for the identification of the ancestral human apoB haplotype is likely to be applicable to many other genes

  15. Is there a hierarchy of social inferences? The likelihood and speed of inferring intentionality, mind, and personality.

    Science.gov (United States)

    Malle, Bertram F; Holbrook, Jess

    2012-04-01

    People interpret behavior by making inferences about agents' intentionality, mind, and personality. Past research studied such inferences 1 at a time; in real life, people make these inferences simultaneously. The present studies therefore examined whether 4 major inferences (intentionality, desire, belief, and personality), elicited simultaneously in response to an observed behavior, might be ordered in a hierarchy of likelihood and speed. To achieve generalizability, the studies included a wide range of stimulus behaviors, presented them verbally and as dynamic videos, and assessed inferences both in a retrieval paradigm (measuring the likelihood and speed of accessing inferences immediately after they were made) and in an online processing paradigm (measuring the speed of forming inferences during behavior observation). Five studies provide evidence for a hierarchy of social inferences-from intentionality and desire to belief to personality-that is stable across verbal and visual presentations and that parallels the order found in developmental and primate research. (c) 2012 APA, all rights reserved.

  16. Using haplotypes to unravel the inheritance of Holstein coat color for a larger audience

    Science.gov (United States)

    Haplotype testing identifies single-nucleotide polymorphisms that bracket a group of alleles from several different genes located on a specific chromosomal section of DNA. For a trait with a limited number of genotypes and phenotypes, the rules of inheritance can be determined by matching up certain...

  17. Effect of dopamine receptor D4 (DRD4) haplotypes on general psychopathology in patients with eating disorders.

    Science.gov (United States)

    Gervasini, Guillermo; González, Luz M; Gamero-Villarroel, Carmen; Mota-Zamorano, Sonia; Carrillo, Juan Antonio; Flores, Isalud; García-Herráiz, Angustias

    2018-05-15

    Among the many candidate genes analyzed in eating disorder (ED) patients, those involved in dopaminergic functions may be of special relevance, as dopamine is known to play a significant role in feeding behavior, the distortion of body image, hyperactivity and reward and reinforcement processes. We aimed to determine the effect of functional polymorphisms and haplotypes in the Dopamine Receptor D4 (DRD4) gene on general psychopathological symptoms in ED patients. Two-hundred-and-seventy-three ED patients [199 with Anorexia Nervosa (AN) and 74 with Bulimia Nervosa (BN)] completed the SCL-90R inventory and were genotyped for four functional, clinically relevant DRD4 polymorphisms: three variants in the promoter region [120-bp tandem repeat (TR, long vs. short allele), C-616G and C-521 T] and a variable number of tandem repeats (VNTR) in exon 3 (7R vs. non-7R allele). After correcting for multiple testing, none of the assayed polymorphisms were individually associated with SCL-90R results. Four DRD4 haplotypes (*1-*4) were detected in the patients with a frequency > 0.1. In the BN group, haplotype *2 (non7R-TR long-C-C) was associated with higher scores in the three global SCL-90R indices (GSI, PSDI and PST) after Bonferroni correction (p ≤ 0.01 in all instances). Furthermore, carriers of this haplotype displayed higher scores (worst symptomatology) in Somatization, Obsessive-Compulsive, Anxiety, Phobic anxiety, Paranoid ideation and the test additional items (p-values for the differences between carriers vs. non-carriers ranging from 0.0001 to 0.0110). Certain combinations of DRD4 variants may contribute to psychopathological features in BN patients. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Bayesian inference data evaluation and decisions

    CERN Document Server

    Harney, Hanns Ludwig

    2016-01-01

    This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...

  19. Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference

    Energy Technology Data Exchange (ETDEWEB)

    Beggs, W.J.

    1981-02-01

    This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; the analysis of variance; quality control procedures; and linear regression analysis.

  20. INFERENCE BUILDING BLOCKS

    Science.gov (United States)

    2018-02-15

    expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation