WorldWideScience

Sample records for sample inference critical

  1. On the criticality of inferred models

    Science.gov (United States)

    Mastromatteo, Iacopo; Marsili, Matteo

    2011-10-01

    Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality.

  2. On the criticality of inferred models

    International Nuclear Information System (INIS)

    Mastromatteo, Iacopo; Marsili, Matteo

    2011-01-01

    Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality

  3. Inverse Ising inference with correlated samples

    International Nuclear Information System (INIS)

    Obermayer, Benedikt; Levine, Erel

    2014-01-01

    Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)

  4. Contingency inferences driven by base rates: Valid by sampling

    Directory of Open Access Journals (Sweden)

    Florian Kutzner

    2011-04-01

    Full Text Available Fiedler et al. (2009, reviewed evidence for the utilization of a contingency inference strategy termed pseudocontingencies (PCs. In PCs, the more frequent levels (and, by implication, the less frequent levels are assumed to be associated. PCs have been obtained using a wide range of task settings and dependent measures. Yet, the readiness with which decision makers rely on PCs is poorly understood. A computer simulation explored two potential sources of subjective validity of PCs. First, PCs are shown to perform above chance level when the task is to infer the sign of moderate to strong population contingencies from a sample of observations. Second, contingency inferences based on PCs and inferences based on cell frequencies are shown to partially agree across samples. Intriguingly, this criterion and convergent validity are by-products of random sampling error, highlighting the inductive nature of contingency inferences.

  5. Fast and scalable inference of multi-sample cancer lineages.

    KAUST Repository

    Popic, Victoria; Salari, Raheleh; Hajirasouliha, Iman; Kashef-Haghighi, Dorna; West, Robert B; Batzoglou, Serafim

    2015-01-01

    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee .

  6. Fast and scalable inference of multi-sample cancer lineages.

    KAUST Repository

    Popic, Victoria

    2015-05-06

    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee .

  7. Network Model-Assisted Inference from Respondent-Driven Sampling Data.

    Science.gov (United States)

    Gile, Krista J; Handcock, Mark S

    2015-06-01

    Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.

  8. Network Model-Assisted Inference from Respondent-Driven Sampling Data

    Science.gov (United States)

    Gile, Krista J.; Handcock, Mark S.

    2015-01-01

    Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328

  9. Rational Variability in Children's Causal Inferences: The Sampling Hypothesis

    Science.gov (United States)

    Denison, Stephanie; Bonawitz, Elizabeth; Gopnik, Alison; Griffiths, Thomas L.

    2013-01-01

    We present a proposal--"The Sampling Hypothesis"--suggesting that the variability in young children's responses may be part of a rational strategy for inductive inference. In particular, we argue that young learners may be randomly sampling from the set of possible hypotheses that explain the observed data, producing different hypotheses with…

  10. Working with sample data exploration and inference

    CERN Document Server

    Chaffe-Stengel, Priscilla

    2014-01-01

    Managers and analysts routinely collect and examine key performance measures to better understand their operations and make good decisions. Being able to render the complexity of operations data into a coherent account of significant events requires an understanding of how to work well with raw data and to make appropriate inferences. Although some statistical techniques for analyzing data and making inferences are sophisticated and require specialized expertise, there are methods that are understandable and applicable by anyone with basic algebra skills and the support of a spreadsheet package. By applying these fundamental methods themselves rather than turning over both the data and the responsibility for analysis and interpretation to an expert, managers will develop a richer understanding and potentially gain better control over their environment. This text is intended to describe these fundamental statistical techniques to managers, data analysts, and students. Statistical analysis of sample data is enh...

  11. Outcome-Dependent Sampling Design and Inference for Cox's Proportional Hazards Model.

    Science.gov (United States)

    Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P; Zhou, Haibo

    2016-11-01

    We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study.

  12. Outcome-Dependent Sampling Design and Inference for Cox’s Proportional Hazards Model

    Science.gov (United States)

    Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P.; Zhou, Haibo

    2016-01-01

    We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study. PMID:28090134

  13. On Bayesian Inference under Sampling from Scale Mixtures of Normals

    NARCIS (Netherlands)

    Fernández, C.; Steel, M.F.J.

    1996-01-01

    This paper considers a Bayesian analysis of the linear regression model under independent sampling from general scale mixtures of Normals.Using a common reference prior, we investigate the validity of Bayesian inference and the existence of posterior moments of the regression and precision

  14. Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.

    Science.gov (United States)

    Bhaskar, Anand; Wang, Y X Rachel; Song, Yun S

    2015-02-01

    With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions. © 2015 Bhaskar et al.; Published by Cold Spring Harbor Laboratory Press.

  15. A normative inference approach for optimal sample sizes in decisions from experience

    Science.gov (United States)

    Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph

    2015-01-01

    “Decisions from experience” (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the “sampling paradigm,” which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the “optimal” sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE. PMID:26441720

  16. Sample Size and Robustness of Inferences from Logistic Regression in the Presence of Nonlinearity and Multicollinearity

    OpenAIRE

    Bergtold, Jason S.; Yeager, Elizabeth A.; Featherstone, Allen M.

    2011-01-01

    The logistic regression models has been widely used in the social and natural sciences and results from studies using this model can have significant impact. Thus, confidence in the reliability of inferences drawn from these models is essential. The robustness of such inferences is dependent on sample size. The purpose of this study is to examine the impact of sample size on the mean estimated bias and efficiency of parameter estimation and inference for the logistic regression model. A numbe...

  17. More practical critical height sampling.

    Science.gov (United States)

    Thomas B. Lynch; Jeffrey H. Gove

    2015-01-01

    Critical Height Sampling (CHS) (Kitamura 1964) can be used to predict cubic volumes per acre without using volume tables or equations. The critical height is defined as the height at which the tree stem appears to be in borderline condition using the point-sampling angle gauge (e.g. prism). An estimate of cubic volume per acre can be obtained from multiplication of the...

  18. Time Clustered Sampling Can Inflate the Inferred Substitution Rate in Foot-And-Mouth Disease Virus Analyses.

    Science.gov (United States)

    Pedersen, Casper-Emil T; Frandsen, Peter; Wekesa, Sabenzia N; Heller, Rasmus; Sangula, Abraham K; Wadsworth, Jemma; Knowles, Nick J; Muwanika, Vincent B; Siegismund, Hans R

    2015-01-01

    With the emergence of analytical software for the inference of viral evolution, a number of studies have focused on estimating important parameters such as the substitution rate and the time to the most recent common ancestor (tMRCA) for rapidly evolving viruses. Coupled with an increasing abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully consider how samples are combined.

  19. Critical examination of logical formulations in quantum theory. Statistical inference and Hilbertian distance between quantum states

    International Nuclear Information System (INIS)

    Hadjisawas, Nicolas.

    1982-01-01

    After a critical study of the logical quantum mechanics formulations of Jauch and Piron, classical and quantum versions of statistical inference are studied. In order to do this, the significance of the Jaynes and Kulback principles (maximum likelihood, least squares principles) is revealed from the theorems established. In the quantum mechanics inference problem, a ''distance'' between states is defined. This concept is used to solve the quantum equivalent of the classical problem studied by Kulback. The ''projection postulate'' proposition is subsequently deduced [fr

  20. Time clustered sampling can inflate the inferred substitution rate in foot-and-mouth disease virus analyses

    DEFF Research Database (Denmark)

    Pedersen, Casper-Emil Tingskov; Frandsen, Peter; Wekesa, Sabenzia N.

    2015-01-01

    abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale...... through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer...... to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully...

  1. The Accuracy of Inference in Small Samples of Dynamic Panel Data Models

    NARCIS (Netherlands)

    Bun, M.J.G.; Kiviet, J.F.

    2001-01-01

    Through Monte Carlo experiments the small sample behavior is examined of various inference techniques for dynamic panel data models when both the time-series and cross-section dimensions of the data set are small. The LSDV technique and corrected versions of it are compared with IV and GMM

  2. Geometric statistical inference

    International Nuclear Information System (INIS)

    Periwal, Vipul

    1999-01-01

    A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined

  3. An antithetic variate to facilitate upper-stem height measurements for critical height sampling with importance sampling

    Science.gov (United States)

    Thomas B. Lynch; Jeffrey H. Gove

    2013-01-01

    Critical height sampling (CHS) estimates cubic volume per unit area by multiplying the sum of critical heights measured on trees tallied in a horizontal point sample (HPS) by the HPS basal area factor. One of the barriers to practical application of CHS is the fact that trees near the field location of the point-sampling sample point have critical heights that occur...

  4. Probabilistic Inference in General Graphical Models through Sampling in Stochastic Networks of Spiking Neurons

    Science.gov (United States)

    Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang

    2011-01-01

    An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows (“explaining away”) and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons. PMID:22219717

  5. Probabilistic inference in general graphical models through sampling in stochastic networks of spiking neurons.

    Directory of Open Access Journals (Sweden)

    Dejan Pecevski

    2011-12-01

    Full Text Available An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away" and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.

  6. Probabilistic inference in general graphical models through sampling in stochastic networks of spiking neurons.

    Science.gov (United States)

    Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang

    2011-12-01

    An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away") and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.

  7. Generalizability of causal inference in observational studies under retrospective convenience sampling.

    Science.gov (United States)

    Hu, Zonghui; Qin, Jing

    2018-05-20

    Many observational studies adopt what we call retrospective convenience sampling (RCS). With the sample size in each arm prespecified, RCS randomly selects subjects from the treatment-inclined subpopulation into the treatment arm and those from the control-inclined into the control arm. Samples in each arm are representative of the respective subpopulation, but the proportion of the 2 subpopulations is usually not preserved in the sample data. We show in this work that, under RCS, existing causal effect estimators actually estimate the treatment effect over the sample population instead of the underlying study population. We investigate how to correct existing methods for consistent estimation of the treatment effect over the underlying population. Although RCS is adopted in medical studies for ethical and cost-effective purposes, it also has a big advantage for statistical inference: When the tendency to receive treatment is low in a study population, treatment effect estimators under RCS, with proper correction, are more efficient than their parallels under random sampling. These properties are investigated both theoretically and through numerical demonstration. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.

  8. Critical evaluation of sample pretreatment techniques.

    Science.gov (United States)

    Hyötyläinen, Tuulia

    2009-06-01

    Sample preparation before chromatographic separation is the most time-consuming and error-prone part of the analytical procedure. Therefore, selecting and optimizing an appropriate sample preparation scheme is a key factor in the final success of the analysis, and the judicious choice of an appropriate procedure greatly influences the reliability and accuracy of a given analysis. The main objective of this review is to critically evaluate the applicability, disadvantages, and advantages of various sample preparation techniques. Particular emphasis is placed on extraction techniques suitable for both liquid and solid samples.

  9. Inference for Local Distributions at High Sampling Frequencies: A Bootstrap Approach

    DEFF Research Database (Denmark)

    Hounyo, Ulrich; Varneskov, Rasmus T.

    of "large" jumps. Our locally dependent wild bootstrap (LDWB) accommodate issues related to the stochastic scale and jumps as well as account for a special block-wise dependence structure induced by sampling errors. We show that the LDWB replicates first and second-order limit theory from the usual...... empirical process and the stochastic scale estimate, respectively, as well as an asymptotic bias. Moreover, we design the LDWB sufficiently general to establish asymptotic equivalence between it and and a nonparametric local block bootstrap, also introduced here, up to second-order distribution theory....... Finally, we introduce LDWB-aided Kolmogorov-Smirnov tests for local Gaussianity as well as local von-Mises statistics, with and without bootstrap inference, and establish their asymptotic validity using the second-order distribution theory. The finite sample performance of CLT and LDWB-aided local...

  10. Statistical inference for the additive hazards model under outcome-dependent sampling.

    Science.gov (United States)

    Yu, Jichang; Liu, Yanyan; Sandler, Dale P; Zhou, Haibo

    2015-09-01

    Cost-effective study design and proper inference procedures for data from such designs are always of particular interests to study investigators. In this article, we propose a biased sampling scheme, an outcome-dependent sampling (ODS) design for survival data with right censoring under the additive hazards model. We develop a weighted pseudo-score estimator for the regression parameters for the proposed design and derive the asymptotic properties of the proposed estimator. We also provide some suggestions for using the proposed method by evaluating the relative efficiency of the proposed method against simple random sampling design and derive the optimal allocation of the subsamples for the proposed design. Simulation studies show that the proposed ODS design is more powerful than other existing designs and the proposed estimator is more efficient than other estimators. We apply our method to analyze a cancer study conducted at NIEHS, the Cancer Incidence and Mortality of Uranium Miners Study, to study the risk of radon exposure to cancer.

  11. Chemistry of groundwater discharge inferred from longitudinal river sampling

    Science.gov (United States)

    Batlle-Aguilar, J.; Harrington, G. A.; Leblanc, M.; Welch, C.; Cook, P. G.

    2014-02-01

    We present an approach for identifying groundwater discharge chemistry and quantifying spatially distributed groundwater discharge into rivers based on longitudinal synoptic sampling and flow gauging of a river. The method is demonstrated using a 450 km reach of a tropical river in Australia. Results obtained from sampling for environmental tracers, major ions, and selected trace element chemistry were used to calibrate a steady state one-dimensional advective transport model of tracer distribution along the river. The model closely reproduced river discharge and environmental tracer and chemistry composition along the study length. It provided a detailed longitudinal profile of groundwater inflow chemistry and discharge rates, revealing that regional fractured mudstones in the central part of the catchment contributed up to 40% of all groundwater discharge. Detailed analysis of model calibration errors and modeled/measured groundwater ion ratios elucidated that groundwater discharging in the top of the catchment is a mixture of local groundwater and bank storage return flow, making the method potentially useful to differentiate between local and regional sourced groundwater discharge. As the error in tracer concentration induced by a flow event applies equally to any conservative tracer, we show that major ion ratios can still be resolved with minimal error when river samples are collected during transient flow conditions. The ability of the method to infer groundwater inflow chemistry from longitudinal river sampling is particularly attractive in remote areas where access to groundwater is limited or not possible, and for identification of actual fluxes of salts and/or specific contaminant sources.

  12. A Bayesian Network Schema for Lessening Database Inference

    National Research Council Canada - National Science Library

    Chang, LiWu; Moskowitz, Ira S

    2001-01-01

    .... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...

  13. Inferences about Variance Components and Reliability-Generalizability Coefficients in the Absence of Random Sampling.

    Science.gov (United States)

    Kane, Michael

    2002-01-01

    Reviews the criticisms of sampling assumptions in generalizability theory (and in reliability theory) and examines the feasibility of using representative sampling, stratification, homogeneity assumptions, and replications to address these criticisms. Suggests some general outlines for the conduct of generalizability theory studies. (SLD)

  14. On the Structure of Cortical Microcircuits Inferred from Small Sample Sizes.

    Science.gov (United States)

    Vegué, Marina; Perin, Rodrigo; Roxin, Alex

    2017-08-30

    The structure in cortical microcircuits deviates from what would be expected in a purely random network, which has been seen as evidence of clustering. To address this issue, we sought to reproduce the nonrandom features of cortical circuits by considering several distinct classes of network topology, including clustered networks, networks with distance-dependent connectivity, and those with broad degree distributions. To our surprise, we found that all of these qualitatively distinct topologies could account equally well for all reported nonrandom features despite being easily distinguishable from one another at the network level. This apparent paradox was a consequence of estimating network properties given only small sample sizes. In other words, networks that differ markedly in their global structure can look quite similar locally. This makes inferring network structure from small sample sizes, a necessity given the technical difficulty inherent in simultaneous intracellular recordings, problematic. We found that a network statistic called the sample degree correlation (SDC) overcomes this difficulty. The SDC depends only on parameters that can be estimated reliably given small sample sizes and is an accurate fingerprint of every topological family. We applied the SDC criterion to data from rat visual and somatosensory cortex and discovered that the connectivity was not consistent with any of these main topological classes. However, we were able to fit the experimental data with a more general network class, of which all previous topologies were special cases. The resulting network topology could be interpreted as a combination of physical spatial dependence and nonspatial, hierarchical clustering. SIGNIFICANCE STATEMENT The connectivity of cortical microcircuits exhibits features that are inconsistent with a simple random network. Here, we show that several classes of network models can account for this nonrandom structure despite qualitative differences in

  15. Reducing bias in population and landscape genetic inferences: the effects of sampling related individuals and multiple life stages.

    Science.gov (United States)

    Peterman, William; Brocato, Emily R; Semlitsch, Raymond D; Eggert, Lori S

    2016-01-01

    In population or landscape genetics studies, an unbiased sampling scheme is essential for generating accurate results, but logistics may lead to deviations from the sample design. Such deviations may come in the form of sampling multiple life stages. Presently, it is largely unknown what effect sampling different life stages can have on population or landscape genetic inference, or how mixing life stages can affect the parameters being measured. Additionally, the removal of siblings from a data set is considered best-practice, but direct comparisons of inferences made with and without siblings are limited. In this study, we sampled embryos, larvae, and adult Ambystoma maculatum from five ponds in Missouri, and analyzed them at 15 microsatellite loci. We calculated allelic richness, heterozygosity and effective population sizes for each life stage at each pond and tested for genetic differentiation (F ST and D C ) and isolation-by-distance (IBD) among ponds. We tested for differences in each of these measures between life stages, and in a pooled population of all life stages. All calculations were done with and without sibling pairs to assess the effect of sibling removal. We also assessed the effect of reducing the number of microsatellites used to make inference. No statistically significant differences were found among ponds or life stages for any of the population genetic measures, but patterns of IBD differed among life stages. There was significant IBD when using adult samples, but tests using embryos, larvae, or a combination of the three life stages were not significant. We found that increasing the ratio of larval or embryo samples in the analysis of genetic distance weakened the IBD relationship, and when using D C , the IBD was no longer significant when larvae and embryos exceeded 60% of the population sample. Further, power to detect an IBD relationship was reduced when fewer microsatellites were used in the analysis.

  16. Reducing bias in population and landscape genetic inferences: the effects of sampling related individuals and multiple life stages

    Directory of Open Access Journals (Sweden)

    William Peterman

    2016-03-01

    Full Text Available In population or landscape genetics studies, an unbiased sampling scheme is essential for generating accurate results, but logistics may lead to deviations from the sample design. Such deviations may come in the form of sampling multiple life stages. Presently, it is largely unknown what effect sampling different life stages can have on population or landscape genetic inference, or how mixing life stages can affect the parameters being measured. Additionally, the removal of siblings from a data set is considered best-practice, but direct comparisons of inferences made with and without siblings are limited. In this study, we sampled embryos, larvae, and adult Ambystoma maculatum from five ponds in Missouri, and analyzed them at 15 microsatellite loci. We calculated allelic richness, heterozygosity and effective population sizes for each life stage at each pond and tested for genetic differentiation (FST and DC and isolation-by-distance (IBD among ponds. We tested for differences in each of these measures between life stages, and in a pooled population of all life stages. All calculations were done with and without sibling pairs to assess the effect of sibling removal. We also assessed the effect of reducing the number of microsatellites used to make inference. No statistically significant differences were found among ponds or life stages for any of the population genetic measures, but patterns of IBD differed among life stages. There was significant IBD when using adult samples, but tests using embryos, larvae, or a combination of the three life stages were not significant. We found that increasing the ratio of larval or embryo samples in the analysis of genetic distance weakened the IBD relationship, and when using DC, the IBD was no longer significant when larvae and embryos exceeded 60% of the population sample. Further, power to detect an IBD relationship was reduced when fewer microsatellites were used in the analysis.

  17. Fossil biogeography: a new model to infer dispersal, extinction and sampling from palaeontological data.

    Science.gov (United States)

    Silvestro, Daniele; Zizka, Alexander; Bacon, Christine D; Cascales-Miñana, Borja; Salamin, Nicolas; Antonelli, Alexandre

    2016-04-05

    Methods in historical biogeography have revolutionized our ability to infer the evolution of ancestral geographical ranges from phylogenies of extant taxa, the rates of dispersals, and biotic connectivity among areas. However, extant taxa are likely to provide limited and potentially biased information about past biogeographic processes, due to extinction, asymmetrical dispersals and variable connectivity among areas. Fossil data hold considerable information about past distribution of lineages, but suffer from largely incomplete sampling. Here we present a new dispersal-extinction-sampling (DES) model, which estimates biogeographic parameters using fossil occurrences instead of phylogenetic trees. The model estimates dispersal and extinction rates while explicitly accounting for the incompleteness of the fossil record. Rates can vary between areas and through time, thus providing the opportunity to assess complex scenarios of biogeographic evolution. We implement the DES model in a Bayesian framework and demonstrate through simulations that it can accurately infer all the relevant parameters. We demonstrate the use of our model by analysing the Cenozoic fossil record of land plants and inferring dispersal and extinction rates across Eurasia and North America. Our results show that biogeographic range evolution is not a time-homogeneous process, as assumed in most phylogenetic analyses, but varies through time and between areas. In our empirical assessment, this is shown by the striking predominance of plant dispersals from Eurasia into North America during the Eocene climatic cooling, followed by a shift in the opposite direction, and finally, a balance in biotic interchange since the middle Miocene. We conclude by discussing the potential of fossil-based analyses to test biogeographic hypotheses and improve phylogenetic methods in historical biogeography. © 2016 The Author(s).

  18. Population genetics inference for longitudinally-sampled mutants under strong selection.

    Science.gov (United States)

    Lacerda, Miguel; Seoighe, Cathal

    2014-11-01

    Longitudinal allele frequency data are becoming increasingly prevalent. Such samples permit statistical inference of the population genetics parameters that influence the fate of mutant variants. To infer these parameters by maximum likelihood, the mutant frequency is often assumed to evolve according to the Wright-Fisher model. For computational reasons, this discrete model is commonly approximated by a diffusion process that requires the assumption that the forces of natural selection and mutation are weak. This assumption is not always appropriate. For example, mutations that impart drug resistance in pathogens may evolve under strong selective pressure. Here, we present an alternative approximation to the mutant-frequency distribution that does not make any assumptions about the magnitude of selection or mutation and is much more computationally efficient than the standard diffusion approximation. Simulation studies are used to compare the performance of our method to that of the Wright-Fisher and Gaussian diffusion approximations. For large populations, our method is found to provide a much better approximation to the mutant-frequency distribution when selection is strong, while all three methods perform comparably when selection is weak. Importantly, maximum-likelihood estimates of the selection coefficient are severely attenuated when selection is strong under the two diffusion models, but not when our method is used. This is further demonstrated with an application to mutant-frequency data from an experimental study of bacteriophage evolution. We therefore recommend our method for estimating the selection coefficient when the effective population size is too large to utilize the discrete Wright-Fisher model. Copyright © 2014 by the Genetics Society of America.

  19. Sampling flies or sampling flaws? Experimental design and inference strength in forensic entomology.

    Science.gov (United States)

    Michaud, J-P; Schoenly, Kenneth G; Moreau, G

    2012-01-01

    Forensic entomology is an inferential science because postmortem interval estimates are based on the extrapolation of results obtained in field or laboratory settings. Although enormous gains in scientific understanding and methodological practice have been made in forensic entomology over the last few decades, a majority of the field studies we reviewed do not meet the standards for inference, which are 1) adequate replication, 2) independence of experimental units, and 3) experimental conditions that capture a representative range of natural variability. Using a mock case-study approach, we identify design flaws in field and lab experiments and suggest methodological solutions for increasing inference strength that can inform future casework. Suggestions for improving data reporting in future field studies are also proposed.

  20. Effects of Sample Size and Dimensionality on the Performance of Four Algorithms for Inference of Association Networks in Metabonomics

    NARCIS (Netherlands)

    Suarez Diez, M.; Saccenti, E.

    2015-01-01

    We investigated the effect of sample size and dimensionality on the performance of four algorithms (ARACNE, CLR, CORR, and PCLRC) when they are used for the inference of metabolite association networks. We report that as many as 100-400 samples may be necessary to obtain stable network estimations,

  1. INFERENCE BUILDING BLOCKS

    Science.gov (United States)

    2018-02-15

    expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation

  2. Dependence of critical current on sample length analyzed by the variation of local critical current bent of BSCCO superconducting composite tape

    International Nuclear Information System (INIS)

    Matsubayashi, H.; Mukai, Y.; Shin, J.K.; Ochiai, S.; Okuda, H.; Osamura, K.; Otto, A.; Malozemoff, A.

    2008-01-01

    Using the high critical current type BSCCO composite tape fabricated at American Superconductor Corporation, the relation of overall critical current to the distribution of local critical current and the dependence of overall critical current on sample length of the bent samples were studied experimentally and analytically. The measured overall critical current was described well from the distribution of local critical current and n-value of the constituting short elements, by regarding the overall sample to be composed of local series circuits and applying the voltage summation model. Also the dependence of overall critical current on sample length could be reproduced in computer satisfactorily by the proposed simulation method

  3. Causal inference in economics and marketing.

    Science.gov (United States)

    Varian, Hal R

    2016-07-05

    This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.

  4. Likelihood inference of non-constant diversification rates with incomplete taxon sampling.

    Science.gov (United States)

    Höhna, Sebastian

    2014-01-01

    Large-scale phylogenies provide a valuable source to study background diversification rates and investigate if the rates have changed over time. Unfortunately most large-scale, dated phylogenies are sparsely sampled (fewer than 5% of the described species) and taxon sampling is not uniform. Instead, taxa are frequently sampled to obtain at least one representative per subgroup (e.g. family) and thus to maximize diversity (diversified sampling). So far, such complications have been ignored, potentially biasing the conclusions that have been reached. In this study I derive the likelihood of a birth-death process with non-constant (time-dependent) diversification rates and diversified taxon sampling. Using simulations I test if the true parameters and the sampling method can be recovered when the trees are small or medium sized (fewer than 200 taxa). The results show that the diversification rates can be inferred and the estimates are unbiased for large trees but are biased for small trees (fewer than 50 taxa). Furthermore, model selection by means of Akaike's Information Criterion favors the true model if the true rates differ sufficiently from alternative models (e.g. the birth-death model is recovered if the extinction rate is large and compared to a pure-birth model). Finally, I applied six different diversification rate models--ranging from a constant-rate pure birth process to a decreasing speciation rate birth-death process but excluding any rate shift models--on three large-scale empirical phylogenies (ants, mammals and snakes with respectively 149, 164 and 41 sampled species). All three phylogenies were constructed by diversified taxon sampling, as stated by the authors. However only the snake phylogeny supported diversified taxon sampling. Moreover, a parametric bootstrap test revealed that none of the tested models provided a good fit to the observed data. The model assumptions, such as homogeneous rates across species or no rate shifts, appear to be

  5. Critical point relascope sampling for unbiased volume estimation of downed coarse woody debris

    Science.gov (United States)

    Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey; Mark J. Ducey

    2005-01-01

    Critical point relascope sampling is developed and shown to be design-unbiased for the estimation of log volume when used with point relascope sampling for downed coarse woody debris. The method is closely related to critical height sampling for standing trees when trees are first sampled with a wedge prism. Three alternative protocols for determining the critical...

  6. Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors

    International Nuclear Information System (INIS)

    Lucka, Felix

    2012-01-01

    Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)

  7. Reinforcement learning or active inference?

    Science.gov (United States)

    Friston, Karl J; Daunizeau, Jean; Kiebel, Stefan J

    2009-07-29

    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

  8. Reinforcement learning or active inference?

    Directory of Open Access Journals (Sweden)

    Karl J Friston

    2009-07-01

    Full Text Available This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

  9. Likelihood inference of non-constant diversification rates with incomplete taxon sampling.

    Directory of Open Access Journals (Sweden)

    Sebastian Höhna

    Full Text Available Large-scale phylogenies provide a valuable source to study background diversification rates and investigate if the rates have changed over time. Unfortunately most large-scale, dated phylogenies are sparsely sampled (fewer than 5% of the described species and taxon sampling is not uniform. Instead, taxa are frequently sampled to obtain at least one representative per subgroup (e.g. family and thus to maximize diversity (diversified sampling. So far, such complications have been ignored, potentially biasing the conclusions that have been reached. In this study I derive the likelihood of a birth-death process with non-constant (time-dependent diversification rates and diversified taxon sampling. Using simulations I test if the true parameters and the sampling method can be recovered when the trees are small or medium sized (fewer than 200 taxa. The results show that the diversification rates can be inferred and the estimates are unbiased for large trees but are biased for small trees (fewer than 50 taxa. Furthermore, model selection by means of Akaike's Information Criterion favors the true model if the true rates differ sufficiently from alternative models (e.g. the birth-death model is recovered if the extinction rate is large and compared to a pure-birth model. Finally, I applied six different diversification rate models--ranging from a constant-rate pure birth process to a decreasing speciation rate birth-death process but excluding any rate shift models--on three large-scale empirical phylogenies (ants, mammals and snakes with respectively 149, 164 and 41 sampled species. All three phylogenies were constructed by diversified taxon sampling, as stated by the authors. However only the snake phylogeny supported diversified taxon sampling. Moreover, a parametric bootstrap test revealed that none of the tested models provided a good fit to the observed data. The model assumptions, such as homogeneous rates across species or no rate shifts, appear

  10. The evolutionary history of ferns inferred from 25 low-copy nuclear genes.

    Science.gov (United States)

    Rothfels, Carl J; Li, Fay-Wei; Sigel, Erin M; Huiet, Layne; Larsson, Anders; Burge, Dylan O; Ruhsam, Markus; Deyholos, Michael; Soltis, Douglas E; Stewart, C Neal; Shaw, Shane W; Pokorny, Lisa; Chen, Tao; dePamphilis, Claude; DeGironimo, Lisa; Chen, Li; Wei, Xiaofeng; Sun, Xiao; Korall, Petra; Stevenson, Dennis W; Graham, Sean W; Wong, Gane K-S; Pryer, Kathleen M

    2015-07-01

    • Understanding fern (monilophyte) phylogeny and its evolutionary timescale is critical for broad investigations of the evolution of land plants, and for providing the point of comparison necessary for studying the evolution of the fern sister group, seed plants. Molecular phylogenetic investigations have revolutionized our understanding of fern phylogeny, however, to date, these studies have relied almost exclusively on plastid data.• Here we take a curated phylogenomics approach to infer the first broad fern phylogeny from multiple nuclear loci, by combining broad taxon sampling (73 ferns and 12 outgroup species) with focused character sampling (25 loci comprising 35877 bp), along with rigorous alignment, orthology inference and model selection.• Our phylogeny corroborates some earlier inferences and provides novel insights; in particular, we find strong support for Equisetales as sister to the rest of ferns, Marattiales as sister to leptosporangiate ferns, and Dennstaedtiaceae as sister to the eupolypods. Our divergence-time analyses reveal that divergences among the extant fern orders all occurred prior to ∼200 MYA. Finally, our species-tree inferences are congruent with analyses of concatenated data, but generally with lower support. Those cases where species-tree support values are higher than expected involve relationships that have been supported by smaller plastid datasets, suggesting that deep coalescence may be reducing support from the concatenated nuclear data.• Our study demonstrates the utility of a curated phylogenomics approach to inferring fern phylogeny, and highlights the need to consider underlying data characteristics, along with data quantity, in phylogenetic studies. © 2015 Botanical Society of America, Inc.

  11. Impact of preferential sampling on exposure prediction and health effect inference in the context of air pollution epidemiology.

    Science.gov (United States)

    Lee, A; Szpiro, A; Kim, S Y; Sheppard, L

    2015-06-01

    Preferential sampling has been defined in the context of geostatistical modeling as the dependence between the sampling locations and the process that describes the spatial structure of the data. It can occur when networks are designed to find high values. For example, in networks based on the U.S. Clean Air Act monitors are sited to determine whether air quality standards are exceeded. We study the impact of the design of monitor networks in the context of air pollution epidemiology studies. The effect of preferential sampling has been illustrated in the literature by highlighting its impact on spatial predictions. In this paper, we use these predictions as input in a second stage analysis, and we assess how they affect health effect inference. Our work is motivated by data from two United States regulatory networks and health data from the Multi-Ethnic Study of Atherosclerosis and Air Pollution. The two networks were designed to monitor air pollution in urban and rural areas respectively, and we found that the health analysis results based on the two networks can lead to different scientific conclusions. We use preferential sampling to gain insight into these differences. We designed a simulation study, and found that the validity and reliability of the health effect estimate can be greatly affected by how we sample the monitor locations. To better understand its effect on second stage inference, we identify two components of preferential sampling that shed light on how preferential sampling alters the properties of the health effect estimate.

  12. Impact of preferential sampling on exposure prediction and health effect inference in the context of air pollution epidemiology

    Science.gov (United States)

    Lee, A.; Szpiro, A.; Kim, S.Y.; Sheppard, L.

    2018-01-01

    Summary Preferential sampling has been defined in the context of geostatistical modeling as the dependence between the sampling locations and the process that describes the spatial structure of the data. It can occur when networks are designed to find high values. For example, in networks based on the U.S. Clean Air Act monitors are sited to determine whether air quality standards are exceeded. We study the impact of the design of monitor networks in the context of air pollution epidemiology studies. The effect of preferential sampling has been illustrated in the literature by highlighting its impact on spatial predictions. In this paper, we use these predictions as input in a second stage analysis, and we assess how they affect health effect inference. Our work is motivated by data from two United States regulatory networks and health data from the Multi-Ethnic Study of Atherosclerosis and Air Pollution. The two networks were designed to monitor air pollution in urban and rural areas respectively, and we found that the health analysis results based on the two networks can lead to different scientific conclusions. We use preferential sampling to gain insight into these differences. We designed a simulation study, and found that the validity and reliability of the health effect estimate can be greatly affected by how we sample the monitor locations. To better understand its effect on second stage inference, we identify two components of preferential sampling that shed light on how preferential sampling alters the properties of the health effect estimate. PMID:29576734

  13. Transport measurements in superconductors: critical current of granular high TC ceramic superconductor samples

    International Nuclear Information System (INIS)

    Passos, W.A.C.

    2016-01-01

    This work presents a method to obtain critical current of granular superconductors. We have carried out transport measurements (ρxT curves and VxI curves) in a YBa_2Cu_3O_7_-_δ sample to determine critical current density of it. Some specimens reveal a 'semiconductor-like' behavior (electrical resistivity decreases with increasing temperatures above critical temperature T_c of material) competing with superconductor behavior. Due to high granular fraction of the sample, these competition is clearly noted in ρxT curves. Measurements carried out from 0 to 8500 Oe of applied field show the same behavior, and the critical current density of the samples is shown. (author)

  14. Polynomial Chaos Surrogates for Bayesian Inference

    KAUST Repository

    Le Maitre, Olivier

    2016-01-06

    The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.

  15. Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

    Science.gov (United States)

    Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

    2017-06-01

    The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.

  16. The role of student’s critical asking question in developing student’s critical thinking skills

    Science.gov (United States)

    Santoso, T.; Yuanita, L.; Erman, E.

    2018-01-01

    Questioning means thinking, and thinking is manifested in the form of questions. Research that studies the relationship between questioning and students’ critical thinking skills is little, if any. The aim of this study is to examine how student’s questions skill correlates to student’s critical thinking skills in learning of chemistry. The research design used was one group pretest-posttest design. The participants involved were 94 students, all of whom attended their last semesters, Chemistry Education of Tadulako University. A pre-test was administered to check participants’ ability to ask critical questions and critical thinking skills in learning chemistry. Then, the students were taught by using questioning technique. After accomplishing the lesson, a post-test was given to evaluate their progress. Obtained data were analyzed by using Pair-Samples T.Test and correlation methods. The result shows that the level of the questions plays an important role in critical thinking skills is the question levels of predictive, analysis, evaluation and inference.

  17. Inferring microRNA regulation of mRNA with partially ordered samples of paired expression data and exogenous prediction algorithms.

    Directory of Open Access Journals (Sweden)

    Brian Godsey

    Full Text Available MicroRNAs (miRs are known to play an important role in mRNA regulation, often by binding to complementary sequences in "target" mRNAs. Recently, several methods have been developed by which existing sequence-based target predictions can be combined with miR and mRNA expression data to infer true miR-mRNA targeting relationships. It has been shown that the combination of these two approaches gives more reliable results than either by itself. While a few such algorithms give excellent results, none fully addresses expression data sets with a natural ordering of the samples. If the samples in an experiment can be ordered or partially ordered by their expected similarity to one another, such as for time-series or studies of development processes, stages, or types, (e.g. cell type, disease, growth, aging, there are unique opportunities to infer miR-mRNA interactions that may be specific to the underlying processes, and existing methods do not exploit this. We propose an algorithm which specifically addresses [partially] ordered expression data and takes advantage of sample similarities based on the ordering structure. This is done within a Bayesian framework which specifies posterior distributions and therefore statistical significance for each model parameter and latent variable. We apply our model to a previously published expression data set of paired miR and mRNA arrays in five partially ordered conditions, with biological replicates, related to multiple myeloma, and we show how considering potential orderings can improve the inference of miR-mRNA interactions, as measured by existing knowledge about the involved transcripts.

  18. Theory of sampling: four critical success factors before analysis.

    Science.gov (United States)

    Wagner, Claas; Esbensen, Kim H

    2015-01-01

    Food and feed materials characterization, risk assessment, and safety evaluations can only be ensured if QC measures are based on valid analytical data, stemming from representative samples. The Theory of Sampling (TOS) is the only comprehensive theoretical framework that fully defines all requirements to ensure sampling correctness and representativity, and to provide the guiding principles for sampling in practice. TOS also defines the concept of material heterogeneity and its impact on the sampling process, including the effects from all potential sampling errors. TOS's primary task is to eliminate bias-generating errors and to minimize sampling variability. Quantitative measures are provided to characterize material heterogeneity, on which an optimal sampling strategy should be based. Four critical success factors preceding analysis to ensure a representative sampling process are presented here.

  19. Analysis of Critical Thinking Skills on The Topic of Static Fluid

    Science.gov (United States)

    Puspita, I.; Kaniawati, I.; Suwarma, I. R.

    2017-09-01

    This study aimed to know the critical thinking skills profil of senior high school students. This research using a descriptive study to analysis student test results of critical thinking skill of 40 students XI grade in one of the senior high school in Bogor District. The method used is survey research with sample determined by purposive sampling technique. The instrument used is test of critical thinking skill by 5 indicators on static fluid topics. Questions consist of 11 set. It is has been developed by researcher and validated by experts. The results showed students critical thinking skills are still low. Is almost every indicator of critical thinking skills only reaches less than 30%. 28% for elementary clarification, 10% for the basic for decisions/basic support, 6% for inference, 6% for advanced clarification, 4% for strategies and tactics.

  20. Brain Imaging, Forward Inference, and Theories of Reasoning

    Science.gov (United States)

    Heit, Evan

    2015-01-01

    This review focuses on the issue of how neuroimaging studies address theoretical accounts of reasoning, through the lens of the method of forward inference (Henson, 2005, 2006). After theories of deductive and inductive reasoning are briefly presented, the method of forward inference for distinguishing between psychological theories based on brain imaging evidence is critically reviewed. Brain imaging studies of reasoning, comparing deductive and inductive arguments, comparing meaningful versus non-meaningful material, investigating hemispheric localization, and comparing conditional and relational arguments, are assessed in light of the method of forward inference. Finally, conclusions are drawn with regard to future research opportunities. PMID:25620926

  1. Brain imaging, forward inference, and theories of reasoning.

    Science.gov (United States)

    Heit, Evan

    2014-01-01

    This review focuses on the issue of how neuroimaging studies address theoretical accounts of reasoning, through the lens of the method of forward inference (Henson, 2005, 2006). After theories of deductive and inductive reasoning are briefly presented, the method of forward inference for distinguishing between psychological theories based on brain imaging evidence is critically reviewed. Brain imaging studies of reasoning, comparing deductive and inductive arguments, comparing meaningful versus non-meaningful material, investigating hemispheric localization, and comparing conditional and relational arguments, are assessed in light of the method of forward inference. Finally, conclusions are drawn with regard to future research opportunities.

  2. The significance of Sampling Design on Inference: An Analysis of Binary Outcome Model of Children’s Schooling Using Indonesian Large Multi-stage Sampling Data

    OpenAIRE

    Ekki Syamsulhakim

    2008-01-01

    This paper aims to exercise a rather recent trend in applied microeconometrics, namely the effect of sampling design on statistical inference, especially on binary outcome model. Many theoretical research in econometrics have shown the inappropriateness of applying i.i.dassumed statistical analysis on non-i.i.d data. These research have provided proofs showing that applying the iid-assumed analysis on a non-iid observations would result in an inflated standard errors which could make the esti...

  3. Critical current measurements of high Tc superconductors in a scanning low temperature cryostat

    International Nuclear Information System (INIS)

    Telschow, K.L.; O'Brien, T.K.

    1991-01-01

    Maintaining uniformity of properties over long distances is one of the fabrication problems encountered with the new high T c superconductors. Uniform properties are crucial in long tapes or wires with high critical current since local nonuniformities can limit the current carrying capacity of the whole piece. Transport critical currents in high T c superconductors are conventionally measured with the contact 4-point probe DC current-voltage technique. This technique requires contact with the sample and and spatially averages over the region between the two voltage contacts. Two techniques have been used to infer the critical state model. The first uses the net magnetization of a suitably shaped sample in an external magnetic field. The second combines a DC magnetic field with AC induced currents to infer spatial flux profiles. The AC magnetization technique offers an advantage in that it is noncontacting; however, it also averages the measurement over a large area and requires that the sample be shaped and positioned such that it exhibits zero demagnetizing factor. This paper describes a measurement technique and a scanning cryostat assembly that are capable of determining local critical current in a tape or wire with high resolution and without any direct sample electrical contact. A small compensated coil was used to induce AC currents in slab-shaped samples. The coil was situated near the surface on one side of the slab. With this method, the AC probe can be used as a noncontacting dissipation probe, replacing the voltage probe in the 4-point contact method, when an externally driven transport current is used, or by itself as a local critical state generator and dissipation detector. The results are shown to be meaningful even when the internal magnetic field is not uniform due to shape demagnetizing effects. 10 refs., 5 figs

  4. Statistical Inference on the Canadian Middle Class

    Directory of Open Access Journals (Sweden)

    Russell Davidson

    2018-03-01

    Full Text Available Conventional wisdom says that the middle classes in many developed countries have recently suffered losses, in terms of both the share of the total population belonging to the middle class, and also their share in total income. Here, distribution-free methods are developed for inference on these shares, by means of deriving expressions for their asymptotic variances of sample estimates, and the covariance of the estimates. Asymptotic inference can be undertaken based on asymptotic normality. Bootstrap inference can be expected to be more reliable, and appropriate bootstrap procedures are proposed. As an illustration, samples of individual earnings drawn from Canadian census data are used to test various hypotheses about the middle-class shares, and confidence intervals for them are computed. It is found that, for the earlier censuses, sample sizes are large enough for asymptotic and bootstrap inference to be almost identical, but that, in the twenty-first century, the bootstrap fails on account of a strange phenomenon whereby many presumably different incomes in the data are rounded to one and the same value. Another difference between the centuries is the appearance of heavy right-hand tails in the income distributions of both men and women.

  5. Inference in partially identified models with many moment inequalities using Lasso

    DEFF Research Database (Denmark)

    Bugni, Federico A.; Caner, Mehmet; Kock, Anders Bredahl

    This paper considers the problem of inference in a partially identified moment (in)equality model with possibly many moment inequalities. Our contribution is to propose a novel two-step new inference method based on the combination of two ideas. On the one hand, our test statistic and critical...

  6. Analytical results from Tank 38H criticality Sample HTF-093

    International Nuclear Information System (INIS)

    Wilmarth, W.R.

    2000-01-01

    Resumption of processing in the 242-16H Evaporator could cause salt dissolution in the Waste Concentration Receipt Tank (Tank 38H). Therefore, High Level Waste personnel sampled the tank at the salt surface. Results of elemental analysis of the dried sludge solids from this sample (HTF-093) show significant quantities of neutron poisons (i.e., sodium, iron, and manganese) present to mitigate the potential for nuclear criticality. Comparison of this sample with the previous chemical and radiometric analyses of H-Area Evaporator samples show high poison to actinide ratios

  7. Design-based Sample and Probability Law-Assumed Sample: Their Role in Scientific Investigation.

    Science.gov (United States)

    Ojeda, Mario Miguel; Sahai, Hardeo

    2002-01-01

    Discusses some key statistical concepts in probabilistic and non-probabilistic sampling to provide an overview for understanding the inference process. Suggests a statistical model constituting the basis of statistical inference and provides a brief review of the finite population descriptive inference and a quota sampling inferential theory.…

  8. Phylogenetic Inference of HIV Transmission Clusters

    Directory of Open Access Journals (Sweden)

    Vlad Novitsky

    2017-10-01

    Full Text Available Better understanding the structure and dynamics of HIV transmission networks is essential for designing the most efficient interventions to prevent new HIV transmissions, and ultimately for gaining control of the HIV epidemic. The inference of phylogenetic relationships and the interpretation of results rely on the definition of the HIV transmission cluster. The definition of the HIV cluster is complex and dependent on multiple factors, including the design of sampling, accuracy of sequencing, precision of sequence alignment, evolutionary models, the phylogenetic method of inference, and specified thresholds for cluster support. While the majority of studies focus on clusters, non-clustered cases could also be highly informative. A new dimension in the analysis of the global and local HIV epidemics is the concept of phylogenetically distinct HIV sub-epidemics. The identification of active HIV sub-epidemics reveals spreading viral lineages and may help in the design of targeted interventions.HIVclustering can also be affected by sampling density. Obtaining a proper sampling density may increase statistical power and reduce sampling bias, so sampling density should be taken into account in study design and in interpretation of phylogenetic results. Finally, recent advances in long-range genotyping may enable more accurate inference of HIV transmission networks. If performed in real time, it could both inform public-health strategies and be clinically relevant (e.g., drug-resistance testing.

  9. Inference Instruction to Support Reading Comprehension for Elementary Students with Learning Disabilities

    Science.gov (United States)

    Hall, Colby; Barnes, Marcia A.

    2017-01-01

    Making inferences during reading is a critical standards-based skill and is important for reading comprehension. This article supports the improvement of reading comprehension for students with learning disabilities (LD) in upper elementary grades by reviewing what is currently known about inference instruction for students with LD and providing…

  10. Fused Regression for Multi-source Gene Regulatory Network Inference.

    Directory of Open Access Journals (Sweden)

    Kari Y Lam

    2016-12-01

    Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.

  11. Causal inference in biology networks with integrated belief propagation.

    Science.gov (United States)

    Chang, Rui; Karr, Jonathan R; Schadt, Eric E

    2015-01-01

    Inferring causal relationships among molecular and higher order phenotypes is a critical step in elucidating the complexity of living systems. Here we propose a novel method for inferring causality that is no longer constrained by the conditional dependency arguments that limit the ability of statistical causal inference methods to resolve causal relationships within sets of graphical models that are Markov equivalent. Our method utilizes Bayesian belief propagation to infer the responses of perturbation events on molecular traits given a hypothesized graph structure. A distance measure between the inferred response distribution and the observed data is defined to assess the 'fitness' of the hypothesized causal relationships. To test our algorithm, we infer causal relationships within equivalence classes of gene networks in which the form of the functional interactions that are possible are assumed to be nonlinear, given synthetic microarray and RNA sequencing data. We also apply our method to infer causality in real metabolic network with v-structure and feedback loop. We show that our method can recapitulate the causal structure and recover the feedback loop only from steady-state data which conventional method cannot.

  12. Sample-length dependence of the critical current of slightly and significantly bent-damaged Bi2223 superconducting composite tape

    International Nuclear Information System (INIS)

    Ochiai, S; Fujimoto, M; Okuda, H; Oh, S S; Ha, D W

    2007-01-01

    The local critical current along a sample length is different from position to position in a long sample, especially when the sample is damaged by externally applied strain. In the present work, we attempted to reveal the relation of the distribution of the local critical current to overall critical current and the sample-length dependence of critical current for slightly and significantly damaged Bi2223 composite tape samples. In the experiment, 48 cm long Bi2223 composite tape samples, composed of 48 local elements with a length of 1 cm and 8 parts with a length 6 cm, were bent by 0.37 and 1.0% to cause slight and significant damage, respectively. The V-I curve, critical current (1 μV cm -1 criterion) and n value were measured for the overall sample as well as for the local elements and parts. It was found that the critical current distributions of the 1 cm elements at 0.37 and 1.0% bending strains are described by the three-parameter- and bimodal Weibull distribution functions, respectively. The critical current of a long sample at both bending strains could be described well by substituting the distributed critical current and n value of the short elements into the series circuit model for voltage generation. Also the measured relation of average critical current to sample length could be reproduced well in the computer by a Monte Carlo simulation method. It was shown that the critical current and n value decrease with increasing sample length at both bending strains. The extent of the decrease in critical current with sample length is dependent on the criterion of the critical current; the critical current decreases only slightly under the 1 μV cm -1 criterion which is not damage-sensitive, while it decreases greatly with increasing sample length under damage-sensitive criteria such as the 1 μV one

  13. Making inference from wildlife collision data: inferring predator absence from prey strikes

    Directory of Open Access Journals (Sweden)

    Peter Caley

    2017-02-01

    Full Text Available Wildlife collision data are ubiquitous, though challenging for making ecological inference due to typically irreducible uncertainty relating to the sampling process. We illustrate a new approach that is useful for generating inference from predator data arising from wildlife collisions. By simply conditioning on a second prey species sampled via the same collision process, and by using a biologically realistic numerical response functions, we can produce a coherent numerical response relationship between predator and prey. This relationship can then be used to make inference on the population size of the predator species, including the probability of extinction. The statistical conditioning enables us to account for unmeasured variation in factors influencing the runway strike incidence for individual airports and to enable valid comparisons. A practical application of the approach for testing hypotheses about the distribution and abundance of a predator species is illustrated using the hypothesized red fox incursion into Tasmania, Australia. We estimate that conditional on the numerical response between fox and lagomorph runway strikes on mainland Australia, the predictive probability of observing no runway strikes of foxes in Tasmania after observing 15 lagomorph strikes is 0.001. We conclude there is enough evidence to safely reject the null hypothesis that there is a widespread red fox population in Tasmania at a population density consistent with prey availability. The method is novel and has potential wider application.

  14. Making inference from wildlife collision data: inferring predator absence from prey strikes.

    Science.gov (United States)

    Caley, Peter; Hosack, Geoffrey R; Barry, Simon C

    2017-01-01

    Wildlife collision data are ubiquitous, though challenging for making ecological inference due to typically irreducible uncertainty relating to the sampling process. We illustrate a new approach that is useful for generating inference from predator data arising from wildlife collisions. By simply conditioning on a second prey species sampled via the same collision process, and by using a biologically realistic numerical response functions, we can produce a coherent numerical response relationship between predator and prey. This relationship can then be used to make inference on the population size of the predator species, including the probability of extinction. The statistical conditioning enables us to account for unmeasured variation in factors influencing the runway strike incidence for individual airports and to enable valid comparisons. A practical application of the approach for testing hypotheses about the distribution and abundance of a predator species is illustrated using the hypothesized red fox incursion into Tasmania, Australia. We estimate that conditional on the numerical response between fox and lagomorph runway strikes on mainland Australia, the predictive probability of observing no runway strikes of foxes in Tasmania after observing 15 lagomorph strikes is 0.001. We conclude there is enough evidence to safely reject the null hypothesis that there is a widespread red fox population in Tasmania at a population density consistent with prey availability. The method is novel and has potential wider application.

  15. The Heuristic Value of p in Inductive Statistical Inference

    Directory of Open Access Journals (Sweden)

    Joachim I. Krueger

    2017-06-01

    Full Text Available Many statistical methods yield the probability of the observed data – or data more extreme – under the assumption that a particular hypothesis is true. This probability is commonly known as ‘the’ p-value. (Null Hypothesis Significance Testing ([NH]ST is the most prominent of these methods. The p-value has been subjected to much speculation, analysis, and criticism. We explore how well the p-value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p-value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p-value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.

  16. The Heuristic Value of p in Inductive Statistical Inference.

    Science.gov (United States)

    Krueger, Joachim I; Heck, Patrick R

    2017-01-01

    Many statistical methods yield the probability of the observed data - or data more extreme - under the assumption that a particular hypothesis is true. This probability is commonly known as 'the' p -value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p -value has been subjected to much speculation, analysis, and criticism. We explore how well the p -value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p -value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p -value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.

  17. Ancestry informative markers: inference of ancestry in aged bone samples using an autosomal AIM-Indel multiplex.

    Science.gov (United States)

    Romanini, Carola; Romero, Magdalena; Salado Puerto, Mercedes; Catelli, Laura; Phillips, Christopher; Pereira, Rui; Gusmão, Leonor; Vullo, Carlos

    2015-05-01

    Ancestry informative markers (AIMs) can be useful to infer ancestry proportions of the donors of forensic evidence. The probability of success typing degraded samples, such as human skeletal remains, is strongly influenced by the DNA fragment lengths that can be amplified and the presence of PCR inhibitors. Several AIM panels are available amongst the many forensic marker sets developed for genotyping degraded DNA. Using a 46 AIM Insertion Deletion (Indel) multiplex, we analyzed human skeletal remains of post mortem time ranging from 35 to 60 years from four different continents (Sub-Saharan Africa, South and Central America, East Asia and Europe) to ascertain the genetic ancestry components. Samples belonging to non-admixed individuals could be assigned to their corresponding continental group. For the remaining samples with admixed ancestry, it was possible to estimate the proportion of co-ancestry components from the four reference population groups. The 46 AIM Indel set was informative enough to efficiently estimate the proportion of ancestry even in samples yielding partial profiles, a frequent occurrence when analyzing inhibited and/or degraded DNA extracts. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  18. HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR

    International Nuclear Information System (INIS)

    Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin

    2015-01-01

    Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics

  19. Bayesian Inference and Online Learning in Poisson Neuronal Networks.

    Science.gov (United States)

    Huang, Yanping; Rao, Rajesh P N

    2016-08-01

    Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.

  20. Critical length sampling: a method to estimate the volume of downed coarse woody debris

    Science.gov (United States)

    G& #246; ran St& #229; hl; Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey

    2010-01-01

    In this paper, critical length sampling for estimating the volume of downed coarse woody debris is presented. Using this method, the volume of downed wood in a stand can be estimated by summing the critical lengths of down logs included in a sample obtained using a relascope or wedge prism; typically, the instrument should be tilted 90° from its usual...

  1. A Learning Algorithm for Multimodal Grammar Inference.

    Science.gov (United States)

    D'Ulizia, A; Ferri, F; Grifoni, P

    2011-12-01

    The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.

  2. Statistical inference

    CERN Document Server

    Rohatgi, Vijay K

    2003-01-01

    Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth

  3. Evaluation of Pu sample oscillations in CESAR

    Energy Technology Data Exchange (ETDEWEB)

    Brunet, M.

    1974-10-15

    A set of 12 plutonium samples of various compositions were oscillated in CESAR in 1973. Comparisons were made to the oscillated reactivity effect of a known specimen of U-235 and boron where the detector signals were corrected against a background signal based on comparison to the motion of a control rod in the central location of critical assembly. An equivalent sample method was tested first for various samples of U-235 and boron to establish a means of correction in the detector response. Inferred plutonium reaction rates in the experiments were compared to transport theory calculations using the APOLLO code. Addition effort is needed to reconcile differences in measured and calculated results requiring both chemical analyses of the plutonium isotopes in the samples and improved cross sections for the plutonium isotopes.

  4. Enhancement of critical currents in superconducting cylindrical samples by circular magnetization

    Energy Technology Data Exchange (ETDEWEB)

    Sikora, A; Makiej, B

    1986-07-16

    Evidence is presented for an enhancement of the critical current by a circular magnetization in cylindrical samples of superconductors such as Sn, In, and In-Pb alloy containing 20 wt% ferromagnetic carbon steel particles. The mechanism of this phenomenon is explained.

  5. Statistical inference for discrete-time samples from affine stochastic delay differential equations

    DEFF Research Database (Denmark)

    Küchler, Uwe; Sørensen, Michael

    2013-01-01

    Statistical inference for discrete time observations of an affine stochastic delay differential equation is considered. The main focus is on maximum pseudo-likelihood estimators, which are easy to calculate in practice. A more general class of prediction-based estimating functions is investigated...

  6. Geostatistical inference using crosshole ground-penetrating radar

    DEFF Research Database (Denmark)

    Looms, Majken C; Hansen, Thomas Mejer; Cordua, Knud Skou

    2010-01-01

    of the subsurface are used to evaluate the uncertainty of the inversion estimate. We have explored the full potential of the geostatistical inference method using several synthetic models of varying correlation structures and have tested the influence of different assumptions concerning the choice of covariance...... reflection profile. Furthermore, the inferred values of the subsurface global variance and the mean velocity have been corroborated with moisturecontent measurements, obtained gravimetrically from samples collected at the field site....

  7. Probabilistic Inference: Task Dependency and Individual Differences of Probability Weighting Revealed by Hierarchical Bayesian Modeling.

    Science.gov (United States)

    Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno

    2016-01-01

    Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.

  8. Probabilistic inference: Task dependency and individual differences of probability weighting revealed by hierarchical Bayesian modelling

    Directory of Open Access Journals (Sweden)

    Moritz eBoos

    2016-05-01

    Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.

  9. A general Bayes weibull inference model for accelerated life testing

    International Nuclear Information System (INIS)

    Dorp, J. Rene van; Mazzuchi, Thomas A.

    2005-01-01

    This article presents the development of a general Bayes inference model for accelerated life testing. The failure times at a constant stress level are assumed to belong to a Weibull distribution, but the specification of strict adherence to a parametric time-transformation function is not required. Rather, prior information is used to indirectly define a multivariate prior distribution for the scale parameters at the various stress levels and the common shape parameter. Using the approach, Bayes point estimates as well as probability statements for use-stress (and accelerated) life parameters may be inferred from a host of testing scenarios. The inference procedure accommodates both the interval data sampling strategy and type I censored sampling strategy for the collection of ALT test data. The inference procedure uses the well-known MCMC (Markov Chain Monte Carlo) methods to derive posterior approximations. The approach is illustrated with an example

  10. Connectivity inference from neural recording data: Challenges, mathematical bases and research directions.

    Science.gov (United States)

    Magrans de Abril, Ildefons; Yoshimoto, Junichiro; Doya, Kenji

    2018-06-01

    This article presents a review of computational methods for connectivity inference from neural activity data derived from multi-electrode recordings or fluorescence imaging. We first identify biophysical and technical challenges in connectivity inference along the data processing pipeline. We then review connectivity inference methods based on two major mathematical foundations, namely, descriptive model-free approaches and generative model-based approaches. We investigate representative studies in both categories and clarify which challenges have been addressed by which method. We further identify critical open issues and possible research directions. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  11. Impact of radiobiological considerations on epidemiological inferences of age-dependent radiosensitivity

    International Nuclear Information System (INIS)

    Crawford-Brown, D.J.

    1983-01-01

    Current epidemiological studies of the age-dependent risk of radiogenic carcinomas are based on populations still in the early stages of cancer expression. The result is a set of logical uncertainties concerning the manner in which inferences may be drawn from the existing data. These uncertainties may be formalized and examined through the application of various radiobiological principles developed from more fundamental experimental data. Chief amongst these considerations are the time course of tumor expression, the role of relative and absolute risk models, the distribution of effects between initiation and promotion, the age-dependent fraction of time a critical cell remains in radiosensitive stages and the combinatorics of the critical cellular subpopulations. Each of these and the combinatorics of the critical cellular subpopulations. Each of these principles are examined in light of their impact on the structuring of epidemiologic data and the drawing of inferences concerning age-dependent radiogenic risk. The data on atomic bomb survivors are employed as a relevant example

  12. Effect of non-Poisson samples on turbulence spectra from laser velocimetry

    Science.gov (United States)

    Sree, Dave; Kjelgaard, Scott O.; Sellers, William L., III

    1994-01-01

    Spectral analysis of laser velocimetry (LV) data plays an important role in characterizing a turbulent flow and in estimating the associated turbulence scales, which can be helpful in validating theoretical and numerical turbulence models. The determination of turbulence scales is critically dependent on the accuracy of the spectral estimates. Spectral estimations from 'individual realization' laser velocimetry data are typically based on the assumption of a Poisson sampling process. What this Note has demonstrated is that the sampling distribution must be considered before spectral estimates are used to infer turbulence scales.

  13. Monte Carlo sampling on technical parameters in criticality and burn-up-calculations

    International Nuclear Information System (INIS)

    Kirsch, M.; Hannstein, V.; Kilger, R.

    2011-01-01

    The increase in computing power over the recent years allows for the introduction of Monte Carlo sampling techniques for sensitivity and uncertainty analyses in criticality safety and burn-up calculations. With these techniques it is possible to assess the influence of a variation of the input parameters within their measured or estimated uncertainties on the final value of a calculation. The probabilistic result of a statistical analysis can thus complement the traditional method of figuring out both the nominal (best estimate) and the bounding case of the neutron multiplication factor (k eff ) in criticality safety analyses, e.g. by calculating the uncertainty of k eff or tolerance limits. Furthermore, the sampling method provides a possibility to derive sensitivity information, i.e. it allows figuring out which of the uncertain input parameters contribute the most to the uncertainty of the system. The application of Monte Carlo sampling methods has become a common practice in both industry and research institutes. Within this approach, two main paths are currently under investigation: the variation of nuclear data used in a calculation and the variation of technical parameters such as manufacturing tolerances. This contribution concentrates on the latter case. The newly developed SUnCISTT (Sensitivities and Uncertainties in Criticality Inventory and Source Term Tool) is introduced. It defines an interface to the well established GRS tool for sensitivity and uncertainty analyses SUSA, that provides the necessary statistical methods for sampling based analyses. The interfaced codes are programs that are used to simulate aspects of the nuclear fuel cycle, such as the criticality safety analysis sequence CSAS5 of the SCALE code system, developed by Oak Ridge National Laboratories, or the GRS burn-up system OREST. In the following, first the implementation of the SUnCISTT will be presented, then, results of its application in an exemplary evaluation of the neutron

  14. Critical thinking dispositions in baccalaureate nursing students.

    Science.gov (United States)

    Shin, Kyung Rim; Lee, Ja Hyung; Ha, Ju Young; Kim, Kon Hee

    2006-10-01

    This paper reports an investigation into the critical thinking disposition of students enrolled in a baccalaureate nursing programme at a university in Korea. Critical thinking may be summarized as a skilled process that conceptualizes and applies information from observation, experience, reflection, inference and communication in a technical manner. It is more of a rational act used as an instrument rather than as a result. Critical thinking is a core competency in nursing and has been widely discussed in nursing education. However, the results of previous research on the effectiveness of nursing education in improving students' critical thinking have been inconsistent. A longitudinal design was used with a convenience sample of 60 nursing students; 32 students participated four times in completing a questionnaire each March from 1999 to 2002. The California Critical Thinking Disposition Inventory was administered to measure disposition to critical thinking. There was a statistically significant improvement in critical thinking disposition score by academic year (F = 7.54, P = 0.0001). Among the subscales, open-mindedness, self-confidence, and maturity also showed a statistically significant difference by academic year (P = 0.0194, 0.0041, 0.0044). Teaching strategies to enhance critical thinking should be developed, in addition to further research on the effect of the nursing curriculum on students' critical thinking. Moreover, survey instruments could be adjusted to incorporate characteristics of the Korean culture.

  15. BagReg: Protein inference through machine learning.

    Science.gov (United States)

    Zhao, Can; Liu, Dao; Teng, Ben; He, Zengyou

    2015-08-01

    Protein inference from the identified peptides is of primary importance in the shotgun proteomics. The target of protein inference is to identify whether each candidate protein is truly present in the sample. To date, many computational methods have been proposed to solve this problem. However, there is still no method that can fully utilize the information hidden in the input data. In this article, we propose a learning-based method named BagReg for protein inference. The method firstly artificially extracts five features from the input data, and then chooses each feature as the class feature to separately build models to predict the presence probabilities of proteins. Finally, the weak results from five prediction models are aggregated to obtain the final result. We test our method on six public available data sets. The experimental results show that our method is superior to the state-of-the-art protein inference algorithms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. A Hierarchical Poisson Log-Normal Model for Network Inference from RNA Sequencing Data

    Science.gov (United States)

    Gallopin, Mélina; Rau, Andrea; Jaffrézic, Florence

    2013-01-01

    Gene network inference from transcriptomic data is an important methodological challenge and a key aspect of systems biology. Although several methods have been proposed to infer networks from microarray data, there is a need for inference methods able to model RNA-seq data, which are count-based and highly variable. In this work we propose a hierarchical Poisson log-normal model with a Lasso penalty to infer gene networks from RNA-seq data; this model has the advantage of directly modelling discrete data and accounting for inter-sample variance larger than the sample mean. Using real microRNA-seq data from breast cancer tumors and simulations, we compare this method to a regularized Gaussian graphical model on log-transformed data, and a Poisson log-linear graphical model with a Lasso penalty on power-transformed data. For data simulated with large inter-sample dispersion, the proposed model performs better than the other methods in terms of sensitivity, specificity and area under the ROC curve. These results show the necessity of methods specifically designed for gene network inference from RNA-seq data. PMID:24147011

  17. Some limitations of public sequence data for phylogenetic inference (in plants).

    Science.gov (United States)

    Hinchliff, Cody E; Smith, Stephen Andrew

    2014-01-01

    The GenBank database contains essentially all of the nucleotide sequence data generated for published molecular systematic studies, but for the majority of taxa these data remain sparse. GenBank has value for phylogenetic methods that leverage data-mining and rapidly improving computational methods, but the limits imposed by the sparse structure of the data are not well understood. Here we present a tree representing 13,093 land plant genera--an estimated 80% of extant plant diversity--to illustrate the potential of public sequence data for broad phylogenetic inference in plants, and we explore the limits to inference imposed by the structure of these data using theoretical foundations from phylogenetic data decisiveness. We find that despite very high levels of missing data (over 96%), the present data retain the potential to inform over 86.3% of all possible phylogenetic relationships. Most of these relationships, however, are informed by small amounts of data--approximately half are informed by fewer than four loci, and more than 99% are informed by fewer than fifteen. We also apply an information theoretic measure of branch support to assess the strength of phylogenetic signal in the data, revealing many poorly supported branches concentrated near the tips of the tree, where data are sparse and the limiting effects of this sparseness are stronger. We argue that limits to phylogenetic inference and signal imposed by low data coverage may pose significant challenges for comprehensive phylogenetic inference at the species level. Computational requirements provide additional limits for large reconstructions, but these may be overcome by methodological advances, whereas insufficient data coverage can only be remedied by additional sampling effort. We conclude that public databases have exceptional value for modern systematics and evolutionary biology, and that a continued emphasis on expanding taxonomic and genomic coverage will play a critical role in developing

  18. Statistical inference for extended or shortened phase II studies based on Simon's two-stage designs.

    Science.gov (United States)

    Zhao, Junjun; Yu, Menggang; Feng, Xi-Ping

    2015-06-07

    Simon's two-stage designs are popular choices for conducting phase II clinical trials, especially in the oncology trials to reduce the number of patients placed on ineffective experimental therapies. Recently Koyama and Chen (2008) discussed how to conduct proper inference for such studies because they found that inference procedures used with Simon's designs almost always ignore the actual sampling plan used. In particular, they proposed an inference method for studies when the actual second stage sample sizes differ from planned ones. We consider an alternative inference method based on likelihood ratio. In particular, we order permissible sample paths under Simon's two-stage designs using their corresponding conditional likelihood. In this way, we can calculate p-values using the common definition: the probability of obtaining a test statistic value at least as extreme as that observed under the null hypothesis. In addition to providing inference for a couple of scenarios where Koyama and Chen's method can be difficult to apply, the resulting estimate based on our method appears to have certain advantage in terms of inference properties in many numerical simulations. It generally led to smaller biases and narrower confidence intervals while maintaining similar coverages. We also illustrated the two methods in a real data setting. Inference procedures used with Simon's designs almost always ignore the actual sampling plan. Reported P-values, point estimates and confidence intervals for the response rate are not usually adjusted for the design's adaptiveness. Proper statistical inference procedures should be used.

  19. Bayesian inference in probabilistic risk assessment-The current state of the art

    International Nuclear Information System (INIS)

    Kelly, Dana L.; Smith, Curtis L.

    2009-01-01

    Markov chain Monte Carlo (MCMC) approaches to sampling directly from the joint posterior distribution of aleatory model parameters have led to tremendous advances in Bayesian inference capability in a wide variety of fields, including probabilistic risk analysis. The advent of freely available software coupled with inexpensive computing power has catalyzed this advance. This paper examines where the risk assessment community is with respect to implementing modern computational-based Bayesian approaches to inference. Through a series of examples in different topical areas, it introduces salient concepts and illustrates the practical application of Bayesian inference via MCMC sampling to a variety of important problems

  20. Artificial Neural Network for Total Laboratory Automation to Improve the Management of Sample Dilution.

    Science.gov (United States)

    Ialongo, Cristiano; Pieri, Massimo; Bernardini, Sergio

    2017-02-01

    Diluting a sample to obtain a measure within the analytical range is a common task in clinical laboratories. However, for urgent samples, it can cause delays in test reporting, which can put patients' safety at risk. The aim of this work is to show a simple artificial neural network that can be used to make it unnecessary to predilute a sample using the information available through the laboratory information system. Particularly, the Multilayer Perceptron neural network built on a data set of 16,106 cardiac troponin I test records produced a correct inference rate of 100% for samples not requiring predilution and 86.2% for those requiring predilution. With respect to the inference reliability, the most relevant inputs were the presence of a cardiac event or surgery and the result of the previous assay. Therefore, such an artificial neural network can be easily implemented into a total automation framework to sensibly reduce the turnaround time of critical orders delayed by the operation required to retrieve, dilute, and retest the sample.

  1. A Network Inference Workflow Applied to Virulence-Related Processes in Salmonella typhimurium

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Ronald C.; Singhal, Mudita; Weller, Jennifer B.; Khoshnevis, Saeed; Shi, Liang; McDermott, Jason E.

    2009-04-20

    Inference of the structure of mRNA transcriptional regulatory networks, protein regulatory or interaction networks, and protein activation/inactivation-based signal transduction networks are critical tasks in systems biology. In this article we discuss a workflow for the reconstruction of parts of the transcriptional regulatory network of the pathogenic bacterium Salmonella typhimurium based on the information contained in sets of microarray gene expression data now available for that organism, and describe our results obtained by following this workflow. The primary tool is one of the network inference algorithms deployed in the Software Environment for BIological Network Inference (SEBINI). Specifically, we selected the algorithm called Context Likelihood of Relatedness (CLR), which uses the mutual information contained in the gene expression data to infer regulatory connections. The associated analysis pipeline automatically stores the inferred edges from the CLR runs within SEBINI and, upon request, transfers the inferred edges into either Cytoscape or the plug-in Collective Analysis of Biological of Biological Interaction Networks (CABIN) tool for further post-analysis of the inferred regulatory edges. The following article presents the outcome of this workflow, as well as the protocols followed for microarray data collection, data cleansing, and network inference. Our analysis revealed several interesting interactions, functional groups, metabolic pathways, and regulons in S. typhimurium.

  2. Robust inference in sample selection models

    KAUST Repository

    Zhelonkin, Mikhail; Genton, Marc G.; Ronchetti, Elvezio

    2015-01-01

    The problem of non-random sample selectivity often occurs in practice in many fields. The classical estimators introduced by Heckman are the backbone of the standard statistical analysis of these models. However, these estimators are very sensitive to small deviations from the distributional assumptions which are often not satisfied in practice. We develop a general framework to study the robustness properties of estimators and tests in sample selection models. We derive the influence function and the change-of-variance function of Heckman's two-stage estimator, and we demonstrate the non-robustness of this estimator and its estimated variance to small deviations from the model assumed. We propose a procedure for robustifying the estimator, prove its asymptotic normality and give its asymptotic variance. Both cases with and without an exclusion restriction are covered. This allows us to construct a simple robust alternative to the sample selection bias test. We illustrate the use of our new methodology in an analysis of ambulatory expenditures and we compare the performance of the classical and robust methods in a Monte Carlo simulation study.

  3. Robust inference in sample selection models

    KAUST Repository

    Zhelonkin, Mikhail

    2015-11-20

    The problem of non-random sample selectivity often occurs in practice in many fields. The classical estimators introduced by Heckman are the backbone of the standard statistical analysis of these models. However, these estimators are very sensitive to small deviations from the distributional assumptions which are often not satisfied in practice. We develop a general framework to study the robustness properties of estimators and tests in sample selection models. We derive the influence function and the change-of-variance function of Heckman\\'s two-stage estimator, and we demonstrate the non-robustness of this estimator and its estimated variance to small deviations from the model assumed. We propose a procedure for robustifying the estimator, prove its asymptotic normality and give its asymptotic variance. Both cases with and without an exclusion restriction are covered. This allows us to construct a simple robust alternative to the sample selection bias test. We illustrate the use of our new methodology in an analysis of ambulatory expenditures and we compare the performance of the classical and robust methods in a Monte Carlo simulation study.

  4. Accelerating inference for diffusions observed with measurement error and large sample sizes using approximate Bayesian computation

    DEFF Research Database (Denmark)

    Picchini, Umberto; Forman, Julie Lyng

    2016-01-01

    a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm......In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...

  5. Sampling the reference set’ revisited

    NARCIS (Netherlands)

    Berkum, van E.E.M.; Linssen, H.N.; Overdijk, D.A.

    1998-01-01

    The confidence level of an inference table is defined as a weighted truth probability of the inference when sampling the reference set. The reference set is recognized by conditioning on the values of maximal partially ancillary statistics. In the sampling experiment values of incidental parameters

  6. Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation

    DEFF Research Database (Denmark)

    Brouwer, Thomas; Frellsen, Jes; Liò, Pietro

    2017-01-01

    In this paper, we study the trade-offs of different inference approaches for Bayesian matrix factorisation methods, which are commonly used for predicting missing values, and for finding patterns in the data. In particular, we consider Bayesian nonnegative variants of matrix factorisation and tri......-factorisation, and compare non-probabilistic inference, Gibbs sampling, variational Bayesian inference, and a maximum-a-posteriori approach. The variational approach is new for the Bayesian nonnegative models. We compare their convergence, and robustness to noise and sparsity of the data, on both synthetic and real...

  7. Surrogate based approaches to parameter inference in ocean models

    KAUST Repository

    Knio, Omar

    2016-01-06

    This talk discusses the inference of physical parameters using model surrogates. Attention is focused on the use of sampling schemes to build suitable representations of the dependence of the model response on uncertain input data. Non-intrusive spectral projections and regularized regressions are used for this purpose. A Bayesian inference formalism is then applied to update the uncertain inputs based on available measurements or observations. To perform the update, we consider two alternative approaches, based on the application of Markov Chain Monte Carlo methods or of adjoint-based optimization techniques. We outline the implementation of these techniques to infer dependence of wind drag, bottom drag, and internal mixing coefficients.

  8. Surrogate based approaches to parameter inference in ocean models

    KAUST Repository

    Knio, Omar

    2016-01-01

    This talk discusses the inference of physical parameters using model surrogates. Attention is focused on the use of sampling schemes to build suitable representations of the dependence of the model response on uncertain input data. Non-intrusive spectral projections and regularized regressions are used for this purpose. A Bayesian inference formalism is then applied to update the uncertain inputs based on available measurements or observations. To perform the update, we consider two alternative approaches, based on the application of Markov Chain Monte Carlo methods or of adjoint-based optimization techniques. We outline the implementation of these techniques to infer dependence of wind drag, bottom drag, and internal mixing coefficients.

  9. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach.

    Directory of Open Access Journals (Sweden)

    Simon Boitard

    2016-03-01

    Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.

  10. A linear programming model for protein inference problem in shotgun proteomics.

    Science.gov (United States)

    Huang, Ting; He, Zengyou

    2012-11-15

    Assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is an important issue in shotgun proteomics. The objective of protein inference is to find a subset of proteins that are truly present in the sample. Although many methods have been proposed for protein inference, several issues such as peptide degeneracy still remain unsolved. In this article, we present a linear programming model for protein inference. In this model, we use a transformation of the joint probability that each peptide/protein pair is present in the sample as the variable. Then, both the peptide probability and protein probability can be expressed as a formula in terms of the linear combination of these variables. Based on this simple fact, the protein inference problem is formulated as an optimization problem: minimize the number of proteins with non-zero probabilities under the constraint that the difference between the calculated peptide probability and the peptide probability generated from peptide identification algorithms should be less than some threshold. This model addresses the peptide degeneracy issue by forcing some joint probability variables involving degenerate peptides to be zero in a rigorous manner. The corresponding inference algorithm is named as ProteinLP. We test the performance of ProteinLP on six datasets. Experimental results show that our method is competitive with the state-of-the-art protein inference algorithms. The source code of our algorithm is available at: https://sourceforge.net/projects/prolp/. zyhe@dlut.edu.cn. Supplementary data are available at Bioinformatics Online.

  11. Perceptual Decision-Making as Probabilistic Inference by Neural Sampling.

    Science.gov (United States)

    Haefner, Ralf M; Berkes, Pietro; Fiser, József

    2016-05-04

    We address two main challenges facing systems neuroscience today: understanding the nature and function of cortical feedback between sensory areas and of correlated variability. Starting from the old idea of perception as probabilistic inference, we show how to use knowledge of the psychophysical task to make testable predictions for the influence of feedback signals on early sensory representations. Applying our framework to a two-alternative forced choice task paradigm, we can explain multiple empirical findings that have been hard to account for by the traditional feedforward model of sensory processing, including the task dependence of neural response correlations and the diverging time courses of choice probabilities and psychophysical kernels. Our model makes new predictions and characterizes a component of correlated variability that represents task-related information rather than performance-degrading noise. It demonstrates a normative way to integrate sensory and cognitive components into physiologically testable models of perceptual decision-making. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Causal inference of asynchronous audiovisual speech

    Directory of Open Access Journals (Sweden)

    John F Magnotti

    2013-11-01

    Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.

  13. Sensorimotor Network Crucial for Inferring Amusement from Smiles.

    Science.gov (United States)

    Paracampo, Riccardo; Tidoni, Emmanuele; Borgomaneri, Sara; di Pellegrino, Giuseppe; Avenanti, Alessio

    2017-11-01

    Understanding whether another's smile reflects authentic amusement is a key challenge in social life, yet, the neural bases of this ability have been largely unexplored. Here, we combined transcranial magnetic stimulation (TMS) with a novel empathic accuracy (EA) task to test whether sensorimotor and mentalizing networks are critical for understanding another's amusement. Participants were presented with dynamic displays of smiles and explicitly requested to infer whether the smiling individual was feeling authentic amusement or not. TMS over sensorimotor regions representing the face (i.e., in the inferior frontal gyrus (IFG) and ventral primary somatosensory cortex (SI)), disrupted the ability to infer amusement authenticity from observed smiles. The same stimulation did not affect performance on a nonsocial task requiring participants to track the smiling expression but not to infer amusement. Neither TMS over prefrontal and temporo-parietal areas supporting mentalizing, nor peripheral control stimulations, affected performance on either task. Thus, motor and somatosensory circuits for controlling and sensing facial movements are causally essential for inferring amusement from another's smile. These findings highlight the functional relevance of IFG and SI to amusement understanding and suggest that EA abilities may be grounded in sensorimotor networks for moving and feeling the body. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Functional networks inference from rule-based machine learning models.

    Science.gov (United States)

    Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume

    2016-01-01

    Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The

  15. Inference and learning in sparse systems with multiple states

    International Nuclear Information System (INIS)

    Braunstein, A.; Ramezanpour, A.; Zhang, P.; Zecchina, R.

    2011-01-01

    We discuss how inference can be performed when data are sampled from the nonergodic phase of systems with multiple attractors. We take as a model system the finite connectivity Hopfield model in the memory phase and suggest a cavity method approach to reconstruct the couplings when the data are separately sampled from few attractor states. We also show how the inference results can be converted into a learning protocol for neural networks in which patterns are presented through weak external fields. The protocol is simple and fully local, and is able to store patterns with a finite overlap with the input patterns without ever reaching a spin-glass phase where all memories are lost.

  16. Bayesian inference from count data using discrete uniform priors.

    Directory of Open Access Journals (Sweden)

    Federico Comoglio

    Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.

  17. Magnetic response and critical current properties of mesoscopic-size YBCO superconducting samples

    International Nuclear Information System (INIS)

    Lisboa-Filho, P N; Deimling, C V; Ortiz, W A

    2010-01-01

    In this contribution superconducting specimens of YBa 2 Cu 3 O 7-δ were synthesized by a modified polymeric precursor method, yielding a ceramic powder with particles of mesoscopic-size. Samples of this powder were then pressed into pellets and sintered under different conditions. The critical current density was analyzed by isothermal AC-susceptibility measurements as a function of the excitation field, as well as with isothermal DC-magnetization runs at different values of the applied field. Relevant features of the magnetic response could be associated to the microstructure of the specimens and, in particular, to the superconducting intra- and intergranular critical current properties.

  18. Magnetic response and critical current properties of mesoscopic-size YBCO superconducting samples

    Energy Technology Data Exchange (ETDEWEB)

    Lisboa-Filho, P N [UNESP - Universidade Estadual Paulista, Grupo de Materiais Avancados, Departamento de Fisica, Bauru (Brazil); Deimling, C V; Ortiz, W A, E-mail: plisboa@fc.unesp.b [Grupo de Supercondutividade e Magnetismo, Departamento de Fisica, Universidade Federal de Sao Carlos, Sao Carlos (Brazil)

    2010-01-15

    In this contribution superconducting specimens of YBa{sub 2}Cu{sub 3}O{sub 7-{delta}} were synthesized by a modified polymeric precursor method, yielding a ceramic powder with particles of mesoscopic-size. Samples of this powder were then pressed into pellets and sintered under different conditions. The critical current density was analyzed by isothermal AC-susceptibility measurements as a function of the excitation field, as well as with isothermal DC-magnetization runs at different values of the applied field. Relevant features of the magnetic response could be associated to the microstructure of the specimens and, in particular, to the superconducting intra- and intergranular critical current properties.

  19. Bootstrap inference when using multiple imputation.

    Science.gov (United States)

    Schomaker, Michael; Heumann, Christian

    2018-04-16

    Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is nonsymmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present 4 methods that are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that 3 of the 4 approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the 4 methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the g-formula for inference, a method for which no standard errors are available. Copyright © 2018 John Wiley & Sons, Ltd.

  20. Bayesian pedigree inference with small numbers of single nucleotide polymorphisms via a factor-graph representation.

    Science.gov (United States)

    Anderson, Eric C; Ng, Thomas C

    2016-02-01

    We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.

  1. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    Science.gov (United States)

    Hero, Alfred O.; Rajaratnam, Bala

    2015-01-01

    When can reliable inference be drawn in fue “Big Data” context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for “Big Data”. Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks. PMID:27087700

  2. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.

    Science.gov (United States)

    Hero, Alfred O; Rajaratnam, Bala

    2016-01-01

    When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.

  3. A simple technique for measuring the superconducting critical temperature of small (>= 10 μg) samples

    International Nuclear Information System (INIS)

    Pereira, R.F.R.; Meyer, E.; Silveira, M.F. da.

    1983-01-01

    A simple technique for measuring the superconducting critical temperature of small (>=10μg) samples is described. The apparatus is built in the form of a probe, which can be introduced directly into a liquid He storage dewar and permits the determination of the critical temperature, with an imprecision of +- 0.05 K above 4.2 K, in about 10 minutes. (Author) [pt

  4. In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.

    Science.gov (United States)

    Audain, Enrique; Uszkoreit, Julian; Sachsenberg, Timo; Pfeuffer, Julianus; Liang, Xiao; Hermjakob, Henning; Sanchez, Aniel; Eisenacher, Martin; Reinert, Knut; Tabb, David L; Kohlbacher, Oliver; Perez-Riverol, Yasset

    2017-01-06

    In mass spectrometry-based shotgun proteomics, protein identifications are usually the desired result. However, most of the analytical methods are based on the identification of reliable peptides and not the direct identification of intact proteins. Thus, assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is a critical step in proteomics research. Currently, different protein inference algorithms and tools are available for the proteomics community. Here, we evaluated five software tools for protein inference (PIA, ProteinProphet, Fido, ProteinLP, MSBayesPro) using three popular database search engines: Mascot, X!Tandem, and MS-GF+. All the algorithms were evaluated using a highly customizable KNIME workflow using four different public datasets with varying complexities (different sample preparation, species and analytical instruments). We defined a set of quality control metrics to evaluate the performance of each combination of search engines, protein inference algorithm, and parameters on each dataset. We show that the results for complex samples vary not only regarding the actual numbers of reported protein groups but also concerning the actual composition of groups. Furthermore, the robustness of reported proteins when using databases of differing complexities is strongly dependant on the applied inference algorithm. Finally, merging the identifications of multiple search engines does not necessarily increase the number of reported proteins, but does increase the number of peptides per protein and thus can generally be recommended. Protein inference is one of the major challenges in MS-based proteomics nowadays. Currently, there are a vast number of protein inference algorithms and implementations available for the proteomics community. Protein assembly impacts in the final results of the research, the quantitation values and the final claims in the research manuscript. Even though protein

  5. Behavior Intention Derivation of Android Malware Using Ontology Inference

    Directory of Open Access Journals (Sweden)

    Jian Jiao

    2018-01-01

    Full Text Available Previous researches on Android malware mainly focus on malware detection, and malware’s evolution makes the process face certain hysteresis. The information presented by these detected results (malice judgment, family classification, and behavior characterization is limited for analysts. Therefore, a method is needed to restore the intention of malware, which reflects the relation between multiple behaviors of complex malware and its ultimate purpose. This paper proposes a novel description and derivation model of Android malware intention based on the theory of intention and malware reverse engineering. This approach creates ontology for malware intention to model the semantic relation between behaviors and its objects and automates the process of intention derivation by using SWRL rules transformed from intention model and Jess inference engine. Experiments on 75 typical samples show that the inference system can perform derivation of malware intention effectively, and 89.3% of the inference results are consistent with artificial analysis, which proves the feasibility and effectiveness of our theory and inference system.

  6. Entropic Inference

    Science.gov (United States)

    Caticha, Ariel

    2011-03-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.

  7. Inferring probabilistic stellar rotation periods using Gaussian processes

    Science.gov (United States)

    Angus, Ruth; Morton, Timothy; Aigrain, Suzanne; Foreman-Mackey, Daniel; Rajpaul, Vinesh

    2018-02-01

    Variability in the light curves of spotted, rotating stars is often non-sinusoidal and quasi-periodic - spots move on the stellar surface and have finite lifetimes, causing stellar flux variations to slowly shift in phase. A strictly periodic sinusoid therefore cannot accurately model a rotationally modulated stellar light curve. Physical models of stellar surfaces have many drawbacks preventing effective inference, such as highly degenerate or high-dimensional parameter spaces. In this work, we test an appropriate effective model: a Gaussian Process with a quasi-periodic covariance kernel function. This highly flexible model allows sampling of the posterior probability density function of the periodic parameter, marginalizing over the other kernel hyperparameters using a Markov Chain Monte Carlo approach. To test the effectiveness of this method, we infer rotation periods from 333 simulated stellar light curves, demonstrating that the Gaussian process method produces periods that are more accurate than both a sine-fitting periodogram and an autocorrelation function method. We also demonstrate that it works well on real data, by inferring rotation periods for 275 Kepler stars with previously measured periods. We provide a table of rotation periods for these and many more, altogether 1102 Kepler objects of interest, and their posterior probability density function samples. Because this method delivers posterior probability density functions, it will enable hierarchical studies involving stellar rotation, particularly those involving population modelling, such as inferring stellar ages, obliquities in exoplanet systems, or characterizing star-planet interactions. The code used to implement this method is available online.

  8. Sparse Bayesian Inference and the Temperature Structure of the Solar Corona

    Energy Technology Data Exchange (ETDEWEB)

    Warren, Harry P. [Space Science Division, Naval Research Laboratory, Washington, DC 20375 (United States); Byers, Jeff M. [Materials Science and Technology Division, Naval Research Laboratory, Washington, DC 20375 (United States); Crump, Nicholas A. [Naval Center for Space Technology, Naval Research Laboratory, Washington, DC 20375 (United States)

    2017-02-20

    Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.

  9. Inference in Complex Systems Using Multi-Phase MCMC Sampling With Gradient Matching Burn-in

    OpenAIRE

    Lazarus, Alan; Husmeier, Dirk; Papamarkou, Theodore

    2017-01-01

    We propose a novel method for parameter inference that builds on the current research in gradient matching surrogate likelihood spaces. Adopting a three phase technique, we demonstrate that it is possible to obtain parameter estimates of limited bias whilst still adopting the paradigm of the computationally cheap surrogate approximation.

  10. Risk Attitudes, Sample Selection and Attrition in a Longitudinal Field Experiment

    DEFF Research Database (Denmark)

    Harrison, Glenn W.; Lau, Morten Igel

    with respect to risk attitudes. Our design builds in explicit randomization on the incentives for participation. We show that there are significant sample selection effects on inferences about the extent of risk aversion, but that the effects of subsequent sample attrition are minimal. Ignoring sample...... selection leads to inferences that subjects in the population are more risk averse than they actually are. Correcting for sample selection and attrition affects utility curvature, but does not affect inferences about probability weighting. Properly accounting for sample selection and attrition effects leads...... to findings of temporal stability in overall risk aversion. However, that stability is around different levels of risk aversion than one might naively infer without the controls for sample selection and attrition we are able to implement. This evidence of “randomization bias” from sample selection...

  11. Nonparametric predictive inference for combined competing risks data

    International Nuclear Information System (INIS)

    Coolen-Maturi, Tahani; Coolen, Frank P.A.

    2014-01-01

    The nonparametric predictive inference (NPI) approach for competing risks data has recently been presented, in particular addressing the question due to which of the competing risks the next unit will fail, and also considering the effects of unobserved, re-defined, unknown or removed competing risks. In this paper, we introduce how the NPI approach can be used to deal with situations where units are not all at risk from all competing risks. This may typically occur if one combines information from multiple samples, which can, e.g. be related to further aspects of units that define the samples or groups to which the units belong or to different applications where the circumstances under which the units operate can vary. We study the effect of combining the additional information from these multiple samples, so effectively borrowing information on specific competing risks from other units, on the inferences. Such combination of information can be relevant to competing risks scenarios in a variety of application areas, including engineering and medical studies

  12. More than one kind of inference: re-examining what's learned in feature inference and classification.

    Science.gov (United States)

    Sweller, Naomi; Hayes, Brett K

    2010-08-01

    Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.

  13. Perceptual inference.

    Science.gov (United States)

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. SEMANTIC PATCH INFERENCE

    DEFF Research Database (Denmark)

    Andersen, Jesper

    2009-01-01

    Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....

  15. Distribution of local critical current along sample length and its relation to overall current in a long Bi2223/Ag superconducting composite tape

    International Nuclear Information System (INIS)

    Ochiai, S; Doko, D; Okuda, H; Oh, S S; Ha, D W

    2006-01-01

    The distribution of the local critical current and the n-value along the sample length and its relation to the overall critical current were studied experimentally and analytically for the bent multifilamentary Bi2223/Ag/Ag-Mg alloy superconducting composite tape. Then, based on the results, it was attempted to simulate on a computer the dependence of the critical current on the sample length. The main results are summarized as follows. The experimentally observed relation of the distributed local critical current and n-value to the overall critical current was described comprehensively with a simple voltage summation model, in which the sample was regarded as a one-dimensional series circuit. The sample length dependence of the critical current was reproduced on the computer by a Monte Carlo simulation incorporating the voltage summation model and the regression analysis results for the local critical current distribution and the relation of the n-value to the critical current

  16. Statistical inference for the lifetime performance index based on generalised order statistics from exponential distribution

    Science.gov (United States)

    Vali Ahmadi, Mohammad; Doostparast, Mahdi; Ahmadi, Jafar

    2015-04-01

    In manufacturing industries, the lifetime of an item is usually characterised by a random variable X and considered to be satisfactory if X exceeds a given lower lifetime limit L. The probability of a satisfactory item is then ηL := P(X ≥ L), called conforming rate. In industrial companies, however, the lifetime performance index, proposed by Montgomery and denoted by CL, is widely used as a process capability index instead of the conforming rate. Assuming a parametric model for the random variable X, we show that there is a connection between the conforming rate and the lifetime performance index. Consequently, the statistical inferences about ηL and CL are equivalent. Hence, we restrict ourselves to statistical inference for CL based on generalised order statistics, which contains several ordered data models such as usual order statistics, progressively Type-II censored data and records. Various point and interval estimators for the parameter CL are obtained and optimal critical regions for the hypothesis testing problems concerning CL are proposed. Finally, two real data-sets on the lifetimes of insulating fluid and ball bearings, due to Nelson (1982) and Caroni (2002), respectively, and a simulated sample are analysed.

  17. Graphical models for inference under outcome-dependent sampling

    DEFF Research Database (Denmark)

    Didelez, V; Kreiner, S; Keiding, N

    2010-01-01

    a node for the sampling indicator, assumptions about sampling processes can be made explicit. We demonstrate how to read off such graphs whether consistent estimation of the association between exposure and outcome is possible. Moreover, we give sufficient graphical conditions for testing and estimating......We consider situations where data have been collected such that the sampling depends on the outcome of interest and possibly further covariates, as for instance in case-control studies. Graphical models represent assumptions about the conditional independencies among the variables. By including...

  18. Processing of Scalar Inferences by Mandarin Learners of English: An Online Measure.

    Directory of Open Access Journals (Sweden)

    Yowyu Lin

    Full Text Available Scalar inferences represent the condition when a speaker uses a weaker expression such as some in a pragmatic scale like , and s/he has the intention to reject the stronger use of the other word like all in the utterance. Considerable disagreement has arisen concerning how interlocutors derive the inferences. The study presented here tries to address this issue by examining online scalar inferences among Mandarin learners of English. To date, Default Inference and Relevance Theory have made different predictions regarding how people process scalar inferences. Findings from recently emerging first language studies did not fully resolved the debate but led to even more heated debates. The current three online psycholinguistic experiments reported here tried to address the processing of scalar inferences from second language perspective. Results showed that Mandarin learners of English showed faster reaction times and a higher acceptance rate when interpreting some as some but not all and this was true even when subjects were under time pressure, which was manifested in Experiment 2. Overall, the results of the experiments supported Default Theory. In addition, Experiment 3 also found that working memory capacity plays a critical role during scalar inference processing. High span readers were faster in accepting the some but not all interpretation than low span readers. However, compared with low span readers, high span readers were more likely to accept the some and possibly all condition, possibly due to their working memory capacity to generate scenarios to fit the interpretation.

  19. Modeling Well Sampled Composite Spectral Energy Distributions of Distant Galaxies via an MCMC-driven Inference Framework

    Science.gov (United States)

    Pasha, Imad; Kriek, Mariska; Johnson, Benjamin; Conroy, Charlie

    2018-01-01

    Using a novel, MCMC-driven inference framework, we have modeled the stellar and dust emission of 32 composite spectral energy distributions (SEDs), which span from the near-ultraviolet (NUV) to far infrared (FIR). The composite SEDs were originally constructed in a previous work from the photometric catalogs of the NEWFIRM Medium-Band Survey, in which SEDs of individual galaxies at 0.5 MIPS 24 μm was added for each SED type, and in this work, PACS 100 μm, PACS160 μm, SPIRE 25 μm, and SPIRE 350 μm photometry have been added to extend the range of the composite SEDs into the FIR. We fit the composite SEDs with the Prospector code, which utilizes an MCMC sampling to explore the parameter space for models created by the Flexible Stellar Population Synthesis (FSPS) code, in order to investigate how specific star formation rate (sSFR), dust temperature, and other galaxy properties vary with SED type.This work is also being used to better constrain the SPS models within FSPS.

  20. Multimodel inference and adaptive management

    Science.gov (United States)

    Rehme, S.E.; Powell, L.A.; Allen, Craig R.

    2011-01-01

    Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.

  1. Developing Critical Thinking with Debate: Evidence from Iranian Male and Female Students

    Directory of Open Access Journals (Sweden)

    Maryam Danaye Tous

    2016-03-01

    Full Text Available The purpose of this study was to investigate the difference between the performance of Iranian male and female EFL learners on the five dimensions of the California Critical Thinking Skills Test.88 learners, out of 120, who were selected through convenience sampling method, participated in this study. The researcher used a quantitative research method with one-group pretest posttest design. This group received some treatment in the form of “the Meeting-House Debate” strategy. Data analysis was done using descriptive and inferential statistics. Result showed that there was no significant difference in the performance of males and females on the sub-scales measured; i.e. evaluation, analysis, inference, deductive reasoning, and inductive reasoning. It was concluded that gender did not have a significant effect on the students’ critical thinking skills.

  2. Quantum Enhanced Inference in Markov Logic Networks.

    Science.gov (United States)

    Wittek, Peter; Gogolin, Christian

    2017-04-19

    Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.

  3. Quantum Enhanced Inference in Markov Logic Networks

    Science.gov (United States)

    Wittek, Peter; Gogolin, Christian

    2017-04-01

    Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.

  4. Bootstrap-based Support of HGT Inferred by Maximum Parsimony

    Directory of Open Access Journals (Sweden)

    Nakhleh Luay

    2010-05-01

    Full Text Available Abstract Background Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. Results In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. Conclusions We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/, and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  5. Bootstrap-based support of HGT inferred by maximum parsimony.

    Science.gov (United States)

    Park, Hyun Jung; Jin, Guohua; Nakhleh, Luay

    2010-05-05

    Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/), and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  6. Inferring Human Mobility from Sparse Low Accuracy Mobile Sensing Data

    DEFF Research Database (Denmark)

    Cuttone, Andrea; Jørgensen, Sune Lehmann; Larsen, Jakob Eg

    2014-01-01

    Understanding both collective and personal human mobility is a central topic in Computational Social Science. Smartphone sensing data is emerging as a promising source for studying human mobility. However, most literature focuses on high-precision GPS positioning and high-frequency sampling, which...... is not always feasible in a longitudinal study or for everyday applications because location sensing has a high battery cost. In this paper we study the feasibility of inferring human mobility from sparse, low accuracy mobile sensing data. We validate our results using participants' location diaries......, and analyze the inferred geographical networks, the time spent at different places, and the number of unique places over time. Our results suggest that low resolution data allows accurate inference of human mobility patterns....

  7. Models for probability and statistical inference theory and applications

    CERN Document Server

    Stapleton, James H

    2007-01-01

    This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...

  8. Assessing school-aged children's inference-making: the effect of story test format in listening comprehension.

    Science.gov (United States)

    Freed, Jenny; Cain, Kate

    2017-01-01

    Comprehension is critical for classroom learning and educational success. Inferences are integral to good comprehension: successful comprehension requires the listener to generate local coherence inferences, which involve integrating information between clauses, and global coherence inferences, which involve integrating textual information with background knowledge to infer motivations, themes, etc. A central priority for the diagnosis of comprehension difficulties and our understanding of why these difficulties arise is the development of valid assessment instruments. We explored typically developing children's ability to make local and global coherence inferences using a novel assessment of listening comprehension. The aims were to determine whether children were more likely to make the target inferences when these were asked during story presentation versus after presentation of the story, and whether there were any age differences between conditions. Children in Years 3 (n = 29) and 5 (n = 31) listened to short stories presented either in a segmented format, in which questions to assess local and global coherence inferences were asked at specific points during story presentation, or in a whole format, when all the questions were asked after the story had been presented. There was developmental progression between age groups for both types of inference question. Children also scored higher on the global coherence inference questions than the local coherence inference questions. There was a benefit of the segmented format for younger children, particularly for the local inference questions. The results suggest that children are more likely to make target inferences if prompted during presentation of the story, and that this format is particularly facilitative for younger children and for local coherence inferences. This has implications for the design of comprehension assessments as well as for supporting children with comprehension difficulties in the classroom

  9. Critical Zone structure inferred from multiscale near surface geophysical and hydrological data across hillslopes at the Eel River CZO

    Science.gov (United States)

    Lee, S. S.; Rempe, D. M.; Holbrook, W. S.; Schmidt, L.; Hahm, W. J.; Dietrich, W. E.

    2017-12-01

    Except for boreholes and road cut, landslide, and quarry exposures, the subsurface structure of the critical zone (CZ) of weathered bedrock is relatively invisible and unmapped, yet this structure controls the short and long term fluxes of water and solutes. Non-invasive geophysical methods such as seismic refraction are widely applied to image the structure of the CZ at the hillslope scale. However, interpretations of such data are often limited due to heterogeneity and anisotropy contributed from fracturing, moisture content, and mineralogy on the seismic signal. We develop a quantitative framework for using seismic refraction tomography from intersecting geophysical surveys and hydrologic data obtained at the Eel River Critical Zone Observatory (ERCZO) in Northern California to help quantify the nature of subsurface structure across multiple hillslopes of varying topography in the area. To enhance our understanding of modeled velocity gradients and boundaries in relation to lithological properties, we compare refraction tomography results with borehole logs of nuclear magnetic resonance (NMR), gamma and neutron density, standard penetration testing, and observation drilling logs. We also incorporate laboratory scale rock characterization including mineralogical and elemental analyses as well as porosity and density measurements made via pycnometry, helium and mercury porosimetry, and laboratory scale NMR. We evaluate the sensitivity of seismically inferred saprolite-weathered bedrock and weathered-unweathered bedrock boundaries to various velocity and inversion parameters in relation with other macro scale processes such as gravitational and tectonic forces in influencing weathered bedrock velocities. Together, our sensitivity analyses and multi-method data comparison provide insight into the interpretation of seismic refraction tomography for the quantification of CZ structure and hydrologic dynamics.

  10. Optimal inference with suboptimal models: Addiction and active Bayesian inference

    Science.gov (United States)

    Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

    2015-01-01

    When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321

  11. Relation of n-value to critical current for local sections and overall sample in a SmBCO coated conductor pulled in tension

    International Nuclear Information System (INIS)

    Ochiai, Shojiro; Okuda, Hiroshi; Nagano, Shinji; Sugano, Michinaka; Oh, Sang-Song; Ha, Hong-Soo; Osamura, Kozo

    2014-01-01

    Under application of tensile stress to a SmBCO (SmBa 2 Cu 3 O 7-δ ) coated conductor sample consisting of series electric circuit of local sections, the relation of voltage-current curve, critical current and n-value of the sections to those of overall sample was studied. The change in critical current and n-value with increasing applied stress was different from section to section due to the difference in damage behavior of the SmBCO layer among the sections. When the difference in extent of damage among the sections was small, the voltages developed in all sections contributed to the voltage of overall sample. In this case, the critical current and n-value of overall sample were within the range of the highest and lowest values among the sections. On the other hand, when the damage in one section was far severer than that of other sections, the voltage developed in the most severely damaged section largely contributed to the overall voltage, and hence the voltage-current curves of the most severely damaged section were almost the same as those of overall sample. In this case, critical current of the overall sample was slightly higher and n-value of the overall sample was lower than the critical current and n-value of the most severely damaged section. Accordingly, the decrease in n-value with decreasing critical current in overall sample was sharper than that in sections. This phenomenon was accounted for by the increase in shunting current at cracked part at higher voltage in the most severely damaged section. (author)

  12. Inference rule and problem solving

    Energy Technology Data Exchange (ETDEWEB)

    Goto, S

    1982-04-01

    Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.

  13. Evaluation of Respondent-Driven Sampling

    Science.gov (United States)

    McCreesh, Nicky; Frost, Simon; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda Ndagire; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G

    2012-01-01

    Background Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex-workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total-population data. Methods Total-population data on age, tribe, religion, socioeconomic status, sexual activity and HIV status were available on a population of 2402 male household-heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, employing current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). Results We recruited 927 household-heads. Full and small RDS samples were largely representative of the total population, but both samples under-represented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven-sampling statistical-inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven-sampling bootstrap 95% confidence intervals included the population proportion. Conclusions Respondent-driven sampling produced a generally representative sample of this well-connected non-hidden population. However, current respondent-driven-sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience-sampling

  14. Evaluation of respondent-driven sampling.

    Science.gov (United States)

    McCreesh, Nicky; Frost, Simon D W; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda N; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G

    2012-01-01

    Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data. Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion. Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required

  15. Knowledge and inference

    CERN Document Server

    Nagao, Makoto

    1990-01-01

    Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig

  16. Sparse linear models: Variational approximate inference and Bayesian experimental design

    International Nuclear Information System (INIS)

    Seeger, Matthias W

    2009-01-01

    A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

  17. Sparse linear models: Variational approximate inference and Bayesian experimental design

    Energy Technology Data Exchange (ETDEWEB)

    Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)

    2009-12-01

    A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

  18. Inferences on Children’s Reading Groups

    Directory of Open Access Journals (Sweden)

    Javier González García

    2009-05-01

    Full Text Available This article focuses on the non-literal information of a text, which can be inferred from key elements or clues offered by the text itself. This kind of text is called implicit text or inference, due to the thinking process that it stimulates. The explicit resources that lead to information retrieval are related to others of implicit information, which have increased their relevance. In this study, during two courses, how two teachers interpret three stories and how they establish a debate dividing the class into three student groups, was analyzed. The sample was formed by two classes of two urban public schools of Burgos capital (Spain, and two of public schools of Tampico (Mexico. This allowed us to observe an increasing percentage value of the group focused in text comprehension, and a lesser percentage of the group perceiving comprehension as a secondary objective.

  19. Active learning and adaptive sampling for non-parametric inference

    NARCIS (Netherlands)

    Castro, R.M.

    2007-01-01

    This thesis presents a general discussion of active learning and adaptive sampling. In many practical scenarios it is possible to use information gleaned from previous observations to focus the sampling process, in the spirit of the "twenty-questions" game. As more samples are collected one can

  20. Risk-Based Predictive Maintenance for Safety-Critical Systems by Using Probabilistic Inference

    Directory of Open Access Journals (Sweden)

    Tianhua Xu

    2013-01-01

    Full Text Available Risk-based maintenance (RBM aims to improve maintenance planning and decision making by reducing the probability and consequences of failure of equipment. A new predictive maintenance strategy that integrates dynamic evolution model and risk assessment is proposed which can be used to calculate the optimal maintenance time with minimal cost and safety constraints. The dynamic evolution model provides qualified risks by using probabilistic inference with bucket elimination and gives the prospective degradation trend of a complex system. Based on the degradation trend, an optimal maintenance time can be determined by minimizing the expected maintenance cost per time unit. The effectiveness of the proposed method is validated and demonstrated by a collision accident of high-speed trains with obstacles in the presence of safety and cost constrains.

  1. Inference for Ecological Dynamical Systems: A Case Study of Two Endemic Diseases

    Directory of Open Access Journals (Sweden)

    Daniel A. Vasco

    2012-01-01

    Full Text Available A Bayesian Markov chain Monte Carlo method is used to infer parameters for an open stochastic epidemiological model: the Markovian susceptible-infected-recovered (SIR model, which is suitable for modeling and simulating recurrent epidemics. This allows exploring two major problems of inference appearing in many mechanistic population models. First, trajectories of these processes are often only partly observed. For example, during an epidemic the transmission process is only partly observable: one cannot record infection times. Therefore, one only records cases (infections as the observations. As a result some means of imputing or reconstructing individuals in the susceptible cases class must be accomplished. Second, the official reporting of observations (cases in epidemiology is typically done not as they are actually recorded but at some temporal interval over which they have been aggregated. To address these issues, this paper investigates the following problems. Parameter inference for a perfectly sampled open Markovian SIR is first considered. Next inference for an imperfectly observed sample path of the system is studied. Although this second problem has been solved for the case of closed epidemics, it has proven quite difficult for the case of open recurrent epidemics. Lastly, application of the statistical theory is made to measles and pertussis epidemic time series data from 60 UK cities.

  2. Asymptotic inference for jump diffusions with state-dependent intensity

    NARCIS (Netherlands)

    Becheri, Gaia; Drost, Feico; Werker, Bas

    2016-01-01

    We establish the local asymptotic normality property for a class of ergodic parametric jump-diffusion processes with state-dependent intensity and known volatility function sampled at high frequency. We prove that the inference problem about the drift and jump parameters is adaptive with respect to

  3. Goal inferences about robot behavior : goal inferences and human response behaviors

    NARCIS (Netherlands)

    Broers, H.A.T.; Ham, J.R.C.; Broeders, R.; De Silva, P.; Okada, M.

    2014-01-01

    This explorative research focused on the goal inferences human observers draw based on a robot's behavior, and the extent to which those inferences predict people's behavior in response to that robot. Results show that different robot behaviors cause different response behavior from people.

  4. Niffler: A Context-Aware and User-Independent Side-Channel Attack System for Password Inference

    Directory of Open Access Journals (Sweden)

    Benxiao Tang

    2018-01-01

    Full Text Available Digital password lock has been commonly used on mobile devices as the primary authentication method. Researches have demonstrated that sensors embedded on mobile devices can be employed to infer the password. However, existing works focus on either each single keystroke inference or entire password sequence inference, which are user-dependent and require huge efforts to collect the ground truth training data. In this paper, we design a novel side-channel attack system, called Niffler, which leverages the user-independent features of movements of tapping consecutive buttons to infer unlocking passwords on smartphones. We extract angle features to reflect the changing trends and build a multicategory classifier combining the dynamic time warping algorithm to infer the probability of each movement. We further use the Markov model to model the unlocking process and use the sequences with the highest probabilities as the attack candidates. Moreover, the sensor readings of successful attacks will be further fed back to continually improve the accuracy of the classifier. In our experiments, 100,000 samples collected from 25 participants are used to evaluate the performance of Niffler. The results show that Niffler achieves 70% and 85% accuracy with 10 attempts in user-independent and user-dependent environments with few training samples, respectively.

  5. Utilization of AHWR critical facility for research and development work on large sample NAA

    International Nuclear Information System (INIS)

    Acharya, R.; Dasari, K.B.; Pujari, P.K.; Swain, K.K.; Reddy, A.V.R.; Verma, S.K.; De, S.K.

    2014-01-01

    The graphite reflector position of AHWR critical facility (CF) was utilized for analysis of large size (g-kg scale) samples using internal mono standard neutron activation analysis (IM-NAA). The reactor position was characterized by cadmium ratio method using In monitor for total flux and sub cadmium to epithermal flux ratio (f). Large sample neutron activation analysis (LSNAA) work was carried out for samples of stainless steel, ancient and new clay potteries and dross. Large as well as non-standard geometry samples (1 g - 0.5 kg) were irradiated. Radioactive assay was carried out using high resolution gamma ray spectrometry. Concentration ratios obtained by IM-NAA were used for provenance study of 30 clay potteries, obtained from excavated Buddhist sites of AP, India. Concentrations of Au and Ag were determined in not so homogeneous three large size samples of dross. An X-Z rotary scanning unit has been installed for counting large and not so homogeneous samples. (author)

  6. Constructing criticality by classification

    DEFF Research Database (Denmark)

    Machacek, Erika

    2017-01-01

    " in the bureaucratic practice of classification: Experts construct material criticality in assessments as they allot information on the materials to the parameters of the assessment framework. In so doing, they ascribe a new set of connotations to the materials, namely supply risk, and their importance to clean energy......, legitimizing a criticality discourse.Specifically, the paper introduces a typology delineating the inferences made by the experts from their produced recommendations in the classification of rare earth element criticality. The paper argues that the classification is a specific process of constructing risk....... It proposes that the expert bureaucratic practice of classification legitimizes (i) the valorisation that was made in the drafting of the assessment framework for the classification, and (ii) political operationalization when enacted that might have (non-)distributive implications for the allocation of public...

  7. Safety critical application of fuzzy control

    International Nuclear Information System (INIS)

    Schildt, G.H.

    1995-01-01

    After an introduction into safety terms a short description of fuzzy logic will be given. Especially, for safety critical applications of fuzzy controllers a possible controller structure will be described. The following items will be discussed: Configuration of fuzzy controllers, design aspects like fuzzfiication, inference strategies, defuzzification and types of membership functions. As an example a typical fuzzy rule set will be presented. Especially, real-time behaviour a fuzzy controllers is mentioned. An example of fuzzy controlling for temperature control purpose within a nuclear reactor together with membership functions and inference strategy of such a fuzzy controller will be presented. (author). 4 refs, 17 figs

  8. Fooled by First Impressions? Reexamining the Diagnostic Value of Appearance-Based Inferences

    OpenAIRE

    2010-01-01

    Abstract We often form opinions about the characteristics of others from single, static samples of their appearance --the very first thing we see when, or even before, we meet them. These inferences occur spontaneously, rapidly, and can impact decisions in a variety of important domains. A crucial question, then, is whether appearance-based inferences are accurate. Using a naturalistic data set of more than 1 million appearance-based judgments obtained from a popular website (Study...

  9. Entropic Inference

    OpenAIRE

    Caticha, Ariel

    2010-01-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...

  10. Technical Note: How to use Winbugs to infer animal models

    DEFF Research Database (Denmark)

    Damgaard, Lars Holm

    2007-01-01

    This paper deals with Bayesian inferences of animal models using Gibbs sampling. First, we suggest a general and efficient method for updating additive genetic effects, in which the computational cost is independent of the pedigree depth and increases linearly only with the size of the pedigree....... Second, we show how this approach can be used to draw inferences from a wide range of animal models using the computer package Winbugs. Finally, we illustrate the approach in a simulation study, in which the data are generated and analyzed using Winbugs according to a linear model with i.i.d errors...... having Student's t distributions. In conclusion, Winbugs can be used to make inferences in small-sized, quantitative, genetic data sets applying a wide range of animal models that are not yet standard in the animal breeding literature...

  11. Learning Convex Inference of Marginals

    OpenAIRE

    Domke, Justin

    2012-01-01

    Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...

  12. Protein Inference from the Integration of Tandem MS Data and Interactome Networks.

    Science.gov (United States)

    Zhong, Jiancheng; Wang, Jianxing; Ding, Xiaojun; Zhang, Zhen; Li, Min; Wu, Fang-Xiang; Pan, Yi

    2017-01-01

    Since proteins are digested into a mixture of peptides in the preprocessing step of tandem mass spectrometry (MS), it is difficult to determine which specific protein a shared peptide belongs to. In recent studies, besides tandem MS data and peptide identification information, some other information is exploited to infer proteins. Different from the methods which first use only tandem MS data to infer proteins and then use network information to refine them, this study proposes a protein inference method named TMSIN, which uses interactome networks directly. As two interacting proteins should co-exist, it is reasonable to assume that if one of the interacting proteins is confidently inferred in a sample, its interacting partners should have a high probability in the same sample, too. Therefore, we can use the neighborhood information of a protein in an interactome network to adjust the probability that the shared peptide belongs to the protein. In TMSIN, a multi-weighted graph is constructed by incorporating the bipartite graph with interactome network information, where the bipartite graph is built with the peptide identification information. Based on multi-weighted graphs, TMSIN adopts an iterative workflow to infer proteins. At each iterative step, the probability that a shared peptide belongs to a specific protein is calculated by using the Bayes' law based on the neighbor protein support scores of each protein which are mapped by the shared peptides. We carried out experiments on yeast data and human data to evaluate the performance of TMSIN in terms of ROC, q-value, and accuracy. The experimental results show that AUC scores yielded by TMSIN are 0.742 and 0.874 in yeast dataset and human dataset, respectively, and TMSIN yields the maximum number of true positives when q-value less than or equal to 0.05. The overlap analysis shows that TMSIN is an effective complementary approach for protein inference.

  13. Bootstrap-Based Inference for Cube Root Consistent Estimators

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Jansson, Michael; Nagasawa, Kenichi

    This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known...... to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. This modification leads to a generic and easy-to-implement resampling method for inference that is conceptually distinct...... from other available distributional approximations based on some form of modified bootstrap. We offer simulation evidence showcasing the performance of our inference method in finite samples. An extension of our methodology to general M-estimation problems is also discussed....

  14. Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library

    Energy Technology Data Exchange (ETDEWEB)

    Bates, Cameron Russell [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Mckigney, Edward Allen [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2018-01-09

    The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.

  15. Inverse Ising Inference Using All the Data

    Science.gov (United States)

    Aurell, Erik; Ekeberg, Magnus

    2012-03-01

    We show that a method based on logistic regression, using all the data, solves the inverse Ising problem far better than mean-field calculations relying only on sample pairwise correlation functions, while still computationally feasible for hundreds of nodes. The largest improvement in reconstruction occurs for strong interactions. Using two examples, a diluted Sherrington-Kirkpatrick model and a two-dimensional lattice, we also show that interaction topologies can be recovered from few samples with good accuracy and that the use of l1 regularization is beneficial in this process, pushing inference abilities further into low-temperature regimes.

  16. sick: The Spectroscopic Inference Crank

    Science.gov (United States)

    Casey, Andrew R.

    2016-03-01

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  17. SICK: THE SPECTROSCOPIC INFERENCE CRANK

    Energy Technology Data Exchange (ETDEWEB)

    Casey, Andrew R., E-mail: arc@ast.cam.ac.uk [Institute of Astronomy, University of Cambridge, Madingley Road, Cambdridge, CB3 0HA (United Kingdom)

    2016-03-15

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  18. SICK: THE SPECTROSCOPIC INFERENCE CRANK

    International Nuclear Information System (INIS)

    Casey, Andrew R.

    2016-01-01

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  19. Isotope ratio mass spectrometry as a tool for source inference in forensic science: A critical review.

    Science.gov (United States)

    Gentile, Natacha; Siegwolf, Rolf T W; Esseiva, Pierre; Doyle, Sean; Zollinger, Kurt; Delémont, Olivier

    2015-06-01

    Isotope ratio mass spectrometry (IRMS) has been used in numerous fields of forensic science in a source inference perspective. This review compiles the studies published on the application of isotope ratio mass spectrometry (IRMS) to the traditional fields of forensic science so far. It completes the review of Benson et al. [1] and synthesises the extent of knowledge already gathered in the following fields: illicit drugs, flammable liquids, human provenancing, microtraces, explosives and other specific materials (packaging tapes, safety matches, plastics, etc.). For each field, a discussion assesses the state of science and highlights the relevance of the information in a forensic context. Through the different discussions which mark out the review, the potential and limitations of IRMS, as well as the needs and challenges of future studies are emphasized. The paper elicits the various dimensions of the source which can be obtained from the isotope information and demonstrates the transversal nature of IRMS as a tool for source inference. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  20. Probabilistic inductive inference: a survey

    OpenAIRE

    Ambainis, Andris

    2001-01-01

    Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.

  1. LAIT: a local ancestry inference toolkit.

    Science.gov (United States)

    Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei

    2017-09-06

    Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.

  2. Bayesian statistical inference

    Directory of Open Access Journals (Sweden)

    Bruno De Finetti

    2017-04-01

    Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.

  3. Critical thinking in nurse managers.

    Science.gov (United States)

    Zori, Susan; Morrison, Barbara

    2009-01-01

    Formal education and support is needed for nurse managers to effectively function in their role in the current health care environment. Many nurse managers assume their positions based on expertise in a clinical role with little expertise in managerial and leadership skills. Operating as a manager and leader requires ongoing development of critical thinking skills and the inclination to use those skills. Critical thinking can have a powerful influence on the decision making and problem solving that nurse managers are faced with on a daily basis. The skills that typify critical thinking include analysis, evaluation, inference, and deductive and inductive reasoning. It is intuitive that nurse managers require both the skills and the dispositions of critical thinking to be successful in this pivotal role at a time of transformation in health care. Incorporating critical thinking into education and support programs for the nurse manager is necessary to position the nurse manager for success.

  4. The Relationship of Critical-Thinking Skills and the Clinical-Judgment Skills of Baccalaureate Nursing Students.

    Science.gov (United States)

    Bowles, Kathleen

    2000-01-01

    Nursing graduates (n=65) completed a critical thinking instrument and clinical decision-making scale. The critical thinking subscales of inference and inductive reasoning were positively correlated to clinical judgment. A significant relationship was found between critical thinking score and grade point average in nursing. (SK)

  5. Hierarchical modeling and inference in ecology: The analysis of data from populations, metapopulations and communities

    Science.gov (United States)

    Royle, J. Andrew; Dorazio, Robert M.

    2008-01-01

    A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.

  6. Learning Additional Languages as Hierarchical Probabilistic Inference: Insights From First Language Processing.

    Science.gov (United States)

    Pajak, Bozena; Fine, Alex B; Kleinschmidt, Dave F; Jaeger, T Florian

    2016-12-01

    We present a framework of second and additional language (L2/L n ) acquisition motivated by recent work on socio-indexical knowledge in first language (L1) processing. The distribution of linguistic categories covaries with socio-indexical variables (e.g., talker identity, gender, dialects). We summarize evidence that implicit probabilistic knowledge of this covariance is critical to L1 processing, and propose that L2/L n learning uses the same type of socio-indexical information to probabilistically infer latent hierarchical structure over previously learned and new languages. This structure guides the acquisition of new languages based on their inferred place within that hierarchy, and is itself continuously revised based on new input from any language. This proposal unifies L1 processing and L2/L n acquisition as probabilistic inference under uncertainty over socio-indexical structure. It also offers a new perspective on crosslinguistic influences during L2/L n learning, accommodating gradient and continued transfer (both negative and positive) from previously learned to novel languages, and vice versa.

  7. Classical methods for interpreting objective function minimization as intelligent inference

    Energy Technology Data Exchange (ETDEWEB)

    Golden, R.M. [Univ. of Texas, Dallas, TX (United States)

    1996-12-31

    Most recognition algorithms and neural networks can be formally viewed as seeking a minimum value of an appropriate objective function during either classification or learning phases. The goal of this paper is to argue that in order to show a recognition algorithm is making intelligent inferences, it is not sufficient to show that the recognition algorithm is computing (or trying to compute) the global minimum of some objective function. One must explicitly define a {open_quotes}relational system{close_quotes} for the recognition algorithm or neural network which identifies the: (i) sample space, (ii) the relevant sigmafield of events generated by the sample space, and (iii) the {open_quotes}relation{close_quotes} for that relational system. Only when such a {open_quotes}relational system{close_quotes} is properly defined, is it possible to formally establish the sense in which computing the global minimum of an objective function is an intelligent, inference.

  8. Is there a hierarchy of social inferences? The likelihood and speed of inferring intentionality, mind, and personality.

    Science.gov (United States)

    Malle, Bertram F; Holbrook, Jess

    2012-04-01

    People interpret behavior by making inferences about agents' intentionality, mind, and personality. Past research studied such inferences 1 at a time; in real life, people make these inferences simultaneously. The present studies therefore examined whether 4 major inferences (intentionality, desire, belief, and personality), elicited simultaneously in response to an observed behavior, might be ordered in a hierarchy of likelihood and speed. To achieve generalizability, the studies included a wide range of stimulus behaviors, presented them verbally and as dynamic videos, and assessed inferences both in a retrieval paradigm (measuring the likelihood and speed of accessing inferences immediately after they were made) and in an online processing paradigm (measuring the speed of forming inferences during behavior observation). Five studies provide evidence for a hierarchy of social inferences-from intentionality and desire to belief to personality-that is stable across verbal and visual presentations and that parallels the order found in developmental and primate research. (c) 2012 APA, all rights reserved.

  9. Learning to Run with Actor-Critic Ensemble

    OpenAIRE

    Huang, Zhewei; Zhou, Shuchang; Zhuang, BoEr; Zhou, Xinyu

    2017-01-01

    We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm. At inference time, our method uses a critic ensemble to select the best action from proposals of multiple actors running in parallel. By having a larger candidate set, our method can avoid actions that have fatal consequences, while staying deterministic. Using ACE, we have won the 2nd place in NIPS'17 Learning to Run competition, under the name of "Megvii-hzw...

  10. Practical Bayesian Inference

    Science.gov (United States)

    Bailer-Jones, Coryn A. L.

    2017-04-01

    Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.

  11. Reliability of dose volume constraint inference from clinical data

    Science.gov (United States)

    Lutz, C. M.; Møller, D. S.; Hoffmann, L.; Knap, M. M.; Alber, M.

    2017-04-01

    Dose volume histogram points (DVHPs) frequently serve as dose constraints in radiotherapy treatment planning. An experiment was designed to investigate the reliability of DVHP inference from clinical data for multiple cohort sizes and complication incidence rates. The experimental background was radiation pneumonitis in non-small cell lung cancer and the DVHP inference method was based on logistic regression. From 102 NSCLC real-life dose distributions and a postulated DVHP model, an ‘ideal’ cohort was generated where the most predictive model was equal to the postulated model. A bootstrap and a Cohort Replication Monte Carlo (CoRepMC) approach were applied to create 1000 equally sized populations each. The cohorts were then analyzed to establish inference frequency distributions. This was applied to nine scenarios for cohort sizes of 102 (1), 500 (2) to 2000 (3) patients (by sampling with replacement) and three postulated DVHP models. The Bootstrap was repeated for a ‘non-ideal’ cohort, where the most predictive model did not coincide with the postulated model. The Bootstrap produced chaotic results for all models of cohort size 1 for both the ideal and non-ideal cohorts. For cohort size 2 and 3, the distributions for all populations were more concentrated around the postulated DVHP. For the CoRepMC, the inference frequency increased with cohort size and incidence rate. Correct inference rates  >85 % were only achieved by cohorts with more than 500 patients. Both Bootstrap and CoRepMC indicate that inference of the correct or approximate DVHP for typical cohort sizes is highly uncertain. CoRepMC results were less spurious than Bootstrap results, demonstrating the large influence that randomness in dose-response has on the statistical analysis.

  12. Benchmarking Relatedness Inference Methods with Genome-Wide Data from Thousands of Relatives.

    Science.gov (United States)

    Ramstetter, Monica D; Dyer, Thomas D; Lehman, Donna M; Curran, Joanne E; Duggirala, Ravindranath; Blangero, John; Mezey, Jason G; Williams, Amy L

    2017-09-01

    Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a data set with 2485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92-99%) when detecting first- and second-degree relationships, but their accuracy dwindles to 76% of relative pairs. Overall, the most accurate methods are Estimation of Recent Shared Ancestry (ERSA) and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches, such as new methods that leverage relatedness signals from multiple samples, are needed to achieve a sizeable jump in performance. Copyright © 2017 Ramstetter et al.

  13. Systematic sampling of discrete and continuous populations: sample selection and the choice of estimator

    Science.gov (United States)

    Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire

    2009-01-01

    Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...

  14. Causal learning and inference as a rational process: the new synthesis.

    Science.gov (United States)

    Holyoak, Keith J; Cheng, Patricia W

    2011-01-01

    Over the past decade, an active line of research within the field of human causal learning and inference has converged on a general representational framework: causal models integrated with bayesian probabilistic inference. We describe this new synthesis, which views causal learning and inference as a fundamentally rational process, and review a sample of the empirical findings that support the causal framework over associative alternatives. Causal events, like all events in the distal world as opposed to our proximal perceptual input, are inherently unobservable. A central assumption of the causal approach is that humans (and potentially nonhuman animals) have been designed in such a way as to infer the most invariant causal relations for achieving their goals based on observed events. In contrast, the associative approach assumes that learners only acquire associations among important observed events, omitting the representation of the distal relations. By incorporating bayesian inference over distributions of causal strength and causal structures, along with noisy-logical (i.e., causal) functions for integrating the influences of multiple causes on a single effect, human judgments about causal strength and structure can be predicted accurately for relatively simple causal structures. Dynamic models of learning based on the causal framework can explain patterns of acquisition observed with serial presentation of contingency data and are consistent with available neuroimaging data. The approach has been extended to a diverse range of inductive tasks, including category-based and analogical inferences.

  15. Logical inference and evaluation

    International Nuclear Information System (INIS)

    Perey, F.G.

    1981-01-01

    Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given

  16. The confounding effect of population structure on bayesian skyline plot inferences of demographic history

    DEFF Research Database (Denmark)

    Heller, Rasmus; Chikhi, Lounes; Siegismund, Hans

    2013-01-01

    Many coalescent-based methods aiming to infer the demographic history of populations assume a single, isolated and panmictic population (i.e. a Wright-Fisher model). While this assumption may be reasonable under many conditions, several recent studies have shown that the results can be misleading...... when it is violated. Among the most widely applied demographic inference methods are Bayesian skyline plots (BSPs), which are used across a range of biological fields. Violations of the panmixia assumption are to be expected in many biological systems, but the consequences for skyline plot inferences...... the best scheme for inferring demographic change over a typical time scale. Analyses of data from a structured African buffalo population demonstrate how BSP results can be strengthened by simulations. We recommend that sample selection should be carefully considered in relation to population structure...

  17. A local non-parametric model for trade sign inference

    Science.gov (United States)

    Blazejewski, Adam; Coggins, Richard

    2005-03-01

    We investigate a regularity in market order submission strategies for 12 stocks with large market capitalization on the Australian Stock Exchange. The regularity is evidenced by a predictable relationship between the trade sign (trade initiator), size of the trade, and the contents of the limit order book before the trade. We demonstrate this predictability by developing an empirical inference model to classify trades into buyer-initiated and seller-initiated. The model employs a local non-parametric method, k-nearest neighbor, which in the past was used successfully for chaotic time series prediction. The k-nearest neighbor with three predictor variables achieves an average out-of-sample classification accuracy of 71.40%, compared to 63.32% for the linear logistic regression with seven predictor variables. The result suggests that a non-linear approach may produce a more parsimonious trade sign inference model with a higher out-of-sample classification accuracy. Furthermore, for most of our stocks the observed regularity in market order submissions seems to have a memory of at least 30 trading days.

  18. Inference

    DEFF Research Database (Denmark)

    Møller, Jesper

    (This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...

  19. Bayesian inference for Markov jump processes with informative observations.

    Science.gov (United States)

    Golightly, Andrew; Wilkinson, Darren J

    2015-04-01

    In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.

  20. Lower complexity bounds for lifted inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred

    2015-01-01

    instances of the model. Numerous approaches for such “lifted inference” techniques have been proposed. While it has been demonstrated that these techniques will lead to significantly more efficient inference on some specific models, there are only very recent and still quite restricted results that show...... the feasibility of lifted inference on certain syntactically defined classes of models. Lower complexity bounds that imply some limitations for the feasibility of lifted inference on more expressive model classes were established earlier in Jaeger (2000; Jaeger, M. 2000. On the complexity of inference about...... that under the assumption that NETIME≠ETIME, there is no polynomial lifted inference algorithm for knowledge bases of weighted, quantifier-, and function-free formulas. Further strengthening earlier results, this is also shown to hold for approximate inference and for knowledge bases not containing...

  1. Pointwise probability reinforcements for robust statistical inference.

    Science.gov (United States)

    Frénay, Benoît; Verleysen, Michel

    2014-02-01

    Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Variations on Bayesian Prediction and Inference

    Science.gov (United States)

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  3. Parametric inference for discretely sampled stochastic differential equations

    DEFF Research Database (Denmark)

    Sørensen, Michael

    A review is given of parametric estimation methods for discretely sampled mul- tivariate diffusion processes. The main focus is on estimating functions and asymp- totic results. Maximum likelihood estimation is briefly considered, but the emphasis is on computationally less demanding martingale...

  4. Ancient Biomolecules and Evolutionary Inference.

    Science.gov (United States)

    Cappellini, Enrico; Prohaska, Ana; Racimo, Fernando; Welker, Frido; Pedersen, Mikkel Winther; Allentoft, Morten E; de Barros Damgaard, Peter; Gutenbrunner, Petra; Dunne, Julie; Hammann, Simon; Roffet-Salque, Mélanie; Ilardo, Melissa; Moreno-Mayar, J Víctor; Wang, Yucheng; Sikora, Martin; Vinner, Lasse; Cox, Jürgen; Evershed, Richard P; Willerslev, Eske

    2018-04-25

    Over the last decade, studies of ancient biomolecules-particularly ancient DNA, proteins, and lipids-have revolutionized our understanding of evolutionary history. Though initially fraught with many challenges, the field now stands on firm foundations. Researchers now successfully retrieve nucleotide and amino acid sequences, as well as lipid signatures, from progressively older samples, originating from geographic areas and depositional environments that, until recently, were regarded as hostile to long-term preservation of biomolecules. Sampling frequencies and the spatial and temporal scope of studies have also increased markedly, and with them the size and quality of the data sets generated. This progress has been made possible by continuous technical innovations in analytical methods, enhanced criteria for the selection of ancient samples, integrated experimental methods, and advanced computational approaches. Here, we discuss the history and current state of ancient biomolecule research, its applications to evolutionary inference, and future directions for this young and exciting field. Expected final online publication date for the Annual Review of Biochemistry Volume 87 is June 20, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

  5. Adaptive Inference on General Graphical Models

    OpenAIRE

    Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur

    2012-01-01

    Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...

  6. Road Traffic Anomaly Detection via Collaborative Path Inference from GPS Snippets

    Directory of Open Access Journals (Sweden)

    Hongtao Wang

    2017-03-01

    Full Text Available Road traffic anomaly denotes a road segment that is anomalous in terms of traffic flow of vehicles. Detecting road traffic anomalies from GPS (Global Position System snippets data is becoming critical in urban computing since they often suggest underlying events. However, the noisy ands parse nature of GPS snippets data have ushered multiple problems, which have prompted the detection of road traffic anomalies to be very challenging. To address these issues, we propose a two-stage solution which consists of two components: a Collaborative Path Inference (CPI model and a Road Anomaly Test (RAT model. CPI model performs path inference incorporating both static and dynamic features into a Conditional Random Field (CRF. Dynamic context features are learned collaboratively from large GPS snippets via a tensor decomposition technique. Then RAT calculates the anomalous degree for each road segment from the inferred fine-grained trajectories in given time intervals. We evaluated our method using a large scale real world dataset, which includes one-month GPS location data from more than eight thousand taxi cabs in Beijing. The evaluation results show the advantages of our method beyond other baseline techniques.

  7. Inferring on the Intentions of Others by Hierarchical Bayesian Learning

    Science.gov (United States)

    Diaconescu, Andreea O.; Mathys, Christoph; Weber, Lilian A. E.; Daunizeau, Jean; Kasper, Lars; Lomakina, Ekaterina I.; Fehr, Ernst; Stephan, Klaas E.

    2014-01-01

    Inferring on others' (potentially time-varying) intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to “player” or “adviser” roles) interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI) scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i) employ hierarchical generative models to infer on the changing intentions of others, (ii) use volatility estimates to inform decision-making in social interactions, and (iii) integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition. PMID:25187943

  8. Testing AGN unification via inference from large catalogs

    Science.gov (United States)

    Nikutta, Robert; Ivezic, Zeljko; Elitzur, Moshe; Nenkova, Maia

    2018-01-01

    Source orientation and clumpiness of the central dust are the main factors in AGN classification. Type-1 QSOs are easy to observe and large samples are available (e.g. in SDSS), but obscured type-2 AGN are dimmer and redder as our line of sight is more obscured, making it difficult to obtain a complete sample. WISE has found up to a million QSOs. With only 4 bands and a relatively small aperture the analysis of individual sources is challenging, but the large sample allows inference of bulk properties at a very significant level.CLUMPY (www.clumpy.org) is arguably the most popular database of AGN torus SEDs. We model the ensemble properties of the entire WISE AGN content using regularized linear regression, with orientation-dependent CLUMPY color-color-magnitude (CCM) tracks as basis functions. We can reproduce the observed number counts per CCM bin with percent-level accuracy, and simultaneously infer the probability distributions of all torus parameters, redshifts, additional SED components, and identify type-1/2 AGN populations through their IR properties alone. We increase the statistical power of our AGN unification tests even further, by adding other datasets as axes in the regression problem. To this end, we make use of the NOAO Data Lab (datalab.noao.edu), which hosts several high-level large datasets and provides very powerful tools for handling large data, e.g. cross-matched catalogs, fast remote queries, etc.

  9. Critical Thinking Skills Evidenced in Graduate Students Blogs

    Science.gov (United States)

    Cain, Holly Reed; Giraud, Vivana; Stedman, Nicole L. P.; Adams, Brittany L.

    2012-01-01

    The objective of this research was to identify Facione's six critical thinking skills using graduate students blogs as a reflection tool in the context of leadership using structured and unstructured blogs. The skills researched were (a) Interpretation, (b) Analysis, (c) Evaluation, (d) Inference, (e) Explanation, and (f) Self-Regulation (Facione,…

  10. Inference of population splits and mixtures from genome-wide allele frequency data.

    Directory of Open Access Journals (Sweden)

    Joseph K Pickrell

    Full Text Available Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.

  11. The inference from a single case: moral versus scientific inferences in implementing new biotechnologies.

    Science.gov (United States)

    Hofmann, B

    2008-06-01

    Are there similarities between scientific and moral inference? This is the key question in this article. It takes as its point of departure an instance of one person's story in the media changing both Norwegian public opinion and a brand-new Norwegian law prohibiting the use of saviour siblings. The case appears to falsify existing norms and to establish new ones. The analysis of this case reveals similarities in the modes of inference in science and morals, inasmuch as (a) a single case functions as a counter-example to an existing rule; (b) there is a common presupposition of stability, similarity and order, which makes it possible to reason from a few cases to a general rule; and (c) this makes it possible to hold things together and retain order. In science, these modes of inference are referred to as falsification, induction and consistency. In morals, they have a variety of other names. Hence, even without abandoning the fact-value divide, there appear to be similarities between inference in science and inference in morals, which may encourage communication across the boundaries between "the two cultures" and which are relevant to medical humanities.

  12. Analysis Critical Thinking Stage of Eighth Grade in PBL-Scaffolding Setting To Solve Mathematical Problems

    OpenAIRE

    Nur Aisyah Isti; Arief Agoestanto; Ary Woro Kurniasih

    2017-01-01

    The purpose of this research was described critical thinking stage of students grade VIII in setting PBL and scaffolding to solve mathematics problems. Critical thinking stage consists of clarification, assesment, inference, and strategy/tactics. The subject were teo students in the level of capacity to think critical (uncritical, less critical, quite critical, and critical). So that this research subject was 8 students in VIII A One State Junior High School of Temanggung. The result showed a...

  13. Introductory statistical inference

    CERN Document Server

    Mukhopadhyay, Nitis

    2014-01-01

    This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist

  14. Bayesian inference of chemical kinetic models from proposed reactions

    KAUST Repository

    Galagali, Nikhil

    2015-02-01

    © 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.

  15. Action Simulation Plays a Critical Role in Deceptive Action Recognition

    NARCIS (Netherlands)

    Tidoni, Emmanuele; Borgomaneri, Sara; Di Pellegrino, Giuseppe; Avenanti, Alessio

    2013-01-01

    The ability to infer deceptive intents from nonverbal behavior is critical for social interactions. By combining single-pulse and repetitive transcranial magnetic stimulation (TMS) in healthy humans, we provide both correlational and causative evidence that action simulation is actively involved in

  16. Measuring change in critical thinking skills of dental students educated in a PBL curriculum.

    Science.gov (United States)

    Pardamean, Bens

    2012-04-01

    This study measured the change in critical thinking skills of dental students educated in a problem-based learning (PBL) pedagogical method. The quantitative analysis was focused on measuring students' critical thinking skills achievement from their first through third years of dental education at the University of Southern California. This non-experimental evaluation was based on a volunteer sample of ninety-eight dental students who completed a demographics/academic questionnaire and a psychometric assessment known as the Health Sciences Reasoning Test (HSRT). The HSRT produced the overall critical thinking skills score. Additionally, the HSRT generated five subscale scores: analysis, inference, evaluation, deductive reasoning, and inductive reasoning. The results of this study concluded that the students showed no continuous and significant incremental improvement in their overall critical thinking skills score achievement during their PBL-based dental education. Except for the inductive reasoning score, this result was very consistent with the four subscale scores. Moreover, after performing the statistical adjustment on total score and subscale scores, no significant statistical differences were found among the three student groups. However, the results of this study found some aspects of critical thinking achievements that differed by categories of gender, race, English as first language, and education level.

  17. Constrained statistical inference: sample-size tables for ANOVA and regression

    Directory of Open Access Journals (Sweden)

    Leonard eVanbrabant

    2015-01-01

    Full Text Available Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient beta1 is larger than beta2 and beta3. The corresponding hypothesis is H: beta1 > {beta2, beta3} and this is known as an (order constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a prespecified power (say, 0.80 for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30% to 50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., beta1 > beta2 results in a higher power than assigning a positive or a negative sign to the parameters (e.g., beta1 > 0.

  18. Active inference, communication and hermeneutics.

    Science.gov (United States)

    Friston, Karl J; Frith, Christopher D

    2015-07-01

    Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. Random Walks on Directed Networks: Inference and Respondent-Driven Sampling

    Directory of Open Access Journals (Sweden)

    Malmros Jens

    2016-06-01

    Full Text Available Respondent-driven sampling (RDS is often used to estimate population properties (e.g., sexual risk behavior in hard-to-reach populations. In RDS, already sampled individuals recruit population members to the sample from their social contacts in an efficient snowball-like sampling procedure. By assuming a Markov model for the recruitment of individuals, asymptotically unbiased estimates of population characteristics can be obtained. Current RDS estimation methodology assumes that the social network is undirected, that is, all edges are reciprocal. However, empirical social networks in general also include a substantial number of nonreciprocal edges. In this article, we develop an estimation method for RDS in populations connected by social networks that include reciprocal and nonreciprocal edges. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. The proposed estimators are evaluated on artificial and empirical networks and are shown to generally perform better than existing estimators. This is the case in particular when the fraction of directed edges in the network is large.

  20. Phylogeny of the gymnosperm genus Cycas L. (Cycadaceae) as inferred from plastid and nuclear loci based on a large-scale sampling: Evolutionary relationships and taxonomical implications.

    Science.gov (United States)

    Liu, Jian; Zhang, Shouzhou; Nagalingum, Nathalie S; Chiang, Yu-Chung; Lindstrom, Anders J; Gong, Xun

    2018-05-18

    The gymnosperm genus Cycas is the sole member of Cycadaceae, and is the largest genus of extant cycads. There are about 115 accepted Cycas species mainly distributed in the paleotropics. Based on morphology, the genus has been divided into six sections and eight subsections, but this taxonomy has not yet been tested in a molecular phylogenetic framework. Although the monophyly of Cycas is broadly accepted, the intrageneric relationships inferred from previous molecular phylogenetic analyses are unclear due to insufficient sampling or uninformative DNA sequence data. In this study, we reconstructed a phylogeny of Cycas using four chloroplast intergenic spacers and seven low-copy nuclear genes and sampling 90% of extant Cycas species. The maximum likelihood and Bayesian inference phylogenies suggest: (1) matrices of either concatenated cpDNA markers or of concatenated nDNA lack sufficient informative sites to resolve the phylogeny alone, however, the phylogeny from the combined cpDNA-nDNA dataset suggests the genus can be roughly divided into 13 clades and six sections that are in agreement with the current classification of the genus; (2) although with partial support, a clade combining sections Panzhihuaenses + Asiorientales is resolved as the earliest diverging branch; (3) section Stangerioides is not monophyletic because the species resolve as a grade; (4) section Indosinenses is not monophyletic as it includes Cycas macrocarpa and C. pranburiensis from section Cycas; (5) section Cycas is the most derived group and its subgroups correspond with geography. Copyright © 2018 Elsevier Inc. All rights reserved.

  1. Morality Principles for Risk Modelling: Needs and Links with the Origins of Plausible Inference

    Science.gov (United States)

    Solana-Ortega, Alberto; Solana, Vicente

    2009-12-01

    In comparison with the foundations of probability calculus, the inescapable and controversial issue of how to assign probabilities has only recently become a matter of formal study. The introduction of information as a technical concept was a milestone, but the most promising entropic assignment methods still face unsolved difficulties, manifesting the incompleteness of plausible inference theory. In this paper we examine the situation faced by risk analysts in the critical field of extreme events modelling, where the former difficulties are especially visible, due to scarcity of observational data, the large impact of these phenomena and the obligation to assume professional responsibilities. To respond to the claim for a sound framework to deal with extremes, we propose a metafoundational approach to inference, based on a canon of extramathematical requirements. We highlight their strong moral content, and show how this emphasis in morality, far from being new, is connected with the historic origins of plausible inference. Special attention is paid to the contributions of Caramuel, a contemporary of Pascal, unfortunately ignored in the usual mathematical accounts of probability.

  2. Optimization methods for logical inference

    CERN Document Server

    Chandru, Vijay

    2011-01-01

    Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in

  3. Inference in `poor` languages

    Energy Technology Data Exchange (ETDEWEB)

    Petrov, S.

    1996-10-01

    Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.

  4. On incomplete sampling under birth-death models and connections to the sampling-based coalescent.

    Science.gov (United States)

    Stadler, Tanja

    2009-11-07

    The constant rate birth-death process is used as a stochastic model for many biological systems, for example phylogenies or disease transmission. As the biological data are usually not fully available, it is crucial to understand the effect of incomplete sampling. In this paper, we analyze the constant rate birth-death process with incomplete sampling. We derive the density of the bifurcation events for trees on n leaves which evolved under this birth-death-sampling process. This density is used for calculating prior distributions in Bayesian inference programs and for efficiently simulating trees. We show that the birth-death-sampling process can be interpreted as a birth-death process with reduced rates and complete sampling. This shows that joint inference of birth rate, death rate and sampling probability is not possible. The birth-death-sampling process is compared to the sampling-based population genetics model, the coalescent. It is shown that despite many similarities between these two models, the distribution of bifurcation times remains different even in the case of very large population sizes. We illustrate these findings on an Hepatitis C virus dataset from Egypt. We show that the transmission times estimates are significantly different-the widely used Gamma statistic even changes its sign from negative to positive when switching from the coalescent to the birth-death process.

  5. Affective expressions in groups and inferences about members' relational well-being: The effects of socially engaging and disengaging emotions.

    Science.gov (United States)

    Rothman, Naomi B; Magee, Joe C

    2016-01-01

    Our findings draw attention to the interpersonal communication function of a relatively unexplored dimension of emotions-the level of social engagement versus disengagement. In four experiments, regardless of valence and target group gender, observers infer greater relational well-being (more cohesiveness and less conflict) between group members from socially engaging (sadness and appreciation) versus disengaging (anger and pride) emotion expressions. Supporting our argument that social (dis)engagement is a critical dimension communicated by these emotions, we demonstrate (1) that inferences about group members' self-interest mediate the effect of socially engaging emotions on cohesiveness and (2) that the influence of socially disengaging emotion expressions on inferences of conflict is attenuated when groups have collectivistic norms (i.e., members value a high level of social engagement). Furthermore, we show an important downstream consequence of these inferences of relational well-being: Groups that seem less cohesive because of their members' proud (versus appreciative) expressions are also expected to have worse task performance.

  6. Essays on inference in economics, competition, and the rate of profit

    Science.gov (United States)

    Scharfenaker, Ellis S.

    This dissertation is comprised of three papers that demonstrate the role of Bayesian methods of inference and Shannon's information theory in classical political economy. The first chapter explores the empirical distribution of profit rate data from North American firms from 1962-2012. This chapter address the fact that existing methods for sample selection from noisy profit rate data in the industrial organization field of economics tends to be conditional on a covariate's value that risks discarding information. Conditioning sample selection instead on the profit rate data's structure by means of a two component (signal and noise) Bayesian mixture model we find the the profit rate sample to be time stationary Laplace distributed, corroborating earlier estimates of cross section distributions. The second chapter compares alternative probabilistic approaches to discrete (quantal) choice analysis and examines the various ways in which they overlap. In particular, the work on individual choice behavior by Duncan Luce and the extension of this work to quantal response problems by game theoreticians is shown to be related both to the rational inattention work of Christopher Sims through Shannon's information theory as well as to the maximum entropy principle of inference proposed physicist Edwin T. Jaynes. In the third chapter I propose a model of ``classically" competitive firms facing informational entropy constraints in their decisions to potentially enter or exit markets based on profit rate differentials. The result is a three parameter logit quantal response distribution for firm entry and exit decisions. Bayesian methods are used for inference into the the distribution of entry and exit decisions conditional on profit rate deviations and firm level data from Compustat is used to test these predictions.

  7. Approaching Reflexivity through Reflection: Issues for Critical Management Education

    Science.gov (United States)

    Hibbert, Paul

    2013-01-01

    This conceptual article seeks to develop insights for teaching reflexivity in undergraduate management classes through developing processes of critical reflection. Theoretical inferences to support this aim are developed and organized in relation to four principles. They are as follows: first, preparing and making space for reflection in the…

  8. EI: A Program for Ecological Inference

    Directory of Open Access Journals (Sweden)

    Gary King

    2004-09-01

    Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.

  9. Critical considerations for the application of environmental DNA methods to detect aquatic species

    Science.gov (United States)

    Goldberg, Caren S.; Turner, Cameron R.; Deiner, Kristy; Klymus, Katy E.; Thomsen, Philip Francis; Murphy, Melanie A.; Spear, Stephen F.; McKee, Anna; Oyler-McCance, Sara J.; Cornman, Robert S.; Laramie, Matthew B.; Mahon, Andrew R.; Lance, Richard F.; Pilliod, David S.; Strickler, Katherine M.; Waits, Lisette P.; Fremier, Alexander K.; Takahara, Teruhiko; Herder, Jelger E.; Taberlet, Pierre

    2016-01-01

    Species detection using environmental DNA (eDNA) has tremendous potential for contributing to the understanding of the ecology and conservation of aquatic species. Detecting species using eDNA methods, rather than directly sampling the organisms, can reduce impacts on sensitive species and increase the power of field surveys for rare and elusive species. The sensitivity of eDNA methods, however, requires a heightened awareness and attention to quality assurance and quality control protocols. Additionally, the interpretation of eDNA data demands careful consideration of multiple factors. As eDNA methods have grown in application, diverse approaches have been implemented to address these issues. With interest in eDNA continuing to expand, supportive guidelines for undertaking eDNA studies are greatly needed.Environmental DNA researchers from around the world have collaborated to produce this set of guidelines and considerations for implementing eDNA methods to detect aquatic macroorganisms.Critical considerations for study design include preventing contamination in the field and the laboratory, choosing appropriate sample analysis methods, validating assays, testing for sample inhibition and following minimum reporting guidelines. Critical considerations for inference include temporal and spatial processes, limits of correlation of eDNA with abundance, uncertainty of positive and negative results, and potential sources of allochthonous DNA.We present a synthesis of knowledge at this stage for application of this new and powerful detection method.

  10. Statistical inference involving binomial and negative binomial parameters.

    Science.gov (United States)

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2009-05-01

    Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.

  11. Likelihood-Based Inference of B Cell Clonal Families.

    Directory of Open Access Journals (Sweden)

    Duncan K Ralph

    2016-10-01

    Full Text Available The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called "rearrangement" forming progenitor B cells, then a Darwinian process of lineage diversification and selection called "affinity maturation." The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem "clonal family inference." In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.

  12. A combinatorial perspective of the protein inference problem.

    Science.gov (United States)

    Yang, Chao; He, Zengyou; Yu, Weichuan

    2013-01-01

    In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from peptide identification results. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we devote ourselves to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound, and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain an analytical expression for protein inference. Our method achieves comparable results with ProteinProphet in a more efficient manner in experiments on two data sets of standard protein mixtures and two data sets of real samples. Based on our model, we study the impact of unique peptides and degenerate peptides (degenerate peptides are peptides shared by at least two proteins) on protein probabilities. Meanwhile, we also study the relationship between our model and ProteinProphet. We name our program ProteinInfer. Its Java source code, our supplementary document and experimental results are available at: >http://bioinformatics.ust.hk/proteininfer.

  13. A Calibration-Capture-Recapture Model for Inferring Natual Gas Leak Population Characteristics Using Data from Google Street View Cars

    Science.gov (United States)

    Weller, Z.; Hoeting, J.; von Fischer, J.

    2017-12-01

    Pipeline systems that distribute natural gas (NG) within cities can leak, leading to safety hazards and wasted product. Moreover, these leaks are climate-altering because NG is primarily composed of methane, a potent greenhouse gas. Scientists have recently developed an innovative method for mapping NG leak locations by installing atmospheric methane analyzers on Google Street View cars. We develop new statistical methodology to answer key inferential questions using data collected by these mobile air monitors. The new calibration-capture-recapture (CCR) model utilizes data from controlled methane releases and data collected by GSV cars to provide inference for several desired quantities, including the number of undetected methane sources and the total methane output rate in a surveyed region. The CCR model addresses challenges associated with using a capture-recapture model to analyze data collected by a mobile detection system including variable sampling effort and lack of physically marking individuals. We develop a Markov chain Monte Carlo algorithm for parameter estimation and apply the CCR model to methane data collected in two U.S. cities. The CCR model provides a new framework for inferring the total number of leaks in NG distribution systems and offers critical insights for informing intelligent repair policy that is both cost-effective and environmentally friendly.

  14. Models for inference in dynamic metacommunity systems

    Science.gov (United States)

    Dorazio, Robert M.; Kery, Marc; Royle, J. Andrew; Plattner, Matthias

    2010-01-01

    A variety of processes are thought to be involved in the formation and dynamics of species assemblages. For example, various metacommunity theories are based on differences in the relative contributions of dispersal of species among local communities and interactions of species within local communities. Interestingly, metacommunity theories continue to be advanced without much empirical validation. Part of the problem is that statistical models used to analyze typical survey data either fail to specify ecological processes with sufficient complexity or they fail to account for errors in detection of species during sampling. In this paper, we describe a statistical modeling framework for the analysis of metacommunity dynamics that is based on the idea of adopting a unified approach, multispecies occupancy modeling, for computing inferences about individual species, local communities of species, or the entire metacommunity of species. This approach accounts for errors in detection of species during sampling and also allows different metacommunity paradigms to be specified in terms of species- and location-specific probabilities of occurrence, extinction, and colonization: all of which are estimable. In addition, this approach can be used to address inference problems that arise in conservation ecology, such as predicting temporal and spatial changes in biodiversity for use in making conservation decisions. To illustrate, we estimate changes in species composition associated with the species-specific phenologies of flight patterns of butterflies in Switzerland for the purpose of estimating regional differences in biodiversity.

  15. Legal Education: Critical of Contemporaneity

    Directory of Open Access Journals (Sweden)

    Patrícia Verônica Nunes Carvalho Sobral

    2016-10-01

    Full Text Available This study reflects on the Legal Education, considering the criticism of contemporaneity. To reach the goal, the text is divided into: Critical, idealization and reality of legal education; Professor  of  law  schools;  The  educational  legislation  Questions  of  legal  education methodology; Pedagogy and the law. The reading of the sources referred the thought inferences  about  the  teaching  of  law,  the  methodological  approach  and  the  didactic- pedagogic preparation, according to Associação Latino Americana de Metodologia do Ensino do Direito. Contributes to the continuity of academic debate in progress, it is a problem that concerns the professional higher education.

  16. Use of an oscillation technique to measure effective cross-sections of fissionable samples in critical assemblies

    International Nuclear Information System (INIS)

    Tretiakoff, O.; Vidal, R.; Carre, J.C.; Robin, M.

    1964-01-01

    The authors describe the technique used to measure the effective absorption and neutron-yield cross-sections of a fissionable sample. These two values are determined by analysing the signals due to the variation in reactivity (over-all signal) and the local perturbation in the flux (local signal) produced by the oscillating sample. These signals are standardized by means of a set of samples containing quantities of fissionable material ( 235 U) and an absorber, boron, which are well known. The measurements are made for different neutron spectra characterized by lattice parameters which constitute the central zone within which the sample moves. This technique is used to study the effective cross-sections of uranium-plutonium alloys for different heavy-water and graphite lattices in the MINERVE and MARIUS critical assemblies. The same experiments are carried out on fuel samples of different irradiations in order to determine the evolution of effective cross-sections as a function of the spectrum and the irradiations. (authors) [fr

  17. Simulated Sampling of Estuary Plankton

    Science.gov (United States)

    Fortner, Rosanne W.; Jenkins, Deborah Bainer

    2009-01-01

    To find out about the microscopic life in the valuable estuary environment, it is usually necessary to be near the water. This dry lab offers an alternative, using authentic data and a simulation of plankton sampling. From the types of organisms found in the sample, middle school students can infer relationships in the biological and physical…

  18. An Inference Language for Imaging

    DEFF Research Database (Denmark)

    Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen

    2014-01-01

    We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...

  19. Inference

    DEFF Research Database (Denmark)

    Møller, Jesper

    2010-01-01

    Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....

  20. Critical-state model for the determination of critical currents in disk-shaped superconductors

    International Nuclear Information System (INIS)

    Frankel, D.J.

    1979-01-01

    A series of experiments has been carried out on the flux trapping and shielding capabilities of a flat strip of Nb-Ti/Cu composite material. A circular piece of material from the strip was tested in a uniform field directed perpendicularly to the surface of the sample. Profiles of the normal component of the field along the sample diameter were measured. The critical-state model was adapted for this geometry and proved capable of reproducing the measured field profiles. Model curves agreed well with experimental field profiles generated when the full sample was in the critical state, when only a portion of the sample was in the critical state, and when profiles were obtained after the direction of the rate change of the magnetic field was reversed. The adaption of the critical-state model to disk geometry provides a possible method either to derive values of the critical current from measurements of field profiles above thin flat samples, or to predict the trapping and shielding behavior of such samples if the critical current is already known. This method of determining critical currents does not require that samples be formed into narrow strips or wires, as is required for direct measurements of J/sub c/, or into tubes or cylinders, as is usually required for magnetization-type measurements. Only a relatively small approximately circular piece of material is needed. The method relies on induced currents, so there is no need to pass large currents into the sample. The field-profile measurements are easily performed with expensive Hall probes and do not require detection of the resistive transition of the superconductor

  1. Feature Inference Learning and Eyetracking

    Science.gov (United States)

    Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.

    2009-01-01

    Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…

  2. Directed partial correlation: inferring large-scale gene regulatory network through induced topology disruptions.

    Directory of Open Access Journals (Sweden)

    Yinyin Yuan

    Full Text Available Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/.

  3. Phylodynamic Inference with Kernel ABC and Its Application to HIV Epidemiology.

    Science.gov (United States)

    Poon, Art F Y

    2015-09-01

    The shapes of phylogenetic trees relating virus populations are determined by the adaptation of viruses within each host, and by the transmission of viruses among hosts. Phylodynamic inference attempts to reverse this flow of information, estimating parameters of these processes from the shape of a virus phylogeny reconstructed from a sample of genetic sequences from the epidemic. A key challenge to phylodynamic inference is quantifying the similarity between two trees in an efficient and comprehensive way. In this study, I demonstrate that a new distance measure, based on a subset tree kernel function from computational linguistics, confers a significant improvement over previous measures of tree shape for classifying trees generated under different epidemiological scenarios. Next, I incorporate this kernel-based distance measure into an approximate Bayesian computation (ABC) framework for phylodynamic inference. ABC bypasses the need for an analytical solution of model likelihood, as it only requires the ability to simulate data from the model. I validate this "kernel-ABC" method for phylodynamic inference by estimating parameters from data simulated under a simple epidemiological model. Results indicate that kernel-ABC attained greater accuracy for parameters associated with virus transmission than leading software on the same data sets. Finally, I apply the kernel-ABC framework to study a recent outbreak of a recombinant HIV subtype in China. Kernel-ABC provides a versatile framework for phylodynamic inference because it can fit a broader range of models than methods that rely on the computation of exact likelihoods. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  4. Network inference from functional experimental data (Conference Presentation)

    Science.gov (United States)

    Desrosiers, Patrick; Labrecque, Simon; Tremblay, Maxime; Bélanger, Mathieu; De Dorlodot, Bertrand; Côté, Daniel C.

    2016-03-01

    Functional connectivity maps of neuronal networks are critical tools to understand how neurons form circuits, how information is encoded and processed by neurons, how memory is shaped, and how these basic processes are altered under pathological conditions. Current light microscopy allows to observe calcium or electrical activity of thousands of neurons simultaneously, yet assessing comprehensive connectivity maps directly from such data remains a non-trivial analytical task. There exist simple statistical methods, such as cross-correlation and Granger causality, but they only detect linear interactions between neurons. Other more involved inference methods inspired by information theory, such as mutual information and transfer entropy, identify more accurately connections between neurons but also require more computational resources. We carried out a comparative study of common connectivity inference methods. The relative accuracy and computational cost of each method was determined via simulated fluorescence traces generated with realistic computational models of interacting neurons in networks of different topologies (clustered or non-clustered) and sizes (10-1000 neurons). To bridge the computational and experimental works, we observed the intracellular calcium activity of live hippocampal neuronal cultures infected with the fluorescent calcium marker GCaMP6f. The spontaneous activity of the networks, consisting of 50-100 neurons per field of view, was recorded from 20 to 50 Hz on a microscope controlled by a homemade software. We implemented all connectivity inference methods in the software, which rapidly loads calcium fluorescence movies, segments the images, extracts the fluorescence traces, and assesses the functional connections (with strengths and directions) between each pair of neurons. We used this software to assess, in real time, the functional connectivity from real calcium imaging data in basal conditions, under plasticity protocols, and epileptic

  5. Fractional Yields Inferred from Halo and Thick Disk Stars

    Science.gov (United States)

    Caimmi, R.

    2013-12-01

    Linear [Q/H]-[O/H] relations, Q = Na, Mg, Si, Ca, Ti, Cr, Fe, Ni, are inferred from a sample (N=67) of recently studied FGK-type dwarf stars in the solar neighbourhood including different populations (Nissen and Schuster 2010, Ramirez et al. 2012), namely LH (N=24, low-α halo), HH (N=25, high-α halo), KD (N=16, thick disk), and OL (N=2, globular cluster outliers). Regression line slope and intercept estimators and related variance estimators are determined. With regard to the straight line, [Q/H]=a_{Q}[O/H]+b_{Q}, sample stars are displayed along a "main sequence", [Q,O] = [a_{Q},b_{Q},Δ b_{Q}], leaving aside the two OL stars, which, in most cases (e.g. Na), lie outside. The unit slope, a_{Q}=1, implies Q is a primary element synthesised via SNII progenitors in the presence of a universal stellar initial mass function (defined as simple primary element). In this respect, Mg, Si, Ti, show hat a_{Q}=1 within ∓2hatσ_ {hat a_{Q}}; Cr, Fe, Ni, within ∓3hatσ_{hat a_{Q}}; Na, Ca, within ∓ rhatσ_{hat a_{Q}}, r>3. The empirical, differential element abundance distributions are inferred from LH, HH, KD, HA = HH + KD subsamples, where related regression lines represent their theoretical counterparts within the framework of simple MCBR (multistage closed box + reservoir) chemical evolution models. Hence, the fractional yields, hat{p}_{Q}/hat{p}_{O}, are determined and (as an example) a comparison is shown with their theoretical counterparts inferred from SNII progenitor nucleosynthesis under the assumption of a power-law stellar initial mass function. The generalized fractional yields, C_{Q}=Z_{Q}/Z_{O}^{a_{Q}}, are determined regardless of the chemical evolution model. The ratio of outflow to star formation rate is compared for different populations in the framework of simple MCBR models. The opposite situation of element abundance variation entirely due to cosmic scatter is also considered under reasonable assumptions. The related differential element abundance

  6. Nested Sampling with Constrained Hamiltonian Monte Carlo

    OpenAIRE

    Betancourt, M. J.

    2010-01-01

    Nested sampling is a powerful approach to Bayesian inference ultimately limited by the computationally demanding task of sampling from a heavily constrained probability distribution. An effective algorithm in its own right, Hamiltonian Monte Carlo is readily adapted to efficiently sample from any smooth, constrained distribution. Utilizing this constrained Hamiltonian Monte Carlo, I introduce a general implementation of the nested sampling algorithm.

  7. Critical investigation of the separation of noradrenaline and adrenaline from urine samples using Al2O3 as adsorbant

    International Nuclear Information System (INIS)

    Neidhart, B.; Kringe, K.-P.; Deutschmann, P.

    1983-01-01

    A critical investigation of the separation of free noradrenaline and adrenaline from urine samples revealed serious errors during sample pretreatment using Al 2 O 3 as adsorbent. An exact and rapid pH adjustment of the sample, using thymol-blue as indicator, proved to be the chief prerequisite for precise and accurate results. Increasing temperature and pH favour the oxidative decomposition of the catecholamines during routine analysis. This was examined, using the radiotracer method and liquid scintillation counting. (author)

  8. Forward and backward inference in spatial cognition.

    Directory of Open Access Journals (Sweden)

    Will D Penny

    Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.

  9. Experiments for Evaluating Application of Bayesian Inference to Situation Awareness of Human Operators in NPPs

    Energy Technology Data Exchange (ETDEWEB)

    Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)

    2014-08-15

    Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)

  10. Experiments for Evaluating Application of Bayesian Inference to Situation Awareness of Human Operators in NPPs

    International Nuclear Information System (INIS)

    Kang, Seong Keun; Seong, Poong Hyun

    2014-01-01

    Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)

  11. Spatially explicit inference for open populations: estimating demographic parameters from camera-trap studies.

    Science.gov (United States)

    Gardner, Beth; Reppucci, Juan; Lucherini, Mauro; Royle, J Andrew

    2010-11-01

    We develop a hierarchical capture-recapture model for demographically open populations when auxiliary spatial information about location of capture is obtained. Such spatial capture-recapture data arise from studies based on camera trapping, DNA sampling, and other situations in which a spatial array of devices records encounters of unique individuals. We integrate an individual-based formulation of a Jolly-Seber type model with recently developed spatially explicit capture-recapture models to estimate density and demographic parameters for survival and recruitment. We adopt a Bayesian framework for inference under this model using the method of data augmentation which is implemented in the software program WinBUGS. The model was motivated by a camera trapping study of Pampas cats Leopardus colocolo from Argentina, which we present as an illustration of the model in this paper. We provide estimates of density and the first quantitative assessment of vital rates for the Pampas cat in the High Andes. The precision of these estimates is poor due likely to the sparse data set. Unlike conventional inference methods which usually rely on asymptotic arguments, Bayesian inferences are valid in arbitrary sample sizes, and thus the method is ideal for the study of rare or endangered species for which small data sets are typical.

  12. A Bayesian Method for Weighted Sampling

    OpenAIRE

    Lo, Albert Y.

    1993-01-01

    Bayesian statistical inference for sampling from weighted distribution models is studied. Small-sample Bayesian bootstrap clone (BBC) approximations to the posterior distribution are discussed. A second-order property for the BBC in unweighted i.i.d. sampling is given. A consequence is that BBC approximations to a posterior distribution of the mean and to the sampling distribution of the sample average, can be made asymptotically accurate by a proper choice of the random variables that genera...

  13. A formal model of interpersonal inference

    Directory of Open Access Journals (Sweden)

    Michael eMoutoussis

    2014-03-01

    Full Text Available Introduction: We propose that active Bayesian inference – a general framework for decision-making – can equally be applied to interpersonal exchanges. Social cognition, however, entails special challenges. We address these challenges through a novel formulation of a formal model and demonstrate its psychological significance. Method: We review relevant literature, especially with regards to interpersonal representations, formulate a mathematical model and present a simulation study. The model accommodates normative models from utility theory and places them within the broader setting of Bayesian inference. Crucially, we endow people's prior beliefs, into which utilities are absorbed, with preferences of self and others. The simulation illustrates the model's dynamics and furnishes elementary predictions of the theory. Results: 1. Because beliefs about self and others inform both the desirability and plausibility of outcomes, in this framework interpersonal representations become beliefs that have to be actively inferred. This inference, akin to 'mentalising' in the psychological literature, is based upon the outcomes of interpersonal exchanges. 2. We show how some well-known social-psychological phenomena (e.g. self-serving biases can be explained in terms of active interpersonal inference. 3. Mentalising naturally entails Bayesian updating of how people value social outcomes. Crucially this includes inference about one’s own qualities and preferences. Conclusion: We inaugurate a Bayes optimal framework for modelling intersubject variability in mentalising during interpersonal exchanges. Here, interpersonal representations are endowed with explicit functional and affective properties. We suggest the active inference framework lends itself to the study of psychiatric conditions where mentalising is distorted.

  14. Supervised learning for infection risk inference using pathology data.

    Science.gov (United States)

    Hernandez, Bernard; Herrero, Pau; Rawson, Timothy Miles; Moore, Luke S P; Evans, Benjamin; Toumazou, Christofer; Holmes, Alison H; Georgiou, Pantelis

    2017-12-08

    Antimicrobial Resistance is threatening our ability to treat common infectious diseases and overuse of antimicrobials to treat human infections in hospitals is accelerating this process. Clinical Decision Support Systems (CDSSs) have been proven to enhance quality of care by promoting change in prescription practices through antimicrobial selection advice. However, bypassing an initial assessment to determine the existence of an underlying disease that justifies the need of antimicrobial therapy might lead to indiscriminate and often unnecessary prescriptions. From pathology laboratory tests, six biochemical markers were selected and combined with microbiology outcomes from susceptibility tests to create a unique dataset with over one and a half million daily profiles to perform infection risk inference. Outliers were discarded using the inter-quartile range rule and several sampling techniques were studied to tackle the class imbalance problem. The first phase selects the most effective and robust model during training using ten-fold stratified cross-validation. The second phase evaluates the final model after isotonic calibration in scenarios with missing inputs and imbalanced class distributions. More than 50% of infected profiles have daily requested laboratory tests for the six biochemical markers with very promising infection inference results: area under the receiver operating characteristic curve (0.80-0.83), sensitivity (0.64-0.75) and specificity (0.92-0.97). Standardization consistently outperforms normalization and sensitivity is enhanced by using the SMOTE sampling technique. Furthermore, models operated without noticeable loss in performance if at least four biomarkers were available. The selected biomarkers comprise enough information to perform infection risk inference with a high degree of confidence even in the presence of incomplete and imbalanced data. Since they are commonly available in hospitals, Clinical Decision Support Systems could

  15. Incorporation of an Explicit Critical-Thinking Curriculum to Improve Pharmacy Students' Critical-Thinking Skills.

    Science.gov (United States)

    Cone, Catherine; Godwin, Donald; Salazar, Krista; Bond, Rucha; Thompson, Megan; Myers, Orrin

    2016-04-25

    Objective. The Health Sciences Reasoning Test (HSRT) is a validated instrument to assess critical-thinking skills. The objective of this study was to determine if HSRT results improved in second-year student pharmacists after exposure to an explicit curriculum designed to develop critical-thinking skills. Methods. In December 2012, the HSRT was administered to students who were in their first year of pharmacy school. Starting in August 2013, students attended a 16-week laboratory curriculum using simulation, formative feedback, and clinical reasoning to teach critical-thinking skills. Following completion of this course, the HSRT was readministered to the same cohort of students. Results. All students enrolled in the course (83) took the HSRT, and following exclusion criteria, 90% of the scores were included in the statistical analysis. Exclusion criteria included students who did not finish more than 60% of the questions or who took less than 15 minutes to complete the test. Significant changes in the HSRT occurred in overall scores and in the subdomains of deduction, evaluation, and inference after students completed the critical-thinking curriculum. Conclusions. Significant improvement in HSRT scores occurred following student immersion in an explicit critical-thinking curriculum. The HSRT was useful in detecting these changes, showing that critical-thinking skills can be learned and then assessed over a relatively short period using a standardized, validated assessment tool like the HSRT.

  16. Incorporation of an Explicit Critical-Thinking Curriculum to Improve Pharmacy Students’ Critical-Thinking Skills

    Science.gov (United States)

    Godwin, Donald; Salazar, Krista; Bond, Rucha; Thompson, Megan; Myers, Orrin

    2016-01-01

    Objective. The Health Sciences Reasoning Test (HSRT) is a validated instrument to assess critical-thinking skills. The objective of this study was to determine if HSRT results improved in second-year student pharmacists after exposure to an explicit curriculum designed to develop critical-thinking skills. Methods. In December 2012, the HSRT was administered to students who were in their first year of pharmacy school. Starting in August 2013, students attended a 16-week laboratory curriculum using simulation, formative feedback, and clinical reasoning to teach critical-thinking skills. Following completion of this course, the HSRT was readministered to the same cohort of students. Results. All students enrolled in the course (83) took the HSRT, and following exclusion criteria, 90% of the scores were included in the statistical analysis. Exclusion criteria included students who did not finish more than 60% of the questions or who took less than 15 minutes to complete the test. Significant changes in the HSRT occurred in overall scores and in the subdomains of deduction, evaluation, and inference after students completed the critical-thinking curriculum. Conclusions. Significant improvement in HSRT scores occurred following student immersion in an explicit critical-thinking curriculum. The HSRT was useful in detecting these changes, showing that critical-thinking skills can be learned and then assessed over a relatively short period using a standardized, validated assessment tool like the HSRT. PMID:27170812

  17. Distributional Inference

    NARCIS (Netherlands)

    Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.

    1995-01-01

    The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is

  18. The perils of straying from protocol: sampling bias and interviewer effects.

    Directory of Open Access Journals (Sweden)

    Carrie J Ngongo

    Full Text Available Fidelity to research protocol is critical. In a contingent valuation study in an informal urban settlement in Nairobi, Kenya, participants responded differently to the three trained interviewers. Interviewer effects were present during the survey pilot, then magnified at the start of the main survey after a seemingly slight adaptation of the survey sampling protocol allowed interviewers to speak with the "closest neighbor" in the event that no one was home at a selected household. This slight degree of interviewer choice led to inferred sampling bias. Multinomial logistic regression and post-estimation tests revealed that the three interviewers' samples differed significantly from one another according to six demographic characteristics. The two female interviewers were 2.8 and 7.7 times less likely to talk with respondents of low socio-economic status than the male interviewer. Systematic error renders it impossible to determine which of the survey responses might be "correct." This experience demonstrates why researchers must take care to strictly follow sampling protocols, consistently train interviewers, and monitor responses by interview to ensure similarity between interviewers' groups and produce unbiased estimates of the parameters of interest.

  19. Robust Inference of Population Structure for Ancestry Prediction and Correction of Stratification in the Presence of Relatedness

    Science.gov (United States)

    Conomos, Matthew P.; Miller, Mike; Thornton, Timothy

    2016-01-01

    Population structure inference with genetic data has been motivated by a variety of applications in population genetics and genetic association studies. Several approaches have been proposed for the identification of genetic ancestry differences in samples where study participants are assumed to be unrelated, including principal components analysis (PCA), multi-dimensional scaling (MDS), and model-based methods for proportional ancestry estimation. Many genetic studies, however, include individuals with some degree of relatedness, and existing methods for inferring genetic ancestry fail in related samples. We present a method, PC-AiR, for robust population structure inference in the presence of known or cryptic relatedness. PC-AiR utilizes genome-screen data and an efficient algorithm to identify a diverse subset of unrelated individuals that is representative of all ancestries in the sample. The PC-AiR method directly performs PCA on the identified ancestry representative subset and then predicts components of variation for all remaining individuals based on genetic similarities. In simulation studies and in applications to real data from Phase III of the HapMap Project, we demonstrate that PC-AiR provides a substantial improvement over existing approaches for population structure inference in related samples. We also demonstrate significant efficiency gains, where a single axis of variation from PC-AiR provides better prediction of ancestry in a variety of structure settings than using ten (or more) components of variation from widely used PCA and MDS approaches. Finally, we illustrate that PC-AiR can provide improved population stratification correction over existing methods in genetic association studies with population structure and relatedness. PMID:25810074

  20. Transport measurements in superconductors: critical current of granular high TC ceramic superconductor samples; Medidas de transporte em supercondutores: corrente critica de supercondutores granulares de alta temperatura critica

    Energy Technology Data Exchange (ETDEWEB)

    Passos, W.A.C., E-mail: wagner.passos@univasf.edu.br [Universidade Federal do Vale do Sao Francisco (IPCM/UNIVASF), Juazeiro do Norte, BA (Brazil). Instituto de Pesquisas em Ciencia dos Materiais; Silva, E.B. [Companhia Energetica do Sao Francisco (CHESF), Recife, PE (Brazil)

    2016-07-01

    This work presents a method to obtain critical current of granular superconductors. We have carried out transport measurements (ρxT curves and VxI curves) in a YBa{sub 2}Cu{sub 3}O{sub 7-δ} sample to determine critical current density of it. Some specimens reveal a 'semiconductor-like' behavior (electrical resistivity decreases with increasing temperatures above critical temperature T{sub c} of material) competing with superconductor behavior. Due to high granular fraction of the sample, these competition is clearly noted in ρxT curves. Measurements carried out from 0 to 8500 Oe of applied field show the same behavior, and the critical current density of the samples is shown. (author)

  1. Continuous Integrated Invariant Inference, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...

  2. Inferring category characteristics from sample characteristics: inductive reasoning and social projection.

    Science.gov (United States)

    Krueger, J; Clement, R W

    1996-03-01

    Inductive reasoning involves generalization from sample observations to categories. This research examined the conditions under which generalizations go beyond the boundaries of the sampled categories. In Experiment 1, participants sampled colored chips from urns. When categorization was not salient, participants revised their estimates of the probability of a particular color even in urns they had not sampled. As categorization became more salient, generalization became limited to the sampled urn. In Experiment 2 the salience of categorization in social induction was varied. When social categorization was not salient, participants projected their own responses to test items to members of a laboratory group even when they themselves did not belong to this group. When salience increased, projection decreased among nonmembers but not among members. In Experiment 3 these results were replicated in a field setting.

  3. Statistical Models for Inferring Vegetation Composition from Fossil Pollen

    Science.gov (United States)

    Paciorek, C.; McLachlan, J. S.; Shang, Z.

    2011-12-01

    Fossil pollen provide information about vegetation composition that can be used to help understand how vegetation has changed over the past. However, these data have not traditionally been analyzed in a way that allows for statistical inference about spatio-temporal patterns and trends. We build a Bayesian hierarchical model called STEPPS (Spatio-Temporal Empirical Prediction from Pollen in Sediments) that predicts forest composition in southern New England, USA, over the last two millenia based on fossil pollen. The critical relationships between abundances of tree taxa in the pollen record and abundances in actual vegetation are estimated using modern (Forest Inventory Analysis) data and (witness tree) data from colonial records. This gives us two time points at which both pollen and direct vegetation data are available. Based on these relationships, and incorporating our uncertainty about them, we predict forest composition using fossil pollen. We estimate the spatial distribution and relative abundances of tree species and draw inference about how these patterns have changed over time. Finally, we describe ongoing work to extend the modeling to the upper Midwest of the U.S., including an approach to infer tree density and thereby estimate the prairie-forest boundary in Minnesota and Wisconsin. This work is part of the PalEON project, which brings together a team of ecosystem modelers, paleoecologists, and statisticians with the goal of reconstructing vegetation responses to climate during the last two millenia in the northeastern and midwestern United States. The estimates from the statistical modeling will be used to assess and calibrate ecosystem models that are used to project ecological changes in response to global change.

  4. Estimating uncertainty of inference for validation

    Energy Technology Data Exchange (ETDEWEB)

    Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM

    2010-09-30

    We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the

  5. IELTS ACADEMIC READING ACHIEVEMENT: THE CONTRIBUTION OF INFERENCE-MAKING AND EVALUATION OF ARGUMENTS

    OpenAIRE

    Afsaneh Ghanizadeh; Azam Vahidian Pour; Akram Hosseini

    2017-01-01

    The pivotal undertaking of education today is to endow individuals with the capacity to be able to think flexibly, reason rationally, and have open minds to be able to evaluate and interpret situations. In line with the studies demonstrating the positive relationship between higher-order thinking skills and academic achievement, this study aimed to particularly examine the impact of the two subcomponents of critical thinking, i.e., inference-making and evaluation of arguments on academic IELT...

  6. Choice of sample size for high transport critical current density in a granular superconductor: percolation versus self-field effects

    International Nuclear Information System (INIS)

    Mulet, R.; Diaz, O.; Altshuler, E.

    1997-01-01

    The percolative character of the current paths and the self-field effects were considered to estimate optimal sample dimensions for the transport current of a granular superconductor by means of a Monte Carlo algorithm and critical-state model calculations. We showed that, under certain conditions, self-field effects are negligible and the J c dependence on sample dimensions is determined by the percolative character of the current. Optimal dimensions are demonstrated to be a function of the fraction of superconducting phase in the sample. (author)

  7. Human brain lesion-deficit inference remapped.

    Science.gov (United States)

    Mah, Yee-Haur; Husain, Masud; Rees, Geraint; Nachev, Parashkev

    2014-09-01

    Our knowledge of the anatomical organization of the human brain in health and disease draws heavily on the study of patients with focal brain lesions. Historically the first method of mapping brain function, it is still potentially the most powerful, establishing the necessity of any putative neural substrate for a given function or deficit. Great inferential power, however, carries a crucial vulnerability: without stronger alternatives any consistent error cannot be easily detected. A hitherto unexamined source of such error is the structure of the high-dimensional distribution of patterns of focal damage, especially in ischaemic injury-the commonest aetiology in lesion-deficit studies-where the anatomy is naturally shaped by the architecture of the vascular tree. This distribution is so complex that analysis of lesion data sets of conventional size cannot illuminate its structure, leaving us in the dark about the presence or absence of such error. To examine this crucial question we assembled the largest known set of focal brain lesions (n = 581), derived from unselected patients with acute ischaemic injury (mean age = 62.3 years, standard deviation = 17.8, male:female ratio = 0.547), visualized with diffusion-weighted magnetic resonance imaging, and processed with validated automated lesion segmentation routines. High-dimensional analysis of this data revealed a hidden bias within the multivariate patterns of damage that will consistently distort lesion-deficit maps, displacing inferred critical regions from their true locations, in a manner opaque to replication. Quantifying the size of this mislocalization demonstrates that past lesion-deficit relationships estimated with conventional inferential methodology are likely to be significantly displaced, by a magnitude dependent on the unknown underlying lesion-deficit relationship itself. Past studies therefore cannot be retrospectively corrected, except by new knowledge that would render them redundant

  8. Polynomial Chaos–Based Bayesian Inference of K-Profile Parameterization in a General Circulation Model of the Tropical Pacific

    KAUST Repository

    Sraj, Ihab

    2016-08-26

    The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference of the uncertain parameters is based on a Markov chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal time scales in addition to the data quality, and filters for the effects of parameter perturbations over those as a result of changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, a surrogate model for the test statistic using the PC method is built. Because of the noise in the model predictions, a basis-pursuit-denoising (BPDN) compressed sensing approach is employed to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model, namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative. © 2016 American Meteorological Society.

  9. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

    Science.gov (United States)

    Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol

    2016-01-01

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.

  10. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

    Directory of Open Access Journals (Sweden)

    James B Pettengill

    Full Text Available The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis. In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples and computational (petabytes of sequence data issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST scheme. When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates there are features (e.g., genomic, assembly, and contamination that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.

  11. Impact of Sample Type and DNA Isolation Procedure on Genomic Inference of Microbiome Composition

    DEFF Research Database (Denmark)

    Knudsen, Berith Elkær; Bergmark, Lasse; Munk, Patrick

    2016-01-01

    that in standard protocols. Based on this insight, we designed an improved DNA isolation procedure optimized for microbiome genomics that can be used for the three examined specimen types and potentially also for other biological specimens. A standard operating procedure is available from https://dx.doi.org/10......Explorations of complex microbiomes using genomics greatly enhance our understanding about their diversity, biogeography, and function. The isolation of DNA from microbiome specimens is a key prerequisite for such examinations, but challenges remain in obtaining sufficient DNA quantities required...... for certain sequencing approaches, achieving accurate genomic inference of microbiome composition, and facilitating comparability of findings across specimen types and sequencing projects. These aspects are particularly relevant for the genomics-based global surveillance of infectious agents and antimicrobial...

  12. A critical review of microextraction by packed sorbent as a sample preparation approach in drug bioanalysis.

    Science.gov (United States)

    Alves, Gilberto; Rodrigues, Márcio; Fortuna, Ana; Falcão, Amílcar; Queiroz, João

    2013-06-01

    Sample preparation is widely accepted as the most labor-intensive and error-prone part of the bioanalytical process. The recent advances in this field have been focused on the miniaturization and integration of sample preparation online with analytical instrumentation, in order to reduce laboratory workload and increase analytical performance. From this perspective, microextraction by packed sorbent (MEPS) has emerged in the last few years as a powerful sample preparation approach suitable to be easily automated with liquid and gas chromatographic systems applied in a variety of bioanalytical areas (pharmaceutical, clinical, toxicological, environmental and food research). This paper aims to provide an overview and a critical discussion of recent bioanalytical methods reported in literature based on MEPS, with special emphasis on those developed for the quantification of therapeutic drugs and/or metabolites in biological samples. The advantages and some limitations of MEPS, as well as its comparison with other extraction techniques, are also addressed herein.

  13. On sampling and modeling complex systems

    International Nuclear Information System (INIS)

    Marsili, Matteo; Mastromatteo, Iacopo; Roudi, Yasser

    2013-01-01

    The study of complex systems is limited by the fact that only a few variables are accessible for modeling and sampling, which are not necessarily the most relevant ones to explain the system behavior. In addition, empirical data typically undersample the space of possible states. We study a generic framework where a complex system is seen as a system of many interacting degrees of freedom, which are known only in part, that optimize a given function. We show that the underlying distribution with respect to the known variables has the Boltzmann form, with a temperature that depends on the number of unknown variables. In particular, when the influence of the unknown degrees of freedom on the known variables is not too irregular, the temperature decreases as the number of variables increases. This suggests that models can be predictable only when the number of relevant variables is less than a critical threshold. Concerning sampling, we argue that the information that a sample contains on the behavior of the system is quantified by the entropy of the frequency with which different states occur. This allows us to characterize the properties of maximally informative samples: within a simple approximation, the most informative frequency size distributions have power law behavior and Zipf’s law emerges at the crossover between the under sampled regime and the regime where the sample contains enough statistics to make inferences on the behavior of the system. These ideas are illustrated in some applications, showing that they can be used to identify relevant variables or to select the most informative representations of data, e.g. in data clustering. (paper)

  14. Functional inference of complex anatomical tendinous networks at a macroscopic scale via sparse experimentation.

    Science.gov (United States)

    Saxena, Anupam; Lipson, Hod; Valero-Cuevas, Francisco J

    2012-01-01

    In systems and computational biology, much effort is devoted to functional identification of systems and networks at the molecular-or cellular scale. However, similarly important networks exist at anatomical scales such as the tendon network of human fingers: the complex array of collagen fibers that transmits and distributes muscle forces to finger joints. This network is critical to the versatility of the human hand, and its function has been debated since at least the 16(th) century. Here, we experimentally infer the structure (both topology and parameter values) of this network through sparse interrogation with force inputs. A population of models representing this structure co-evolves in simulation with a population of informative future force inputs via the predator-prey estimation-exploration algorithm. Model fitness depends on their ability to explain experimental data, while the fitness of future force inputs depends on causing maximal functional discrepancy among current models. We validate our approach by inferring two known synthetic Latex networks, and one anatomical tendon network harvested from a cadaver's middle finger. We find that functionally similar but structurally diverse models can exist within a narrow range of the training set and cross-validation errors. For the Latex networks, models with low training set error [functional structure of complex anatomical networks. This work expands current bioinformatics inference approaches by demonstrating that sparse, yet informative interrogation of biological specimens holds significant computational advantages in accurate and efficient inference over random testing, or assuming model topology and only inferring parameters values. These findings also hold clues to both our evolutionary history and the development of versatile machines.

  15. Systematic parameter inference in stochastic mesoscopic modeling

    Energy Technology Data Exchange (ETDEWEB)

    Lei, Huan; Yang, Xiu [Pacific Northwest National Laboratory, Richland, WA 99352 (United States); Li, Zhen [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States); Karniadakis, George Em, E-mail: george_karniadakis@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States)

    2017-02-01

    We propose a method to efficiently determine the optimal coarse-grained force field in mesoscopic stochastic simulations of Newtonian fluid and polymer melt systems modeled by dissipative particle dynamics (DPD) and energy conserving dissipative particle dynamics (eDPD). The response surfaces of various target properties (viscosity, diffusivity, pressure, etc.) with respect to model parameters are constructed based on the generalized polynomial chaos (gPC) expansion using simulation results on sampling points (e.g., individual parameter sets). To alleviate the computational cost to evaluate the target properties, we employ the compressive sensing method to compute the coefficients of the dominant gPC terms given the prior knowledge that the coefficients are “sparse”. The proposed method shows comparable accuracy with the standard probabilistic collocation method (PCM) while it imposes a much weaker restriction on the number of the simulation samples especially for systems with high dimensional parametric space. Fully access to the response surfaces within the confidence range enables us to infer the optimal force parameters given the desirable values of target properties at the macroscopic scale. Moreover, it enables us to investigate the intrinsic relationship between the model parameters, identify possible degeneracies in the parameter space, and optimize the model by eliminating model redundancies. The proposed method provides an efficient alternative approach for constructing mesoscopic models by inferring model parameters to recover target properties of the physics systems (e.g., from experimental measurements), where those force field parameters and formulation cannot be derived from the microscopic level in a straight forward way.

  16. Extended likelihood inference in reliability

    International Nuclear Information System (INIS)

    Martz, H.F. Jr.; Beckman, R.J.; Waller, R.A.

    1978-10-01

    Extended likelihood methods of inference are developed in which subjective information in the form of a prior distribution is combined with sampling results by means of an extended likelihood function. The extended likelihood function is standardized for use in obtaining extended likelihood intervals. Extended likelihood intervals are derived for the mean of a normal distribution with known variance, the failure-rate of an exponential distribution, and the parameter of a binomial distribution. Extended second-order likelihood methods are developed and used to solve several prediction problems associated with the exponential and binomial distributions. In particular, such quantities as the next failure-time, the number of failures in a given time period, and the time required to observe a given number of failures are predicted for the exponential model with a gamma prior distribution on the failure-rate. In addition, six types of life testing experiments are considered. For the binomial model with a beta prior distribution on the probability of nonsurvival, methods are obtained for predicting the number of nonsurvivors in a given sample size and for predicting the required sample size for observing a specified number of nonsurvivors. Examples illustrate each of the methods developed. Finally, comparisons are made with Bayesian intervals in those cases where these are known to exist

  17. Unexpected presence of Fagus orientalis complex in Italy as inferred from 45,000-year-old DNA pollen samples from Venice lagoon.

    Science.gov (United States)

    Paffetti, Donatella; Vettori, Cristina; Caramelli, David; Vernesi, Cristiano; Lari, Martina; Paganelli, Arturo; Paule, Ladislav; Giannini, Raffaello

    2007-08-16

    Phylogeographic analyses on the Western Euroasiatic Fagus taxa (F. orientalis, F. sylvatica, F. taurica and F. moesiaca) is available, however, the subdivision of Fagus spp. is unresolved and there is no consensus on the phylogeny and on the identification (both with morphological than molecular markers) of Fagus Eurasiatic taxa. For the first time molecular analyses of ancient pollen, dated at least 45,000 years ago, were used in combination with the phylogeny analysis on current species, to identify the Fagus spp. present during the Last Interglacial period in Italy. In this work we aim at testing if the trnL-trnF chloroplast DNA (cpDNA) region, that has been previously proved efficient in discriminating different Quercus taxa, can be employed in distinguishing the Fagus species and in identifying the ancient pollen. 86 populations from 4 Western Euroasistic taxa were sampled, and sequenced for the trnL-trnF region to verify the efficiency of this cpDNA region in identifying the Fagus spp.. Furthermore, Fagus crenata (2 populations), Fagus grandifolia (2 populations), Fagus japonica, Fagus hayatae, Quercus species and Castanea species were analysed to better resolve the phylogenetic inference. Our results show that this cpDNA region harbour some informative sites that allow to infer relationships among the species within the Fagaceae family. In particular, few specific and fixed mutations were able to discriminate and identify all the different Fagus species. Considering a short fragment of 176 base pairs within the trnL intron, 2 transversions were found able in distinguishing the F. orientalis complex taxa (F. orientalis, F. taurica and F. moesiaca) from the remaining Fagus spp. (F. sylvatica, F. japonica, F. hayataea, F. crenata and F. grandifolia). This permits to analyse this fragment also in ancient samples, where DNA is usually highly degraded. The sequences data indicate that the DNA recovered from ancient pollen belongs to the F. orientalis complex since

  18. The aggregate site frequency spectrum for comparative population genomic inference.

    Science.gov (United States)

    Xue, Alexander T; Hickerson, Michael J

    2015-12-01

    Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavour that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of nonmodel species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) data sets collected from multiple, co-distributed species for assemblage-level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multispecies demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a data set consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent postglacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multilevel statistical frameworks to test models involving assemblages and/or communities, and as large-scale SNP data from nonmodel species become routine, the aSFS expands the potential for powerful next-generation comparative population genomic inference. © 2015 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  19. Quantum-Like Representation of Non-Bayesian Inference

    Science.gov (United States)

    Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

    2013-01-01

    This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.

  20. Bayesian Inference Methods for Sparse Channel Estimation

    DEFF Research Database (Denmark)

    Pedersen, Niels Lovmand

    2013-01-01

    This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...

  1. Statistical inference an integrated Bayesianlikelihood approach

    CERN Document Server

    Aitkin, Murray

    2010-01-01

    Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre

  2. A Story-Based Simulation for Teaching Sampling Distributions

    Science.gov (United States)

    Turner, Stephen; Dabney, Alan R.

    2015-01-01

    Statistical inference relies heavily on the concept of sampling distributions. However, sampling distributions are difficult to teach. We present a series of short animations that are story-based, with associated assessments. We hope that our contribution can be useful as a tool to teach sampling distributions in the introductory statistics…

  3. Design and implementation of an adaptive critic-based neuro-fuzzy controller on an unmanned bicycle

    OpenAIRE

    Shafiekhani, Ali; Mahjoob, Mohammad J.; Akraminia, Mehdi

    2017-01-01

    Fuzzy critic-based learning forms a reinforcement learning method based on dynamic programming. In this paper, an adaptive critic-based neuro-fuzzy system is presented for an unmanned bicycle. The only information available for the critic agent is the system feedback which is interpreted as the last action performed by the controller in the previous state. The signal produced by the critic agent is used along with the error back propagation to tune (online) conclusion parts of the fuzzy infer...

  4. Inference and testing on the boundary in extended constant conditional correlation GARCH models

    DEFF Research Database (Denmark)

    Pedersen, Rasmus Søndergaard

    2017-01-01

    We consider inference and testing in extended constant conditional correlation GARCH models in the case where the true parameter vector is a boundary point of the parameter space. This is of particular importance when testing for volatility spillovers in the model. The large-sample properties...

  5. Examining patterns of change in the critical thinking skills of graduate nursing students.

    Science.gov (United States)

    McMullen, Maureen A; McMullen, William F

    2009-06-01

    Although critical thinking in undergraduate nursing education has been explored in depth, little is known about the critical thinking skills of graduate nursing students. Prior research on change in critical thinking scores is based primarily on pretest and posttest assessments that provide minimal information about change. This study used individual growth modeling to investigate how critical thinking skills change during a 2-year graduate nurse program. Scores from the evaluation, inference, and analysis subscales of the California Critical Thinking Skills Test comprised the empirical growth record. Change in the three critical thinking skills was more dynamic than that reported in previous studies. Patterns of change differed by critical thinking skill and in relation to students' initial critical thinking skill levels at program entry.

  6. A Critical Appraisal of the `Day' Diagram

    Science.gov (United States)

    Roberts, A. P.; Tauxe, L.; Heslop, D.

    2017-12-01

    The `Day' diagram [Day et al., 1977; doi:10.1016/0031-9201(77)90108-X] is used widely to infer the mean domain state of magnetic mineral assemblages. The Day plot coordinates are the ratios of the saturation remanent magnetization to saturation magnetization (Mrs/Ms) and the coercivity of remanence to coercivity (Bcr/Bc), as determined from a major hysteresis loop and a backfield demagnetization curve. Based on theoretical and empirical arguments, Day plots are typically demarcated into stable single domain (SD), `pseudosingle domain' (`PSD'), and multidomain (MD) zones. It is a simple task to determine Mrs/Ms and Bcr/Bc for a sample and to assign a mean domain state based on the boundaries defined by Day et al. [1977]. Many other parameters contribute to variability in a Day diagram, including surface oxidation, mineral stoichiometry, stress state, magnetostatic interactions, and mixtures of magnetic particles with different sizes and shapes. Bulk magnetic measurements usually lack detailed independent evidence to constrain each free parameter, which makes the Day diagram fundamentally ambiguous. This raises questions about its usefulness for diagnosing magnetic particle size variations. The Day diagram is also used to make inferences about binary mixing of magnetic particles, where, for example, mixtures of SD and MD particles give rise to a bulk `PSD' response even though the concentration of `PSD' grains could be zero. In our assessment of thousands of hysteresis measurements of geological samples, binary mixing occurs in a tiny number of cases. Ternary, quaternary, and higher order mixing are usually observed. Also, uniaxial SD and MD end-members are nearly always inappropriate for considering mixing because uniaxial SD particles are virtually non-existent in igneous rocks. Thus, use of mixing lines in Day diagrams routinely provides unsatisfactory representations of particle size variations. We critically appraise the Day diagram and argue that its many

  7. Proper and Paradigmatic Metonymy as a Lens for Characterizing Student Conceptions of Distributions and Sampling

    Science.gov (United States)

    Noll, Jennifer; Hancock, Stacey

    2015-01-01

    This research investigates what students' use of statistical language can tell us about their conceptions of distribution and sampling in relation to informal inference. Prior research documents students' challenges in understanding ideas of distribution and sampling as tools for making informal statistical inferences. We know that these…

  8. Inference Attacks and Control on Database Structures

    Directory of Open Access Journals (Sweden)

    Muhamed Turkanovic

    2015-02-01

    Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.

  9. A unified framework for haplotype inference in nuclear families.

    Science.gov (United States)

    Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong

    2012-07-01

    Many large genome-wide association studies include nuclear families with more than one child (trio families), allowing for analysis of differences between siblings (sib pair analysis). Statistical power can be increased when haplotypes are used instead of genotypes. Currently, haplotype inference in families with more than one child can be performed either using the familial information or statistical information derived from the population samples but not both. Building on our recently proposed tree-based deterministic framework (TDS) for trio families, we augment its applicability to general nuclear families. We impose a minimum recombinant approach locally and independently on each multiple children family, while resorting to the population-derived information to solve the remaining ambiguities. Thus our framework incorporates all available information (familial and population) in a given study. We demonstrate that using all the constraints in our approach we can have gains in the accuracy as opposed to breaking the multiple children families to separate trios and resorting to a trio inference algorithm or phasing each family in isolation. We believe that our proposed framework could be the method of choice for haplotype inference in studies that include nuclear families with multiple children. Our software (tds2.0) is downloadable from www.ee.columbia.edu/∼anastas/tds. © 2012 The Authors Annals of Human Genetics © 2012 Blackwell Publishing Ltd/University College London.

  10. Scanning electron microscope autoradiography of critical point dried biological samples

    International Nuclear Information System (INIS)

    Weiss, R.L.

    1980-01-01

    A technique has been developed for the localization of isotopes in the scanning electron microscope. Autoradiographic studies have been performed using a model system and a unicellular biflagellate alga. One requirement of this technique is that all manipulations be carried out on samples that are maintained in a liquid state. Observations of a source of radiation ( 125 I-ferritin) show that the nuclear emulsion used to detect radiation is active under these conditions. Efficiency measurement performed using 125 I-ferritin indicate that 125 I-SEM autoradiography is an efficient process that exhibits a 'dose dependent' response. Two types of labeling methods were used with cells, surface labeling with 125 I and internal labeling with 3 H. Silver grains appeared on labeled cells after autoradiography, removal of residual gelatin and critical point drying. The location of grains was examined on a flagellated green alga (Chlamydomonas reinhardi) capable of undergoing cell fusion. Fusion experiments using labeled and unlabeled cells indicate that 1. Labeling is specific for incorporated radioactivity; 2. Cell surface structure is preserved in SEM autoradiographs and 3. The technique appears to produce reliable autoradiographs. Thus scanning electron microscope autoradiography should provide a new and useful experimental approach

  11. Analysis of critical thinking ability of VII grade students based on the mathematical anxiety level through learning cycle 7E model

    Science.gov (United States)

    Widyaningsih, E.; Waluya, S. B.; Kurniasih, A. W.

    2018-03-01

    This study aims to know mastery learning of students’ critical thinking ability with learning cycle 7E, determine whether the critical thinking ability of the students with learning cycle 7E is better than students’ critical thinking ability with expository model, and describe the students’ critical thinking phases based on the mathematical anxiety level. The method is mixed method with concurrent embedded. The population is VII grade students of SMP Negeri 3 Kebumen academic year 2016/2017. Subjects are determined by purposive sampling, selected two students from each level of mathematical anxiety. Data collection techniques include test, questionnaire, interview, and documentation. Quantitative data analysis techniques include mean test, proportion test, difference test of two means, difference test of two proportions and for qualitative data used Miles and Huberman model. The results show that: (1) students’ critical thinking ability with learning cycle 7E achieve mastery learning; (2) students’ critical thinking ability with learning cycle 7E is better than students’ critical thinking ability with expository model; (3) description of students’ critical thinking phases based on the mathematical anxiety level that is the lower the mathematical anxiety level, the subjects have been able to fulfil all of the indicators of clarification, assessment, inference, and strategies phases.

  12. Use of genetic data to infer population-specific ecological and phenotypic traits from mixed aggregations.

    Directory of Open Access Journals (Sweden)

    Paul Moran

    Full Text Available Many applications in ecological genetics involve sampling individuals from a mixture of multiple biological populations and subsequently associating those individuals with the populations from which they arose. Analytical methods that assign individuals to their putative population of origin have utility in both basic and applied research, providing information about population-specific life history and habitat use, ecotoxins, pathogen and parasite loads, and many other non-genetic ecological, or phenotypic traits. Although the question is initially directed at the origin of individuals, in most cases the ultimate desire is to investigate the distribution of some trait among populations. Current practice is to assign individuals to a population of origin and study properties of the trait among individuals within population strata as if they constituted independent samples. It seemed that approach might bias population-specific trait inference. In this study we made trait inferences directly through modeling, bypassing individual assignment. We extended a Bayesian model for population mixture analysis to incorporate parameters for the phenotypic trait and compared its performance to that of individual assignment with a minimum probability threshold for assignment. The Bayesian mixture model outperformed individual assignment under some trait inference conditions. However, by discarding individuals whose origins are most uncertain, the individual assignment method provided a less complex analytical technique whose performance may be adequate for some common trait inference problems. Our results provide specific guidance for method selection under various genetic relationships among populations with different trait distributions.

  13. Use of genetic data to infer population-specific ecological and phenotypic traits from mixed aggregations

    Science.gov (United States)

    Moran, Paul; Bromaghin, Jeffrey F.; Masuda, Michele

    2014-01-01

    Many applications in ecological genetics involve sampling individuals from a mixture of multiple biological populations and subsequently associating those individuals with the populations from which they arose. Analytical methods that assign individuals to their putative population of origin have utility in both basic and applied research, providing information about population-specific life history and habitat use, ecotoxins, pathogen and parasite loads, and many other non-genetic ecological, or phenotypic traits. Although the question is initially directed at the origin of individuals, in most cases the ultimate desire is to investigate the distribution of some trait among populations. Current practice is to assign individuals to a population of origin and study properties of the trait among individuals within population strata as if they constituted independent samples. It seemed that approach might bias population-specific trait inference. In this study we made trait inferences directly through modeling, bypassing individual assignment. We extended a Bayesian model for population mixture analysis to incorporate parameters for the phenotypic trait and compared its performance to that of individual assignment with a minimum probability threshold for assignment. The Bayesian mixture model outperformed individual assignment under some trait inference conditions. However, by discarding individuals whose origins are most uncertain, the individual assignment method provided a less complex analytical technique whose performance may be adequate for some common trait inference problems. Our results provide specific guidance for method selection under various genetic relationships among populations with different trait distributions.

  14. Implicit transitive inference and the human hippocampus: does intravenous midazolam function as a reversible hippocampal lesion?

    Directory of Open Access Journals (Sweden)

    Greene Anthony J

    2007-09-01

    Full Text Available Abstract Recent advances have led to an understanding that the hippocampus is involved more broadly than explicit or declarative memory alone. Tasks which involve the acquisition of complex associations involve the hippocampus whether the learning is explicit or implicit. One hippocampal-dependent implicit task is transitive inference (TI. Recently it was suggested that implicit transitive inference does not depend upon the hippocampus (Frank, M. J., O'Reilly, R. C., & Curran, T. 2006. When memory fails, intuition reigns: midazolam enhances implicit inference in humans. Psychological Science, 17, 700–707. The authors demonstrated that intravenous midazolam, which is thought to inactivate the hippocampus, may enhance TI performance. Three critical assumptions are required but not met: 1 that deactivations of other regions could not account for the effect 2 that intravenous midazolam does indeed deactivate the hippocampus and 3 that midazolam influences explicit but not implicit memory. Each of these assumptions is seriously flawed. Consequently, the suggestion that implicit TI does not depend upon the hippocampus is unfounded.

  15. Type Inference with Inequalities

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff

    1991-01-01

    of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...

  16. Inferring Smoking Status from User Generated Content in an Online Cessation Community.

    Science.gov (United States)

    Amato, Michael S; Papandonatos, George D; Cha, Sarah; Wang, Xi; Zhao, Kang; Cohn, Amy M; Pearson, Jennifer L; Graham, Amanda L

    2018-01-22

    User generated content (UGC) is a valuable but underutilized source of information about individuals who participate in online cessation interventions. This study represents a first effort to passively detect smoking status among members of an online cessation program using UGC. Secondary data analysis was performed on data from 826 participants in a web-based smoking cessation randomized trial that included an online community. Domain experts from the online community reviewed each post and comment written by participants and attempted to infer the author's smoking status at the time it was written. Inferences from UGC were validated by comparison with self-reported 30-day point prevalence abstinence (PPA). Following validation, the impact of this method was evaluated across all individuals and timepoints in the study period. Of the 826 participants in the analytic sample, 719 had written at least one post from which content inference was possible. Among participants for whom unambiguous smoking status was inferred during the 30 days preceding their 3-month follow-up survey, concordance with self-report was almost perfect (kappa = 0.94). Posts indicating abstinence tended to be written shortly after enrollment (median = 14 days). Passive inference of smoking status from UGC in online cessation communities is possible and highly reliable for smokers who actively produce content. These results lay the groundwork for further development of observational research tools and intervention innovations. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. SU-E-T-144: Bayesian Inference of Local Relapse Data Using a Poisson-Based Tumour Control Probability Model

    Energy Technology Data Exchange (ETDEWEB)

    La Russa, D [The Ottawa Hospital Cancer Centre, Ottawa, ON (Canada)

    2015-06-15

    Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributions found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.

  18. Hierarchy-associated semantic-rule inference framework for classifying indoor scenes

    Science.gov (United States)

    Yu, Dan; Liu, Peng; Ye, Zhipeng; Tang, Xianglong; Zhao, Wei

    2016-03-01

    Typically, the initial task of classifying indoor scenes is challenging, because the spatial layout and decoration of a scene can vary considerably. Recent efforts at classifying object relationships commonly depend on the results of scene annotation and predefined rules, making classification inflexible. Furthermore, annotation results are easily affected by external factors. Inspired by human cognition, a scene-classification framework was proposed using the empirically based annotation (EBA) and a match-over rule-based (MRB) inference system. The semantic hierarchy of images is exploited by EBA to construct rules empirically for MRB classification. The problem of scene classification is divided into low-level annotation and high-level inference from a macro perspective. Low-level annotation involves detecting the semantic hierarchy and annotating the scene with a deformable-parts model and a bag-of-visual-words model. In high-level inference, hierarchical rules are extracted to train the decision tree for classification. The categories of testing samples are generated from the parts to the whole. Compared with traditional classification strategies, the proposed semantic hierarchy and corresponding rules reduce the effect of a variable background and improve the classification performance. The proposed framework was evaluated on a popular indoor scene dataset, and the experimental results demonstrate its effectiveness.

  19. Inference in models with adaptive learning

    NARCIS (Netherlands)

    Chevillon, G.; Massmann, M.; Mavroeidis, S.

    2010-01-01

    Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be

  20. Experiment, monitoring, and gradient methods used to infer climate change effects on plant communities yield consistent patterns

    Science.gov (United States)

    Sarah C. Elmendorf; Gregory H.R. Henry; Robert D. Hollisterd; Anna Maria Fosaa; William A. Gould; Luise Hermanutz; Annika Hofgaard; Ingibjorg I. Jonsdottir; Janet C. Jorgenson; Esther Levesque; Borgbor Magnusson; Ulf Molau; Isla H. Myers-Smith; Steven F. Oberbauer; Christian Rixen; Craig E. Tweedie; Marilyn Walkers

    2015-01-01

    Inference about future climate change impacts typically relies on one of three approaches: manipulative experiments, historical comparisons (broadly defined to include monitoring the response to ambient climate fluctuations using repeat sampling of plots, dendroecology, and paleoecology techniques), and space-for-time substitutions derived from sampling along...

  1. Fiducial inference - A Neyman-Pearson interpretation

    NARCIS (Netherlands)

    Salome, D; VonderLinden, W; Dose,; Fischer, R; Preuss, R

    1999-01-01

    Fisher's fiducial argument is a tool for deriving inferences in the form of a probability distribution on the parameter space, not based on Bayes's Theorem. Lindley established that in exceptional situations fiducial inferences coincide with posterior distributions; in the other situations fiducial

  2. Uncertainty in prediction and in inference

    NARCIS (Netherlands)

    Hilgevoord, J.; Uffink, J.

    1991-01-01

    The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close re-lationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in

  3. Role of Personality Traits, Learning Styles and Metacognition in Predicting Critical Thinking of Undergraduate Students

    Directory of Open Access Journals (Sweden)

    Soliemanifar O

    2015-04-01

    The aim of this study was to investigate the role of personality traits, learning styles and metacognition in predicting critical thinking. Instrument & Methods: In this descriptive correlative study, 240 students (130 girls and 110 boys of Ahvaz Shahid Chamran University were selected by multi-stage random sampling method. The instruments for collecting data were NEO Five-Factor Inventory, learning style inventory of Kolb (LSI, metacognitive assessment inventory (MAI of Schraw & Dennison (1994 and California Critical Thinking Skills Test (CCTST. The data were analyzed using Pearson correlation coefficient, stepwise regression analysis and Canonical correlation analysis.  Findings: Openness to experiment (b=0.41, conscientiousness (b=0.28, abstract conceptualization (b=0.39, active experimentation (b=0.22, reflective observation (b=0.12, knowledge of cognition (b=0.47 and regulation of cognition (b=0.29 were effective in predicting critical thinking. Openness to experiment and conscientiousness (r2=0.25, active experimentation, abstract conceptualization and reflective observation learning styles (r2=0.21 and knowledge and regulation of cognition metacognitions (r2=0.3 had an important role in explaining critical thinking. The linear combination of critical thinking skills (evaluation, analysis, inference was predictable by a linear combination of dispositional-cognitive factors (openness, conscientiousness, abstract conceptualization, active experimentation, knowledge of cognition and regulation of cognition. Conclusion: Personality traits, learning styles and metacognition, as dispositional-cognitive factors, play a significant role in students' critical thinking.

  4. Polynomial Chaos Acceleration for the Bayesian Inference of Random Fields with Gaussian Priors and Uncertain Covariance Hyper-Parameters

    KAUST Repository

    Le Maitre, Olivier

    2015-01-07

    We address model dimensionality reduction in the Bayesian inference of Gaussian fields, considering prior covariance function with unknown hyper-parameters. The Karhunen-Loeve (KL) expansion of a prior Gaussian process is traditionally derived assuming fixed covariance function with pre-assigned hyperparameter values. Thus, the modes strengths of the Karhunen-Loeve expansion inferred using available observations, as well as the resulting inferred process, dependent on the pre-assigned values for the covariance hyper-parameters. Here, we seek to infer the process and its the covariance hyper-parameters in a single Bayesian inference. To this end, the uncertainty in the hyper-parameters is treated by means of a coordinate transformation, leading to a KL-type expansion on a fixed reference basis of spatial modes, but with random coordinates conditioned on the hyper-parameters. A Polynomial Chaos (PC) expansion of the model prediction is also introduced to accelerate the Bayesian inference and the sampling of the posterior distribution with MCMC method. The PC expansion of the model prediction also rely on a coordinates transformation, enabling us to avoid expanding the dependence of the prediction with respect to the covariance hyper-parameters. We demonstrate the efficiency of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data.

  5. Critical masses for the even-neutron-numbered transuranium actinides

    International Nuclear Information System (INIS)

    Westfall, R.M.

    1981-01-01

    As part of a standards effort of the American Nuclear Society to establish subcritical mass limits for the transuranium actinides, critical masses were calculated for seven actinides, critical masses were calculated for seven actinide elements in bare, water-reflected, and steel-reflected metal systems. For the nuclides /sup 242/Pu and /sup 241/Am, values obtained with ENDF/B-V cross-section data were in much better agreement with values inferred from experimental measurement than were initial values calculated with ENDF/B-IV data. A brief description of the analytical methods employed is followed by a presentation of the results. 10 refs

  6. Critical current density improvements in MgB2 superconducting bulk samples by K2CO3 additions  

    DEFF Research Database (Denmark)

    Grivel, J.-C.

    2018-01-01

    MgB2 bulk samples with potassium carbonate doping were made by means of reaction of elemental Mg and B powders mixed with various amounts of K2CO3. The Tc of the superconducting phase as well as its a-axis parameter were decreased as a result of carbon doping. Potassium escaped the samples during...... reaction. The critical current density of MgB2 was improved both in self field and under applied magnetic field for T ≤ 30 K, with optimum results for 1 mol% K2CO3 addition. The normalized flux pinning force (f(b)) shows that the flux pinning mechanism at low field is similar for all samples, following...

  7. Interactive Instruction in Bayesian Inference

    DEFF Research Database (Denmark)

    Khan, Azam; Breslav, Simon; Hornbæk, Kasper

    2018-01-01

    An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....

  8. Acceptance sampling using judgmental and randomly selected samples

    Energy Technology Data Exchange (ETDEWEB)

    Sego, Landon H.; Shulman, Stanley A.; Anderson, Kevin K.; Wilson, John E.; Pulsipher, Brent A.; Sieber, W. Karl

    2010-09-01

    We present a Bayesian model for acceptance sampling where the population consists of two groups, each with different levels of risk of containing unacceptable items. Expert opinion, or judgment, may be required to distinguish between the high and low-risk groups. Hence, high-risk items are likely to be identifed (and sampled) using expert judgment, while the remaining low-risk items are sampled randomly. We focus on the situation where all observed samples must be acceptable. Consequently, the objective of the statistical inference is to quantify the probability that a large percentage of the unsampled items in the population are also acceptable. We demonstrate that traditional (frequentist) acceptance sampling and simpler Bayesian formulations of the problem are essentially special cases of the proposed model. We explore the properties of the model in detail, and discuss the conditions necessary to ensure that required samples sizes are non-decreasing function of the population size. The method is applicable to a variety of acceptance sampling problems, and, in particular, to environmental sampling where the objective is to demonstrate the safety of reoccupying a remediated facility that has been contaminated with a lethal agent.

  9. Inferring Phylogenetic Networks Using PhyloNet.

    Science.gov (United States)

    Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay

    2018-07-01

    PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.

  10. [Inferring landmark displacements from changes in cephalometric angles].

    Science.gov (United States)

    Xu, T; Baumrind, S

    2001-07-01

    To investigate the appropriateness of using changes in angular measurements to reflect the actually profile changes. The sample consists of 48 growing malocclusion patients, contained 24 Class I and 24 Class II subjects, treated by an experienced orthodontist using Edgewise technique. Landmark and superimpositional data were extracted from the previously prepared numerical database. Three pairs of angular and linear measures were computed by the Craniofacial Software Package. Although the associations between all three angular measures and their corresponding linear measures are statistically significant at the 0.001 level, the disagreement between these three pairs of measures are 10.4%, 22.9% and 37.5% respectively in this sample. The direction of displacement of anterior facial landmarks during growth and treatment cannot reliably be inferred merely from changes in cephalometric Angles.

  11. Active inference and learning.

    Science.gov (United States)

    Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni

    2016-09-01

    This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  12. Understanding the Complex Relationship between Critical Thinking and Science Reasoning among Undergraduate Thesis Writers.

    Science.gov (United States)

    Dowd, Jason E; Thompson, Robert J; Schiff, Leslie A; Reynolds, Julie A

    2018-01-01

    Developing critical-thinking and scientific reasoning skills are core learning objectives of science education, but little empirical evidence exists regarding the interrelationships between these constructs. Writing effectively fosters students' development of these constructs, and it offers a unique window into studying how they relate. In this study of undergraduate thesis writing in biology at two universities, we examine how scientific reasoning exhibited in writing (assessed using the Biology Thesis Assessment Protocol) relates to general and specific critical-thinking skills (assessed using the California Critical Thinking Skills Test), and we consider implications for instruction. We find that scientific reasoning in writing is strongly related to inference , while other aspects of science reasoning that emerge in writing (epistemological considerations, writing conventions, etc.) are not significantly related to critical-thinking skills. Science reasoning in writing is not merely a proxy for critical thinking. In linking features of students' writing to their critical-thinking skills, this study 1) provides a bridge to prior work suggesting that engagement in science writing enhances critical thinking and 2) serves as a foundational step for subsequently determining whether instruction focused explicitly on developing critical-thinking skills (particularly inference ) can actually improve students' scientific reasoning in their writing. © 2018 J. E. Dowd et al. CBE—Life Sciences Education © 2018 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  13. Active Inference, homeostatic regulation and adaptive behavioural control.

    Science.gov (United States)

    Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl

    2015-11-01

    We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Generative Inferences Based on Learned Relations

    Science.gov (United States)

    Chen, Dawn; Lu, Hongjing; Holyoak, Keith J.

    2017-01-01

    A key property of relational representations is their "generativity": From partial descriptions of relations between entities, additional inferences can be drawn about other entities. A major theoretical challenge is to demonstrate how the capacity to make generative inferences could arise as a result of learning relations from…

  15. NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

    Science.gov (United States)

    Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

    2015-09-29

    In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.

  16. A caveat regarding diatom-inferred nitrogen concentrations in oligotrophic lakes

    Science.gov (United States)

    Arnett, Heather A.; Saros, Jasmine E.; Mast, M. Alisa

    2012-01-01

    Atmospheric deposition of reactive nitrogen (Nr) has enriched oligotrophic lakes with nitrogen (N) in many regions of the world and elicited dramatic changes in diatom community structure. The lakewater concentrations of nitrate that cause these community changes remain unclear, raising interest in the development of diatom-based transfer functions to infer nitrate. We developed a diatom calibration set using surface sediment samples from 46 high-elevation lakes across the Rocky Mountains of the western US, a region spanning an N deposition gradient from very low to moderate levels (phosphorus, and hypolimnetic water temperature were related to diatom distributions. A transfer function was developed for nitrate and applied to a sedimentary diatom profile from Heart Lake in the central Rockies. The model coefficient of determination (bootstrapping validation) of 0.61 suggested potential for diatom-inferred reconstructions of lakewater nitrate concentrations over time, but a comparison of observed versus diatom-inferred nitrate values revealed the poor performance of this model at low nitrate concentrations. Resource physiology experiments revealed that nitrogen requirements of two key taxa were opposite to nitrate optima defined in the transfer function. Our data set reveals two underlying ecological constraints that impede the development of nitrate transfer functions in oligotrophic lakes: (1) even in lakes with nitrate concentrations below quantification (<1 μg L−1), diatom assemblages were already dominated by species indicative of moderate N enrichment; (2) N-limited oligotrophic lakes switch to P limitation after receiving only modest inputs of reactive N, shifting the controls on diatom species changes along the length of the nitrate gradient. These constraints suggest that quantitative inferences of nitrate from diatom assemblages will likely require experimental approaches.

  17. RISK MANAGEMENT AUTOMATION OF SOFTWARE PROJECTS BASED ОN FUZZY INFERENCE

    Directory of Open Access Journals (Sweden)

    T. M. Zubkova

    2015-09-01

    Full Text Available Application suitability for one of the intelligent methods for risk management of software projects has been shown based on the review of existing algorithms for fuzzy inference in the field of applied problems. Information sources in the management of software projects are analyzed; major and minor risks are highlighted. The most critical parameters have been singled out giving the possibility to estimate the occurrence of an adverse situations (project duration, the frequency of customer’s requirements changing, work deadlines, experience of developers’ participation in such projects and others.. The method of qualitative fuzzy description based on fuzzy logic has been developed for analysis of these parameters. Evaluation of possible situations and knowledge base formation rely on a survey of experts. The main limitations of existing automated systems have been identified in relation to their applicability to risk management in the software design. Theoretical research set the stage for software system that makes it possible to automate the risk management process for software projects. The developed software system automates the process of fuzzy inference in the following stages: rule base formation of the fuzzy inference systems, fuzzification of input variables, aggregation of sub-conditions, activation and accumulation of conclusions for fuzzy production rules, variables defuzzification. The result of risk management automation process in the software design is their quantitative and qualitative assessment and expert advice for their minimization. Practical significance of the work lies in the fact that implementation of the developed automated system gives the possibility for performance improvement of software projects.

  18. Inference for feature selection using the Lasso with high-dimensional data

    DEFF Research Database (Denmark)

    Brink-Jensen, Kasper; Ekstrøm, Claus Thorn

    2014-01-01

    Penalized regression models such as the Lasso have proved useful for variable selection in many fields - especially for situations with high-dimensional data where the numbers of predictors far exceeds the number of observations. These methods identify and rank variables of importance but do...... not generally provide any inference of the selected variables. Thus, the variables selected might be the "most important" but need not be significant. We propose a significance test for the selection found by the Lasso. We introduce a procedure that computes inference and p-values for features chosen...... by the Lasso. This method rephrases the null hypothesis and uses a randomization approach which ensures that the error rate is controlled even for small samples. We demonstrate the ability of the algorithm to compute $p$-values of the expected magnitude with simulated data using a multitude of scenarios...

  19. Application of lot quality assurance sampling for leprosy elimination monitoring--examination of some critical factors.

    Science.gov (United States)

    Gupte, M D; Murthy, B N; Mahmood, K; Meeralakshmi, S; Nagaraju, B; Prabhakaran, R

    2004-04-01

    The concept of elimination of an infectious disease is different from eradication and in a way from control as well. In disease elimination programmes the desired reduced level of prevalence is set up as the target to be achieved in a practical time frame. Elimination can be considered in the context of national or regional levels. Prevalence levels depend on occurrence of new cases and thus could remain fluctuating. There are no ready pragmatic methods to monitor the progress of leprosy elimination programmes. We therefore tried to explore newer methods to answer these demands. With the lowering of prevalence of leprosy to the desired level of 1 case per 10000 population at the global level, the programme administrators' concern will be shifted to smaller areas e.g. national and sub-national levels. For monitoring this situation, we earlier observed that lot quality assurance sampling (LQAS), a quality control tool in industry was useful in the initially high endemic areas. However, critical factors such as geographical distribution of cases and adoption of cluster sampling design instead of simple random sampling design deserve attention before LQAS could generally be recommended. The present exercise was aimed at validating applicability of LQAS, and adopting these modifications for monitoring leprosy elimination in Tamil Nadu state, which was highly endemic for leprosy. A representative sample of 64000 people drawn from eight districts of Tamil Nadu state, India, with maximum allowable number of 25 cases was considered, using LQAS methodology to test whether leprosy prevalence was at or below 7 per 10000 population. Expected number of cases for each district was obtained assuming Poisson distribution. Goodness of fit for the observed and expected cases (closeness of the expected number of cases to those observed) was tested through chi(2). Enhancing factor (design effect) for sample size was obtained by computing the intraclass correlation. The survey actually

  20. Parametric statistical inference basic theory and modern approaches

    CERN Document Server

    Zacks, Shelemyahu; Tsokos, C P

    1981-01-01

    Parametric Statistical Inference: Basic Theory and Modern Approaches presents the developments and modern trends in statistical inference to students who do not have advanced mathematical and statistical preparation. The topics discussed in the book are basic and common to many fields of statistical inference and thus serve as a jumping board for in-depth study. The book is organized into eight chapters. Chapter 1 provides an overview of how the theory of statistical inference is presented in subsequent chapters. Chapter 2 briefly discusses statistical distributions and their properties. Chapt

  1. The JCO criticality accident at Tokai-mura, Japan: an overview of the sampling campaign and preliminary results

    International Nuclear Information System (INIS)

    Komura, K.; Yamamoto, M.; Muroyama, T.; Murata, Y.; Nakanishi, T.; Hoshi, M.; Takada, J.; Ishikawa, M.; Takeoka, S.; Kitagawa, K.; Suga, S.; Endo, S.; Tosaki, N.; Mitsugashira, T.; Hara, M.; Hashimoto, T.; Takano, M.; Yanagawa, Y.; Tsuboi, T.; Ichimasa, M.; Ichimasa, Y.; Imura, H.; Sasajima, E.; Seki, R.; Saito, Y.; Kondo, M.; Kojima, S.; Muramatsu, Y.; Yoshida, S.; Shibata, S.; Yonehara, H.; Watanabe, Y.; Kimura, S.; Shiraishi, K.; Ban-nai, T.; Sahoo, S.K.; Igarashi, Y.; Aoyama, M.; Hirose, K.; Uehiro, T.; Doi, T.; Tanaka, A.; Matsuzawa, T.

    2000-01-01

    A criticality accident occurred on September 30, 1999 at the uranium conversion facility of the JCO Company Ltd. in Tokai-mura, Japan. A collaborating scientific investigation team was organized in two groups, the first to carry out research on the environmental impact (the environmental research group) and the second to assess the radiation effects on residents (the biological research group). This report concerns only the activities of the environmental research group. Four investigative teams were sent on different dates to the accident site and its vicinity to collect samples. About 400 samples were collected and subjected to analysis. An outline of the sampling campaign is presented here along with a brief chronology of the accident and the preliminary key results obtained by the independent research group are summarised in this Special Issue of the Journal of Environmental Radioactivity

  2. Ancestry inference using principal component analysis and spatial analysis: a distance-based analysis to account for population substructure.

    Science.gov (United States)

    Byun, Jinyoung; Han, Younghun; Gorlov, Ivan P; Busam, Jonathan A; Seldin, Michael F; Amos, Christopher I

    2017-10-16

    Accurate inference of genetic ancestry is of fundamental interest to many biomedical, forensic, and anthropological research areas. Genetic ancestry memberships may relate to genetic disease risks. In a genome association study, failing to account for differences in genetic ancestry between cases and controls may also lead to false-positive results. Although a number of strategies for inferring and taking into account the confounding effects of genetic ancestry are available, applying them to large studies (tens thousands samples) is challenging. The goal of this study is to develop an approach for inferring genetic ancestry of samples with unknown ancestry among closely related populations and to provide accurate estimates of ancestry for application to large-scale studies. In this study we developed a novel distance-based approach, Ancestry Inference using Principal component analysis and Spatial analysis (AIPS) that incorporates an Inverse Distance Weighted (IDW) interpolation method from spatial analysis to assign individuals to population memberships. We demonstrate the benefits of AIPS in analyzing population substructure, specifically related to the four most commonly used tools EIGENSTRAT, STRUCTURE, fastSTRUCTURE, and ADMIXTURE using genotype data from various intra-European panels and European-Americans. While the aforementioned commonly used tools performed poorly in inferring ancestry from a large number of subpopulations, AIPS accurately distinguished variations between and within subpopulations. Our results show that AIPS can be applied to large-scale data sets to discriminate the modest variability among intra-continental populations as well as for characterizing inter-continental variation. The method we developed will protect against spurious associations when mapping the genetic basis of a disease. Our approach is more accurate and computationally efficient method for inferring genetic ancestry in the large-scale genetic studies.

  3. Amygdala volume linked to individual differences in mental state inference in early childhood and adulthood

    Directory of Open Access Journals (Sweden)

    Katherine Rice

    2014-04-01

    Full Text Available We investigated the role of the amygdala in mental state inference in a sample of adults and in a sample of children aged 4 and 6 years. This period in early childhood represents a time when mentalizing abilities undergo dramatic changes. Both children and adults inferred mental states from pictures of others’ eyes, and children also inferred the mental states of others from stories (e.g., a false belief task. We also collected structural MRI data from these participants, to determine whether larger amygdala volumes (controlling for age and total gray matter volume were related to better face-based and story-based mentalizing. For children, larger amygdala volumes were related to better face-based, but not story-based, mentalizing. In contrast, in adults, amygdala volume was not related to face-based mentalizing. We next divided the face-based items into two subscales: cognitive (e.g., thinking, not believing versus affective (e.g., friendly, kind items. For children, performance on cognitive items was positively correlated with amygdala volume, but for adults, only performance on affective items was positively correlated with amygdala volume. These results indicate that the amygdala's role in mentalizing may be specific to face-based tasks and that the nature of its involvement may change over development.

  4. Variational inference & deep learning: A new synthesis

    OpenAIRE

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  5. Variational inference & deep learning : A new synthesis

    NARCIS (Netherlands)

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  6. Ensemble stacking mitigates biases in inference of synaptic connectivity

    Directory of Open Access Journals (Sweden)

    Brendan Chambers

    2018-03-01

    Full Text Available A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches. Mapping the routing of spikes through local circuitry is crucial for understanding neocortical computation. Under appropriate experimental conditions, these maps can be used to infer likely patterns of synaptic recruitment, linking activity to underlying anatomical connections. Such inferences help to reveal the synaptic implementation of population dynamics and computation. We compare a number of standard functional measures to infer underlying connectivity. We find that regularization impacts measures

  7. Constraint Satisfaction Inference : Non-probabilistic Global Inference for Sequence Labelling

    NARCIS (Netherlands)

    Canisius, S.V.M.; van den Bosch, A.; Daelemans, W.; Basili, R.; Moschitti, A.

    2006-01-01

    We present a new method for performing sequence labelling based on the idea of using a machine-learning classifier to generate several possible output sequences, and then applying an inference procedure to select the best sequence among those. Most sequence labelling methods following a similar

  8. From Neutron Star Observables to the Equation of State. II. Bayesian Inference of Equation of State Pressures

    Science.gov (United States)

    Raithel, Carolyn A.; Özel, Feryal; Psaltis, Dimitrios

    2017-08-01

    One of the key goals of observing neutron stars is to infer the equation of state (EoS) of the cold, ultradense matter in their interiors. Here, we present a Bayesian statistical method of inferring the pressures at five fixed densities, from a sample of mock neutron star masses and radii. We show that while five polytropic segments are needed for maximum flexibility in the absence of any prior knowledge of the EoS, regularizers are also necessary to ensure that simple underlying EoS are not over-parameterized. For ideal data with small measurement uncertainties, we show that the pressure at roughly twice the nuclear saturation density, {ρ }{sat}, can be inferred to within 0.3 dex for many realizations of potential sources of uncertainties. The pressures of more complicated EoS with significant phase transitions can also be inferred to within ˜30%. We also find that marginalizing the multi-dimensional parameter space of pressure to infer a mass-radius relation can lead to biases of nearly 1 km in radius, toward larger radii. Using the full, five-dimensional posterior likelihoods avoids this bias.

  9. Hysteresis critical point of nitrogen in porous glass: occurrence of sample spanning transition in capillary condensation.

    Science.gov (United States)

    Morishige, Kunimitsu

    2009-06-02

    To examine the mechanisms for capillary condensation and for capillary evaporation in porous glass, we measured the hysteresis critical points and desorption scanning curves of nitrogen in four kinds of porous glasses with different pore sizes (Vycor, CPG75A, CPG120A, and CPG170A). The shapes of the hysteresis loop in the adsorption isotherm of nitrogen for the Vycor and the CPG75A changed with temperature, whereas those for the CPG120A and the CPG170A remained almost unchanged with temperature. The hysteresis critical points for the Vycor and the CPG75A fell on the common line observed previously for ordered mesoporous silicas. On the other hand, the hysteresis critical points for the CPG120A and the CPG170A deviated appreciably from the common line. This strongly suggests that capillary evaporation of nitrogen in the interconnected and disordered pores of both the Vycor and the CPG75A follows a cavitation process at least in the vicinity of their hysteresis critical temperatures in the same way as that in the cagelike pores of the ordered silicas, whereas the hysteresis critical points in the CPG120A and the CPG170A have origin different from that in the cagelike pores. The desorption scanning curves for the CPG75A indicated the nonindependence of the porous domains. On the other hand, for both the CPG120A and the CPG170A, we obtained the scanning curves that are expected from the independent domain theory. All these results suggest that sample spanning transitions in capillary condensation and evaporation take place inside the interconnected pores of both the CPG120A and the CPG170A.

  10. Reasoning about Informal Statistical Inference: One Statistician's View

    Science.gov (United States)

    Rossman, Allan J.

    2008-01-01

    This paper identifies key concepts and issues associated with the reasoning of informal statistical inference. I focus on key ideas of inference that I think all students should learn, including at secondary level as well as tertiary. I argue that a fundamental component of inference is to go beyond the data at hand, and I propose that statistical…

  11. Meta-learning framework applied in bioinformatics inference system design.

    Science.gov (United States)

    Arredondo, Tomás; Ormazábal, Wladimir

    2015-01-01

    This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.

  12. Contributions of Teachers' Thinking Styles to Critical Thinking Dispositions (Istanbul-Fatih Sample)

    Science.gov (United States)

    Emir, Serap

    2013-01-01

    The main purpose of the research was to determine the contributions of the teachers' thinking styles to critical thinking dispositions. Hence, it is aimed to determine whether thinking styles are related to critical thinking dispositions and thinking styles measure critical thinking dispositions or not. The research was designed in relational…

  13. Measuring the critical current in superconducting samples made of NT-50 under pulse irradiation by high-energy particles

    International Nuclear Information System (INIS)

    Vasilev, P.G.; Vladimirova, N.M.; Volkov, V.I.; Goncharov, I.N.; Zajtsev, L.N.; Zel'dich, B.D.; Ivanov, V.I.; Kleshchenko, E.D.; Khvostov, V.B.

    1981-01-01

    The results of tests of superconducting samples of an uninsulated wire of the 0.5 mm diameter, containing 1045 superconducting filaments of the 10 μm diameter made of NT-50 superconductor in a copper matrix, are given. The upper part of the sample (''closed'') is placed between two glass-cloth-base laminate plates of the 50 mm length, and the lower part (''open'') of the 45 mm length is immerged into liquid helium. The sample is located perpendicular to the magnetic field of a superconducting solenoid and it is irradiated by charged particle beams at the energy of several GeV. The measurement results of permissible energy release in the sample depending on subcriticality (I/Isub(c) where I is an operating current through the sample, and Isub(c) is a critical current for lack of the beam) and the particle flux density, as well as of the maximum permissible fluence depending on subcriticality. In case of the ''closed'' sample irradiated by short pulses (approximately 1 ms) for I/Isub(c) [ru

  14. Chironomid-based water depth reconstructions: an independent evaluation of site-specific and local inference models

    NARCIS (Netherlands)

    Engels, S.; Cwynar, L.C.; Rees, A.B.H.; Shuman, B.N.

    2012-01-01

    Water depth is an important environmental variable that explains a significant portion of the variation in the chironomid fauna of shallow lakes. We developed site-specific and local chironomid water-depth inference models using 26 and 104 surface-sediment samples, respectively, from seven

  15. Statistical inference using weak chaos and infinite memory

    International Nuclear Information System (INIS)

    Welling, Max; Chen Yutian

    2010-01-01

    We describe a class of deterministic weakly chaotic dynamical systems with infinite memory. These 'herding systems' combine learning and inference into one algorithm, where moments or data-items are converted directly into an arbitrarily long sequence of pseudo-samples. This sequence has infinite range correlations and as such is highly structured. We show that its information content, as measured by sub-extensive entropy, can grow as fast as K log T, which is faster than the usual 1/2 K log T for exchangeable sequences generated by random posterior sampling from a Bayesian model. In one dimension we prove that herding sequences are equivalent to Sturmian sequences which have complexity exactly log(T + 1). More generally, we advocate the application of the rich theoretical framework around nonlinear dynamical systems, chaos theory and fractal geometry to statistical learning.

  16. Statistical inference using weak chaos and infinite memory

    Energy Technology Data Exchange (ETDEWEB)

    Welling, Max; Chen Yutian, E-mail: welling@ics.uci.ed, E-mail: yutian.chen@uci.ed [Donald Bren School of Information and Computer Science, University of California Irvine CA 92697-3425 (United States)

    2010-06-01

    We describe a class of deterministic weakly chaotic dynamical systems with infinite memory. These 'herding systems' combine learning and inference into one algorithm, where moments or data-items are converted directly into an arbitrarily long sequence of pseudo-samples. This sequence has infinite range correlations and as such is highly structured. We show that its information content, as measured by sub-extensive entropy, can grow as fast as K log T, which is faster than the usual 1/2 K log T for exchangeable sequences generated by random posterior sampling from a Bayesian model. In one dimension we prove that herding sequences are equivalent to Sturmian sequences which have complexity exactly log(T + 1). More generally, we advocate the application of the rich theoretical framework around nonlinear dynamical systems, chaos theory and fractal geometry to statistical learning.

  17. Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

    Directory of Open Access Journals (Sweden)

    Fei Xiao

    Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.

  18. An empirical Bayesian approach for model-based inference of cellular signaling networks

    Directory of Open Access Journals (Sweden)

    Klinke David J

    2009-11-01

    Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.

  19. Intercoalescence time distribution of incomplete gene genealogies in temporally varying populations, and applications in population genetic inference.

    Science.gov (United States)

    Chen, Hua

    2013-03-01

    Tracing back to a specific time T in the past, the genealogy of a sample of haplotypes may not have reached their common ancestor and may leave m lineages extant. For such an incomplete genealogy truncated at a specific time T in the past, the distribution and expectation of the intercoalescence times conditional on T are derived in an exact form in this paper for populations of deterministically time-varying sizes, specifically, for populations growing exponentially. The derived intercoalescence time distribution can be integrated to the coalescent-based joint allele frequency spectrum (JAFS) theory, and is useful for population genetic inference from large-scale genomic data, without relying on computationally intensive approaches, such as importance sampling and Markov Chain Monte Carlo (MCMC) methods. The inference of several important parameters relying on this derived conditional distribution is demonstrated: quantifying population growth rate and onset time, and estimating the number of ancestral lineages at a specific ancient time. Simulation studies confirm validity of the derivation and statistical efficiency of the methods using the derived intercoalescence time distribution. Two examples of real data are given to show the inference of the population growth rate of a European sample from the NIEHS Environmental Genome Project, and the number of ancient lineages of 31 mitochondrial genomes from Tibetan populations. © 2013 Blackwell Publishing Ltd/University College London.

  20. Statistical inference and Aristotle's Rhetoric.

    Science.gov (United States)

    Macdonald, Ranald R

    2004-11-01

    Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.

  1. Children's and adults' judgments of the certainty of deductive inferences, inductive inferences, and guesses.

    Science.gov (United States)

    Pillow, Bradford H; Pearson, Raeanne M; Hecht, Mary; Bremer, Amanda

    2010-01-01

    Children and adults rated their own certainty following inductive inferences, deductive inferences, and guesses. Beginning in kindergarten, participants rated deductions as more certain than weak inductions or guesses. Deductions were rated as more certain than strong inductions beginning in Grade 3, and fourth-grade children and adults differentiated strong inductions, weak inductions, and informed guesses from pure guesses. By Grade 3, participants also gave different types of explanations for their deductions and inductions. These results are discussed in relation to children's concepts of cognitive processes, logical reasoning, and epistemological development.

  2. Probability sampling design in ethnobotanical surveys of medicinal plants

    Directory of Open Access Journals (Sweden)

    Mariano Martinez Espinosa

    2012-07-01

    Full Text Available Non-probability sampling design can be used in ethnobotanical surveys of medicinal plants. However, this method does not allow statistical inferences to be made from the data generated. The aim of this paper is to present a probability sampling design that is applicable in ethnobotanical studies of medicinal plants. The sampling design employed in the research titled "Ethnobotanical knowledge of medicinal plants used by traditional communities of Nossa Senhora Aparecida do Chumbo district (NSACD, Poconé, Mato Grosso, Brazil" was used as a case study. Probability sampling methods (simple random and stratified sampling were used in this study. In order to determine the sample size, the following data were considered: population size (N of 1179 families; confidence coefficient, 95%; sample error (d, 0.05; and a proportion (p, 0.5. The application of this sampling method resulted in a sample size (n of at least 290 families in the district. The present study concludes that probability sampling methods necessarily have to be employed in ethnobotanical studies of medicinal plants, particularly where statistical inferences have to be made using data obtained. This can be achieved by applying different existing probability sampling methods, or better still, a combination of such methods.

  3. The importance of sound methodology in environmental DNA sampling

    Science.gov (United States)

    T. M. Wilcox; K. J. Carim; M. K. Young; K. S. McKelvey; T. W. Franklin; M. K. Schwartz

    2018-01-01

    Environmental DNA (eDNA) sampling - which enables inferences of species’ presence from genetic material in the environment - is a powerful tool for sampling rare fishes. Numerous studies have demonstrated that eDNA sampling generally provides greater probabilities of detection than traditional techniques (e.g., Thomsen et al. 2012; McKelvey et al. 2016; Valentini et al...

  4. Sampling, Probability Models and Statistical Reasoning Statistical

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...

  5. Towards a better understanding of the legibility bias in performance assessments: the case of gender-based inferences.

    Science.gov (United States)

    Greifeneder, Rainer; Zelt, Sarah; Seele, Tim; Bottenberg, Konstantin; Alt, Alexander

    2012-09-01

    Handwriting legibility systematically biases evaluations in that highly legible handwriting results in more positive evaluations than less legible handwriting. Because performance assessments in educational contexts are not only based on computerized or multiple choice tests but often include the evaluation of handwritten work samples, understanding the causes of this bias is critical. This research was designed to replicate and extend the legibility bias in two tightly controlled experiments and to explore whether gender-based inferences contribute to its occurrence. A total of 132 students from a German university participated in one pre-test and two independent experiments. Participants were asked to read and evaluate several handwritten essays varying in content quality. Each essay was presented to some participants in highly legible handwriting and to other participants in less legible handwriting. In addition, the assignment of legibility to participant group was reversed from essay to essay, resulting in a mixed-factor design. The legibility bias was replicated in both experiments. Results suggest that gender-based inferences do not account for its occurrence. Rather it appears that fluency from legibility exerts a biasing impact on evaluations of content and author abilities. The legibility bias was shown to be genuine and strong. By refuting a series of alternative explanations, this research contributes to a better understanding of what underlies the legibility bias. The present research may inform those who grade on what to focus and thus help to better allocate cognitive resources when trying to reduce this important source of error. ©2011 The British Psychological Society.

  6. Deep Learning for Population Genetic Inference.

    Science.gov (United States)

    Sheehan, Sara; Song, Yun S

    2016-03-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  7. Using Alien Coins to Test Whether Simple Inference Is Bayesian

    Science.gov (United States)

    Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.

    2016-01-01

    Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…

  8. On Maximum Entropy and Inference

    Directory of Open Access Journals (Sweden)

    Luigi Gresele

    2017-11-01

    Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.

  9. Reliability of a method of sampling stream invertebrates

    CSIR Research Space (South Africa)

    Chutter, FM

    1966-05-01

    Full Text Available In field ecological studies inferences must often be drawn from dissimilarities in numbers and species of organisms found in biological samples collected at different times and under various conditions....

  10. BayesCLUMPY: BAYESIAN INFERENCE WITH CLUMPY DUSTY TORUS MODELS

    International Nuclear Information System (INIS)

    Asensio Ramos, A.; Ramos Almeida, C.

    2009-01-01

    Our aim is to present a fast and general Bayesian inference framework based on the synergy between machine learning techniques and standard sampling methods and apply it to infer the physical properties of clumpy dusty torus using infrared photometric high spatial resolution observations of active galactic nuclei. We make use of the Metropolis-Hastings Markov Chain Monte Carlo algorithm for sampling the posterior distribution function. Such distribution results from combining all a priori knowledge about the parameters of the model and the information introduced by the observations. The main difficulty resides in the fact that the model used to explain the observations is computationally demanding and the sampling is very time consuming. For this reason, we apply a set of artificial neural networks that are used to approximate and interpolate a database of models. As a consequence, models not present in the original database can be computed ensuring continuity. We focus on the application of this solution scheme to the recently developed public database of clumpy dusty torus models. The machine learning scheme used in this paper allows us to generate any model from the database using only a factor of 10 -4 of the original size of the database and a factor of 10 -3 in computing time. The posterior distribution obtained for each model parameter allows us to investigate how the observations constrain the parameters and which ones remain partially or completely undetermined, providing statistically relevant confidence intervals. As an example, the application to the nuclear region of Centaurus A shows that the optical depth of the clouds, the total number of clouds, and the radial extent of the cloud distribution zone are well constrained using only six filters. The code is freely available from the authors.

  11. Finite-sample instrumental variables inference using an asymptotically pivotal statistic

    NARCIS (Netherlands)

    Bekker, P; Kleibergen, F

    2003-01-01

    We consider the K-statistic, Kleibergen's (2002, Econometrica 70, 1781-1803) adaptation of the Anderson-Rubin (AR) statistic in instrumental variables regression. Whereas Kleibergen (2002) especially analyzes the asymptotic behavior of the statistic, we focus on finite-sample properties in, a

  12. Compiling Relational Bayesian Networks for Exact Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan

    2004-01-01

    We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...

  13. Critical current density in MgB2 bulk samples after co-doping with nano-SiC and poly zinc acrylate complexes

    International Nuclear Information System (INIS)

    Zhang, Z.; Suo, H.; Ma, L.; Zhang, T.; Liu, M.; Zhou, M.

    2011-01-01

    SiC and poly zinc acrylate complexes co-doped MgB 2 bulk has been synthesized. Co-doping can cause higher carbon substitutions and the second phase particles. Co-doping can further increase the Jc value of MgB 2 bulk on the base of the SiC doping. The co-doped MgB 2 bulk samples have been synthesized using an in situ reaction processing. The additives is 8 wt.% SiC nano powders and 10 wt.% [(CH 2 CHCOO) 2 Zn] n poly zinc acrylate complexes (PZA). A systematic study was performed on samples doped with SiC or PZA and samples co-doped with both of them. The effects of doping and co-doping on phase formation, microstructure, and the variation of lattice parameters were studied. The amount of substituted carbon, the critical temperature (T c ) and the critical current density (J c ) were determined. The calculated lattice parameters show the decrease of the a-axis, while no obvious change was detected for c-axis parameter in co-doped samples. This indicates that the carbon was substituted by boron in MgB 2 . The amount of substituted carbon for the co-doped sample shows an enhancement compared to that of the both single doped samples. The co-doped samples perform the highest J c values, which reaches 3.3 x 10 4 A/cm 2 at 5 K and 7 T. It is shown that co-doping with SiC and organic compound is an effective way to further improve the superconducting properties of MgB 2 .

  14. Uncertainty in prediction and in inference

    International Nuclear Information System (INIS)

    Hilgevoord, J.; Uffink, J.

    1991-01-01

    The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close relationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in inference can be obtained by means of the so-called statistical distance between probability distributions. When applied to quantum mechanics, this distance leads to a measure of the distinguishability of quantum states, which essentially is the absolute value of the matrix element between the states. The importance of this result to the quantum mechanical uncertainty principle is noted. The second part of the paper provides a derivation of the statistical distance on the basis of the so-called method of support

  15. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  16. Compiling Relational Bayesian Networks for Exact Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark

    2006-01-01

    We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...

  17. Introduction to statistical inference and its applications with R

    CERN Document Server

    Trosset, Michael W

    2009-01-01

    ExperimentsExamples Randomization The Importance of Probability Games of Chance Mathematical Preliminaries Sets Counting Functions Limits Probability Interpretations of Probability Axioms of Probability Finite Sample Spaces Conditional Probability Random VariablesCase Study: Padrolling in Milton Murayama's All I asking for is my bodyDiscrete Random VariablesBasic Concepts Examples Expectation Binomial DistributionsContinuous Random Variables A Motivating Example Basic Concepts Elementary Examples Normal Distributions Normal Sampling DistributionsQuantifying Population Attributes Symmetry Quantiles The Method of Least SquaresData The Plug-In Principle Plug-In Estimates of Mean and Variance Plug-In Estimates of Quantiles Kernel Density Estimates Case Study: Are Forearm Lengths Normally Distributed? TransformationsLots of Data Averaging Decreases Variation The Weak Law of Large Numbers The Central Limit TheoremInferenceA Motivating Example Point EstimationHeuristics of Hypothesis Testing Testing Hypotheses about...

  18. Inference of Tumor Phylogenies with Improved Somatic Mutation Discovery

    KAUST Repository

    Salari, Raheleh

    2013-01-01

    Next-generation sequencing technologies provide a powerful tool for studying genome evolution during progression of advanced diseases such as cancer. Although many recent studies have employed new sequencing technologies to detect mutations across multiple, genetically related tumors, current methods do not exploit available phylogenetic information to improve the accuracy of their variant calls. Here, we present a novel algorithm that uses somatic single nucleotide variations (SNVs) in multiple, related tissue samples as lineage markers for phylogenetic tree reconstruction. Our method then leverages the inferred phylogeny to improve the accuracy of SNV discovery. Experimental analyses demonstrate that our method achieves up to 32% improvement for somatic SNV calling of multiple related samples over the accuracy of GATK\\'s Unified Genotyper, the state of the art multisample SNV caller. © 2013 Springer-Verlag.

  19. Efficient Bayesian inference for ARFIMA processes

    Science.gov (United States)

    Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.

    2015-03-01

    Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.

  20. A Statistical Framework for Microbial Source Attribution: Measuring Uncertainty in Host Transmission Events Inferred from Genetic Data (Part 2 of a 2 Part Report)

    Energy Technology Data Exchange (ETDEWEB)

    Allen, J; Velsko, S

    2009-11-16

    This report explores the question of whether meaningful conclusions can be drawn regarding the transmission relationship between two microbial samples on the basis of differences observed between the two sample's respective genomes. Unlike similar forensic applications using human DNA, the rapid rate of microbial genome evolution combined with the dynamics of infectious disease require a shift in thinking on what it means for two samples to 'match' in support of a forensic hypothesis. Previous outbreaks for SARS-CoV, FMDV and HIV were examined to investigate the question of how microbial sequence data can be used to draw inferences that link two infected individuals by direct transmission. The results are counter intuitive with respect to human DNA forensic applications in that some genetic change rather than exact matching improve confidence in inferring direct transmission links, however, too much genetic change poses challenges, which can weaken confidence in inferred links. High rates of infection coupled with relatively weak selective pressure observed in the SARS-CoV and FMDV data lead to fairly low confidence for direct transmission links. Confidence values for forensic hypotheses increased when testing for the possibility that samples are separated by at most a few intermediate hosts. Moreover, the observed outbreak conditions support the potential to provide high confidence values for hypothesis that exclude direct transmission links. Transmission inferences are based on the total number of observed or inferred genetic changes separating two sequences rather than uniquely weighing the importance of any one genetic mismatch. Thus, inferences are surprisingly robust in the presence of sequencing errors provided the error rates are randomly distributed across all samples in the reference outbreak database and the novel sequence samples in question. When the number of observed nucleotide mutations are limited due to characteristics of the

  1. Deep Learning for Population Genetic Inference.

    Directory of Open Access Journals (Sweden)

    Sara Sheehan

    2016-03-01

    Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  2. Deep Learning for Population Genetic Inference

    Science.gov (United States)

    Sheehan, Sara; Song, Yun S.

    2016-01-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908

  3. Statistical Inference for Data Adaptive Target Parameters.

    Science.gov (United States)

    Hubbard, Alan E; Kherad-Pajouh, Sara; van der Laan, Mark J

    2016-05-01

    Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science.

  4. Type Inference for Session Types in the Pi-Calculus

    DEFF Research Database (Denmark)

    Graversen, Eva Fajstrup; Harbo, Jacob Buchreitz; Huttel, Hans

    2014-01-01

    In this paper we present a direct algorithm for session type inference for the π-calculus. Type inference for session types has previously been achieved by either imposing limitations and restriction on the π-calculus, or by reducing the type inference problem to that for linear types. Our approach...

  5. Randomization-Based Inference about Latent Variables from Complex Samples: The Case of Two-Stage Sampling

    Science.gov (United States)

    Li, Tiandong

    2012-01-01

    In large-scale assessments, such as the National Assessment of Educational Progress (NAEP), plausible values based on Multiple Imputations (MI) have been used to estimate population characteristics for latent constructs under complex sample designs. Mislevy (1991) derived a closed-form analytic solution for a fixed-effect model in creating…

  6. The Role of the Sampling Distribution in Understanding Statistical Inference

    Science.gov (United States)

    Lipson, Kay

    2003-01-01

    Many statistics educators believe that few students develop the level of conceptual understanding essential for them to apply correctly the statistical techniques at their disposal and to interpret their outcomes appropriately. It is also commonly believed that the sampling distribution plays an important role in developing this understanding.…

  7. Explanatory Preferences Shape Learning and Inference.

    Science.gov (United States)

    Lombrozo, Tania

    2016-10-01

    Explanations play an important role in learning and inference. People often learn by seeking explanations, and they assess the viability of hypotheses by considering how well they explain the data. An emerging body of work reveals that both children and adults have strong and systematic intuitions about what constitutes a good explanation, and that these explanatory preferences have a systematic impact on explanation-based processes. In particular, people favor explanations that are simple and broad, with the consequence that engaging in explanation can shape learning and inference by leading people to seek patterns and favor hypotheses that support broad and simple explanations. Given the prevalence of explanation in everyday cognition, understanding explanation is therefore crucial to understanding learning and inference. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Grammatical inference algorithms, routines and applications

    CERN Document Server

    Wieczorek, Wojciech

    2017-01-01

    This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.

  9. Efficient computation of the joint sample frequency spectra for multiple populations.

    Science.gov (United States)

    Kamm, John A; Terhorst, Jonathan; Song, Yun S

    2017-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.

  10. Inference of Causal Relationships between Biomarkers and Outcomes in High Dimensions

    Directory of Open Access Journals (Sweden)

    Felix Agakov

    2011-12-01

    Full Text Available We describe a unified computational framework for learning causal dependencies between genotypes, biomarkers, and phenotypic outcomes from large-scale data. In contrast to previous studies, our framework allows for noisy measurements, hidden confounders, missing data, and pleiotropic effects of genotypes on outcomes. The method exploits the use of genotypes as “instrumental variables” to infer causal associations between phenotypic biomarkers and outcomes, without requiring the assumption that genotypic effects are mediated only through the observed biomarkers. The framework builds on sparse linear methods developed in statistics and machine learning and modified here for inferring structures of richer networks with latent variables. Where the biomarkers are gene transcripts, the method can be used for fine mapping of quantitative trait loci (QTLs detected in genetic linkage studies. To demonstrate our method, we examined effects of gene transcript levels in the liver on plasma HDL cholesterol levels in a sample of 260 mice from a heterogeneous stock.

  11. Inferring microbial interaction networks from metagenomic data using SgLV-EKF algorithm.

    Science.gov (United States)

    Alshawaqfeh, Mustafa; Serpedin, Erchin; Younes, Ahmad Bani

    2017-03-27

    Inferring the microbial interaction networks (MINs) and modeling their dynamics are critical in understanding the mechanisms of the bacterial ecosystem and designing antibiotic and/or probiotic therapies. Recently, several approaches were proposed to infer MINs using the generalized Lotka-Volterra (gLV) model. Main drawbacks of these models include the fact that these models only consider the measurement noise without taking into consideration the uncertainties in the underlying dynamics. Furthermore, inferring the MIN is characterized by the limited number of observations and nonlinearity in the regulatory mechanisms. Therefore, novel estimation techniques are needed to address these challenges. This work proposes SgLV-EKF: a stochastic gLV model that adopts the extended Kalman filter (EKF) algorithm to model the MIN dynamics. In particular, SgLV-EKF employs a stochastic modeling of the MIN by adding a noise term to the dynamical model to compensate for modeling uncertainties. This stochastic modeling is more realistic than the conventional gLV model which assumes that the MIN dynamics are perfectly governed by the gLV equations. After specifying the stochastic model structure, we propose the EKF to estimate the MIN. SgLV-EKF was compared with two similarity-based algorithms, one algorithm from the integral-based family and two regression-based algorithms, in terms of the achieved performance on two synthetic data-sets and two real data-sets. The first data-set models the randomness in measurement data, whereas, the second data-set incorporates uncertainties in the underlying dynamics. The real data-sets are provided by a recent study pertaining to an antibiotic-mediated Clostridium difficile infection. The experimental results demonstrate that SgLV-EKF outperforms the alternative methods in terms of robustness to measurement noise, modeling errors, and tracking the dynamics of the MIN. Performance analysis demonstrates that the proposed SgLV-EKF algorithm

  12. Ensemble stacking mitigates biases in inference of synaptic connectivity.

    Science.gov (United States)

    Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N

    2018-01-01

    A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.

  13. Molecular phylogeny of Acer monspessulanum L. subspecies from Iran inferred using the ITS region of nuclear ribosomal DNA

    Directory of Open Access Journals (Sweden)

    HANIF KHADEMI

    2016-04-01

    Full Text Available Abstract. Khademi H, Mehregan I, Assadi M, Nejadsatari T, Zarre S. 2015. Molecular phylogeny of Acer monspessulanum L. subspecies from Iran inferred using the ITS region of nuclear ribosomal DNA. Biodiversitas 17: 16-23. This study was carried out on the Acer monspessulanum complex growing wild in Iran. Internal transcribed spacer (ITS sequences for 75 samples representing five different subspecies of Acer monspessulanum were analyzed. Beside this, 86 previously published ITS sequences from GenBank were used to test the monophyly of the complex worldwide. Phylogenetic analyses were conducted using Bayesian inference and maximum parsimony. The results indicate that most samples of A. monspessulanum species from Iran were part of a monophyletic clade with 8 samples of A. ibericum from Georgia, A. hyrcanum from Iran and one of A. sempervirens from Greece (PP= 1; BS= 79%. Our results indicate that use of morphological characteristics coupled with molecular data will be most effective.

  14. Stochastic processes inference theory

    CERN Document Server

    Rao, Malempati M

    2014-01-01

    This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.

  15. Russell and Humean Inferences

    Directory of Open Access Journals (Sweden)

    João Paulo Monteiro

    2001-12-01

    Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.

  16. Efficient algorithms for conditional independence inference

    Czech Academy of Sciences Publication Activity Database

    Bouckaert, R.; Hemmecke, R.; Lindner, S.; Studený, Milan

    2010-01-01

    Roč. 11, č. 1 (2010), s. 3453-3479 ISSN 1532-4435 R&D Projects: GA ČR GA201/08/0539; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : conditional independence inference * linear programming approach Subject RIV: BA - General Mathematics Impact factor: 2.949, year: 2010 http://library.utia.cas.cz/separaty/2010/MTR/studeny-efficient algorithms for conditional independence inference.pdf

  17. State-Space Inference and Learning with Gaussian Processes

    OpenAIRE

    Turner, R; Deisenroth, MP; Rasmussen, CE

    2010-01-01

    18.10.13 KB. Ok to add author version to spiral, authors hold copyright. State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. C...

  18. Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals.

    Science.gov (United States)

    Chen, Daizhuo; Fraiberger, Samuel P; Moakler, Robert; Provost, Foster

    2017-09-01

    Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions. As a use case, we explore personal inferences that are made possible from "Likes" on Facebook. We first present a means for providing transparency into the information responsible for inferences drawn by data-driven models. We then introduce the "cloaking device"-a mechanism for users to inhibit the use of particular pieces of information in inference. Using these analytical tools we ask two main questions: (1) How much information must users cloak to significantly affect inferences about their personal traits? We find that usually users must cloak only a small portion of their actions to inhibit inference. We also find that, encouragingly, false-positive inferences are significantly easier to cloak than true-positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. We demonstrate a simple modeling change that requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make control easier or harder for their users.

  19. Multi-Wavelength Spectroscopy of Tidal Disruption Flares: A Legacy Sample for the LSST Era

    Science.gov (United States)

    Cenko, Stephen

    2017-08-01

    When a star passes within the sphere of disruption of a massive black hole, tidal forces will overcome self-gravity and unbind the star. While approximately half of the stellar debris is ejected at high velocities, the remaining material stays bound to the black hole and accretes, resulting in a luminous, long-lived transient known as a tidal disruption flare (TDF). In addition to serving as unique laboratories for accretion physics, TDFs offer the hope of measuring black hole masses in galaxies much too distant for resolved kinematic studies.In order to realize this potential, we must better understand the detailed processes by which the bound debris circularizes and forms an accretion disk. Spectroscopy is critical to this effort, as emission and absorption line diagnostics provide insight into the location and physical state (velocity, density, composition) of the emitting gas (in analogy with quasars). UV spectra are particularly critical, as most strong atomic features fall in this bandpass, and high-redshift TDF discoveries from LSST will sample rest-frame UV wavelengths.Here we propose to obtain a sequence of UV (HST) and optical (Gemini/GMOS) spectra for a sample of 5 TDFs discovered by the Zwicky Transient Facility, doubling the number of TDFs with UV spectra. Our observations will directly test models for the generation of the UV/optical emission (circularization vs reprocessing) by searching for outflows and measuring densities, temperatures, and composition as a function of time. This effort is critical to developing the framework by which we can infer black hole properties (e.g., mass) from LSST TDF discoveries.

  20. Hydrologic Synthesis Across the Critical Zone Observatory Network: A Step Towards Understanding the Coevolution of Critical Zone Function and Structure

    Science.gov (United States)

    Wlostowski, A. N.; Harman, C. J.; Molotch, N. P.

    2017-12-01

    The physical and biological architecture of the Earth's Critical Zone controls hydrologic partitioning, storage, and chemical evolution of precipitated water. The Critical Zone Observatory (CZO) Network provides an ideal platform to explore linkages between catchment structure and hydrologic function across a gradient of geologic and climatic settings. A legacy of hypothesis-motivated research at each site has generated a wealth of data characterizing the architecture and hydrologic function of the critical zone. We will present a synthesis of this data that aims to elucidate and explain (in the sense of making mutually intelligible) variations in hydrologic function across the CZO network. Top-down quantitative signatures of the storage and partitioning of water at catchment scales extracted from precipitation, streamflow, and meteorological data will be compared with each other, and provide quantitative benchmarks to assess differences in perceptual models of hydrologic function at each CZO site. Annual water balance analyses show that CZO sites span a wide gradient of aridity and evaporative partitioning. The aridity index (PET/P) ranges from 0.3 at Luquillo to 4.3 at Reynolds Creek, while the evaporative index (E/P) ranges from 0.3 at Luquillo (Rio Mamayes) to 0.9 at Reynolds Creek (Reynolds Creek Outlet). Snow depth and SWE observations reveal that snowpack is an important seasonal storage reservoir at three sites: Boulder, Jemez, Reynolds Creek and Southern Sierra. Simple dynamical models are also used to infer seasonal patterns of subsurface catchment storage. A root-zone water balance model reveals unique seasonal variations in plant-available water storage. Seasonal patterns of plant-available storage are driven by the asynchronicity of seasonal precipitation and evaporation cycles. Catchment sensitivity functions are derived at each site to infer relative changes in hydraulic storage (the apparent storage reservoir responsible for modulating streamflow

  1. Bayesian structural inference for hidden processes

    Science.gov (United States)

    Strelioff, Christopher C.; Crutchfield, James P.

    2014-04-01

    We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.

  2. The Impact of Disablers on Predictive Inference

    Science.gov (United States)

    Cummins, Denise Dellarosa

    2014-01-01

    People consider alternative causes when deciding whether a cause is responsible for an effect (diagnostic inference) but appear to neglect them when deciding whether an effect will occur (predictive inference). Five experiments were conducted to test a 2-part explanation of this phenomenon: namely, (a) that people interpret standard predictive…

  3. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks

    Science.gov (United States)

    2017-01-01

    Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more

  4. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks.

    Science.gov (United States)

    Klinkenberg, Don; Backer, Jantien A; Didelot, Xavier; Colijn, Caroline; Wallinga, Jacco

    2017-05-01

    Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more

  5. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks.

    Directory of Open Access Journals (Sweden)

    Don Klinkenberg

    2017-05-01

    Full Text Available Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection

  6. Automatic physical inference with information maximizing neural networks

    Science.gov (United States)

    Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

    2018-04-01

    Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.

  7. Statistical inference of seabed sound-speed structure in the Gulf of Oman Basin.

    Science.gov (United States)

    Sagers, Jason D; Knobles, David P

    2014-06-01

    Addressed is the statistical inference of the sound-speed depth profile of a thick soft seabed from broadband sound propagation data recorded in the Gulf of Oman Basin in 1977. The acoustic data are in the form of time series signals recorded on a sparse vertical line array and generated by explosive sources deployed along a 280 km track. The acoustic data offer a unique opportunity to study a deep-water bottom-limited thickly sedimented environment because of the large number of time series measurements, very low seabed attenuation, and auxiliary measurements. A maximum entropy method is employed to obtain a conditional posterior probability distribution (PPD) for the sound-speed ratio and the near-surface sound-speed gradient. The multiple data samples allow for a determination of the average error constraint value required to uniquely specify the PPD for each data sample. Two complicating features of the statistical inference study are addressed: (1) the need to develop an error function that can both utilize the measured multipath arrival structure and mitigate the effects of data errors and (2) the effect of small bathymetric slopes on the structure of the bottom interacting arrivals.

  8. Inference as Prediction

    Science.gov (United States)

    Watson, Jane

    2007-01-01

    Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…

  9. Problem solving and inference mechanisms

    Energy Technology Data Exchange (ETDEWEB)

    Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A

    1982-01-01

    The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.

  10. How accurately can other people infer your thoughts—And does culture matter?

    Science.gov (United States)

    Valanides, Constantinos; Sheppard, Elizabeth; Mitchell, Peter

    2017-01-01

    This research investigated how accurately people infer what others are thinking after observing a brief sample of their behaviour and whether culture/similarity is a relevant factor. Target participants (14 British and 14 Mediterraneans) were cued to think about either positive or negative events they had experienced. Subsequently, perceiver participants (16 British and 16 Mediterraneans) watched videos of the targets thinking about these things. Perceivers (both groups) were significantly accurate in judging when targets had been cued to think of something positive versus something negative, indicating notable inferential ability. Additionally, Mediterranean perceivers were better than British perceivers in making such inferences, irrespective of nationality of the targets, something that was statistically accounted for by corresponding group differences in levels of independently measured collectivism. The results point to the need for further research to investigate the possibility that being reared in a collectivist culture fosters ability in interpreting others’ behaviour. PMID:29112972

  11. How accurately can other people infer your thoughts-And does culture matter?

    Science.gov (United States)

    Valanides, Constantinos; Sheppard, Elizabeth; Mitchell, Peter

    2017-01-01

    This research investigated how accurately people infer what others are thinking after observing a brief sample of their behaviour and whether culture/similarity is a relevant factor. Target participants (14 British and 14 Mediterraneans) were cued to think about either positive or negative events they had experienced. Subsequently, perceiver participants (16 British and 16 Mediterraneans) watched videos of the targets thinking about these things. Perceivers (both groups) were significantly accurate in judging when targets had been cued to think of something positive versus something negative, indicating notable inferential ability. Additionally, Mediterranean perceivers were better than British perceivers in making such inferences, irrespective of nationality of the targets, something that was statistically accounted for by corresponding group differences in levels of independently measured collectivism. The results point to the need for further research to investigate the possibility that being reared in a collectivist culture fosters ability in interpreting others' behaviour.

  12. Inferring diameters of spheres and cylinders using interstitial water.

    Science.gov (United States)

    Herrera, Sheryl L; Mercredi, Morgan E; Buist, Richard; Martin, Melanie

    2018-06-04

    Most early methods to infer axon diameter distributions using magnetic resonance imaging (MRI) used single diffusion encoding sequences such as pulsed gradient spin echo (SE) and are thus sensitive to axons of diameters > 5 μm. We previously simulated oscillating gradient (OG) SE sequences for diffusion spectroscopy to study smaller axons including the majority constituting cortical connections. That study suggested the model of constant extra-axonal diffusion breaks down at OG accessible frequencies. In this study we present data from phantoms to test a time-varying interstitial apparent diffusion coefficient. Diffusion spectra were measured in four samples from water packed around beads of diameters 3, 6 and 10 μm; and 151 μm diameter tubes. Surface-to-volume ratios, and diameters were inferred. The bead pore radii estimates were 0.60±0.08 μm, 0.54±0.06 μm and 1.0±0.1 μm corresponding to bead diameters ranging from 2.9±0.4 μm to 5.3±0.7 μm, 2.6±0.3 μm to 4.8±0.6 μm, and 4.9±0.7 μm to 9±1 μm. The tube surface-to-volume ratio estimate was 0.06±0.02 μm -1 corresponding to a tube diameter of 180±70 μm. Interstitial models with OG inferred 3-10 μm bead diameters from 0.54±0.06 μm to 1.0±0.1 μm pore radii and 151 μm tube diameters from 0.06±0.02 μm -1 surface-to-volume ratios.

  13. Inference of Transmission Network Structure from HIV Phylogenetic Trees.

    Science.gov (United States)

    Giardina, Federica; Romero-Severson, Ethan Obie; Albert, Jan; Britton, Tom; Leitner, Thomas

    2017-01-01

    Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions, we added a within-host evolutionary model in epidemiological simulations and examined if the resulting phylogeny could recover different types of contact networks. To further improve realism, we also introduced patient-specific differences in infectivity across disease stages, and on the epidemic level we considered incomplete sampling and the age of the epidemic. Second, we implemented an inference method based on approximate Bayesian computation (ABC) to discriminate among three well-studied network models and jointly estimate both network parameters and key epidemiological quantities such as the infection rate. Our ABC framework used both topological and distance-based tree statistics for comparison between simulated and observed trees. Overall, our simulations showed that a virus time-scaled phylogeny (genealogy) may be substantially different from the between-host transmission tree. This has important implications for the interpretation of what a phylogeny reveals about the underlying epidemic contact network. In particular, we found that while the within-host evolutionary process obscures the transmission tree, the diversification process and infectivity dynamics also add discriminatory power to differentiate between different types of contact networks. We also found that the possibility to differentiate contact networks depends on how far an epidemic has progressed, where distance-based tree statistics have more power early in an epidemic. Finally, we applied our ABC inference on two different outbreaks from the Swedish HIV-1 epidemic.

  14. Elements of Causal Inference: Foundations and Learning Algorithms

    DEFF Research Database (Denmark)

    Peters, Jonas Martin; Janzing, Dominik; Schölkopf, Bernhard

    A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning......A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning...

  15. Bayesian methods for hackers probabilistic programming and Bayesian inference

    CERN Document Server

    Davidson-Pilon, Cameron

    2016-01-01

    Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...

  16. Hierarchial mark-recapture models: a framework for inference about demographic processes

    Science.gov (United States)

    Link, W.A.; Barker, R.J.

    2004-01-01

    The development of sophisticated mark-recapture models over the last four decades has provided fundamental tools for the study of wildlife populations, allowing reliable inference about population sizes and demographic rates based on clearly formulated models for the sampling processes. Mark-recapture models are now routinely described by large numbers of parameters. These large models provide the next challenge to wildlife modelers: the extraction of signal from noise in large collections of parameters. Pattern among parameters can be described by strong, deterministic relations (as in ultrastructural models) but is more flexibly and credibly modeled using weaker, stochastic relations. Trend in survival rates is not likely to be manifest by a sequence of values falling precisely on a given parametric curve; rather, if we could somehow know the true values, we might anticipate a regression relation between parameters and explanatory variables, in which true value equals signal plus noise. Hierarchical models provide a useful framework for inference about collections of related parameters. Instead of regarding parameters as fixed but unknown quantities, we regard them as realizations of stochastic processes governed by hyperparameters. Inference about demographic processes is based on investigation of these hyperparameters. We advocate the Bayesian paradigm as a natural, mathematically and scientifically sound basis for inference about hierarchical models. We describe analysis of capture-recapture data from an open population based on hierarchical extensions of the Cormack-Jolly-Seber model. In addition to recaptures of marked animals, we model first captures of animals and losses on capture, and are thus able to estimate survival probabilities w (i.e., the complement of death or permanent emigration) and per capita growth rates f (i.e., the sum of recruitment and immigration rates). Covariation in these rates, a feature of demographic interest, is explicitly

  17. Phylogeny of the Celastraceae inferred from phytochrome B gene sequence and morphology.

    Science.gov (United States)

    Simmons, M P; Clevinger, C C; Savolainen, V; Archer, R H; Mathews, S; Doyle, J J

    2001-02-01

    Phylogenetic relationships within Celastraceae were inferred using a simultaneous analysis of 61 morphological characters and 1123 base pairs of phytochrome B exon 1 from the nuclear genome. No gaps were inferred, and the gene tree topology suggests that the primers were specific to a single locus that did not duplicate among the lineages sampled. This region of phytochrome B was most useful for examining relationships among closely related genera. Fifty-one species from 38 genera of Celastraceae were sampled. The Celastraceae sensu lato (including Hippocrateaceae) were resolved as a monophyletic group. Loesener's subfamilies and tribes of Celastraceae were not supported. The Hippocrateaceae were resolved as a monophyletic group nested within a paraphyletic Celastraceae sensu stricto. Goupia was resolved as more closely related to Euphorbiaceae, Corynocarpaceae, and Linaceae than to Celastraceae. Plagiopteron (Flacourtiaceae) was resolved as the sister group of Hippocrateoideae. Brexia (Brexiaceae) was resolved as closely related to Elaeodendron and Pleurostylia. Canotia was resolved as the sister group of Acanthothamnus within Celastraceae. Perrottetia and Mortonia were resolved as the sister group of the rest of the Celastraceae. Siphonodon was resolved as a derived member of Celastraceae. Maytenus was resolved as three disparate groups, suggesting that this large genus needs to be recircumscribed.

  18. Self-criticism, dependency, and stress reactivity: an experience sampling approach to testing Blatt and Zuroff's (1992) theory of personality predispositions to depression in high-risk youth.

    Science.gov (United States)

    Adams, Philippe; Abela, John R Z; Auerbach, Randy; Skitch, Steven

    2009-11-01

    S. J. Blatt and D. C. Zuroff's 1992 theory of personality predispositions to depression posits that individuals who possess high levels of self-criticism and/or dependency are vulnerable to developing depression following negative events. The current study used experience sampling methodology to test this theory in a sample of 49 children ages 7 to 14. Children completed measures of dependency, self-criticism, and depressive symptoms. Subsequently, children were given a handheld computer that signaled them to complete measures of depressive symptoms and negative events at randomly selected times over 2 months. Results of hierarchical linear modeling analyses indicated that higher levels of both self-criticism and dependency were associated with greater elevations in depressive symptoms following negative events. Furthermore, each personality predisposition remained a significant predictor of such elevations after controlling for the interaction between the other personality predisposition and negative events. The results suggest that dependency and self-criticism represent distinct vulnerability factors to depression in youth.

  19. Inference for the Sharpe Ratio Using a Likelihood-Based Approach

    Directory of Open Access Journals (Sweden)

    Ying Liu

    2012-01-01

    Full Text Available The Sharpe ratio is the prominent risk-adjusted performance measure used by practitioners. Statistical testing of this ratio using its asymptotic distribution has lagged behind its use. In this paper, highly accurate likelihood analysis is applied for inference on the Sharpe ratio. Both the one- and two-sample problems are considered. The methodology has O(n−3/2 distributional accuracy and can be implemented using any parametric return distribution structure. Simulations are provided to demonstrate the method's superior accuracy over existing methods used for testing in the literature.

  20. Causal inference in econometrics

    CERN Document Server

    Kreinovich, Vladik; Sriboonchitta, Songsak

    2016-01-01

    This book is devoted to the analysis of causal inference which is one of the most difficult tasks in data analysis: when two phenomena are observed to be related, it is often difficult to decide whether one of them causally influences the other one, or whether these two phenomena have a common cause. This analysis is the main focus of this volume. To get a good understanding of the causal inference, it is important to have models of economic phenomena which are as accurate as possible. Because of this need, this volume also contains papers that use non-traditional economic models, such as fuzzy models and models obtained by using neural networks and data mining techniques. It also contains papers that apply different econometric models to analyze real-life economic dependencies.

  1. Are great apes able to reason from multi-item samples to populations of food items?

    Science.gov (United States)

    Eckert, Johanna; Rakoczy, Hannes; Call, Josep

    2017-10-01

    Inductive learning from limited observations is a cognitive capacity of fundamental importance. In humans, it is underwritten by our intuitive statistics, the ability to draw systematic inferences from populations to randomly drawn samples and vice versa. According to recent research in cognitive development, human intuitive statistics develops early in infancy. Recent work in comparative psychology has produced first evidence for analogous cognitive capacities in great apes who flexibly drew inferences from populations to samples. In the present study, we investigated whether great apes (Pongo abelii, Pan troglodytes, Pan paniscus, Gorilla gorilla) also draw inductive inferences in the opposite direction, from samples to populations. In two experiments, apes saw an experimenter randomly drawing one multi-item sample from each of two populations of food items. The populations differed in their proportion of preferred to neutral items (24:6 vs. 6:24) but apes saw only the distribution of food items in the samples that reflected the distribution of the respective populations (e.g., 4:1 vs. 1:4). Based on this observation they were then allowed to choose between the two populations. Results show that apes seemed to make inferences from samples to populations and thus chose the population from which the more favorable (4:1) sample was drawn in Experiment 1. In this experiment, the more attractive sample not only contained proportionally but also absolutely more preferred food items than the less attractive sample. Experiment 2, however, revealed that when absolute and relative frequencies were disentangled, apes performed at chance level. Whether these limitations in apes' performance reflect true limits of cognitive competence or merely performance limitations due to accessory task demands is still an open question. © 2017 Wiley Periodicals, Inc.

  2. Assessment of network inference methods: how to cope with an underdetermined problem.

    Directory of Open Access Journals (Sweden)

    Caroline Siegenthaler

    Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.

  3. Critical current measurements of high-temperature superconducting short samples at a wide range of temperatures and magnetic fields

    Science.gov (United States)

    Ma, Hongjun; Liu, Huajun; Liu, Fang; Zhang, Huahui; Ci, Lu; Shi, Yi; Lei, Lei

    2018-01-01

    High-Temperature Superconductors (HTS) are potential materials for high-field magnets, low-loss transmission cables, and Superconducting Magnetic Energy Storage (SMES) due to their high upper critical magnetic field (Hc2) and critical temperature (Tc). The critical current (Ic) of HTS, which is one of the most important parameters for superconductor application, depends strongly on the magnetic fields and temperatures. A new Ic measurement system that can carry out accurate Ic measurement for HTS short samples with various temperatures (4.2-80 K), magnetic fields (0-14 T), and angles of the magnetic field (0°-90°) has been developed. The Ic measurement system mainly consists of a measurement holder, temperature-control system, background magnet, test cryostat, data acquisition system, and DC power supply. The accuracy of temperature control is better than ±0.1 K over the 20-80 K range and ±0.05 K when measured below 20 K. The maximum current is over 1000 A with a measurement uncertainty of 1%. The system had been successfully used for YBa2Cu3O7-x(YBCO) tapes Ic determination with different temperatures and magnetic fields.

  4. Critical current measurements of high-temperature superconducting short samples at a wide range of temperatures and magnetic fields.

    Science.gov (United States)

    Ma, Hongjun; Liu, Huajun; Liu, Fang; Zhang, Huahui; Ci, Lu; Shi, Yi; Lei, Lei

    2018-01-01

    High-Temperature Superconductors (HTS) are potential materials for high-field magnets, low-loss transmission cables, and Superconducting Magnetic Energy Storage (SMES) due to their high upper critical magnetic field (H c2 ) and critical temperature (T c ). The critical current (I c ) of HTS, which is one of the most important parameters for superconductor application, depends strongly on the magnetic fields and temperatures. A new I c measurement system that can carry out accurate I c measurement for HTS short samples with various temperatures (4.2-80 K), magnetic fields (0-14 T), and angles of the magnetic field (0°-90°) has been developed. The I c measurement system mainly consists of a measurement holder, temperature-control system, background magnet, test cryostat, data acquisition system, and DC power supply. The accuracy of temperature control is better than ±0.1 K over the 20-80 K range and ±0.05 K when measured below 20 K. The maximum current is over 1000 A with a measurement uncertainty of 1%. The system had been successfully used for YBa 2 Cu 3 O 7-x (YBCO) tapes I c determination with different temperatures and magnetic fields.

  5. Critical current density and upper critical field of the PbMo6S8 Chevrel phase

    International Nuclear Information System (INIS)

    Seeber, B.; Decroux, M.; Fischer, O.

    1988-01-01

    A detailed discussion of critical current density and upper critical field for PbMo 6 S 8 (PMS) is given. It is shown that PMS bulk as well as wire samples can be prepared with sufficient quality to observe the scaling law for the volume pinning force. Using the scaling law an estimation for the critical current density as a function of field and temperature was made. The study also indicates that a substantial improvement of the critical current density can be expected by optimizing the upper critical field without changing the microstructure. It is shown that the availability of high quality samples of EuMo 6 S 8 , to which PMS is similar, makes it possible to study separately the different physical parameters which determine the upper critical field in PMS

  6. Probability and Statistical Inference

    OpenAIRE

    Prosper, Harrison B.

    2006-01-01

    These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.

  7. Fuzzy logic controller using different inference methods

    International Nuclear Information System (INIS)

    Liu, Z.; De Keyser, R.

    1994-01-01

    In this paper the design of fuzzy controllers by using different inference methods is introduced. Configuration of the fuzzy controllers includes a general rule-base which is a collection of fuzzy PI or PD rules, the triangular fuzzy data model and a centre of gravity defuzzification algorithm. The generalized modus ponens (GMP) is used with the minimum operator of the triangular norm. Under the sup-min inference rule, six fuzzy implication operators are employed to calculate the fuzzy look-up tables for each rule base. The performance is tested in simulated systems with MATLAB/SIMULINK. Results show the effects of using the fuzzy controllers with different inference methods and applied to different test processes

  8. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  9. Statistical inference based on divergence measures

    CERN Document Server

    Pardo, Leandro

    2005-01-01

    The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...

  10. Two‐phase designs for joint quantitative‐trait‐dependent and genotype‐dependent sampling in post‐GWAS regional sequencing

    Science.gov (United States)

    Espin‐Garcia, Osvaldo; Craiu, Radu V.

    2017-01-01

    ABSTRACT We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme‐QT strata yields significant power improvements compared to marginal QT‐ or SNP‐based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure. PMID:29239496

  11. Active inference, sensory attenuation and illusions.

    Science.gov (United States)

    Brown, Harriet; Adams, Rick A; Parees, Isabel; Edwards, Mark; Friston, Karl

    2013-11-01

    Active inference provides a simple and neurobiologically plausible account of how action and perception are coupled in producing (Bayes) optimal behaviour. This can be seen most easily as minimising prediction error: we can either change our predictions to explain sensory input through perception. Alternatively, we can actively change sensory input to fulfil our predictions. In active inference, this action is mediated by classical reflex arcs that minimise proprioceptive prediction error created by descending proprioceptive predictions. However, this creates a conflict between action and perception; in that, self-generated movements require predictions to override the sensory evidence that one is not actually moving. However, ignoring sensory evidence means that externally generated sensations will not be perceived. Conversely, attending to (proprioceptive and somatosensory) sensations enables the detection of externally generated events but precludes generation of actions. This conflict can be resolved by attenuating the precision of sensory evidence during movement or, equivalently, attending away from the consequences of self-made acts. We propose that this Bayes optimal withdrawal of precise sensory evidence during movement is the cause of psychophysical sensory attenuation. Furthermore, it explains the force-matching illusion and reproduces empirical results almost exactly. Finally, if attenuation is removed, the force-matching illusion disappears and false (delusional) inferences about agency emerge. This is important, given the negative correlation between sensory attenuation and delusional beliefs in normal subjects--and the reduction in the magnitude of the illusion in schizophrenia. Active inference therefore links the neuromodulatory optimisation of precision to sensory attenuation and illusory phenomena during the attribution of agency in normal subjects. It also provides a functional account of deficits in syndromes characterised by false inference

  12. Reinforcement and inference in cross-situational word learning.

    Science.gov (United States)

    Tilles, Paulo F C; Fontanari, José F

    2013-01-01

    Cross-situational word learning is based on the notion that a learner can determine the referent of a word by finding something in common across many observed uses of that word. Here we propose an adaptive learning algorithm that contains a parameter that controls the strength of the reinforcement applied to associations between concurrent words and referents, and a parameter that regulates inference, which includes built-in biases, such as mutual exclusivity, and information of past learning events. By adjusting these parameters so that the model predictions agree with data from representative experiments on cross-situational word learning, we were able to explain the learning strategies adopted by the participants of those experiments in terms of a trade-off between reinforcement and inference. These strategies can vary wildly depending on the conditions of the experiments. For instance, for fast mapping experiments (i.e., the correct referent could, in principle, be inferred in a single observation) inference is prevalent, whereas for segregated contextual diversity experiments (i.e., the referents are separated in groups and are exhibited with members of their groups only) reinforcement is predominant. Other experiments are explained with more balanced doses of reinforcement and inference.

  13. Data-driven inference for the spatial scan statistic.

    Science.gov (United States)

    Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C

    2011-08-02

    Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.

  14. Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution

    Directory of Open Access Journals (Sweden)

    Charleston W. K. Chiang

    2016-05-01

    Full Text Available Identity-by-descent (IBD is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample, and there are currently efforts to detect shorter segments from sequencing. Here, we study a problem of identifiability: because existing approaches detect IBD based on contiguous segments of identity-by-state, inferred long segments of IBD may arise from the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that significant proportions of inferred segments 1–2 cM long are results of conflations of two or more shorter segments, each at least 0.2 cM or longer, under demographic scenarios typical for modern humans for all programs tested. The impact of such conflation is much smaller for longer (> 2 cM segments. This biases the inferred IBD segment length distribution, and so can affect downstream inferences that depend on the assumption that each segment of IBD derives from a single common ancestor. As an example, we present and analyze an estimator of the de novo mutation rate using IBD segments, and demonstrate that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate. Understanding the conflation effect in detail will make its correction in future methods more tractable.

  15. Vertically Integrated Seismological Analysis II : Inference

    Science.gov (United States)

    Arora, N. S.; Russell, S.; Sudderth, E.

    2009-12-01

    Methods for automatically associating detected waveform features with hypothesized seismic events, and localizing those events, are a critical component of efforts to verify the Comprehensive Test Ban Treaty (CTBT). As outlined in our companion abstract, we have developed a hierarchical model which views detection, association, and localization as an integrated probabilistic inference problem. In this abstract, we provide more details on the Markov chain Monte Carlo (MCMC) methods used to solve this inference task. MCMC generates samples from a posterior distribution π(x) over possible worlds x by defining a Markov chain whose states are the worlds x, and whose stationary distribution is π(x). In the Metropolis-Hastings (M-H) method, transitions in the Markov chain are constructed in two steps. First, given the current state x, a candidate next state x‧ is generated from a proposal distribution q(x‧ | x), which may be (more or less) arbitrary. Second, the transition to x‧ is not automatic, but occurs with an acceptance probability—α(x‧ | x) = min(1, π(x‧)q(x | x‧)/π(x)q(x‧ | x)). The seismic event model outlined in our companion abstract is quite similar to those used in multitarget tracking, for which MCMC has proved very effective. In this model, each world x is defined by a collection of events, a list of properties characterizing those events (times, locations, magnitudes, and types), and the association of each event to a set of observed detections. The target distribution π(x) = P(x | y), the posterior distribution over worlds x given the observed waveform data y at all stations. Proposal distributions then implement several types of moves between worlds. For example, birth moves create new events; death moves delete existing events; split moves partition the detections for an event into two new events; merge moves combine event pairs; swap moves modify the properties and assocations for pairs of events. Importantly, the rules for

  16. Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

    Science.gov (United States)

    Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

    2013-01-01

    Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

  17. Type Ia Supernova Light Curve Inference: Hierarchical Models for Nearby SN Ia in the Optical and Near Infrared

    Science.gov (United States)

    Mandel, Kaisey; Kirshner, R. P.; Narayan, G.; Wood-Vasey, W. M.; Friedman, A. S.; Hicken, M.

    2010-01-01

    I have constructed a comprehensive statistical model for Type Ia supernova light curves spanning optical through near infrared data simultaneously. The near infrared light curves are found to be excellent standard candles (sigma(MH) = 0.11 +/- 0.03 mag) that are less vulnerable to systematic error from dust extinction, a major confounding factor for cosmological studies. A hierarchical statistical framework incorporates coherently multiple sources of randomness and uncertainty, including photometric error, intrinsic supernova light curve variations and correlations, dust extinction and reddening, peculiar velocity dispersion and distances, for probabilistic inference with Type Ia SN light curves. Inferences are drawn from the full probability density over individual supernovae and the SN Ia and dust populations, conditioned on a dataset of SN Ia light curves and redshifts. To compute probabilistic inferences with hierarchical models, I have developed BayeSN, a Markov Chain Monte Carlo algorithm based on Gibbs sampling. This code explores and samples the global probability density of parameters describing individual supernovae and the population. I have applied this hierarchical model to optical and near infrared data of over 100 nearby Type Ia SN from PAIRITEL, the CfA3 sample, and the literature. Using this statistical model, I find that SN with optical and NIR data have a smaller residual scatter in the Hubble diagram than SN with only optical data. The continued study of Type Ia SN in the near infrared will be important for improving their utility as precise and accurate cosmological distance indicators.

  18. Analysis Critical Thinking Stage of Eighth Grade in PBL-Scaffolding Setting To Solve Mathematical Problems

    Directory of Open Access Journals (Sweden)

    Nur Aisyah Isti

    2017-03-01

    Full Text Available The purpose of this research was described critical thinking stage of students grade VIII in setting PBL and scaffolding to solve mathematics problems. Critical thinking stage consists of clarification, assesment, inference, and strategy/tactics. The subject were teo students in the level of capacity to think critical (uncritical, less critical, quite critical, and critical. So that this research subject was 8 students in VIII A One State Junior High School of Temanggung. The result showed a description (1 critical thinking stage of students in setting PBL, in clarification the higher level of capacity to think critical students, students can identification information from question fully, can identificatio problem became detailed, and can explored the relationship among the information; (2 a strategy of scaffolding were given by critical thinking stage and TKBK, in assesment, scaffolding had given was given hint/key classically; and (3 transformation characteristic of the critical thinking stage of students after given scaffolding, it because of habituation in setting PBL and scaffolding.

  19. Eight challenges in phylodynamic inference

    Directory of Open Access Journals (Sweden)

    Simon D.W. Frost

    2015-03-01

    Full Text Available The field of phylodynamics, which attempts to enhance our understanding of infectious disease dynamics using pathogen phylogenies, has made great strides in the past decade. Basic epidemiological and evolutionary models are now well characterized with inferential frameworks in place. However, significant challenges remain in extending phylodynamic inference to more complex systems. These challenges include accounting for evolutionary complexities such as changing mutation rates, selection, reassortment, and recombination, as well as epidemiological complexities such as stochastic population dynamics, host population structure, and different patterns at the within-host and between-host scales. An additional challenge exists in making efficient inferences from an ever increasing corpus of sequence data.

  20. Human Inferences about Sequences: A Minimal Transition Probability Model.

    Directory of Open Access Journals (Sweden)

    Florent Meyniel

    2016-12-01

    Full Text Available The brain constantly infers the causes of the inputs it receives and uses these inferences to generate statistical expectations about future observations. Experimental evidence for these expectations and their violations include explicit reports, sequential effects on reaction times, and mismatch or surprise signals recorded in electrophysiology and functional MRI. Here, we explore the hypothesis that the brain acts as a near-optimal inference device that constantly attempts to infer the time-varying matrix of transition probabilities between the stimuli it receives, even when those stimuli are in fact fully unpredictable. This parsimonious Bayesian model, with a single free parameter, accounts for a broad range of findings on surprise signals, sequential effects and the perception of randomness. Notably, it explains the pervasive asymmetry between repetitions and alternations encountered in those studies. Our analysis suggests that a neural machinery for inferring transition probabilities lies at the core of human sequence knowledge.

  1. A generative inference framework for analysing patterns of cultural change in sparse population data with evidence for fashion trends in LBK culture.

    Science.gov (United States)

    Kandler, Anne; Shennan, Stephen

    2015-12-06

    Cultural change can be quantified by temporal changes in frequency of different cultural artefacts and it is a central question to identify what underlying cultural transmission processes could have caused the observed frequency changes. Observed changes, however, often describe the dynamics in samples of the population of artefacts, whereas transmission processes act on the whole population. Here we develop a modelling framework aimed at addressing this inference problem. To do so, we firstly generate population structures from which the observed sample could have been drawn randomly and then determine theoretical samples at a later time t2 produced under the assumption that changes in frequencies are caused by a specific transmission process. Thereby we also account for the potential effect of time-averaging processes in the generation of the observed sample. Subsequent statistical comparisons (e.g. using Bayesian inference) of the theoretical and observed samples at t2 can establish which processes could have produced the observed frequency data. In this way, we infer underlying transmission processes directly from available data without any equilibrium assumption. We apply this framework to a dataset describing pottery from settlements of some of the first farmers in Europe (the LBK culture) and conclude that the observed frequency dynamic of different types of decorated pottery is consistent with age-dependent selection, a preference for 'young' pottery types which is potentially indicative of fashion trends. © 2015 The Author(s).

  2. Distinguishing between statistical significance and practical/clinical meaningfulness using statistical inference.

    Science.gov (United States)

    Wilkinson, Michael

    2014-03-01

    Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.

  3. Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks.

    Science.gov (United States)

    Wang, Yi Kan; Hurley, Daniel G; Schnell, Santiago; Print, Cristin G; Crampin, Edmund J

    2013-01-01

    We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.

  4. Making Type Inference Practical

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens

    1992-01-01

    We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...

  5. Examples in parametric inference with R

    CERN Document Server

    Dixit, Ulhas Jayram

    2016-01-01

    This book discusses examples in parametric inference with R. Combining basic theory with modern approaches, it presents the latest developments and trends in statistical inference for students who do not have an advanced mathematical and statistical background. The topics discussed in the book are fundamental and common to many fields of statistical inference and thus serve as a point of departure for in-depth study. The book is divided into eight chapters: Chapter 1 provides an overview of topics on sufficiency and completeness, while Chapter 2 briefly discusses unbiased estimation. Chapter 3 focuses on the study of moments and maximum likelihood estimators, and Chapter 4 presents bounds for the variance. In Chapter 5, topics on consistent estimator are discussed. Chapter 6 discusses Bayes, while Chapter 7 studies some more powerful tests. Lastly, Chapter 8 examines unbiased and other tests. Senior undergraduate and graduate students in statistics and mathematics, and those who have taken an introductory cou...

  6. Causal Effect Inference with Deep Latent-Variable Models

    NARCIS (Netherlands)

    Louizos, C; Shalit, U.; Mooij, J.; Sontag, D.; Zemel, R.; Welling, M.

    2017-01-01

    Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of

  7. Causal inference in survival analysis using pseudo-observations

    DEFF Research Database (Denmark)

    Andersen, Per K; Syriopoulou, Elisavet; Parner, Erik T

    2017-01-01

    Causal inference for non-censored response variables, such as binary or quantitative outcomes, is often based on either (1) direct standardization ('G-formula') or (2) inverse probability of treatment assignment weights ('propensity score'). To do causal inference in survival analysis, one needs ...

  8. How accurately can other people infer your thoughts-And does culture matter?

    Directory of Open Access Journals (Sweden)

    Constantinos Valanides

    Full Text Available This research investigated how accurately people infer what others are thinking after observing a brief sample of their behaviour and whether culture/similarity is a relevant factor. Target participants (14 British and 14 Mediterraneans were cued to think about either positive or negative events they had experienced. Subsequently, perceiver participants (16 British and 16 Mediterraneans watched videos of the targets thinking about these things. Perceivers (both groups were significantly accurate in judging when targets had been cued to think of something positive versus something negative, indicating notable inferential ability. Additionally, Mediterranean perceivers were better than British perceivers in making such inferences, irrespective of nationality of the targets, something that was statistically accounted for by corresponding group differences in levels of independently measured collectivism. The results point to the need for further research to investigate the possibility that being reared in a collectivist culture fosters ability in interpreting others' behaviour.

  9. Statistical Inference at Work: Statistical Process Control as an Example

    Science.gov (United States)

    Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia

    2008-01-01

    To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…

  10. On quantum statistical inference

    NARCIS (Netherlands)

    Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have

  11. When is an image a health claim? A false-recollection method to detect implicit inferences about products' health benefits.

    Science.gov (United States)

    Klepacz, Naomi A; Nash, Robert A; Egan, M Bernadette; Hodgkins, Charo E; Raats, Monique M

    2016-08-01

    Images on food and dietary supplement packaging might lead people to infer (appropriately or inappropriately) certain health benefits of those products. Research on this issue largely involves direct questions, which could (a) elicit inferences that would not be made unprompted, and (b) fail to capture inferences made implicitly. Using a novel memory-based method, in the present research, we explored whether packaging imagery elicits health inferences without prompting, and the extent to which these inferences are made implicitly. In 3 experiments, participants saw fictional product packages accompanied by written claims. Some packages contained an image that implied a health-related function (e.g., a brain), and some contained no image. Participants studied these packages and claims, and subsequently their memory for seen and unseen claims were tested. When a health image was featured on a package, participants often subsequently recognized health claims that-despite being implied by the image-were not truly presented. In Experiment 2, these recognition errors persisted despite an explicit warning against treating the images as informative. In Experiment 3, these findings were replicated in a large consumer sample from 5 European countries, and with a cued-recall test. These findings confirm that images can act as health claims, by leading people to infer health benefits without prompting. These inferences appear often to be implicit, and could therefore be highly pervasive. The data underscore the importance of regulating imagery on product packaging; memory-based methods represent innovative ways to measure how leading (or misleading) specific images can be. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  12. Discrete Multiwavelet Critical-Sampling Transform-Based OFDM System over Rayleigh Fading Channels

    Directory of Open Access Journals (Sweden)

    Sameer A. Dawood

    2015-01-01

    Full Text Available Discrete multiwavelet critical-sampling transform (DMWCST has been proposed instead of fast Fourier transform (FFT in the realization of the orthogonal frequency division multiplexing (OFDM system. The proposed structure further reduces the level of interference and improves the bandwidth efficiency through the elimination of the cyclic prefix due to the good orthogonality and time-frequency localization properties of the multiwavelet transform. The proposed system was simulated using MATLAB to allow various parameters of the system to be varied and tested. The performance of DMWCST-based OFDM (DMWCST-OFDM was compared with that of the discrete wavelet transform-based OFDM (DWT-OFDM and the traditional FFT-based OFDM (FFT-OFDM over flat fading and frequency-selective fading channels. Results obtained indicate that the performance of the proposed DMWCST-OFDM system achieves significant improvement compared to those of DWT-OFDM and FFT-OFDM systems. DMWCST improves the performance of the OFDM system by a factor of 1.5–2.5 dB and 13–15.5 dB compared with the DWT and FFT, respectively. Therefore the proposed system offers higher data rate in wireless mobile communications.

  13. The Impact of Contextual Clue Selection on Inference

    Directory of Open Access Journals (Sweden)

    Leila Barati

    2010-05-01

    Full Text Available Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with selecting different embedded contextual clues. Naturally, people with different world experiences infer similar contextual situations differently. Lack of contextual knowledge of the target language can present an obstacle to comprehension (Anderson & Lynch, 2003. This paper tries to investigate how true contextual clue selection from the text can influence listener’s inference. In the present study 60 male and female teenagers (13-19 and 60 male and female young adults (20-26 were selected randomly based on Oxford Placement Test (OPT. During the study two fiction and two non-fiction passages were read to the participants in the experimental and control groups respectively and they were given scores according to Lexile’s Score (LS[1] based on their correct inference and logical thinking ability. In general the results show that participants’ clue selection based on their personal schematic references and background knowledge differ between teenagers and young adults and influence inference and listening comprehension. [1]- This is a framework for reading and listening which matches the appropriate score to each text based on degree of difficulty of text and each text was given a Lexile score from zero to four.

  14. Static security-based available transfer capability using adaptive neuro fuzzy inference system

    Energy Technology Data Exchange (ETDEWEB)

    Venkaiah, C.; Vinod Kumar, D.M.

    2010-07-01

    In a deregulated power system, power transactions between a seller and a buyer can only be scheduled when there is sufficient available transfer capability (ATC). Internet-based, open access same-time information systems (OASIS) provide market participants with ATC information that is continuously updated in real time. Static security-based ATC can be computed for the base case system as well as for the critical line outages of the system. Since critical line outages are based on static security analysis, the computation of static security based ATC using conventional methods is both tedious and time consuming. In this study, static security-based ATC was computed for real-time applications using 3 artificial intelligent methods notably the back propagation algorithm (BPA), the radial basis function (RBF) neural network, and the adaptive neuro fuzzy inference system (ANFIS). An IEEE 24-bus reliability test system (RTS) and 75-bus practical system were used to test these 3 different intelligent methods. The results were compared with the conventional full alternating current (AC) load flow method for different transactions.

  15. Static security-based available transfer capability using adaptive neuro fuzzy inference system

    International Nuclear Information System (INIS)

    Venkaiah, C.; Vinod Kumar, D.M.

    2010-01-01

    In a deregulated power system, power transactions between a seller and a buyer can only be scheduled when there is sufficient available transfer capability (ATC). Internet-based, open access same-time information systems (OASIS) provide market participants with ATC information that is continuously updated in real time. Static security-based ATC can be computed for the base case system as well as for the critical line outages of the system. Since critical line outages are based on static security analysis, the computation of static security based ATC using conventional methods is both tedious and time consuming. In this study, static security-based ATC was computed for real-time applications using 3 artificial intelligent methods notably the back propagation algorithm (BPA), the radial basis function (RBF) neural network, and the adaptive neuro fuzzy inference system (ANFIS). An IEEE 24-bus reliability test system (RTS) and 75-bus practical system were used to test these 3 different intelligent methods. The results were compared with the conventional full alternating current (AC) load flow method for different transactions.

  16. Inferring Demographic History Using Two-Locus Statistics.

    Science.gov (United States)

    Ragsdale, Aaron P; Gutenkunst, Ryan N

    2017-06-01

    Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.

  17. The importance of learning when making inferences

    Directory of Open Access Journals (Sweden)

    Jorg Rieskamp

    2008-03-01

    Full Text Available The assumption that people possess a repertoire of strategies to solve the inference problems they face has been made repeatedly. The experimental findings of two previous studies on strategy selection are reexamined from a learning perspective, which argues that people learn to select strategies for making probabilistic inferences. This learning process is modeled with the strategy selection learning (SSL theory, which assumes that people develop subjective expectancies for the strategies they have. They select strategies proportional to their expectancies, which are updated on the basis of experience. For the study by Newell, Weston, and Shanks (2003 it can be shown that people did not anticipate the success of a strategy from the beginning of the experiment. Instead, the behavior observed at the end of the experiment was the result of a learning process that can be described by the SSL theory. For the second study, by Br"oder and Schiffer (2006, the SSL theory is able to provide an explanation for why participants only slowly adapted to new environments in a dynamic inference situation. The reanalysis of the previous studies illustrates the importance of learning for probabilistic inferences.

  18. Bayesian inference of substrate properties from film behavior

    International Nuclear Information System (INIS)

    Aggarwal, R; Demkowicz, M J; Marzouk, Y M

    2015-01-01

    We demonstrate that by observing the behavior of a film deposited on a substrate, certain features of the substrate may be inferred with quantified uncertainty using Bayesian methods. We carry out this demonstration on an illustrative film/substrate model where the substrate is a Gaussian random field and the film is a two-component mixture that obeys the Cahn–Hilliard equation. We construct a stochastic reduced order model to describe the film/substrate interaction and use it to infer substrate properties from film behavior. This quantitative inference strategy may be adapted to other film/substrate systems. (paper)

  19. Data-driven inference for the spatial scan statistic

    Directory of Open Access Journals (Sweden)

    Duczmal Luiz H

    2011-08-01

    Full Text Available Abstract Background Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. Results A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. Conclusions A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.

  20. Toward decentralized analysis of mercury (II) in real samples. A critical review on nanotechnology-based methodologies.

    Science.gov (United States)

    Botasini, Santiago; Heijo, Gonzalo; Méndez, Eduardo

    2013-10-24

    In recent years, it has increased the number of works focused on the development of novel nanoparticle-based sensors for mercury detection, mainly motivated by the need of low cost portable devices capable of giving fast and reliable analytical response, thus contributing to the analytical decentralization. Methodologies employing colorimetric, fluorometric, magnetic, and electrochemical output signals allowed reaching detection limits within the pM and nM ranges. Most of these developments proved their suitability in detecting and quantifying mercury (II) ions in synthetic solutions or spiked water samples. However, the state of art in these technologies is still behind the standard methods of mercury quantification, such as cold vapor atomic absorption spectrometry and inductively coupled plasma techniques, in terms of reliability and sensitivity. This is mainly because the response of nanoparticle-based sensors is highly affected by the sample matrix. The developed analytical nanosystems may fail in real samples because of the negative incidence of the ionic strength and the presence of exchangeable ligands. The aim of this review is to critically consider the recently published innovations in this area, and highlight the needs to include more realistic assays in future research in order to make these advances suitable for on-site analysis. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. The interplay between post-critical beliefs and anxiety: an exploratory study in a Polish sample.

    Science.gov (United States)

    Śliwak, Jacek; Zarzycka, Beata

    2012-06-01

    The present research investigates the relationship between anxiety and the religiosity dimensions that Wulff (Psychology of religion: classic and contemporary views, Wiley, New York, 1991; Psychology of religion. Classic and contemporary views, Wiley, New York, 1997; Psychologia religii. Klasyczna i współczesna, Wydawnictwo Szkolne i Pedagogiczne, Warszawa, 1999) described as Exclusion vs. Inclusion of Transcendence and Literal vs. Symbolic. The researchers used the Post-Critical Belief scale (Hutsebaut in J Empir Theol 9(2):48-66, 1996; J Empir Theol 10(1):39-54, 1997) to measure Wulff's religiosity dimensions and the IPAT scale (Krug et al. 1967) to measure anxiety. Results from an adult sample (N = 83) suggest that three dimensions show significant relations with anxiety. Orthodoxy correlated negatively with suspiciousness (L) and positively with guilt proneness (O) factor-in the whole sample. Among women, Historical Relativism negatively correlated with suspiciousness (L), lack of integration (Q3), general anxiety and covert anxiety. Among men, Historical Relativism positively correlated with tension (Q4) and emotional instability (C), general anxiety, covert anxiety and overt anxiety. External Critique was correlated with suspiciousness (L) by men.

  2. 30 CFR 7.405 - Critical characteristics.

    Science.gov (United States)

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Critical characteristics. 7.405 Section 7.405... Cable Splice Kits § 7.405 Critical characteristics. (a) A sample from each production run, batch, or lot... tested, or (b) A sample of the materials that contribute to the flame-resistant characteristic of the...

  3. Uncertainty quantification and inference of Manning's friction coefficients using DART buoy data during the Tōhoku tsunami

    KAUST Repository

    Sraj, Ihab; Mandli, Kyle T.; Knio, Omar; Dawson, Clint N.; Hoteit, Ibrahim

    2014-01-01

    Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning's n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning's n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.

  4. Uncertainty quantification and inference of Manning's friction coefficients using DART buoy data during the Tōhoku tsunami

    KAUST Repository

    Sraj, Ihab

    2014-11-01

    Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning\\'s n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning\\'s n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.

  5. Toward decentralized analysis of mercury (II) in real samples. A critical review on nanotechnology-based methodologies

    International Nuclear Information System (INIS)

    Botasini, Santiago; Heijo, Gonzalo; Méndez, Eduardo

    2013-01-01

    Graphical abstract: -- Highlights: •Several methods based on nanotechnology achieve limit of detections in the pM and nM ranges for mercury (II) analysis. •Most of these methods are validated in filtered water samples and/or spiked samples. •Thiols in real samples constitute an actual competence for any sensor based on the binding of mercury (II) ions. •Future research should include the study of matrix interferences including thiols and dissolved organic matter. -- Abstract: In recent years, it has increased the number of works focused on the development of novel nanoparticle-based sensors for mercury detection, mainly motivated by the need of low cost portable devices capable of giving fast and reliable analytical response, thus contributing to the analytical decentralization. Methodologies employing colorimetric, fluorometric, magnetic, and electrochemical output signals allowed reaching detection limits within the pM and nM ranges. Most of these developments proved their suitability in detecting and quantifying mercury (II) ions in synthetic solutions or spiked water samples. However, the state of art in these technologies is still behind the standard methods of mercury quantification, such as cold vapor atomic absorption spectrometry and inductively coupled plasma techniques, in terms of reliability and sensitivity. This is mainly because the response of nanoparticle-based sensors is highly affected by the sample matrix. The developed analytical nanosystems may fail in real samples because of the negative incidence of the ionic strength and the presence of exchangeable ligands. The aim of this review is to critically consider the recently published innovations in this area, and highlight the needs to include more realistic assays in future research in order to make these advances suitable for on-site analysis

  6. Statistical inference an integrated approach

    CERN Document Server

    Migon, Helio S; Louzada, Francisco

    2014-01-01

    Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...

  7. Statistical learning and selective inference.

    Science.gov (United States)

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  8. Life-history traits of the Miocene Hipparion concudense (Spain inferred from bone histological structure.

    Directory of Open Access Journals (Sweden)

    Cayetana Martinez-Maza

    Full Text Available Histological analyses of fossil bones have provided clues on the growth patterns and life history traits of several extinct vertebrates that would be unavailable for classical morphological studies. We analyzed the bone histology of Hipparion to infer features of its life history traits and growth pattern. Microscope analysis of thin sections of a large sample of humeri, femora, tibiae and metapodials of Hipparion concudense from the upper Miocene site of Los Valles de Fuentidueña (Segovia, Spain has shown that the number of growth marks is similar among the different limb bones, suggesting that equivalent skeletochronological inferences for this Hipparion population might be achieved by means of any of the elements studied. Considering their abundance, we conducted a skeletechronological study based on the large sample of third metapodials from Los Valles de Fuentidueña together with another large sample from the Upper Miocene locality of Concud (Teruel, Spain. The data obtained enabled us to distinguish four age groups in both samples and to determine that Hipparion concudense tended to reach skeletal maturity during its third year of life. Integration of bone microstructure and skeletochronological data allowed us to identify ontogenetic changes in bone structure and growth rate and to distinguish three histologic ontogenetic stages corresponding to immature, subadult and adult individuals. Data on secondary osteon density revealed an increase in bone remodeling throughout the ontogenetic stages and a lesser degree thereof in the Concud population, which indicates different biomechanical stresses in the two populations, likely due to environmental differences. Several individuals showed atypical growth patterns in the Concud sample, which may also reflect environmental differences between the two localities. Finally, classification of the specimens' age within groups enabled us to characterize the age structure of both samples, which is

  9. The Impact of Selection, Gene Conversion, and Biased Sampling on the Assessment of Microbial Demography.

    Science.gov (United States)

    Lapierre, Marguerite; Blin, Camille; Lambert, Amaury; Achaz, Guillaume; Rocha, Eduardo P C

    2016-07-01

    Recent studies have linked demographic changes and epidemiological patterns in bacterial populations using coalescent-based approaches. We identified 26 studies using skyline plots and found that 21 inferred overall population expansion. This surprising result led us to analyze the impact of natural selection, recombination (gene conversion), and sampling biases on demographic inference using skyline plots and site frequency spectra (SFS). Forward simulations based on biologically relevant parameters from Escherichia coli populations showed that theoretical arguments on the detrimental impact of recombination and especially natural selection on the reconstructed genealogies cannot be ignored in practice. In fact, both processes systematically lead to spurious interpretations of population expansion in skyline plots (and in SFS for selection). Weak purifying selection, and especially positive selection, had important effects on skyline plots, showing patterns akin to those of population expansions. State-of-the-art techniques to remove recombination further amplified these biases. We simulated three common sampling biases in microbiological research: uniform, clustered, and mixed sampling. Alone, or together with recombination and selection, they further mislead demographic inferences producing almost any possible skyline shape or SFS. Interestingly, sampling sub-populations also affected skyline plots and SFS, because the coalescent rates of populations and their sub-populations had different distributions. This study suggests that extreme caution is needed to infer demographic changes solely based on reconstructed genealogies. We suggest that the development of novel sampling strategies and the joint analyzes of diverse population genetic methods are strictly necessary to estimate demographic changes in populations where selection, recombination, and biased sampling are present. © The Author 2016. Published by Oxford University Press on behalf of the Society for

  10. Critical Magnetic Field Strengths for Unipolar Solar Coronal Plumes In Quiet Regions and Coronal Holes?

    Science.gov (United States)

    Avallone, Ellis; Tiwari, Sanjiv K.; Panesar, Navdeep K.; Moore, Ronald L.; Winebarger, Amy

    2017-01-01

    Coronal plumes are bright magnetic funnels that are found in quiet regions and coronal holes that extend high into the solar corona whose lifetimes can last from hours to days. The heating processes that make plumes bright involve the magnetic field at the base of the plume, but their intricacies remain mysterious. Raouafi et al. (2014) infer from observation that plume heating is a consequence of magnetic reconnection at the base, whereas Wang et al. (2016) infer that plume heating is a result of convergence of the magnetic flux at the plume's base, or base flux. Both papers suggest that the base flux in their plumes is of mixed polarity, but do not quantitatively measure the base flux or consider whether a critical magnetic field strength is required for plume production. To investigate the magnetic origins of plume heating, we track plume luminosity in the 171 Å wavelength as well as the abundance and strength of the base flux over the lifetimes of six unipolar coronal plumes. Of these, three are in coronal holes and three are in quiet regions. For this sample, we find that plume heating is triggered when convergence of the base flux surpasses a field strength of approximately 300 - 500 Gauss, and that the luminosity of both quiet region and coronal hole plumes respond similarly to the strength of the magnetic field in the base.

  11. TYPE Ia SUPERNOVA LIGHT-CURVE INFERENCE: HIERARCHICAL BAYESIAN ANALYSIS IN THE NEAR-INFRARED

    International Nuclear Information System (INIS)

    Mandel, Kaisey S.; Friedman, Andrew S.; Kirshner, Robert P.; Wood-Vasey, W. Michael

    2009-01-01

    We present a comprehensive statistical analysis of the properties of Type Ia supernova (SN Ia) light curves in the near-infrared using recent data from Peters Automated InfraRed Imaging TELescope and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction, and intrinsic variations, for principled and coherent statistical inference. SN Ia light-curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR data set. The logical structure of the hierarchical model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient Markov Chain Monte Carlo algorithm exploiting the conditional probabilistic structure using Gibbs sampling. We apply this framework to the JHK s SN Ia light-curve data. A new light-curve model captures the observed J-band light-curve shape variations. The marginal intrinsic variances in peak absolute magnitudes are σ(M J ) = 0.17 ± 0.03, σ(M H ) = 0.11 ± 0.03, and σ(M Ks ) = 0.19 ± 0.04. We describe the first quantitative evidence for correlations between the NIR absolute magnitudes and J-band light-curve shapes, and demonstrate their utility for distance estimation. The average residual in the Hubble diagram for the training set SNe at cz > 2000kms -1 is 0.10 mag. The new application of bootstrap cross-validation to SN Ia light-curve inference tests the sensitivity of the statistical model fit to the finite sample and estimates the prediction error at 0.15 mag. These results demonstrate that SN Ia NIR light curves are as effective as corrected optical light curves, and, because they are less vulnerable to dust absorption, they have great potential as precise and accurate cosmological distance indicators.

  12. How to Make Correct Predictions in False Belief Tasks without Attributing False Beliefs: An Analysis of Alternative Inferences and How to Avoid Them

    Directory of Open Access Journals (Sweden)

    Ricardo Augusto Perera

    2018-04-01

    Full Text Available The use of new paradigms of false belief tasks (FBT allowed to reduce the age of children who pass the test from the previous 4 years in the standard version to only 15 months or even a striking 6 months in the nonverbal modification. These results are often taken as evidence that infants already possess an—at least implicit—theory of mind (ToM. We criticize this inferential leap on the grounds that inferring a ToM from the predictive success on a false belief task requires to assume as premise that a belief reasoning is a necessary condition for correct action prediction. It is argued that the FBT does not satisfactorily constrain the predictive means, leaving room for the use of belief-independent inferences (that can rely on the attribution of non-representational mental states or the consideration of behavioral patterns that dispense any reference to other minds. These heuristics, when applied to the FBT, can achieve the same predictive success of a belief-based inference because information provided by the test stimulus allows the recognition of particular situations that can be subsumed by their ‘laws’. Instead of solving this issue by designing a single experimentum crucis that would render unfeasible the use of non-representational inferences, we suggest the application of a set of tests in which, although individually they can support inferences dissociated from a ToM, only an inference that makes use of false beliefs is able to correctly predict all the outcomes.

  13. Critical transport current in granular high temperature superconductors

    International Nuclear Information System (INIS)

    Bogolyubov, N.A.

    1999-01-01

    The temperature and size dependence of the critical current in a zero magnetic field of three bismuth-based ceramic samples with round cross section and one sample with rectangular triangle cross section have been studied by a contactless technique. It is shown that the critical current can be presented as a product of the temperature and size dependent factors. The temperature-dependent multiplier reflects the individual peculiarities of the Josephson net of each sample, while the size factor is a homogeneous function of the cross-section sizes. The index of this function is independent of the cross-section form, the temperature and individual properties of HTSC samples. The radial distribution of critical current density in round samples and dependence of the critical current density on the magnetic conduction in granular HTSC have been found from the analysis of experimental data

  14. Critical thinking and creativity in nursing: learners' perspectives.

    Science.gov (United States)

    Chan, Zenobia C Y

    2013-05-01

    Although the development of critical thinking and the development of creativity are major areas in nursing programme, little has been explored about learners' perspectives towards these two concepts, especially in Chinese contexts. This study aimed to reveal nursing learners' perspectives on creativity and critical thinking. Qualitative data collection methods were adopted, namely group interviews and concept map drawings. The process of data collection was conducted in private rooms at a University. 36 nursing students from two problem-based learning classes were recruited in two groups for the study. After data collection, content analysis with axial coding approach was conducted to explore the narrative themes, to summarise the main ideas, and to make valid inferences from the connections among critical thinking, creativity, and other exogenous variables. Based on the findings, six major themes were identified: "revisiting the meanings of critical thinking"; "critical thinking and knowledge: partners or rivals?"; "is critical thinking criticising?"; "revising the meanings of creativity"; "creativity and experience: partners or rivals?"; and "should creativity be practical?". This study showed that learners had diverse perspectives towards critical thinking and creativity, and their debate on these two domains provided implications on nursing education, since the voices of learners are crucial in teaching. By closing the gap between learners and educators, this study offered some insights on nursing education in the new curriculum, in particular to co-construct nursing knowledge which is student-driven, and to consider students' voices towards understanding and applying creativity and critical thinking in nursing. Copyright © 2012 Elsevier Ltd. All rights reserved.

  15. Inferring Molecular Processes Heterogeneity from Transcriptional Data.

    Science.gov (United States)

    Gogolewski, Krzysztof; Wronowska, Weronika; Lech, Agnieszka; Lesyng, Bogdan; Gambin, Anna

    2017-01-01

    RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs.

  16. Object-Oriented Type Inference

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff; Palsberg, Jens

    1991-01-01

    We present a new approach to inferring types in untyped object-oriented programs with inheritance, assignments, and late binding. It guarantees that all messages are understood, annotates the program with type information, allows polymorphic methods, and can be used as the basis of an op...

  17. Statistical inferences under the Null hypothesis: Common mistakes and pitfalls in neuroimaging studies.

    Directory of Open Access Journals (Sweden)

    Jean-Michel eHupé

    2015-02-01

    Full Text Available Published studies using functional and structural MRI include many errors in the way data are analyzed and conclusions reported. This was observed when working on a comprehensive review of the neural bases of synesthesia, but these errors are probably endemic to neuroimaging studies. All studies reviewed had based their conclusions using Null Hypothesis Significance Tests (NHST. NHST have yet been criticized since their inception because they are more appropriate for taking decisions related to a Null hypothesis (like in manufacturing than for making inferences about behavioral and neuronal processes. Here I focus on a few key problems of NHST related to brain imaging techniques, and explain why or when we should not rely on significance tests. I also observed that, often, the ill-posed logic of NHST was even not correctly applied, and describe what I identified as common mistakes or at least problematic practices in published papers, in light of what could be considered as the very basics of statistical inference. MRI statistics also involve much more complex issues than standard statistical inference. Analysis pipelines vary a lot between studies, even for those using the same software, and there is no consensus which pipeline is the best. I propose a synthetic view of the logic behind the possible methodological choices, and warn against the usage and interpretation of two statistical methods popular in brain imaging studies, the false discovery rate (FDR procedure and permutation tests. I suggest that current models for the analysis of brain imaging data suffer from serious limitations and call for a revision taking into account the new statistics (confidence intervals logic.

  18. Energy metabolism in mobile, wild-sampled sharks inferred by plasma lipids.

    Science.gov (United States)

    Gallagher, Austin J; Skubel, Rachel A; Pethybridge, Heidi R; Hammerschlag, Neil

    2017-01-01

    Evaluating how predators metabolize energy is increasingly useful for conservation physiology, as it can provide information on their current nutritional condition. However, obtaining metabolic information from mobile marine predators is inherently challenging owing to their relative rarity, cryptic nature and often wide-ranging underwater movements. Here, we investigate aspects of energy metabolism in four free-ranging shark species ( n  = 281; blacktip, bull, nurse, and tiger) by measuring three metabolic parameters [plasma triglycerides (TAG), free fatty acids (FFA) and cholesterol (CHOL)] via non-lethal biopsy sampling. Plasma TAG, FFA and total CHOL concentrations (in millimoles per litre) varied inter-specifically and with season, year, and shark length varied within a species. The TAG were highest in the plasma of less active species (nurse and tiger sharks), whereas FFA were highest among species with relatively high energetic demands (blacktip and bull sharks), and CHOL concentrations were highest in bull sharks. Although temporal patterns in all metabolites were varied among species, there appeared to be peaks in the spring and summer, with ratios of TAG/CHOL (a proxy for condition) in all species displaying a notable peak in summer. These results provide baseline information of energy metabolism in large sharks and are an important step in understanding how the metabolic parameters can be assessed through non-lethal sampling in the future. In particular, this study emphasizes the importance of accounting for intra-specific and temporal variability in sampling designs seeking to monitor the nutritional condition and metabolic responses of shark populations.

  19. The Probabilistic Convolution Tree: Efficient Exact Bayesian Inference for Faster LC-MS/MS Protein Inference

    Science.gov (United States)

    Serang, Oliver

    2014-01-01

    Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234

  20. Inference-based procedural modeling of solids

    KAUST Repository

    Biggers, Keith

    2011-11-01

    As virtual environments become larger and more complex, there is an increasing need for more automated construction algorithms to support the development process. We present an approach for modeling solids by combining prior examples with a simple sketch. Our algorithm uses an inference-based approach to incrementally fit patches together in a consistent fashion to define the boundary of an object. This algorithm samples and extracts surface patches from input models, and develops a Petri net structure that describes the relationship between patches along an imposed parameterization. Then, given a new parameterized line or curve, we use the Petri net to logically fit patches together in a manner consistent with the input model. This allows us to easily construct objects of varying sizes and configurations using arbitrary articulation, repetition, and interchanging of parts. The result of our process is a solid model representation of the constructed object that can be integrated into a simulation-based environment. © 2011 Elsevier Ltd. All rights reserved.

  1. Reward inference by primate prefrontal and striatal neurons.

    Science.gov (United States)

    Pan, Xiaochuan; Fan, Hongwei; Sawa, Kosuke; Tsuda, Ichiro; Tsukada, Minoru; Sakagami, Masamichi

    2014-01-22

    The brain contains multiple yet distinct systems involved in reward prediction. To understand the nature of these processes, we recorded single-unit activity from the lateral prefrontal cortex (LPFC) and the striatum in monkeys performing a reward inference task using an asymmetric reward schedule. We found that neurons both in the LPFC and in the striatum predicted reward values for stimuli that had been previously well experienced with set reward quantities in the asymmetric reward task. Importantly, these LPFC neurons could predict the reward value of a stimulus using transitive inference even when the monkeys had not yet learned the stimulus-reward association directly; whereas these striatal neurons did not show such an ability. Nevertheless, because there were two set amounts of reward (large and small), the selected striatal neurons were able to exclusively infer the reward value (e.g., large) of one novel stimulus from a pair after directly experiencing the alternative stimulus with the other reward value (e.g., small). Our results suggest that although neurons that predict reward value for old stimuli in the LPFC could also do so for new stimuli via transitive inference, those in the striatum could only predict reward for new stimuli via exclusive inference. Moreover, the striatum showed more complex functions than was surmised previously for model-free learning.

  2. Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.

    Science.gov (United States)

    Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui

    2017-01-01

    The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.

  3. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  4. Inference about density and temporary emigration in unmarked populations

    Science.gov (United States)

    Chandler, Richard B.; Royle, J. Andrew; King, David I.

    2011-01-01

    Few species are distributed uniformly in space, and populations of mobile organisms are rarely closed with respect to movement, yet many models of density rely upon these assumptions. We present a hierarchical model allowing inference about the density of unmarked populations subject to temporary emigration and imperfect detection. The model can be fit to data collected using a variety of standard survey methods such as repeated point counts in which removal sampling, double-observer sampling, or distance sampling is used during each count. Simulation studies demonstrated that parameter estimators are unbiased when temporary emigration is either "completely random" or is determined by the size and location of home ranges relative to survey points. We also applied the model to repeated removal sampling data collected on Chestnut-sided Warblers (Dendroica pensylvancia) in the White Mountain National Forest, USA. The density estimate from our model, 1.09 birds/ha, was similar to an estimate of 1.11 birds/ha produced by an intensive spot-mapping effort. Our model is also applicable when processes other than temporary emigration affect the probability of being available for detection, such as in studies using cue counts. Functions to implement the model have been added to the R package unmarked.

  5. Statistical Inference on Stochastic Dominance Efficiency. Do Omitted Risk Factors Explain the Size and Book-to-Market Effects?

    NARCIS (Netherlands)

    G.T. Post (Thierry)

    2003-01-01

    textabstractThis paper discusses statistical inference on the second-order stochastic dominance (SSD) efficiency of a given portfolio relative to all portfolios formed from a set of assets. We derive the asymptotic sampling distribution of the Post test statistic for SSD efficiency. Unfortunately, a

  6. Gene expression inference with deep learning.

    Science.gov (United States)

    Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

    2016-06-15

    Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Inferences about nested subsets structure when not all species are detected

    Science.gov (United States)

    Cam, E.; Nichols, J.D.; Hines, J.E.; Sauer, J.R.

    2000-01-01

    Comparisons of species composition among ecological communities of different size have often provided evidence that the species in communities with lower species richness form nested subsets of the species in larger communities. In the vast majority of studies, the question of nested subsets has been addressed using information on presence-absence, where a '0' is interpreted as the absence of a given species from a given location. Most of the methodological discussion in earlier studies investigating nestedness concerns the approach to generation of model-based matrices. However, it is most likely that in many situations investigators cannot detect all the species present in the location sampled. The possibility that zeros in incidence matrices reflect nondetection rather than absence of species has not been considered in studies addressing nested subsets, even though the position of zeros in these matrices forms the basis of earlier inference methods. These sampling artifacts are likely to lead to erroneous conclusions about both variation over space in species richness and the degree of similarity of the various locations. Here we propose an approach to investigation of nestedness, based on statistical inference methods explicitly incorporating species detection probability, that take into account the probabilistic nature of the sampling process. We use presence-absence data collected under Pollock?s robust capture-recapture design, and resort to an estimator of species richness originally developed for closed populations to assess the proportion of species shared by different locations. We develop testable predictions corresponding to the null hypothesis of a nonnested pattern, and an alternative hypothesis of perfect nestedness. We also present an index for assessing the degree of nestedness of a system of ecological communities. We illustrate our approach using avian data from the North American Breeding Bird Survey collected in Florida Keys.

  8. Model selection and inference a practical information-theoretic approach

    CERN Document Server

    Burnham, Kenneth P

    1998-01-01

    This book is unique in that it covers the philosophy of model-based data analysis and an omnibus strategy for the analysis of empirical data The book introduces information theoretic approaches and focuses critical attention on a priori modeling and the selection of a good approximating model that best represents the inference supported by the data Kullback-Leibler information represents a fundamental quantity in science and is Hirotugu Akaike's basis for model selection The maximized log-likelihood function can be bias-corrected to provide an estimate of expected, relative Kullback-Leibler information This leads to Akaike's Information Criterion (AIC) and various extensions and these are relatively simple and easy to use in practice, but little taught in statistics classes and far less understood in the applied sciences than should be the case The information theoretic approaches provide a unified and rigorous theory, an extension of likelihood theory, an important application of information theory, and are ...

  9. System Support for Forensic Inference

    Science.gov (United States)

    Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan

    Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.

  10. Bayesian inference of the number of factors in gene-expression analysis: application to human virus challenge studies

    Directory of Open Access Journals (Sweden)

    Hero Alfred

    2010-11-01

    Full Text Available Abstract Background Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP, the Indian Buffet Process (IBP, and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB analysis. Results Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV, Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD, closely related non-Bayesian approaches. Conclusions Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.

  11. Bayesian inference of the number of factors in gene-expression analysis: application to human virus challenge studies.

    Science.gov (United States)

    Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence

    2010-11-09

    Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.

  12. Real-World Outcomes and Critical Thinking: Differential Analysis by Academic Major and Gender

    Directory of Open Access Journals (Sweden)

    Amanda Franco

    2015-08-01

    Full Text Available The Real-World Outcomes is an inventory that measures everyday problematic behaviors that represent decisions where critical thinking is presumably absent; assessing the negative outcomes of poor daily decisions helps to infer the degree of critical thinking that mediates everyday reasoning. In the present paper, we describe the process of translation and cultural adaptation of this inventory to Portuguese. We present evidence of its administration to 259 college students concerning reliability, and differences based on academic major and gender. No statistically significant differences were found, either due to academic major or gender. Results suggest the value of this instrument to assessing daily decision making and life outcomes, and also, to estimate the quality of critical thinking in everyday life.

  13. The Dual Role of Vegetation as a Constraint on Mass and Energy Flux into the Critical Zone and as an Emergent Property of Geophysical Critical Zone Structure

    Science.gov (United States)

    Brooks, P. D.; Swetnam, T. L.; Barnard, H. R.; Singha, K.; Harpold, A.; Litvak, M. E.

    2017-12-01

    Spatial patterns in vegetation long have been used to scale both landsurface-atmosphere exchanges of water and carbon as well as to infer subsurface structure. These pursuits typical proceed in isolation and rarely do inferences gained from one community propagate to related efforts in another. Perhaps more importantly, vegetation often is treated as an emergent property of landscape-climate interactions rather than an active modifier of both critical zone structure and energy fluxes. We posit that vegetation structure and activity are under utilized as a tool towards understanding landscape evolution and present examples that begin to disentangle the role of vegetation as both an emergent property and an active control on critical zone structure and function. As climate change, population growth, and land use changes threaten water resources worldwide, the need for the new insights vegetation can provide becomes not just a basic science priority, but a pressing applied science question with clear societal importance. This presentation will provide an overview of recent efforts to address the dual role of vegetation in both modifying and reflecting critical zone structure in the western North American forests. For example, interactions between topography and stand scale vegetation structure influence both solar radiation and turbulence altering landscape scale partitioning of evaporation vs transpiration with major impacts of surface water supply. Similarly, interactions between topographic shading, lateral redistribution of plant available water, and subsurface storage create a mosaic of drought resistance and resilience across complex terrain. These complex interactions between geophysical and vegetation components of critical zone structure result in predictable patterns in catchment scale hydrologic partitioning within individual watersheds while simultaneously suggesting testable hypotheses for why catchments under similar climate regimes respond so

  14. Bayesian Inference for Functional Dynamics Exploring in fMRI Data

    Directory of Open Access Journals (Sweden)

    Xuan Guo

    2016-01-01

    Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.

  15. Assessing the critical thinking skills of faculty: What do the findings mean for nursing education?

    Science.gov (United States)

    Zygmont, Dolores M; Schaefer, Karen Moore

    2006-01-01

    The purpose of this study was twofold: to determine the critical thinking skills of nurse faculty and to examine the relationship between epistemological position and critical thinking. Most participants reported having no education on critical thinking. Data were collected using the California Critical Thinking Skills Test (CCTST) and the Learning Environment Preferences (LEP). Findings from the CCTST indicated that faculty varied considerably in their ability to think critically; LEP findings suggested that participants had not reached the intellectual level needed for critical thinking. In addition, 12 faculty participated in one-hour telephone interviews in which they described experiences in which students demonstrated critical thinking. Despite a lack of clarity on the definition of critical thinking, faculty described clinical examples where students engaged in analysis, inference, and evaluation. Based on these findings, it is recommended that faculty transfer their ability to engage students in critical thinking in the clinical setting to the classroom setting. Benchmarks can be established based on the ability of faculty to engage in critical thinking.

  16. Inferring Groups of Objects, Preferred Routes, and Facility Locations from Trajectories

    DEFF Research Database (Denmark)

    Ceikute, Vaida

    (i) infer groups of objects traveling together, (ii) determine routes preferred by local drivers, and (iii) identify attractive facility locations. First, we present framework that efficiently supports online discovery of groups of moving objects that travel together. We adopt a sampling......-independent approach that makes no assumptions about when object positions are sampled and that supports the use of approximate trajectories. The framework’s algorithms exploit density-based clustering to identify groups. Such identified groups are scored based on cardinality and duration. With the use of domination...... and similarity notions, groups of low interest are pruned, and a variety of different, interesting groups are returned. Results from empirical studies with real and synthetic data offer insight into the effectiveness and efficiency of the proposed framework. Next, we view GPS trajectories as trips that represent...

  17. The Role of Working Memory in the Probabilistic Inference of Future Sensory Events.

    Science.gov (United States)

    Cashdollar, Nathan; Ruhnau, Philipp; Weisz, Nathan; Hasson, Uri

    2017-05-01

    The ability to represent the emerging regularity of sensory information from the external environment has been thought to allow one to probabilistically infer future sensory occurrences and thus optimize behavior. However, the underlying neural implementation of this process is still not comprehensively understood. Through a convergence of behavioral and neurophysiological evidence, we establish that the probabilistic inference of future events is critically linked to people's ability to maintain the recent past in working memory. Magnetoencephalography recordings demonstrated that when visual stimuli occurring over an extended time series had a greater statistical regularity, individuals with higher working-memory capacity (WMC) displayed enhanced slow-wave neural oscillations in the θ frequency band (4-8 Hz.) prior to, but not during stimulus appearance. This prestimulus neural activity was specifically linked to contexts where information could be anticipated and influenced the preferential sensory processing for this visual information after its appearance. A separate behavioral study demonstrated that this process intrinsically emerges during continuous perception and underpins a realistic advantage for efficient behavioral responses. In this way, WMC optimizes the anticipation of higher level semantic concepts expected to occur in the near future. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Large sample NAA of a pottery replica utilizing thermal neutron flux at AHWR critical facility and X-Z rotary scanning unit

    International Nuclear Information System (INIS)

    Acharya, R.; Dasari, K.B.; Pujari, P.K.; Swain, K.K.; Shinde, A.D.; Reddy, A.V.R.

    2013-01-01

    Large sample neutron activation analysis (LSNAA) of a clay pottery replica from Peru was carried out using low neutron flux graphite reflector position of Advanced Heavy Water Reactor (AHWR) critical facility. This work was taken up as a part of inter-comparison exercise under IAEA CRP on LSNAA of archaeological objects. Irradiated large size sample, placed on an X-Z rotary scanning unit, was assayed using a 40% relative efficiency HPGe detector. The k 0 -based internal monostandard NAA (IM-NAA) in conjunction with insitu relative detection efficiency was used to calculate concentration ratios of 12 elements with respect to Na. Analyses of both small and large size samples were carried out to check homogeneity and to arrive at absolute concentrations. (author)

  19. Analysis of students critical thinking skills in socio-scientific issues of biodiversity subject

    Science.gov (United States)

    Santika, A. R.; Purwianingsih, W.; Nuraeni, E.

    2018-05-01

    Critical thinking is a skills the which students should have in order to face 21st century demands. Critical thinking skills can help people in facing their daily problems, especially problems roommates relate to science. This research is aimed to analyze students critical thinking skills in socio-scientific issues of biodiversity subject. The method used in this research was descriptive method. The research subject is first-grade students’ in senior high school. The data collected by interview and open-ended question the which classified based on framework : (1) question at issue, (2) information (3) purpose (4) concepts (5) assumptions, (6) point of view, (7) interpretation and inference, and (8) implication and consequences, then it will be assessed by using rubrics. The result of the data showed students critical thinking skills in socio-scientific issues of biodiversity subject is in low and medium category. Therefore we need a learning activity that is able to develop student’s critical thinking skills, especially regarding issues of social science.

  20. Working memory supports inference learning just like classification learning.

    Science.gov (United States)

    Craig, Stewart; Lewandowsky, Stephan

    2013-08-01

    Recent research has found a positive relationship between people's working memory capacity (WMC) and their speed of category learning. To date, only classification-learning tasks have been considered, in which people learn to assign category labels to objects. It is unknown whether learning to make inferences about category features might also be related to WMC. We report data from a study in which 119 participants undertook classification learning and inference learning, and completed a series of WMC tasks. Working memory capacity was positively related to people's classification and inference learning performance.

  1. Use of Multi-Response Format Test in the Assessment of Medical Students’ Critical Thinking Ability

    Science.gov (United States)

    Mafinejad, Mahboobeh Khabaz; Monajemi, Alireza; Jalili, Mohammad; Soltani, Akbar; Rasouli, Javad

    2017-01-01

    Introduction To evaluate students critical thinking skills effectively, change in assessment practices is must. The assessment of a student’s ability to think critically is a constant challenge, and yet there is considerable debate on the best assessment method. There is evidence that the intrinsic nature of open and closed-ended response questions is to measure separate cognitive abilities. Aim To assess critical thinking ability of medical students by using multi-response format of assessment. Materials and Methods A cross-sectional study was conducted on a group of 159 undergraduate third-year medical students. All the participants completed the California Critical Thinking Skills Test (CCTST) consisting of 34 multiple-choice questions to measure general critical thinking skills and a researcher-developed test that combines open and closed-ended questions. A researcher-developed 48-question exam, consisting of 8 short-answers and 5 essay questions, 19 Multiple-Choice Questions (MCQ), and 16 True-False (TF) questions, was used to measure critical thinking skills. Correlation analyses were performed using Pearson’s coefficient to explore the association between the total scores of tests and subtests. Results One hundred and fifty-nine students participated in this study. The sample comprised 81 females (51%) and 78 males (49%) with an age range of 20±2.8 years (mean 21.2 years). The response rate was 64.1%. A significant positive correlation was found between types of questions and critical thinking scores, of which the correlations of MCQ (r=0.82) and essay questions (r=0.77) were strongest. The significant positive correlations between multi-response format test and CCTST’s subscales were seen in analysis, evaluation, inference and inductive reasoning. Unlike CCTST subscales, multi-response format test have weak correlation with CCTST total score (r=0.45, p=0.06). Conclusion This study highlights the importance of considering multi-response format test in

  2. Use of Multi-Response Format Test in the Assessment of Medical Students' Critical Thinking Ability.

    Science.gov (United States)

    Mafinejad, Mahboobeh Khabaz; Arabshahi, Seyyed Kamran Soltani; Monajemi, Alireza; Jalili, Mohammad; Soltani, Akbar; Rasouli, Javad

    2017-09-01

    To evaluate students critical thinking skills effectively, change in assessment practices is must. The assessment of a student's ability to think critically is a constant challenge, and yet there is considerable debate on the best assessment method. There is evidence that the intrinsic nature of open and closed-ended response questions is to measure separate cognitive abilities. To assess critical thinking ability of medical students by using multi-response format of assessment. A cross-sectional study was conducted on a group of 159 undergraduate third-year medical students. All the participants completed the California Critical Thinking Skills Test (CCTST) consisting of 34 multiple-choice questions to measure general critical thinking skills and a researcher-developed test that combines open and closed-ended questions. A researcher-developed 48-question exam, consisting of 8 short-answers and 5 essay questions, 19 Multiple-Choice Questions (MCQ), and 16 True-False (TF) questions, was used to measure critical thinking skills. Correlation analyses were performed using Pearson's coefficient to explore the association between the total scores of tests and subtests. One hundred and fifty-nine students participated in this study. The sample comprised 81 females (51%) and 78 males (49%) with an age range of 20±2.8 years (mean 21.2 years). The response rate was 64.1%. A significant positive correlation was found between types of questions and critical thinking scores, of which the correlations of MCQ (r=0.82) and essay questions (r=0.77) were strongest. The significant positive correlations between multi-response format test and CCTST's subscales were seen in analysis, evaluation, inference and inductive reasoning. Unlike CCTST subscales, multi-response format test have weak correlation with CCTST total score (r=0.45, p=0.06). This study highlights the importance of considering multi-response format test in the assessment of critical thinking abilities of medical

  3. Statistical inference for stochastic processes

    National Research Council Canada - National Science Library

    Basawa, Ishwar V; Prakasa Rao, B. L. S

    1980-01-01

    The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...

  4. Method for critical current testing

    International Nuclear Information System (INIS)

    Siddall, M.B.; Smathers, D.B.

    1989-01-01

    Superconducting critical current testing software was developed with four important features not feasible with analog test equipment. First, quasi-steady-state sample current conditions are achieved by incrementing sample current, followed by holding some milliseconds until the transient voltage decays before voltage sampling. Then the self-field correction from a helically wound sample is computed and subtracted from each sampled field reading. A copper wire inductively wound shunt which is used for quench protection has a constant measured resistance from which the shunt leakage current is computed and subtracted from the sample current by measuring the shunt voltage after each sample current reading. Finally, the critical current is recomputed from a least squares curve fit to the power law: E=A*In when the correlation coefficient for the fit is high enough to ensure a better result than the raw datum. Comparison with NBS Standard Reference Material (NbTi) and current round robin Nb/sub 3/Sn testing is examined

  5. Inference of Large Phylogenies Using Neighbour-Joining

    DEFF Research Database (Denmark)

    Simonsen, Martin; Mailund, Thomas; Pedersen, Christian Nørgaard Storm

    2011-01-01

    The neighbour-joining method is a widely used method for phylogenetic reconstruction which scales to thousands of taxa. However, advances in sequencing technology have made data sets with more than 10,000 related taxa widely available. Inference of such large phylogenies takes hours or days using...... the Neighbour-Joining method on a normal desktop computer because of the O(n^3) running time. RapidNJ is a search heuristic which reduce the running time of the Neighbour-Joining method significantly but at the cost of an increased memory consumption making inference of large phylogenies infeasible. We present...... two extensions for RapidNJ which reduce the memory requirements and \\makebox{allows} phylogenies with more than 50,000 taxa to be inferred efficiently on a desktop computer. Furthermore, an improved version of the search heuristic is presented which reduces the running time of RapidNJ on many data...

  6. Statistical causal inferences and their applications in public health research

    CERN Document Server

    Wu, Pan; Chen, Ding-Geng

    2016-01-01

    This book compiles and presents new developments in statistical causal inference. The accompanying data and computer programs are publicly available so readers may replicate the model development and data analysis presented in each chapter. In this way, methodology is taught so that readers may implement it directly. The book brings together experts engaged in causal inference research to present and discuss recent issues in causal inference methodological development. This is also a timely look at causal inference applied to scenarios that range from clinical trials to mediation and public health research more broadly. In an academic setting, this book will serve as a reference and guide to a course in causal inference at the graduate level (Master's or Doctorate). It is particularly relevant for students pursuing degrees in Statistics, Biostatistics and Computational Biology. Researchers and data analysts in public health and biomedical research will also find this book to be an important reference.

  7. Sample Size of One: Operational Qualitative Analysis in the Classroom

    Directory of Open Access Journals (Sweden)

    John Hoven

    2015-10-01

    Full Text Available Qualitative analysis has two extraordinary capabilities: first, finding answers to questions we are too clueless to ask; and second, causal inference – hypothesis testing and assessment – within a single unique context (sample size of one. These capabilities are broadly useful, and they are critically important in village-level civil-military operations. Company commanders need to learn quickly, "What are the problems and possibilities here and now, in this specific village? What happens if we do A, B, and C?" – and that is an ill-defined, one-of-a-kind problem. The U.S. Army's Eighty-Third Civil Affairs Battalion is our "first user" innovation partner in a new project to adapt qualitative research methods to an operational tempo and purpose. Our aim is to develop a simple, low-cost methodology and training program for local civil-military operations conducted by non-specialist conventional forces. Complementary to that, this paper focuses on some essential basics that can be implemented by college professors without significant cost, effort, or disruption.

  8. The anatomy of choice: active inference and agency

    Directory of Open Access Journals (Sweden)

    Karl eFriston

    2013-09-01

    Full Text Available This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behaviour. In particular, we consider prior beliefs that action minimises the Kullback-Leibler divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimises a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimising free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action – constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualises optimal decision theory and economic (utilitarian formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimisation, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria has a unique and Bayes-optimal solution – that minimises free energy. This sensitivity corresponds to the precision of beliefs about behaviour, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behaviour entails a representation of confidence about outcomes that are under an agent's control.

  9. The anatomy of choice: active inference and agency.

    Science.gov (United States)

    Friston, Karl; Schwartenbeck, Philipp; Fitzgerald, Thomas; Moutoussis, Michael; Behrens, Timothy; Dolan, Raymond J

    2013-01-01

    This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behavior. In particular, we consider prior beliefs that action minimizes the Kullback-Leibler (KL) divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimizes a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimizing free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action-constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualizes optimal decision theory and economic (utilitarian) formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimization, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria) has a unique and Bayes-optimal solution-that minimizes free energy. This sensitivity corresponds to the precision of beliefs about behavior, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behavior entails a representation of confidence about outcomes that are under an agent's control.

  10. Universal Darwinism As a Process of Bayesian Inference.

    Science.gov (United States)

    Campbell, John O

    2016-01-01

    Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.

  11. Inference of Time-Evolving Coupled Dynamical Systems in the Presence of Noise

    Science.gov (United States)

    Stankovski, Tomislav; Duggento, Andrea; McClintock, Peter V. E.; Stefanovska, Aneta

    2012-07-01

    A new method is introduced for analysis of interactions between time-dependent coupled oscillators, based on the signals they generate. It distinguishes unsynchronized dynamics from noise-induced phase slips and enables the evolution of the coupling functions and other parameters to be followed. It is based on phase dynamics, with Bayesian inference of the time-evolving parameters achieved by shaping the prior densities to incorporate knowledge of previous samples. The method is tested numerically and applied to reveal and quantify the time-varying nature of cardiorespiratory interactions.

  12. On principles of inductive inference

    OpenAIRE

    Kostecki, Ryszard Paweł

    2011-01-01

    We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.

  13. Estimating the spatial distribution of a plant disease epidemic from a sample

    Science.gov (United States)

    Sampling is of central importance in plant pathology. It facilitates our understanding of how epidemics develop in space and time and can also be used to inform disease management decisions. Making inferences from a sample is necessary because we rarely have the resources to conduct a complete censu...

  14. Model averaging, optimal inference and habit formation

    Directory of Open Access Journals (Sweden)

    Thomas H B FitzGerald

    2014-06-01

    Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.

  15. Using Spectroscopy to Infer the Eruption Style and Volatile History of Volcanic Tephras

    Science.gov (United States)

    McBride, M. J.; Horgan, B. H. N.; Rowe, M. C.; Wall, K. T.; Oxley, B. M.

    2017-12-01

    The interaction between volatiles and magma strongly influences volcanic eruption styles, and results in an increase in the glass component of volcanic tephra. On Earth, both phreatomagmatic and magmatic explosive eruptions create glassy tephras. Phreatomagmatic eruptions form abundant glass by quickly quenching lava through interaction with meteoric water while magmatic eruptions create less glass through slower cooling within larger pyroclasts or eruption columns. Wall et al. (2014) used X-ray diffraction (XRD) of diverse tephra samples to show that glass content correlates with eruption style, as magmatic samples contain less glass than phreatomagmatic samples. While use of XRD is limited to Earth and the Curiosity rover on Mars, orbital spectroscopy is much a more common technique in the exploration of terrestrial bodies. In this study, we evaluate whether or not spectroscopy can be used to infer eruption style and thus volatile history. Visible/near-infrared (VNIR) and thermal-infrared (TIR) spectra were collected of the Wall et al. (2014) tephra samples, and were analyzed for trends related to glass content and thus eruption style. VNIR spectra can detect glass at high abundances as well as hydrothermal alteration minerals produced during interactions with meteoric water. Using TIR, glass abundances can be derived by deconvolving the spectra with a standard spectral library; however, due to the non-unique spectral shape of glass, intermediate to high glass abundances in tephras are difficult to differentiate using TIR alone. Synthetic mixtures of glass and crystalline minerals verify these results. Therefore, the most effective method for determining glass abundance and thus eruption style from volcanic deposits is a combination of VNIR and TIR spectral analysis. Using standard planetary remote sensing instrumentation to infer eruption styles will provide a new window into the volcanic and volatile histories of terrestrial bodies.

  16. Bootstrapping phylogenies inferred from rearrangement data

    Directory of Open Access Journals (Sweden)

    Lin Yu

    2012-08-01

    Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its

  17. Bootstrapping phylogenies inferred from rearrangement data.

    Science.gov (United States)

    Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me

    2012-08-29

    Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver

  18. Classification versus inference learning contrasted with real-world categories.

    Science.gov (United States)

    Jones, Erin L; Ross, Brian H

    2011-07-01

    Categories are learned and used in a variety of ways, but the research focus has been on classification learning. Recent work contrasting classification with inference learning of categories found important later differences in category performance. However, theoretical accounts differ on whether this is due to an inherent difference between the tasks or to the implementation decisions. The inherent-difference explanation argues that inference learners focus on the internal structure of the categories--what each category is like--while classification learners focus on diagnostic information to predict category membership. In two experiments, using real-world categories and controlling for earlier methodological differences, inference learners learned more about what each category was like than did classification learners, as evidenced by higher performance on a novel classification test. These results suggest that there is an inherent difference between learning new categories by classifying an item versus inferring a feature.

  19. Statistical inference via fiducial methods

    OpenAIRE

    Salomé, Diemer

    1998-01-01

    In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary

  20. Information-Theoretic Inference of Large Transcriptional Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Meyer Patrick

    2007-01-01

    Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.