Robust Inference with Multi-way Clustering
A. Colin Cameron; Jonah B. Gelbach; Douglas L. Miller; Doug Miller
2009-01-01
In this paper we propose a variance estimator for the OLS estimator as well as for nonlinear estimators such as logit, probit and GMM. This variance estimator enables cluster-robust inference when there is two-way or multi-way clustering that is non-nested. The variance estimator extends the standard cluster-robust variance estimator or sandwich estimator for one-way clustering (e.g. Liang and Zeger (1986), Arellano (1987)) and relies on similar relatively weak distributional assumptions. Our...
Phylogenetic Inference of HIV Transmission Clusters
Directory of Open Access Journals (Sweden)
Vlad Novitsky
2017-10-01
Full Text Available Better understanding the structure and dynamics of HIV transmission networks is essential for designing the most efficient interventions to prevent new HIV transmissions, and ultimately for gaining control of the HIV epidemic. The inference of phylogenetic relationships and the interpretation of results rely on the definition of the HIV transmission cluster. The definition of the HIV cluster is complex and dependent on multiple factors, including the design of sampling, accuracy of sequencing, precision of sequence alignment, evolutionary models, the phylogenetic method of inference, and specified thresholds for cluster support. While the majority of studies focus on clusters, non-clustered cases could also be highly informative. A new dimension in the analysis of the global and local HIV epidemics is the concept of phylogenetically distinct HIV sub-epidemics. The identification of active HIV sub-epidemics reveals spreading viral lineages and may help in the design of targeted interventions.HIVclustering can also be affected by sampling density. Obtaining a proper sampling density may increase statistical power and reduce sampling bias, so sampling density should be taken into account in study design and in interpretation of phylogenetic results. Finally, recent advances in long-range genotyping may enable more accurate inference of HIV transmission networks. If performed in real time, it could both inform public-health strategies and be clinically relevant (e.g., drug-resistance testing.
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Schweder, Tore
2006-01-01
The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus Plenge; Schweder, Tore
The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...
Inferring hierarchical clustering structures by deterministic annealing
International Nuclear Information System (INIS)
Hofmann, T.; Buhmann, J.M.
1996-01-01
The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree evaluation and we derive a non-greedy technique for tree growing. Applying the principles of maximum entropy and minimum cross entropy, a deterministic annealing algorithm is derived in a meanfield approximation. This technique allows us to canonically superimpose tree structures and to fit parameters to averaged or open-quote fuzzified close-quote trees
Bootstrap-Based Improvements for Inference with Clustered Errors
Doug Miller; A. Colin Cameron; Jonah B. Gelbach
2006-01-01
Microeconometrics researchers have increasingly realized the essential need to account for any within-group dependence in estimating standard errors of regression parameter estimates. The typical preferred solution is to calculate cluster-robust or sandwich standard errors that permit quite general heteroskedasticity and within-cluster error correlation, but presume that the number of clusters is large. In applications with few (5-30) clusters, standard asymptotic tests can over-reject consid...
Modulated modularity clustering as an exploratory tool for functional genomic inference.
Directory of Open Access Journals (Sweden)
Eric A Stone
2009-05-01
Full Text Available In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC, seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation.
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
DEFF Research Database (Denmark)
Møller, Jesper
(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering
Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland
2000-01-01
Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Field line distribution of density at L=4.8 inferred from observations by CLUSTER
Directory of Open Access Journals (Sweden)
S. Schäfer
2009-02-01
Full Text Available For two events observed by the CLUSTER spacecraft, the field line distribution of mass density ρ was inferred from Alfvén wave harmonic frequencies and compared to the electron density ne from plasma wave data and the oxygen density nO+ from the ion composition experiment. In one case, the average ion mass M≈ρ/ne was about 5 amu (28 October 2002, while in the other it was about 3 amu (10 September 2002. Both events occurred when the CLUSTER 1 (C1 spacecraft was in the plasmatrough. Nevertheless, the electron density ne was significantly lower for the first event (ne=8 cm−3 than for the second event (ne=22 cm−3, and this seems to be the main difference leading to a different value of M. For the first event (28 October 2002, we were able to measure the Alfvén wave frequencies for eight harmonics with unprecedented precision, so that the error in the inferred mass density is probably dominated by factors other than the uncertainty in frequency (e.g., magnetic field model and theoretical wave equation. This field line distribution (at L=4.8 was very flat for magnetic latitude |MLAT|≲20° but very steeply increasing with respect to |MLAT| for |MLAT|≳40°. The total variation in ρ was about four orders of magnitude, with values at large |MLAT| roughly consistent with ionospheric values. For the second event (10 September 2002, there was a small local maximum in mass density near the magnetic equator. The inferred mass density decreases to a minimum 23% lower than the equatorial value at |MLAT|=15.5°, and then steeply increases as one moves along the field line toward the ionosphere. For this event we were also able to examine the spatial dependence of the electron density using measurements of ne from all four CLUSTER spacecraft. Our analysis indicates that the density varies with L at L~5 roughly like L−4, and that ne is also locally peaked at the magnetic equator, but with a smaller peak. The value of ne reaches a density minimum
Approximation Of Multi-Valued Inverse Functions Using Clustering And Sugeno Fuzzy Inference
Walden, Maria A.; Bikdash, Marwan; Homaifar, Abdollah
1998-01-01
Finding the inverse of a continuous function can be challenging and computationally expensive when the inverse function is multi-valued. Difficulties may be compounded when the function itself is difficult to evaluate. We show that we can use fuzzy-logic approximators such as Sugeno inference systems to compute the inverse on-line. To do so, a fuzzy clustering algorithm can be used in conjunction with a discriminating function to split the function data into branches for the different values of the forward function. These data sets are then fed into a recursive least-squares learning algorithm that finds the proper coefficients of the Sugeno approximators; each Sugeno approximator finds one value of the inverse function. Discussions about the accuracy of the approximation will be included.
Directory of Open Access Journals (Sweden)
Fajar Ibnu Tufeil
2009-06-01
Full Text Available Model fuzzy memiliki kemampuan untuk menjelaskan secara linguistik suatu sistem yang terlalu kompleks. Aturan-aturan dalam model fuzzy pada umumnya dibangun berdasarkan keahlian manusia dan pengetahuan heuristik dari sistem yang dimodelkan. Teknik ini selanjutnya dikembangkan menjadi teknik yang dapat mengidentifikasi aturan-aturan dari suatu basis data yang telah dikelompokkan berdasarkan persamaan strukturnya. Dalam hal ini metode pengelompokan fuzzy berfungsi untuk mencari kelompok-kelompok data. Informasi yang dihasilkan dari metode pengelompokan ini, yaitu informasi tentang pusat kelompok, digunakan untuk membentuk aturan-aturan dalam sistem penalaran fuzzy. Dalam skripsi ini dibahas mengenai penerapan fuzzy infereance system dengan metode pengelompokan fuzzy subtractive clustering, yaitu untuk membentuk sistem penalaran fuzzy dengan menggunakan model fuzzy Takagi-Sugeno orde satu. Selanjutnya, metode pengelompokan fuzzy subtractive clustering diterapkan dalam memodelkan masalah dibidang pemasaran, yaitu untuk memprediksi permintaan pasar terhadap suatu produk susu. Aplikasi ini dibangun menggunakan Borland Delphi 6.0. Dari hasil pengujian diperoleh tingkat error prediksi terkecil yaitu dengan Error Average 0.08%.
A stepwise-cluster microbial biomass inference model in food waste composting
International Nuclear Information System (INIS)
Sun Wei; Huang, Guo H.; Zeng Guangming; Qin Xiaosheng; Sun Xueling
2009-01-01
A stepwise-cluster microbial biomass inference (SMI) model was developed through introducing stepwise-cluster analysis (SCA) into composting process modeling to tackle the nonlinear relationships among state variables and microbial activities. The essence of SCA is to form a classification tree based on a series of cutting or mergence processes according to given statistical criteria. Eight runs of designed experiments in bench-scale reactors in a laboratory were constructed to demonstrate the feasibility of the proposed method. The results indicated that SMI could help establish a statistical relationship between state variables and composting microbial characteristics, where discrete and nonlinear complexities exist. Significance levels of cutting/merging were provided such that the accuracies of the developed forecasting trees were controllable. Through an attempted definition of input effects on the output in SMI, the effects of the state variables on thermophilic bacteria were ranged in a descending order as: Time (day) > moisture content (%) > ash content (%, dry) > Lower Temperature (deg. C) > pH > NH 4 + -N (mg/Kg, dry) > Total N (%, dry) > Total C (%, dry); the effects on mesophilic bacteria were ordered as: Time > Upper Temperature (deg. C) > Total N > moisture content > NH 4 + -N > Total C > pH. This study made the first attempt in applying SCA to mapping the nonlinear and discrete relationships in composting processes.
DEFF Research Database (Denmark)
Pedersen, Casper-Emil Tingskov; Frandsen, Peter; Wekesa, Sabenzia N.
2015-01-01
abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale...... through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer...... to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully...
Pedersen, Casper-Emil T; Frandsen, Peter; Wekesa, Sabenzia N; Heller, Rasmus; Sangula, Abraham K; Wadsworth, Jemma; Knowles, Nick J; Muwanika, Vincent B; Siegismund, Hans R
2015-01-01
With the emergence of analytical software for the inference of viral evolution, a number of studies have focused on estimating important parameters such as the substitution rate and the time to the most recent common ancestor (tMRCA) for rapidly evolving viruses. Coupled with an increasing abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully consider how samples are combined.
Bansal, Ravi; Peterson, Bradley S
2018-06-01
Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal
Directory of Open Access Journals (Sweden)
Fonseca Carlos M
2010-10-01
Full Text Available Abstract Background Irregularly shaped spatial clusters are difficult to delineate. A cluster found by an algorithm often spreads through large portions of the map, impacting its geographical meaning. Penalized likelihood methods for Kulldorff's spatial scan statistics have been used to control the excessive freedom of the shape of clusters. Penalty functions based on cluster geometry and non-connectivity have been proposed recently. Another approach involves the use of a multi-objective algorithm to maximize two objectives: the spatial scan statistics and the geometric penalty function. Results & Discussion We present a novel scan statistic algorithm employing a function based on the graph topology to penalize the presence of under-populated disconnection nodes in candidate clusters, the disconnection nodes cohesion function. A disconnection node is defined as a region within a cluster, such that its removal disconnects the cluster. By applying this function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions. To evaluate the statistical significance of solutions for multi-objective scans, a statistical approach based on the concept of attainment function is used. In this paper we compared different penalized likelihoods employing the geometric and non-connectivity regularity functions and the novel disconnection nodes cohesion function. We also build multi-objective scans using those three functions and compare them with the previous penalized likelihood scans. An application is presented using comprehensive state-wide data for Chagas' disease in puerperal women in Minas Gerais state, Brazil. Conclusions We show that, compared to the other single-objective algorithms, multi-objective scans present better performance, regarding power, sensitivity and positive predicted value. The multi-objective non-connectivity scan is faster and better suited for the
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Smoothed Particle Inference: A Kilo-Parametric Method for X-ray Galaxy Cluster Modeling
Energy Technology Data Exchange (ETDEWEB)
Peterson, John R.; Marshall, P.J.; /KIPAC, Menlo Park; Andersson, K.; /Stockholm U. /SLAC
2005-08-05
We propose an ambitious new method that models the intracluster medium in clusters of galaxies as a set of X-ray emitting smoothed particles of plasma. Each smoothed particle is described by a handful of parameters including temperature, location, size, and elemental abundances. Hundreds to thousands of these particles are used to construct a model cluster of galaxies, with the appropriate complexity estimated from the data quality. This model is then compared iteratively with X-ray data in the form of adaptively binned photon lists via a two-sample likelihood statistic and iterated via Markov Chain Monte Carlo. The complex cluster model is propagated through the X-ray instrument response using direct sampling Monte Carlo methods. Using this approach the method can reproduce many of the features observed in the X-ray emission in a less assumption-dependent way that traditional analyses, and it allows for a more detailed characterization of the density, temperature, and metal abundance structure of clusters. Multi-instrument X-ray analyses and simultaneous X-ray, Sunyaev-Zeldovich (SZ), and lensing analyses are a straight-forward extension of this methodology. Significant challenges still exist in understanding the degeneracy in these models and the statistical noise induced by the complexity of the models.
Cycle-Based Cluster Variational Method for Direct and Inverse Inference
Furtlehner, Cyril; Decelle, Aurélien
2016-08-01
Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to 10^5 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.
McEwen, Joseph E.; Weinberg, David H.
2018-04-01
The combination of galaxy-galaxy lensing (GGL) and galaxy clustering is a promising route to measuring the amplitude of matter clustering and testing modified gravity theories of cosmic acceleration. Halo occupation distribution (HOD) modeling can extend the approach down to nonlinear scales, but galaxy assembly bias could introduce systematic errors by causing the HOD to vary with large scale environment at fixed halo mass. We investigate this problem using the mock galaxy catalogs created by Hearin & Watson (2013, HW13), which exhibit significant assembly bias because galaxy luminosity is tied to halo peak circular velocity and galaxy colour is tied to halo formation time. The preferential placement of galaxies (especially red galaxies) in older halos affects the cutoff of the mean occupation function for central galaxies, with halos in overdense regions more likely to host galaxies. The effect of assembly bias on the satellite galaxy HOD is minimal. We introduce an extended, environment dependent HOD (EDHOD) prescription to describe these results and fit galaxy correlation measurements. Crucially, we find that the galaxy-matter cross-correlation coefficient, rgm(r) ≡ ξgm(r) . [ξmm(r)ξgg(r)]-1/2, is insensitive to assembly bias on scales r ≳ 1 h^{-1} Mpc, even though ξgm(r) and ξgg(r) are both affected individually. We can therefore recover the correct ξmm(r) from the HW13 galaxy-galaxy and galaxy-matter correlations using either a standard HOD or EDHOD fitting method. For Mr ≤ -19 or Mr ≤ -20 samples the recovery of ξmm(r) is accurate to 2% or better. For a sample of red Mr ≤ -20 galaxies we achieve 2% recovery at r ≳ 2 h^{-1} Mpc with EDHOD modeling but lower accuracy at smaller scales or with a standard HOD fit. Most of our mock galaxy samples are consistent with rgm = 1 down to r = 1h-1Mpc, to within the uncertainties set by our finite simulation volume.
McEwen, Joseph E.; Weinberg, David H.
2018-07-01
The combination of galaxy-galaxy lensing and galaxy clustering is a promising route to measuring the amplitude of matter clustering and testing modified gravity theories of cosmic acceleration. Halo occupation distribution (HOD) modelling can extend the approach down to non-linear scales, but galaxy assembly bias could introduce systematic errors by causing the HOD to vary with the large-scale environment at fixed halo mass. We investigate this problem using the mock galaxy catalogs created by Hearin & Watson (2013, HW13), which exhibit significant assembly bias because galaxy luminosity is tied to halo peak circular velocity and galaxy colour is tied to halo formation time. The preferential placement of galaxies (especially red galaxies) in older haloes affects the cutoff of the mean occupation function ⟨Ncen(Mmin)⟩ for central galaxies, with haloes in overdense regions more likely to host galaxies. The effect of assembly bias on the satellite galaxy HOD is minimal. We introduce an extended, environment-dependent HOD (EDHOD) prescription to describe these results and fit galaxy correlation measurements. Crucially, we find that the galaxy-matter cross-correlation coefficient, rgm(r) ≡ ξgm(r) . [ξmm(r)ξgg(r)]-1/2, is insensitive to assembly bias on scales r ≳ 1 h-1 Mpc, even though ξgm(r) and ξgg(r) are both affected individually. We can therefore recover the correct ξmm(r) from the HW13 galaxy-galaxy and galaxy-matter correlations using either a standard HOD or EDHOD fitting method. For Mr ≤ -19 or Mr ≤ -20 samples the recovery of ξmm(r) is accurate to 2 per cent or better. For a sample of red Mr ≤ -20 galaxies, we achieve 2 per cent recovery at r ≳ 2 h-1 Mpc with EDHOD modelling but lower accuracy at smaller scales or with a standard HOD fit. Most of our mock galaxy samples are consistent with rgm = 1 down to r = 1 h-1 Mpc, to within the uncertainties set by our finite simulation volume.
From GPS tracks to context: Inference of high-level context information through spatial clustering
Moreira, Adriano; Santos, Maribel Yasmina
2005-01-01
Location-aware applications use the location of users to adapt their behaviour and to select the relevant information for users in a particular situation. This location information is obtained through a set of location sensors, or from network-based location services, and is often used directly, without any further processing, as a parameter in a selection process. In this paper we propose a method to infer high-level context information from a series of position records obtained from a GPS r...
Pearson's chi-square test and rank correlation inferences for clustered data.
Shih, Joanna H; Fay, Michael P
2017-09-01
Pearson's chi-square test has been widely used in testing for association between two categorical responses. Spearman rank correlation and Kendall's tau are often used for measuring and testing association between two continuous or ordered categorical responses. However, the established statistical properties of these tests are only valid when each pair of responses are independent, where each sampling unit has only one pair of responses. When each sampling unit consists of a cluster of paired responses, the assumption of independent pairs is violated. In this article, we apply the within-cluster resampling technique to U-statistics to form new tests and rank-based correlation estimators for possibly tied clustered data. We develop large sample properties of the new proposed tests and estimators and evaluate their performance by simulations. The proposed methods are applied to a data set collected from a PET/CT imaging study for illustration. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
REEXAMINING THE LITHIUM DEPLETION BOUNDARY IN THE PLEIADES AND THE INFERRED AGE OF THE CLUSTER
Energy Technology Data Exchange (ETDEWEB)
Dahm, S. E. [W. M. Keck Observatory, Kamuela, HI 96743 (United States)
2015-11-10
Moderate-dispersion (R ∼ 5400), optical spectroscopy of seven brown dwarf candidate members of the Pleiades was obtained using the Echellette Spectrograph and Imager on the Keck II telescope. The proper motion and photometrically selected sample lies on the single-star main sequence of the cluster and effectively brackets the established lithium depletion boundary. The brown dwarf candidates range in spectral type from M6 to M7, implying effective temperatures between ∼2800 and 2650 K. All sources exhibit Hα emission, consistent with enhanced chromospheric activity that is expected for young, very low-mass stars and brown dwarfs. Li i λ6708 absorption is confidently detected in the photospheres of two of the seven sources. A revised lithium depletion boundary is established in the near-infrared where the effects of extinction and variability are minimized. This lithium depletion edge occurs near K{sub o} = 14.45 or M{sub K} = 8.78 mag (UKIRT Infrared Deep Sky Survey), assuming the most accurate and precise distance estimate for the cluster of 136.2 pc. From recent theoretical evolutionary models, a revised age of τ = 112 ± 5 Myr is determined for the Pleiades. Accounting for the effects of magnetic activity on the photospheres of these very low-mass stars and brown dwarfs, however, would imply an even younger age for the cluster of ∼100 Myr.
Husein, A. M.; Harahap, M.; Aisyah, S.; Purba, W.; Muhazir, A.
2018-03-01
Medication planning aim to get types, amount of medicine according to needs, and avoid the emptiness medicine based on patterns of disease. In making the medicine planning is still rely on ability and leadership experience, this is due to take a long time, skill, difficult to obtain a definite disease data, need a good record keeping and reporting, and the dependence of the budget resulted in planning is not going well, and lead to frequent lack and excess of medicines. In this research, we propose Adaptive Neuro Fuzzy Inference System (ANFIS) method to predict medication needs in 2016 and 2017 based on medical data in 2015 and 2016 from two source of hospital. The framework of analysis using two approaches. The first phase is implementing ANFIS to a data source, while the second approach we keep using ANFIS, but after the process of clustering from K-Means algorithm, both approaches are calculated values of Root Mean Square Error (RMSE) for training and testing. From the testing result, the proposed method with better prediction rates based on the evaluation analysis of quantitative and qualitative compared with existing systems, however the implementation of K-Means Algorithm against ANFIS have an effect on the timing of the training process and provide a classification accuracy significantly better without clustering.
Magnetic field gradients inferred from multi-point measurements of Cluster FGM and EDI
Teubenbacher, Robert; Nakamura, Rumi; Giner, Lukas; Plaschke, Ferdinand; Baumjohann, Wolfgang; Magnes, Werner; Eichelberger, Hans; Steller, Manfred; Torbert, Roy
2013-04-01
We use Cluster data from fluxgate magnetometer (FGM) and electron drift instrument (EDI) to determine the magnetic field gradients in the near-Earth magnetotail. Here we use the magnetic field data from FGM measurements as well as the gyro-time data of electrons determined from the time of flight measurements of EDI. The results are compared with the values estimated from empirical magnetic field models for different magnetospheric conditions. We also estimated the spin axis offset of FGM based on comparison between EDI and FGM data and discuss the possible effect in determining the current sheet characteristics.
Galactic globular cluster NGC 6752 and its stellar population as inferred from multicolor photometry
Energy Technology Data Exchange (ETDEWEB)
Kravtsov, Valery [Instituto de Astronomía, Universidad Católica del Norte, Avenida Angamos 0610, Casilla 1280, Antofagasta (Chile); Alcaíno, Gonzalo [Isaac Newton Institute of Chile, Ministerio de Educación de Chile, Casilla 8-9, Correo 9, Santiago (Chile); Marconi, Gianni; Alvarado, Franklin, E-mail: vkravtsov@ucn.cl, E-mail: inewton@terra.cl, E-mail: falvarad@eso.org, E-mail: gmarconi@eso.org [ESO-European Southern Observatory, Alonso de Cordova 3107, Vitacura, Santiago (Chile)
2014-03-01
This paper is devoted to photometric study of the Galactic globular cluster (GGC) NGC 6752 in UBVI, focusing on the multiplicity of its stellar population. We emphasize that our U passband is (1) narrower than the standard one due to its smaller extension blueward and (2) redshifted by ∼300 Å relative to its counterparts, such as the HST F336W filter. Accordingly, both the spectral features encompassed by it and photometric effects of the multiplicity revealed in our study are somewhat different than in recent studies of NGC 6752. Main sequence stars bluer in U – B are less centrally concentrated, as red giants are. We find a statistically significant increasing luminosity of the red giant branch (RGB) bump of ΔU ≈ 0.2 mag toward the cluster outskirts with no so obvious effect in V. The photometric results are correlated with spectroscopic data: the bluer RGB stars in U – B have lower nitrogen abundances. We draw attention to a larger width of the RGB than the blue horizontal branch (BHB) in U – B. This seems to agree with the effects predicted to be caused by molecular bands produced by nitrogen-containing molecules. We find that brighter BHB stars, especially the brightest ones, are more centrally concentrated. This implies that red giants that are redder in U – B, i.e., more nitrogen enriched and centrally concentrated, are the main progenitors of the brighter BHB stars. However, such a progenitor-progeny relationship disagrees with theoretical predictions and with the results on the elemental abundances in horizontal branch stars. We isolated the asymptotic giant branch clump and estimated the parameter ΔV{sub ZAHB}{sup clump} = 0.98 ± 0.12.
ACTION-SPACE CLUSTERING OF TIDAL STREAMS TO INFER THE GALACTIC POTENTIAL
Energy Technology Data Exchange (ETDEWEB)
Sanderson, Robyn E.; Helmi, Amina [Kapteyn Astronomical Institute, P.O. Box 800, 9700 AV Groningen (Netherlands); Hogg, David W., E-mail: robyn@astro.columbia.edu [Center for Cosmology and Particle Physics, Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States)
2015-03-10
We present a new method for constraining the Milky Way halo gravitational potential by simultaneously fitting multiple tidal streams. This method requires three-dimensional positions and velocities for all stars to be fit, but does not require identification of any specific stream or determination of stream membership for any star. We exploit the principle that the action distribution of stream stars is most clustered when the potential used to calculate the actions is closest to the true potential. Clustering is quantified with the Kullback-Leibler Divergence (KLD), which also provides conditional uncertainties for our parameter estimates. We show, for toy Gaia-like data in a spherical isochrone potential, that maximizing the KLD of the action distribution relative to a smoother distribution recovers the input potential. The precision depends on the observational errors and number of streams; using K III giants as tracers, we measure the enclosed mass at the average radius of the sample stars accurate to 3% and precise to 20%-40%. Recovery of the scale radius is precise to 25%, biased 50% high by the small galactocentric distance range of stars in our mock sample (1-25 kpc, or about three scale radii, with mean 6.5 kpc). 20-25 streams with at least 100 stars each are required for a stable confidence interval. With radial velocities (RVs) to 100 kpc, all parameters are determined with ∼10% accuracy and 20% precision (1.3% accuracy for the enclosed mass), underlining the need to complete the RV catalog for faint halo stars observed by Gaia.
Lovasi, Gina S; Fink, David S; Mooney, Stephen J; Link, Bruce G
2017-12-01
Accounting for non-independence in health research often warrants attention. Particularly, the availability of geographic information systems data has increased the ease with which studies can add measures of the local "neighborhood" even if participant recruitment was through other contexts, such as schools or clinics. We highlight a tension between two perspectives that is often present, but particularly salient when more than one type of potentially health-relevant context is indexed (e.g., both neighborhood and school). On the one hand, a model-based perspective emphasizes the processes producing outcome variation, and observed data are used to make inference about that process. On the other hand, a design-based perspective emphasizes inference to a well-defined finite population, and is commonly invoked by those using complex survey samples or those with responsibility for the health of local residents. These two perspectives have divergent implications when deciding whether clustering must be accounted for analytically and how to select among candidate cluster definitions, though the perspectives are by no means monolithic. There are tensions within each perspective as well as between perspectives. We aim to provide insight into these perspectives and their implications for population health researchers. We focus on the crucial step of deciding which cluster definition or definitions to use at the analysis stage, as this has consequences for all subsequent analytic and interpretational challenges with potentially clustered data.
Directory of Open Access Journals (Sweden)
Gina S. Lovasi
2017-12-01
Full Text Available Accounting for non-independence in health research often warrants attention. Particularly, the availability of geographic information systems data has increased the ease with which studies can add measures of the local “neighborhood” even if participant recruitment was through other contexts, such as schools or clinics. We highlight a tension between two perspectives that is often present, but particularly salient when more than one type of potentially health-relevant context is indexed (e.g., both neighborhood and school. On the one hand, a model-based perspective emphasizes the processes producing outcome variation, and observed data are used to make inference about that process. On the other hand, a design-based perspective emphasizes inference to a well-defined finite population, and is commonly invoked by those using complex survey samples or those with responsibility for the health of local residents. These two perspectives have divergent implications when deciding whether clustering must be accounted for analytically and how to select among candidate cluster definitions, though the perspectives are by no means monolithic. There are tensions within each perspective as well as between perspectives. We aim to provide insight into these perspectives and their implications for population health researchers. We focus on the crucial step of deciding which cluster definition or definitions to use at the analysis stage, as this has consequences for all subsequent analytic and interpretational challenges with potentially clustered data.
DEFF Research Database (Denmark)
Hekker, S.; Basu, S.; Stello, D.
2011-01-01
and metallicity contribute to the observed difference in locations in the H-R diagram of the old metal-rich cluster NGC 6791 and the middle-aged solar-metallicity cluster NGC 6819. For the young cluster NGC 6811, the explanation of the position of the stars in the H-R diagram challenges the assumption of solar...
Directory of Open Access Journals (Sweden)
Fanny Perraudeau
2017-07-01
Full Text Available Novel single-cell transcriptome sequencing assays allow researchers to measure gene expression levels at the resolution of single cells and offer the unprecendented opportunity to investigate at the molecular level fundamental biological questions, such as stem cell differentiation or the discovery and characterization of rare cell types. However, such assays raise challenging statistical and computational questions and require the development of novel methodology and software. Using stem cell differentiation in the mouse olfactory epithelium as a case study, this integrated workflow provides a step-by-step tutorial to the methodology and associated software for the following four main tasks: (1 dimensionality reduction accounting for zero inflation and over dispersion and adjusting for gene and cell-level covariates; (2 cell clustering using resampling-based sequential ensemble clustering; (3 inference of cell lineages and pseudotimes; and (4 differential expression analysis along lineages.
Directory of Open Access Journals (Sweden)
Wills Rachael A
2009-05-01
Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Weiler, Martin; Nakamura, Takashi; Sekiya, Hiroshi; Dopfer, Otto; Miyazaki, Mitsuhiko; Fujii, Masaaki
2012-12-07
We present the resonance-enhanced multiphoton ionization, infrared-ultraviolet hole burning (IR-UV HB), and IR dip spectra of the trans-acetanilide-methanol (AA-MeOH) cluster in the S(0), S(1), and cationic ground state (D(0)) in a supersonic jet. The IR-UV HB spectra demonstrate the co-existence of two isomers in S(0,1), in which MeOH binds either to the NH or the CO site of the peptide linkage in AA, denoted as AA(NH)-MeOH and AA(CO)-MeOH. When AA(CO)-MeOH is selectively ionized, its IR spectrum in D(0) is the same as that measured for AA(+) (NH)-MeOH. Thus, photoionization of AA(CO)-MeOH induces migration of MeOH from the CO to the NH site with 100% yield. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Indian Academy of Sciences (India)
2017-09-27
Sep 27, 2017 ... Author for correspondence (zh4403701@126.com). MS received 15 ... lic clusters using density functional theory (DFT)-GGA of the DMOL3 package. ... In the process of geometric optimization, con- vergence thresholds ..... and Postgraduate Research & Practice Innovation Program of. Jiangsu Province ...
Indian Academy of Sciences (India)
environmental as well as technical problems during fuel gas utilization. ... adsorption on some alloys of Pd, namely PdAu, PdAg ... ried out on small neutral and charged Au24,26,27, Cu,28 ... study of Zanti et al.29 on Pdn (n = 1–9) clusters.
Directory of Open Access Journals (Sweden)
N. A. Tsyganenko
2009-04-01
Full Text Available A detailed statistical study of the magnetic structure of the dayside polar cusps is presented, based on multi-year sets of magnetometer data of Polar and Cluster spacecraft, taken in 1996–2006 and 2001–2007, respectively. Thanks to the dense data coverage in both Northern and Southern Hemispheres, the analysis spanned nearly the entire length of the cusps, from low altitudes to the cusp "throat" and the magnetosheath. Subsets of data falling inside the polar cusp "funnels" were selected with the help of TS05 and IGRF magnetic field models, taking into account the dipole tilt and the solar wind/IMF conditions. The selection funnels were shifted within ±10° of SM latitude around the model cusp location, and linear regression parameters were calculated for each sliding subset, further divided into 10 bins of distance in the range 2≤R≤12 RE, with the following results. (1 Diamagnetic depression, caused by the penetrated magnetosheath plasma, becomes first visible at R~4–5 RE, rapidly deepens with growing R, peaks at R~6–9 RE, and then partially subsides and widens in latitude at the cusp's outer end. (2 The depression peak is systematically shifted poleward (by ~2° of the footpoint latitude with respect to the model cusp field line, passing through the min{|B|} point at the magnetopause. (3 At all radial distances, clear and distinct peaks of the correlation between the local By and By(IMF and of the corresponding proportionality coefficient are observed. A remarkably regular variation of that coefficient with R quantitatively confirms the field-aligned geometry of the cusp currents associated with the IMF By, found in earlier observations.
TreeCluster: Massively scalable transmission clustering using phylogenetic trees
Moshiri, Alexander
2018-01-01
Background: The ability to infer transmission clusters from molecular data is critical to designing and evaluating viral control strategies. Viral sequencing datasets are growing rapidly, but standard methods of transmission cluster inference do not scale well beyond thousands of sequences. Results: I present TreeCluster, a cross-platform tool that performs transmission cluster inference on a given phylogenetic tree orders of magnitude faster than existing inference methods and supports multi...
International Nuclear Information System (INIS)
Schaeffer, R.
1987-01-01
The galaxy and cluster luminosity functions are constructed from a model of the mass distribution based on hierarchical clustering at an epoch where the matter distribution is non-linear. These luminosity functions are seen to reproduce the present distribution of objects as can be inferred from the observations. They can be used to deduce the redshift dependence of the cluster distribution and to extrapolate the observations towards the past. The predicted evolution of the cluster distribution is quite strong, although somewhat less rapid than predicted by the linear theory
Kim, Chan Moon; Parnichkun, Manukid
2017-11-01
Coagulation is an important process in drinking water treatment to attain acceptable treated water quality. However, the determination of coagulant dosage is still a challenging task for operators, because coagulation is nonlinear and complicated process. Feedback control to achieve the desired treated water quality is difficult due to lengthy process time. In this research, a hybrid of k-means clustering and adaptive neuro-fuzzy inference system ( k-means-ANFIS) is proposed for the settled water turbidity prediction and the optimal coagulant dosage determination using full-scale historical data. To build a well-adaptive model to different process states from influent water, raw water quality data are classified into four clusters according to its properties by a k-means clustering technique. The sub-models are developed individually on the basis of each clustered data set. Results reveal that the sub-models constructed by a hybrid k-means-ANFIS perform better than not only a single ANFIS model, but also seasonal models by artificial neural network (ANN). The finally completed model consisting of sub-models shows more accurate and consistent prediction ability than a single model of ANFIS and a single model of ANN based on all five evaluation indices. Therefore, the hybrid model of k-means-ANFIS can be employed as a robust tool for managing both treated water quality and production costs simultaneously.
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.
1995-01-01
The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is
Caticha, Ariel
2010-01-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...
Aggelopoulos, Nikolaos C
2015-08-01
Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.
International Nuclear Information System (INIS)
Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)
1985-01-01
The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references
Rodríguez-Ezpeleta, Naiara
2016-03-03
Restriction-site associated DNA sequencing (RAD-seq) and related methods are revolutionizing the field of population genomics in non-model organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD-seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under- or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD-seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD-seq data analysis strategies on population structure inferences that are directly applicable to other species.
Directory of Open Access Journals (Sweden)
Evanita Evanita
2016-04-01
Full Text Available Di Indonesia kepadatan arus lalu lintas terjadi pada jam berangkat dan pulang kantor, hari-hari libur panjang atau hari-hari besar nasional terutama saat hari raya Idul Fitri (lebaran. Mudik sudah menjadi tradisi bagi masyarakat Indonesia yang ditunggu-tunggu menjelang lebaran, berbondong-bondong untuk pulang ke kampung halaman untuk bertemu dan berkumpul dengan keluarga. Kegiatan rutin tahunan ini banyak di lakukan khususnya bagi masyarakat kota-kota besar seperti Jakarta, dimana diketahui bahwa Jakarta adalah Ibu kota negara Republik Indonesia dan menjadi tujuan merantau untuk mencari pekerjaan yang lebih layak yang merupakan harapan besar bagi masyarakat desa. Volume kendaraan bertambah sejak 7 hari menjelang lebaran sampai 7 hari setelah lebaran tiap tahunnya terutama pada arah keluar dan masuk wilayah Jawa Tengah yang banyak menjadi tujuan mudik. Volume kendaraan saat arus mudik yang selalu meningkat inilah yang akan diteliti lebih lanjut dengan metode ANFIS agar dapat menjadi alternatif solusi langkah apa yang akan dilakukan di tahun selanjutnya agar pelayanan lalu lintas, kemacetan panjang dan angka kecelakaan berkurang. Dengan input parameter ANFIS yang digunakan yaitu pengclusteran hingga 5 cluster, epoch 100, error goal 0 diperoleh performa terbaik ANFIS dengan K-Means clustering yang terbagi menjadi 3 cluster, epoch terbaik sebesar 20 dengan RMSE Training terbaik sebesar 0,1198, RMSE Testing terbaik sebesar 0,0282 dan waktu proses tersingkat sebesar 0,0695.Selanjutnya hasil prediksi diharapkan dapat bermanfaat menjadi alternatif solusi langkah apa yang akan dilakukan di tahun selanjutnya agar pelayanan lalu lintas lebih baik lagi. Kata kunci: angkutan lebaran, Jawa Tengah, ANFIS.
Rohatgi, Vijay K
2003-01-01
Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth
Directory of Open Access Journals (Sweden)
Lucy van Dorp
2015-08-01
Full Text Available The Ari peoples of Ethiopia are comprised of different occupational groups that can be distinguished genetically, with Ari Cultivators and the socially marginalised Ari Blacksmiths recently shown to have a similar level of genetic differentiation between them (FST ≈ 0.023 - 0.04 as that observed among multiple ethnic groups sampled throughout Ethiopia. Anthropologists have proposed two competing theories to explain the origins of the Ari Blacksmiths as (i remnants of a population that inhabited Ethiopia prior to the arrival of agriculturists (e.g. Cultivators, or (ii relatively recently related to the Cultivators but presently marginalized in the community due to their trade. Two recent studies by different groups analysed genome-wide DNA from samples of Ari Blacksmiths and Cultivators and suggested that genetic patterns between the two groups were more consistent with model (i and subsequent assimilation of the indigenous peoples into the expanding agriculturalist community. We analysed the same samples using approaches designed to attenuate signals of genetic differentiation that are attributable to allelic drift within a population. By doing so, we provide evidence that the genetic differences between Ari Blacksmiths and Cultivators can be entirely explained by bottleneck effects consistent with hypothesis (ii. This finding serves as both a cautionary tale about interpreting results from unsupervised clustering algorithms, and suggests that social constructions are contributing directly to genetic differentiation over a relatively short time period among previously genetically similar groups.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Clustering of near clusters versus cluster compactness
International Nuclear Information System (INIS)
Yu Gao; Yipeng Jing
1989-01-01
The clustering properties of near Zwicky clusters are studied by using the two-point angular correlation function. The angular correlation functions for compact and medium compact clusters, for open clusters, and for all near Zwicky clusters are estimated. The results show much stronger clustering for compact and medium compact clusters than for open clusters, and that open clusters have nearly the same clustering strength as galaxies. A detailed study of the compactness-dependence of correlation function strength is worth investigating. (author)
DEFF Research Database (Denmark)
Andersen, Jesper
2009-01-01
Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....
Energy Technology Data Exchange (ETDEWEB)
Petrov, S.
1996-10-01
Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Geometric statistical inference
International Nuclear Information System (INIS)
Periwal, Vipul
1999-01-01
A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined
Bailer-Jones, Coryn A. L.
2017-04-01
Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
Data-driven inference for the spatial scan statistic
Directory of Open Access Journals (Sweden)
Duczmal Luiz H
2011-08-01
Full Text Available Abstract Background Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. Results A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. Conclusions A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Data-driven inference for the spatial scan statistic.
Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C
2011-08-02
Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Nagao, Makoto
1990-01-01
Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig
Logical inference and evaluation
International Nuclear Information System (INIS)
Perey, F.G.
1981-01-01
Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given
Subjective randomness as statistical inference.
Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B
2018-06-01
Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.
Probability and Statistical Inference
Prosper, Harrison B.
2006-01-01
These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.
On quantum statistical inference
Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.
2003-01-01
Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have
2018-02-15
expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation
Introductory statistical inference
Mukhopadhyay, Nitis
2014-01-01
This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist
The cluster bootstrap consistency in generalized estimating equations
Cheng, Guang
2013-03-01
The cluster bootstrap resamples clusters or subjects instead of individual observations in order to preserve the dependence within each cluster or subject. In this paper, we provide a theoretical justification of using the cluster bootstrap for the inferences of the generalized estimating equations (GEE) for clustered/longitudinal data. Under the general exchangeable bootstrap weights, we show that the cluster bootstrap yields a consistent approximation of the distribution of the regression estimate, and a consistent approximation of the confidence sets. We also show that a computationally more efficient one-step version of the cluster bootstrap provides asymptotically equivalent inference. © 2012.
Convex Clustering: An Attractive Alternative to Hierarchical Clustering
Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth
2015-01-01
The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340
Mahgoub, Ahmed Nasser; Böhnel, Harald; Siebe, Claus; Salinas, Sergio; Guilbaud, Marie-Noëlle
2017-11-01
The paleomagnetic dating procedure was applied to a cluster of four partly overlapping monogenetic Holocene volcanoes and associated lava flows, namely La Tinaja, La Palma, Mesa La Muerta, and Malpaís de Cutzaróndiro, located in the Tacámbaro-Puruarán area, at the southeastern margin of the Michoacán-Guanajuato volcanic field. For this purpose, 21 sites distributed as far apart as possible from each other were sampled to obtain a well-averaged mean paleomagnetic direction for each single lava flow. For intensity determinations, double-heating Thellier experiments using the IZZI protocol were conducted on 55 selected samples. La Tinaja is the oldest of these flows and was dated by the 14C method at 5115 ± 130 years BP (cal 4184-3655 BCE). It is stratigraphically underneath the other three flows with Malpaís de Cutzaróndiro lava flow being the youngest. The paleomagnetic dating procedure was applied using the Matlab archaeo-dating tool in couple with the geomagnetic field model SHA.DIF.14k. Accordingly, for La Tinaja several possible age ranges were obtained, of which the range 3650-3480 BCE is closest to the 14C age. Paleomagnetic dating on La Palma produced a unique age range of 3220-2880 BCE. Two ages ranges of 2240-2070 BCE and 760-630 BCE were obtained for Mesa La Muerta and a well-constrained age of 420-320 BCE for Malpaís de Cutzaróndiro. Although systematic archaeological excavations have so far not been carried out in this area, it is possible that the younger eruptions were contemporary to local human occupation. Our paleomagnetic dates indicate that all four eruptions, although closely clustered in space, occurred separately in time with varying recurrence intervals ranging between 300 and 2300 years. This finding should be considered when constraining the nature of the magmatic plumbing system and developing a strategy aimed at reducing risk in the volcanically active Michoacán-Guanajuato volcanic field, where several young monogenetic volcano
Type Inference with Inequalities
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff
1991-01-01
of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...
Histamine headache; Headache - histamine; Migrainous neuralgia; Headache - cluster; Horton's headache; Vascular headache - cluster ... Doctors do not know exactly what causes cluster headaches. They ... (chemical in the body released during an allergic response) or ...
Watson, Jane
2007-01-01
Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…
Hybrid Optical Inference Machines
1991-09-27
with labels. Now, events. a set of facts cal be generated in the dyadic form "u, R 1,2" Eichmann and Caulfield (19] consider the same type of and can...these enceding-schemes. These architectures are-based pri- 19. G. Eichmann and H. J. Caulfield, "Optical Learning (Inference)marily on optical inner
Directory of Open Access Journals (Sweden)
Pablo Vinuesa
2018-05-01
Full Text Available The massive accumulation of genome-sequences in public databases promoted the proliferation of genome-level phylogenetic analyses in many areas of biological research. However, due to diverse evolutionary and genetic processes, many loci have undesirable properties for phylogenetic reconstruction. These, if undetected, can result in erroneous or biased estimates, particularly when estimating species trees from concatenated datasets. To deal with these problems, we developed GET_PHYLOMARKERS, a pipeline designed to identify high-quality markers to estimate robust genome phylogenies from the orthologous clusters, or the pan-genome matrix (PGM, computed by GET_HOMOLOGUES. In the first context, a set of sequential filters are applied to exclude recombinant alignments and those producing anomalous or poorly resolved trees. Multiple sequence alignments and maximum likelihood (ML phylogenies are computed in parallel on multi-core computers. A ML species tree is estimated from the concatenated set of top-ranking alignments at the DNA or protein levels, using either FastTree or IQ-TREE (IQT. The latter is used by default due to its superior performance revealed in an extensive benchmark analysis. In addition, parsimony and ML phylogenies can be estimated from the PGM. We demonstrate the practical utility of the software by analyzing 170 Stenotrophomonas genome sequences available in RefSeq and 10 new complete genomes of Mexican environmental S. maltophilia complex (Smc isolates reported herein. A combination of core-genome and PGM analyses was used to revise the molecular systematics of the genus. An unsupervised learning approach that uses a goodness of clustering statistic identified 20 groups within the Smc at a core-genome average nucleotide identity (cgANIb of 95.9% that are perfectly consistent with strongly supported clades on the core- and pan-genome trees. In addition, we identified 16 misclassified RefSeq genome sequences, 14 of them labeled as
Vinuesa, Pablo; Ochoa-Sánchez, Luz E; Contreras-Moreira, Bruno
2018-01-01
The massive accumulation of genome-sequences in public databases promoted the proliferation of genome-level phylogenetic analyses in many areas of biological research. However, due to diverse evolutionary and genetic processes, many loci have undesirable properties for phylogenetic reconstruction. These, if undetected, can result in erroneous or biased estimates, particularly when estimating species trees from concatenated datasets. To deal with these problems, we developed GET_PHYLOMARKERS, a pipeline designed to identify high-quality markers to estimate robust genome phylogenies from the orthologous clusters, or the pan-genome matrix (PGM), computed by GET_HOMOLOGUES. In the first context, a set of sequential filters are applied to exclude recombinant alignments and those producing anomalous or poorly resolved trees. Multiple sequence alignments and maximum likelihood (ML) phylogenies are computed in parallel on multi-core computers. A ML species tree is estimated from the concatenated set of top-ranking alignments at the DNA or protein levels, using either FastTree or IQ-TREE (IQT). The latter is used by default due to its superior performance revealed in an extensive benchmark analysis. In addition, parsimony and ML phylogenies can be estimated from the PGM. We demonstrate the practical utility of the software by analyzing 170 Stenotrophomonas genome sequences available in RefSeq and 10 new complete genomes of Mexican environmental S. maltophilia complex (Smc) isolates reported herein. A combination of core-genome and PGM analyses was used to revise the molecular systematics of the genus. An unsupervised learning approach that uses a goodness of clustering statistic identified 20 groups within the Smc at a core-genome average nucleotide identity (cgANIb) of 95.9% that are perfectly consistent with strongly supported clades on the core- and pan-genome trees. In addition, we identified 16 misclassified RefSeq genome sequences, 14 of them labeled as S. maltophilia
Inverse Ising inference with correlated samples
International Nuclear Information System (INIS)
Obermayer, Benedikt; Levine, Erel
2014-01-01
Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)
Inference rule and problem solving
Energy Technology Data Exchange (ETDEWEB)
Goto, S
1982-04-01
Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.
Stochastic processes inference theory
Rao, Malempati M
2014-01-01
This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.
Making Type Inference Practical
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens
1992-01-01
We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...
Directory of Open Access Journals (Sweden)
João Paulo Monteiro
2001-12-01
Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.
Causal inference in econometrics
Kreinovich, Vladik; Sriboonchitta, Songsak
2016-01-01
This book is devoted to the analysis of causal inference which is one of the most difficult tasks in data analysis: when two phenomena are observed to be related, it is often difficult to decide whether one of them causally influences the other one, or whether these two phenomena have a common cause. This analysis is the main focus of this volume. To get a good understanding of the causal inference, it is important to have models of economic phenomena which are as accurate as possible. Because of this need, this volume also contains papers that use non-traditional economic models, such as fuzzy models and models obtained by using neural networks and data mining techniques. It also contains papers that apply different econometric models to analyze real-life economic dependencies.
Active inference and learning.
Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni
2016-09-01
This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
DEFF Research Database (Denmark)
Ackerman, Margareta; Ben-David, Shai; Branzei, Simina
2012-01-01
We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...
Learning Convex Inference of Marginals
Domke, Justin
2012-01-01
Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...
Probabilistic inductive inference: a survey
Ambainis, Andris
2001-01-01
Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.
Multimodel inference and adaptive management
Rehme, S.E.; Powell, L.A.; Allen, Craig R.
2011-01-01
Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.
Inferring time-varying network topologies from gene expression data.
Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas
2007-01-01
Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
Katz, R
1992-11-01
Cluster management is a management model that fosters decentralization of management, develops leadership potential of staff, and creates ownership of unit-based goals. Unlike shared governance models, there is no formal structure created by committees and it is less threatening for managers. There are two parts to the cluster management model. One is the formation of cluster groups, consisting of all staff and facilitated by a cluster leader. The cluster groups function for communication and problem-solving. The second part of the cluster management model is the creation of task forces. These task forces are designed to work on short-term goals, usually in response to solving one of the unit's goals. Sometimes the task forces are used for quality improvement or system problems. Clusters are groups of not more than five or six staff members, facilitated by a cluster leader. A cluster is made up of individuals who work the same shift. For example, people with job titles who work days would be in a cluster. There would be registered nurses, licensed practical nurses, nursing assistants, and unit clerks in the cluster. The cluster leader is chosen by the manager based on certain criteria and is trained for this specialized role. The concept of cluster management, criteria for choosing leaders, training for leaders, using cluster groups to solve quality improvement issues, and the learning process necessary for manager support are described.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Emotional inferences by pragmatics
Iza-Miqueleiz, Mauricio
2017-01-01
It has for long been taken for granted that, along the course of reading a text, world knowledge is often required in order to establish coherent links between sentences (McKoon & Ratcliff 1992, Iza & Ezquerro 2000). The content grasped from a text turns out to be strongly dependent upon the reader’s additional knowledge that allows a coherent interpretation of the text as a whole. The world knowledge directing the inference may be of distinctive nature. Gygax et al. (2007) showed that m...
DEFF Research Database (Denmark)
Andersen, Jesper; Lawall, Julia
2010-01-01
A key issue in maintaining Linux device drivers is the need to keep them up to date with respect to evolutions in Linux internal libraries. Currently, there is little tool support for performing and documenting such changes. In this paper we present a tool, spdiff, that identifies common changes...... developers can use it to extract an abstract representation of the set of changes that others have made. Our experiments on recent changes in Linux show that the inferred generic patches are more concise than the corresponding patches found in commits to the Linux source tree while being safe with respect...
International Nuclear Information System (INIS)
Geraedts, J.M.P.
1983-01-01
Spectra of isotopically mixed clusters (dimers of SF 6 ) are calculated as well as transition frequencies. The result leads to speculations about the suitability of the laser-cluster fragmentation process for isotope separation. (Auth.)
... a role. Unlike migraine and tension headache, cluster headache generally isn't associated with triggers, such as foods, hormonal changes or stress. Once a cluster period begins, however, drinking alcohol ...
Ensemble stacking mitigates biases in inference of synaptic connectivity.
Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N
2018-01-01
A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.
Pearce, Iris
1985-01-01
Cluster headache is the most severe primary headache with recurrent pain attacks described as worse than giving birth. The aim of this paper was to make an overview of current knowledge on cluster headache with a focus on pathophysiology and treatment. This paper presents hypotheses of cluster headache pathophysiology, current treatment options and possible future therapy approaches. For years, the hypothalamus was regarded as the key structure in cluster headache, but is now thought to be pa...
Queiroz, Dayane Andrade
2015-01-01
Neste trabalho apresentamos as categorias cluster, que foram introduzidas por Aslak Bakke Buan, Robert Marsh, Markus Reineke, Idun Reiten e Gordana Todorov, com o objetivo de categoriíicar as algebras cluster criadas em 2002 por Sergey Fomin e Andrei Zelevinsky. Os autores acima, em [4], mostraram que existe uma estreita relação entre algebras cluster e categorias cluster para quivers cujo grafo subjacente é um diagrama de Dynkin. Para isto desenvolveram uma teoria tilting na estrutura triang...
Energy Technology Data Exchange (ETDEWEB)
Sanfilippo, Antonio P.; Calapristi, Augustin J.; Crow, Vernon L.; Hetzler, Elizabeth G.; Turner, Alan E.
2004-05-26
We present an approach to the disambiguation of cluster labels that capitalizes on the notion of semantic similarity to assign WordNet senses to cluster labels. The approach provides interesting insights on how document clustering can provide the basis for developing a novel approach to word sense disambiguation.
SHERSTIUK S.V.; POSYLAYEVA K.I.
2013-01-01
In the article there are the theoretical and methodological approaches to the nature and existence of the cluster. The cluster differences from other kinds of cooperative and integration associations. Was develop by scientific-practical recommendations for forming a competitive horticultur cluster.
DEFF Research Database (Denmark)
Gulati, Mukesh; Lund-Thomsen, Peter; Suresh, Sangeetha
2018-01-01
sell their products successfully in international markets, but there is also an increasingly large consumer base within India. Indeed, Indian industrial clusters have contributed to a substantial part of this growth process, and there are several hundred registered clusters within the country...... of this handbook, which focuses on the role of CSR in MSMEs. Hence we contribute to the literature on CSR in industrial clusters and specifically CSR in Indian industrial clusters by investigating the drivers of CSR in India’s industrial clusters....
Likelihood-Based Inference of B Cell Clonal Families.
Directory of Open Access Journals (Sweden)
Duncan K Ralph
2016-10-01
Full Text Available The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called "rearrangement" forming progenitor B cells, then a Darwinian process of lineage diversification and selection called "affinity maturation." The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem "clonal family inference." In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.
Wagstaff, Kiri L.
2012-03-01
On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained
Feature Inference Learning and Eyetracking
Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.
2009-01-01
Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…
An Inference Language for Imaging
DEFF Research Database (Denmark)
Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen
2014-01-01
We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...
Energy Technology Data Exchange (ETDEWEB)
Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ahn, Sungsoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of); Shin, Jinwoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of)
2017-05-25
Computing partition function is the most important statistical inference task arising in applications of Graphical Models (GM). Since it is computationally intractable, approximate methods have been used to resolve the issue in practice, where meanfield (MF) and belief propagation (BP) are arguably the most popular and successful approaches of a variational type. In this paper, we propose two new variational schemes, coined Gauged-MF (G-MF) and Gauged-BP (G-BP), improving MF and BP, respectively. Both provide lower bounds for the partition function by utilizing the so-called gauge transformation which modifies factors of GM while keeping the partition function invariant. Moreover, we prove that both G-MF and G-BP are exact for GMs with a single loop of a special structure, even though the bare MF and BP perform badly in this case. Our extensive experiments, on complete GMs of relatively small size and on large GM (up-to 300 variables) confirm that the newly proposed algorithms outperform and generalize MF and BP.
Social Inference Through Technology
Oulasvirta, Antti
Awareness cues are computer-mediated, real-time indicators of people’s undertakings, whereabouts, and intentions. Already in the mid-1970 s, UNIX users could use commands such as “finger” and “talk” to find out who was online and to chat. The small icons in instant messaging (IM) applications that indicate coconversants’ presence in the discussion space are the successors of “finger” output. Similar indicators can be found in online communities, media-sharing services, Internet relay chat (IRC), and location-based messaging applications. But presence and availability indicators are only the tip of the iceberg. Technological progress has enabled richer, more accurate, and more intimate indicators. For example, there are mobile services that allow friends to query and follow each other’s locations. Remote monitoring systems developed for health care allow relatives and doctors to assess the wellbeing of homebound patients (see, e.g., Tang and Venables 2000). But users also utilize cues that have not been deliberately designed for this purpose. For example, online gamers pay attention to other characters’ behavior to infer what the other players are like “in real life.” There is a common denominator underlying these examples: shared activities rely on the technology’s representation of the remote person. The other human being is not physically present but present only through a narrow technological channel.
An intelligent clustering based methodology for confusable diseases ...
African Journals Online (AJOL)
Journal of Computer Science and Its Application ... In this paper, an intelligent system driven by fuzzy clustering algorithm and Adaptive Neuro-Fuzzy Inference System for ... Data on patients diagnosed and confirmed by laboratory tests of viral ...
Ensemble stacking mitigates biases in inference of synaptic connectivity
Directory of Open Access Journals (Sweden)
Brendan Chambers
2018-03-01
Full Text Available A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches. Mapping the routing of spikes through local circuitry is crucial for understanding neocortical computation. Under appropriate experimental conditions, these maps can be used to infer likely patterns of synaptic recruitment, linking activity to underlying anatomical connections. Such inferences help to reveal the synaptic implementation of population dynamics and computation. We compare a number of standard functional measures to infer underlying connectivity. We find that regularization impacts measures
Optimization methods for logical inference
Chandru, Vijay
2011-01-01
Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in
Minku, Leandro L.
2017-10-06
Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.
International Nuclear Information System (INIS)
Romli
1997-01-01
Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods
Everitt, Brian S; Leese, Morven; Stahl, Daniel
2011-01-01
Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons
DEFF Research Database (Denmark)
Böcker, S.; Baumbach, Jan
2013-01-01
. The problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications......The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side...
On principles of inductive inference
Kostecki, Ryszard Paweł
2011-01-01
We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.
Statistical inference via fiducial methods
Salomé, Diemer
1998-01-01
In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary
Statistical inference for stochastic processes
National Research Council Canada - National Science Library
Basawa, Ishwar V; Prakasa Rao, B. L. S
1980-01-01
The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...
Mapping Dark Matter in Simulated Galaxy Clusters
Bowyer, Rachel
2018-01-01
Galaxy clusters are the most massive bound objects in the Universe with most of their mass being dark matter. Cosmological simulations of structure formation show that clusters are embedded in a cosmic web of dark matter filaments and large scale structure. It is thought that these filaments are found preferentially close to the long axes of clusters. We extract galaxy clusters from the simulations "cosmo-OWLS" in order to study their properties directly and also to infer their properties from weak gravitational lensing signatures. We investigate various stacking procedures to enhance the signal of the filaments and large scale structure surrounding the clusters to better understand how the filaments of the cosmic web connect with galaxy clusters. This project was supported in part by the NSF REU grant AST-1358980 and by the Nantucket Maria Mitchell Association.
Dynamics of Galaxy Clusters and their Outskirts
DEFF Research Database (Denmark)
Falco, Martina
Galaxy clusters have demonstrated to be powerful probes of cosmology, since their mass and abundance depend on the cosmological model that describes the Universe and on the gravitational formation process of cosmological structures. The main challenge in using clusters to constrain cosmology...... is that their masses cannot be measured directly, but need to be inferred indirectly through their observable properties. The most common methods extract the cluster mass from their strong X-ray emission or from the measured redshifts of the galaxy members. The gravitational lensing effect caused by clusters...... on the background galaxies is also an important trace of their total mass distribution.In the work presented within this thesis, we exploit the connection between the gravitational potential of galaxy clusters and the kinematical properties of their surroundings, in order to determine the total cluster mass...
Active inference, communication and hermeneutics.
Friston, Karl J; Frith, Christopher D
2015-07-01
Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Pottawattamie County School System, Council Bluffs, IA.
The 15 occupational clusters (transportation, fine arts and humanities, communications and media, personal service occupations, construction, hospitality and recreation, health occupations, marine science occupations, consumer and homemaking-related occupations, agribusiness and natural resources, environment, public service, business and office…
DEFF Research Database (Denmark)
Berks, G.; Keyserlingk, Diedrich Graf von; Jantzen, Jan
2000-01-01
A symptom is a condition indicating the presence of a disease, especially, when regarded as an aid in diagnosis.Symptoms are the smallest units indicating the existence of a disease. A syndrome on the other hand is an aggregate, set or cluster of concurrent symptoms which together indicate...... and clustering are the basic concerns in medicine. Classification depends on definitions of the classes and their required degree of participant of the elements in the cases' symptoms. In medicine imprecise conditions are the rule and therefore fuzzy methods are much more suitable than crisp ones. Fuzzy c......-mean clustering is an easy and well improved tool, which has been applied in many medical fields. We used c-mean fuzzy clustering after feature extraction from an aphasia database. Factor analysis was applied on a correlation matrix of 26 symptoms of language disorders and led to five factors. The factors...
Non-parametric co-clustering of large scale sparse bipartite networks on the GPU
DEFF Research Database (Denmark)
Hansen, Toke Jansen; Mørup, Morten; Hansen, Lars Kai
2011-01-01
of row and column clusters from a hypothesis space of an infinite number of clusters. To reach large scale applications of co-clustering we exploit that parameter inference for co-clustering is well suited for parallel computing. We develop a generic GPU framework for efficient inference on large scale...... sparse bipartite networks and achieve a speedup of two orders of magnitude compared to estimation based on conventional CPUs. In terms of scalability we find for networks with more than 100 million links that reliable inference can be achieved in less than an hour on a single GPU. To efficiently manage...
Donchev, Todor I [Urbana, IL; Petrov, Ivan G [Champaign, IL
2011-05-31
Described herein is an apparatus and a method for producing atom clusters based on a gas discharge within a hollow cathode. The hollow cathode includes one or more walls. The one or more walls define a sputtering chamber within the hollow cathode and include a material to be sputtered. A hollow anode is positioned at an end of the sputtering chamber, and atom clusters are formed when a gas discharge is generated between the hollow anode and the hollow cathode.
Massey, Richard; Kitching, Thomas; Nagai, Daisuke
2010-01-01
The unique properties of dark matter are revealed during collisions between clusters of galaxies, such as the bullet cluster (1E 0657−56) and baby bullet (MACS J0025−12). These systems provide evidence for an additional, invisible mass in the separation between the distributions of their total mass, measured via gravitational lensing, and their ordinary ‘baryonic’ matter, measured via its X-ray emission. Unfortunately, the information available from these systems is limited by their rarity. C...
Leroux, Elizabeth; Ducros, Anne
2008-01-01
Abstract Cluster headache (CH) is a primary headache disease characterized by recurrent short-lasting attacks (15 to 180 minutes) of excruciating unilateral periorbital pain accompanied by ipsilateral autonomic signs (lacrimation, nasal congestion, ptosis, miosis, lid edema, redness of the eye). It affects young adults, predominantly males. Prevalence is estimated at 0.5–1.0/1,000. CH has a circannual and circadian periodicity, attacks being clustered (hence the name) in bouts that can occur ...
Gene cluster statistics with gene families.
Raghupathy, Narayanan; Durand, Dannie
2009-05-01
Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Statistical inference for Cox processes
DEFF Research Database (Denmark)
Møller, Jesper; Waagepetersen, Rasmus Plenge
2002-01-01
Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome...... research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters...... that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space...
International Nuclear Information System (INIS)
Goodman, J.; Hut, P.
1985-01-01
The enigma of core collapse receives much attention in this volume. In addition, several observational papers summarize recent techniques and results and discuss the stellar dynamical implications of the enormous progress in the quality of surface photometry, proper motion studies, radial velocity determinations, as well as space-based measurements in a variety of wavelengths. The value of these Proceedings as a standard reference work is enhanced by the inclusion of two appendices, featuring English translations of two seminal papers on stellar dynamics published in Russian and not previously available in a Western language. A third appendix contains an up-to-date catalogue of observationally determined parameters of galactic globular clusters, as well as theoretically inferred parameters. This catalogue will prove to be an essential reference for phenomenonological studies and an ideal testing ground for new theoretical developments. (orig.)
Interactive Instruction in Bayesian Inference
DEFF Research Database (Denmark)
Khan, Azam; Breslav, Simon; Hornbæk, Kasper
2018-01-01
An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
On Maximum Entropy and Inference
Directory of Open Access Journals (Sweden)
Luigi Gresele
2017-11-01
Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.
Identifikasi Gangguan Neurologis Menggunakan Metode Adaptive Neuro Fuzzy Inference System (ANFIS
Directory of Open Access Journals (Sweden)
Jani Kusanti
2015-07-01
Abstract The use of Adaptive Neuro Fuzzy Inference System (ANFIS methods in the process of identifying one of neurological disorders in the head, known in medical terms ischemic stroke from the ct scan of the head in order to identify the location of ischemic stroke. The steps are performed in the extraction process of identifying, among others, the image of the ct scan of the head by using a histogram. Enhanced image of the intensity histogram image results using Otsu threshold to obtain results pixels rated 1 related to the object while pixel rated 0 associated with the measurement background. The result used for image clustering process, to process image clusters used fuzzy c-mean (FCM clustering result is a row of the cluster center, the results of the data used to construct a fuzzy inference system (FIS. Fuzzy inference system applied is fuzzy inference model of Takagi-Sugeno-Kang. In this study ANFIS is used to optimize the results of the determination of the location of the blockage ischemic stroke. Used recursive least squares estimator (RLSE for learning. RMSE results obtained in the training process of 0.0432053, while in the process of generated test accuracy rate of 98.66% Keywords— Stroke Ischemik, Global threshold, Fuzzy Inference System model Sugeno, ANFIS, RMSE
Eight challenges in phylodynamic inference
Directory of Open Access Journals (Sweden)
Simon D.W. Frost
2015-03-01
Full Text Available The field of phylodynamics, which attempts to enhance our understanding of infectious disease dynamics using pathogen phylogenies, has made great strides in the past decade. Basic epidemiological and evolutionary models are now well characterized with inferential frameworks in place. However, significant challenges remain in extending phylodynamic inference to more complex systems. These challenges include accounting for evolutionary complexities such as changing mutation rates, selection, reassortment, and recombination, as well as epidemiological complexities such as stochastic population dynamics, host population structure, and different patterns at the within-host and between-host scales. An additional challenge exists in making efficient inferences from an ever increasing corpus of sequence data.
Problem solving and inference mechanisms
Energy Technology Data Exchange (ETDEWEB)
Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A
1982-01-01
The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.
Minku, Leandro L.; Hou, Siqing
2017-01-01
baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number
Object-Oriented Type Inference
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff; Palsberg, Jens
1991-01-01
We present a new approach to inferring types in untyped object-oriented programs with inheritance, assignments, and late binding. It guarantees that all messages are understood, annotates the program with type information, allows polymorphic methods, and can be used as the basis of an op...
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Mixed normal inference on multicointegration
Boswijk, H.P.
2009-01-01
Asymptotic likelihood analysis of cointegration in I(2) models, see Johansen (1997, 2006), Boswijk (2000) and Paruolo (2000), has shown that inference on most parameters is mixed normal, implying hypothesis test statistics with an asymptotic 2 null distribution. The asymptotic distribution of the
Statistical inference and Aristotle's Rhetoric.
Macdonald, Ranald R
2004-11-01
Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
DEFF Research Database (Denmark)
Christensen, Thomas Budde
The cluster theory attributed to Michael Porter has significantly influenced industrial policies in countries across Europe and North America since the beginning of the 1990s. Institutions such as the EU, OECD and the World Bank and governments in countries such as the UK, France, The Netherlands...... or management. Both the Accelerate Wales and the Accelerate Cluster programmes target this issue by trying to establish networks between companies that can be used to supply knowledge from research institutions to manufacturing companies. The paper concludes that public sector interventions can make...... businesses. The universities were not considered by the participating companies to be important parts of the local business environment and inputs from universities did not appear to be an important source to access knowledge about new product development or new techniques in production, distribution...
Small Business Administration — The Regional Innovation Clusters serve a diverse group of sectors and geographies. Three of the initial pilot clusters, termed Advanced Defense Technology clusters,...
Mucha, Hans-Joachim; Sofyan, Hizir
2000-01-01
As an explorative technique, duster analysis provides a description or a reduction in the dimension of the data. It classifies a set of observations into two or more mutually exclusive unknown groups based on combinations of many variables. Its aim is to construct groups in such a way that the profiles of objects in the same groups are relatively homogenous whereas the profiles of objects in different groups are relatively heterogeneous. Clustering is distinct from classification techniques, ...
Molecular Polarizability of Sc and C (Fullerene and Graphite Clusters
Directory of Open Access Journals (Sweden)
Francisco Torrens
2001-05-01
Full Text Available A method (POLAR for the calculation of the molecular polarizability is presented. It uses the interacting induced dipoles polarization model. As an example, the method is applied to Scn and Cn (fullerene and one-shell graphite model clusters. On varying the number of atoms, the clusters show numbers indicative of particularly polarizable structures. The are compared with reference calculations (PAPID. In general, the Scn calculated (POLAR and Cn computed (POLAR and PAPID are less polarizable than what is inferred from the bulk. However, the Scn calculated (PAPID are more polarizable than what is inferred. Moreover, previous theoretical work yielded the same trend for Sin, Gen and GanAsm small clusters. The high polarizability of the Scn clusters (PAPID is attributed to arise from dangling bonds at the surface of the cluster.
Statistical learning and selective inference.
Taylor, Jonathan; Tibshirani, Robert J
2015-06-23
We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Statistical inference an integrated approach
Migon, Helio S; Louzada, Francisco
2014-01-01
Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Causal inference based on counterfactuals
Directory of Open Access Journals (Sweden)
Höfler M
2005-09-01
Full Text Available Abstract Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept.
System Support for Forensic Inference
Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan
Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Statistical inference on residual life
Jeong, Jong-Hyeon
2014-01-01
This is a monograph on the concept of residual life, which is an alternative summary measure of time-to-event data, or survival data. The mean residual life has been used for many years under the name of life expectancy, so it is a natural concept for summarizing survival or reliability data. It is also more interpretable than the popular hazard function, especially for communications between patients and physicians regarding the efficacy of a new drug in the medical field. This book reviews existing statistical methods to infer the residual life distribution. The review and comparison includes existing inference methods for mean and median, or quantile, residual life analysis through medical data examples. The concept of the residual life is also extended to competing risks analysis. The targeted audience includes biostatisticians, graduate students, and PhD (bio)statisticians. Knowledge in survival analysis at an introductory graduate level is advisable prior to reading this book.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Statistical inference a short course
Panik, Michael J
2012-01-01
A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal
On Quantum Statistical Inference, II
Barndorff-Nielsen, O. E.; Gill, R. D.; Jupp, P. E.
2003-01-01
Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, theoretical developments in the theory of quantum measurements have brought the basic mathematical framework for the probability calculations much closer to that of classical probability theory. The present paper reviews this field and proposes and inte...
Nonparametric predictive inference in reliability
International Nuclear Information System (INIS)
Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.
2002-01-01
We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere
Variational inference & deep learning : A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Variational inference & deep learning: A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Continuous Integrated Invariant Inference, Phase I
National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...
Scalable inference for stochastic block models
Peng, Chengbin
2017-12-08
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Adaptive Inference on General Graphical Models
Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur
2012-01-01
Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...
Numerical approximations for speeding up mcmc inference in the infinite relational model
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Albers, Kristoffer Jon
2015-01-01
The infinite relational model (IRM) is a powerful model for discovering clusters in complex networks; however, the computational speed of Markov chain Monte Carlo inference in the model can be a limiting factor when analyzing large networks. We investigate how using numerical approximations...
Directory of Open Access Journals (Sweden)
Amol P. Bhondekar
2010-03-01
Full Text Available Sensor deployment scheme highly governs the effectiveness of distributed wireless sensor network. Issues such as energy conservation and clustering make the deployment problem much more complex. A multiobjective Fuzzy Inference System based strategy for mobile sensor deployment is presented in this paper. This strategy gives a synergistic combination of energy capacity, clustering and peer-to-peer deployment. Performance of our strategy is evaluated in terms of coverage, uniformity, speed and clustering. Our algorithm is compared against a modified distributed self-spreading algorithm to exhibit better performance.
Nuclear clustering - a cluster core model study
International Nuclear Information System (INIS)
Paul Selvi, G.; Nandhini, N.; Balasubramaniam, M.
2015-01-01
Nuclear clustering, similar to other clustering phenomenon in nature is a much warranted study, since it would help us in understanding the nature of binding of the nucleons inside the nucleus, closed shell behaviour when the system is highly deformed, dynamics and structure at extremes. Several models account for the clustering phenomenon of nuclei. We present in this work, a cluster core model study of nuclear clustering in light mass nuclei
Sweller, Naomi; Hayes, Brett K
2010-08-01
Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.
Generative inference for cultural evolution.
Kandler, Anne; Powell, Adam
2018-04-05
One of the major challenges in cultural evolution is to understand why and how various forms of social learning are used in human populations, both now and in the past. To date, much of the theoretical work on social learning has been done in isolation of data, and consequently many insights focus on revealing the learning processes or the distributions of cultural variants that are expected to have evolved in human populations. In population genetics, recent methodological advances have allowed a greater understanding of the explicit demographic and/or selection mechanisms that underlie observed allele frequency distributions across the globe, and their change through time. In particular, generative frameworks-often using coalescent-based simulation coupled with approximate Bayesian computation (ABC)-have provided robust inferences on the human past, with no reliance on a priori assumptions of equilibrium. Here, we demonstrate the applicability and utility of generative inference approaches to the field of cultural evolution. The framework advocated here uses observed population-level frequency data directly to establish the likely presence or absence of particular hypothesized learning strategies. In this context, we discuss the problem of equifinality and argue that, in the light of sparse cultural data and the multiplicity of possible social learning processes, the exclusion of those processes inconsistent with the observed data might be the most instructive outcome. Finally, we summarize the findings of generative inference approaches applied to a number of case studies.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).
sick: The Spectroscopic Inference Crank
Casey, Andrew R.
2016-03-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Inferring network structure from cascades
Ghonge, Sushrut; Vural, Dervis Can
2017-07-01
Many physical, biological, and social phenomena can be described by cascades taking place on a network. Often, the activity can be empirically observed, but not the underlying network of interactions. In this paper we offer three topological methods to infer the structure of any directed network given a set of cascade arrival times. Our formulas hold for a very general class of models where the activation probability of a node is a generic function of its degree and the number of its active neighbors. We report high success rates for synthetic and real networks, for several different cascade models.
SICK: THE SPECTROSCOPIC INFERENCE CRANK
Energy Technology Data Exchange (ETDEWEB)
Casey, Andrew R., E-mail: arc@ast.cam.ac.uk [Institute of Astronomy, University of Cambridge, Madingley Road, Cambdridge, CB3 0HA (United Kingdom)
2016-03-15
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Inference in hybrid Bayesian networks
International Nuclear Information System (INIS)
Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio
2009-01-01
Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.
SICK: THE SPECTROSCOPIC INFERENCE CRANK
International Nuclear Information System (INIS)
Casey, Andrew R.
2016-01-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust.
Cun, Yupeng; Yang, Tsun-Po; Achter, Viktor; Lang, Ulrich; Peifer, Martin
2018-06-01
The genomes of cancer cells constantly change during pathogenesis. This evolutionary process can lead to the emergence of drug-resistant mutations in subclonal populations, which can hinder therapeutic intervention in patients. Data derived from massively parallel sequencing can be used to infer these subclonal populations using tumor-specific point mutations. The accurate determination of copy-number changes and tumor impurity is necessary to reliably infer subclonal populations by mutational clustering. This protocol describes how to use Sclust, a copy-number analysis method with a recently developed mutational clustering approach. In a series of simulations and comparisons with alternative methods, we have previously shown that Sclust accurately determines copy-number states and subclonal populations. Performance tests show that the method is computationally efficient, with copy-number analysis and mutational clustering taking Linux/Unix command-line syntax should be able to carry out analyses of subclonal populations.
Directory of Open Access Journals (Sweden)
Ducros Anne
2008-07-01
Full Text Available Abstract Cluster headache (CH is a primary headache disease characterized by recurrent short-lasting attacks (15 to 180 minutes of excruciating unilateral periorbital pain accompanied by ipsilateral autonomic signs (lacrimation, nasal congestion, ptosis, miosis, lid edema, redness of the eye. It affects young adults, predominantly males. Prevalence is estimated at 0.5–1.0/1,000. CH has a circannual and circadian periodicity, attacks being clustered (hence the name in bouts that can occur during specific months of the year. Alcohol is the only dietary trigger of CH, strong odors (mainly solvents and cigarette smoke and napping may also trigger CH attacks. During bouts, attacks may happen at precise hours, especially during the night. During the attacks, patients tend to be restless. CH may be episodic or chronic, depending on the presence of remission periods. CH is associated with trigeminovascular activation and neuroendocrine and vegetative disturbances, however, the precise cautive mechanisms remain unknown. Involvement of the hypothalamus (a structure regulating endocrine function and sleep-wake rhythms has been confirmed, explaining, at least in part, the cyclic aspects of CH. The disease is familial in about 10% of cases. Genetic factors play a role in CH susceptibility, and a causative role has been suggested for the hypocretin receptor gene. Diagnosis is clinical. Differential diagnoses include other primary headache diseases such as migraine, paroxysmal hemicrania and SUNCT syndrome. At present, there is no curative treatment. There are efficient treatments to shorten the painful attacks (acute treatments and to reduce the number of daily attacks (prophylactic treatments. Acute treatment is based on subcutaneous administration of sumatriptan and high-flow oxygen. Verapamil, lithium, methysergide, prednisone, greater occipital nerve blocks and topiramate may be used for prophylaxis. In refractory cases, deep-brain stimulation of the
Network inference from functional experimental data (Conference Presentation)
Desrosiers, Patrick; Labrecque, Simon; Tremblay, Maxime; Bélanger, Mathieu; De Dorlodot, Bertrand; Côté, Daniel C.
2016-03-01
Functional connectivity maps of neuronal networks are critical tools to understand how neurons form circuits, how information is encoded and processed by neurons, how memory is shaped, and how these basic processes are altered under pathological conditions. Current light microscopy allows to observe calcium or electrical activity of thousands of neurons simultaneously, yet assessing comprehensive connectivity maps directly from such data remains a non-trivial analytical task. There exist simple statistical methods, such as cross-correlation and Granger causality, but they only detect linear interactions between neurons. Other more involved inference methods inspired by information theory, such as mutual information and transfer entropy, identify more accurately connections between neurons but also require more computational resources. We carried out a comparative study of common connectivity inference methods. The relative accuracy and computational cost of each method was determined via simulated fluorescence traces generated with realistic computational models of interacting neurons in networks of different topologies (clustered or non-clustered) and sizes (10-1000 neurons). To bridge the computational and experimental works, we observed the intracellular calcium activity of live hippocampal neuronal cultures infected with the fluorescent calcium marker GCaMP6f. The spontaneous activity of the networks, consisting of 50-100 neurons per field of view, was recorded from 20 to 50 Hz on a microscope controlled by a homemade software. We implemented all connectivity inference methods in the software, which rapidly loads calcium fluorescence movies, segments the images, extracts the fluorescence traces, and assesses the functional connections (with strengths and directions) between each pair of neurons. We used this software to assess, in real time, the functional connectivity from real calcium imaging data in basal conditions, under plasticity protocols, and epileptic
Inferring modules from human protein interactome classes
Directory of Open Access Journals (Sweden)
Chaurasia Gautam
2010-07-01
Full Text Available Abstract Background The integration of protein-protein interaction networks derived from high-throughput screening approaches and complementary sources is a key topic in systems biology. Although integration of protein interaction data is conventionally performed, the effects of this procedure on the result of network analyses has not been examined yet. In particular, in order to optimize the fusion of heterogeneous interaction datasets, it is crucial to consider not only their degree of coverage and accuracy, but also their mutual dependencies and additional salient features. Results We examined this issue based on the analysis of modules detected by network clustering methods applied to both integrated and individual (disaggregated data sources, which we call interactome classes. Due to class diversity, we deal with variable dependencies of data features arising from structural specificities and biases, but also from possible overlaps. Since highly connected regions of the human interactome may point to potential protein complexes, we have focused on the concept of modularity, and elucidated the detection power of module extraction algorithms by independent validations based on GO, MIPS and KEGG. From the combination of protein interactions with gene expressions, a confidence scoring scheme has been proposed before proceeding via GO with further classification in permanent and transient modules. Conclusions Disaggregated interactomes are shown to be informative for inferring modularity, thus contributing to perform an effective integrative analysis. Validation of the extracted modules by multiple annotation allows for the assessment of confidence measures assigned to the modules in a protein pathway context. Notably, the proposed multilayer confidence scheme can be used for network calibration by enabling a transition from unweighted to weighted interactomes based on biological evidence.
Young star clusters in nearby molecular clouds
Getman, K. V.; Kuhn, M. A.; Feigelson, E. D.; Broos, P. S.; Bate, M. R.; Garmire, G. P.
2018-06-01
The SFiNCs (Star Formation in Nearby Clouds) project is an X-ray/infrared study of the young stellar populations in 22 star-forming regions with distances ≲ 1 kpc designed to extend our earlier MYStIX (Massive Young Star-Forming Complex Study in Infrared and X-ray) survey of more distant clusters. Our central goal is to give empirical constraints on cluster formation mechanisms. Using parametric mixture models applied homogeneously to the catalogue of SFiNCs young stars, we identify 52 SFiNCs clusters and 19 unclustered stellar structures. The procedure gives cluster properties including location, population, morphology, association with molecular clouds, absorption, age (AgeJX), and infrared spectral energy distribution (SED) slope. Absorption, SED slope, and AgeJX are age indicators. SFiNCs clusters are examined individually, and collectively with MYStIX clusters, to give the following results. (1) SFiNCs is dominated by smaller, younger, and more heavily obscured clusters than MYStIX. (2) SFiNCs cloud-associated clusters have the high ellipticities aligned with their host molecular filaments indicating morphology inherited from their parental clouds. (3) The effect of cluster expansion is evident from the radius-age, radius-absorption, and radius-SED correlations. Core radii increase dramatically from ˜0.08 to ˜0.9 pc over the age range 1-3.5 Myr. Inferred gas removal time-scales are longer than 1 Myr. (4) Rich, spatially distributed stellar populations are present in SFiNCs clouds representing early generations of star formation. An appendix compares the performance of the mixture models and non-parametric minimum spanning tree to identify clusters. This work is a foundation for future SFiNCs/MYStIX studies including disc longevity, age gradients, and dynamical modelling.
Lower complexity bounds for lifted inference
DEFF Research Database (Denmark)
Jaeger, Manfred
2015-01-01
instances of the model. Numerous approaches for such “lifted inference” techniques have been proposed. While it has been demonstrated that these techniques will lead to significantly more efficient inference on some specific models, there are only very recent and still quite restricted results that show...... the feasibility of lifted inference on certain syntactically defined classes of models. Lower complexity bounds that imply some limitations for the feasibility of lifted inference on more expressive model classes were established earlier in Jaeger (2000; Jaeger, M. 2000. On the complexity of inference about...... that under the assumption that NETIME≠ETIME, there is no polynomial lifted inference algorithm for knowledge bases of weighted, quantifier-, and function-free formulas. Further strengthening earlier results, this is also shown to hold for approximate inference and for knowledge bases not containing...
Statistical inference for financial engineering
Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki
2014-01-01
This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.
Type inference for correspondence types
DEFF Research Database (Denmark)
Hüttel, Hans; Gordon, Andy; Hansen, Rene Rydhof
2009-01-01
We present a correspondence type/effect system for authenticity in a π-calculus with polarized channels, dependent pair types and effect terms and show how one may, given a process P and an a priori type environment E, generate constraints that are formulae in the Alternating Least Fixed......-Point (ALFP) logic. We then show how a reasonable model of the generated constraints yields a type/effect assignment such that P becomes well-typed with respect to E if and only if this is possible. The formulae generated satisfy a finite model property; a system of constraints is satisfiable if and only...... if it has a finite model. As a consequence, we obtain the result that type/effect inference in our system is polynomial-time decidable....
Causal inference in public health.
Glass, Thomas A; Goodman, Steven N; Hernán, Miguel A; Samet, Jonathan M
2013-01-01
Causal inference has a central role in public health; the determination that an association is causal indicates the possibility for intervention. We review and comment on the long-used guidelines for interpreting evidence as supporting a causal association and contrast them with the potential outcomes framework that encourages thinking in terms of causes that are interventions. We argue that in public health this framework is more suitable, providing an estimate of an action's consequences rather than the less precise notion of a risk factor's causal effect. A variety of modern statistical methods adopt this approach. When an intervention cannot be specified, causal relations can still exist, but how to intervene to change the outcome will be unclear. In application, the often-complex structure of causal processes needs to be acknowledged and appropriate data collected to study them. These newer approaches need to be brought to bear on the increasingly complex public health challenges of our globalized world.
Brightest Cluster Galaxies in REXCESS Clusters
Haarsma, Deborah B.; Leisman, L.; Bruch, S.; Donahue, M.
2009-01-01
Most galaxy clusters contain a Brightest Cluster Galaxy (BCG) which is larger than the other cluster ellipticals and has a more extended profile. In the hierarchical model, the BCG forms through many galaxy mergers in the crowded center of the cluster, and thus its properties give insight into the assembly of the cluster as a whole. In this project, we are working with the Representative XMM-Newton Cluster Structure Survey (REXCESS) team (Boehringer et al 2007) to study BCGs in 33 X-ray luminous galaxy clusters, 0.055 < z < 0.183. We are imaging the BCGs in R band at the Southern Observatory for Astrophysical Research (SOAR) in Chile. In this poster, we discuss our methods and give preliminary measurements of the BCG magnitudes, morphology, and stellar mass. We compare these BCG properties with the properties of their host clusters, particularly of the X-ray emitting gas.
Partitional clustering algorithms
2015-01-01
This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...
Inference Attacks and Control on Database Structures
Directory of Open Access Journals (Sweden)
Muhamed Turkanovic
2015-02-01
Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.
BioCluster: Tool for Identification and Clustering of Enterobacteriaceae Based on Biochemical Data
Directory of Open Access Journals (Sweden)
Ahmed Abdullah
2015-06-01
Full Text Available Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC and the Improved Hierarchical Clustering (IHC, a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1–47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/.
Diversity among galaxy clusters
International Nuclear Information System (INIS)
Struble, M.F.; Rood, H.J.
1988-01-01
The classification of galaxy clusters is discussed. Consideration is given to the classification scheme of Abell (1950's), Zwicky (1950's), Morgan, Matthews, and Schmidt (1964), and Morgan-Bautz (1970). Galaxies can be classified based on morphology, chemical composition, spatial distribution, and motion. The correlation between a galaxy's environment and morphology is examined. The classification scheme of Rood-Sastry (1971), which is based on clusters's morphology and galaxy population, is described. The six types of clusters they define include: (1) a cD-cluster dominated by a single large galaxy, (2) a cluster dominated by a binary, (3) a core-halo cluster, (4) a cluster dominated by several bright galaxies, (5) a cluster appearing flattened, and (6) an irregularly shaped cluster. Attention is also given to the evolution of cluster structures, which is related to initial density and cluster motion
LAIT: a local ancestry inference toolkit.
Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei
2017-09-06
Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.
Forward and backward inference in spatial cognition.
Directory of Open Access Journals (Sweden)
Will D Penny
Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.
Generative Inferences Based on Learned Relations
Chen, Dawn; Lu, Hongjing; Holyoak, Keith J.
2017-01-01
A key property of relational representations is their "generativity": From partial descriptions of relations between entities, additional inferences can be drawn about other entities. A major theoretical challenge is to demonstrate how the capacity to make generative inferences could arise as a result of learning relations from…
Inference in models with adaptive learning
Chevillon, G.; Massmann, M.; Mavroeidis, S.
2010-01-01
Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be
Fiducial inference - A Neyman-Pearson interpretation
Salome, D; VonderLinden, W; Dose,; Fischer, R; Preuss, R
1999-01-01
Fisher's fiducial argument is a tool for deriving inferences in the form of a probability distribution on the parameter space, not based on Bayes's Theorem. Lindley established that in exceptional situations fiducial inferences coincide with posterior distributions; in the other situations fiducial
Uncertainty in prediction and in inference
Hilgevoord, J.; Uffink, J.
1991-01-01
The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close re-lationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in
Causal inference in economics and marketing.
Varian, Hal R
2016-07-05
This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2000-01-01
New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
The Impact of Disablers on Predictive Inference
Cummins, Denise Dellarosa
2014-01-01
People consider alternative causes when deciding whether a cause is responsible for an effect (diagnostic inference) but appear to neglect them when deciding whether an effect will occur (predictive inference). Five experiments were conducted to test a 2-part explanation of this phenomenon: namely, (a) that people interpret standard predictive…
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...
Extended likelihood inference in reliability
International Nuclear Information System (INIS)
Martz, H.F. Jr.; Beckman, R.J.; Waller, R.A.
1978-10-01
Extended likelihood methods of inference are developed in which subjective information in the form of a prior distribution is combined with sampling results by means of an extended likelihood function. The extended likelihood function is standardized for use in obtaining extended likelihood intervals. Extended likelihood intervals are derived for the mean of a normal distribution with known variance, the failure-rate of an exponential distribution, and the parameter of a binomial distribution. Extended second-order likelihood methods are developed and used to solve several prediction problems associated with the exponential and binomial distributions. In particular, such quantities as the next failure-time, the number of failures in a given time period, and the time required to observe a given number of failures are predicted for the exponential model with a gamma prior distribution on the failure-rate. In addition, six types of life testing experiments are considered. For the binomial model with a beta prior distribution on the probability of nonsurvival, methods are obtained for predicting the number of nonsurvivors in a given sample size and for predicting the required sample size for observing a specified number of nonsurvivors. Examples illustrate each of the methods developed. Finally, comparisons are made with Bayesian intervals in those cases where these are known to exist
Reinforcement learning or active inference?
Friston, Karl J; Daunizeau, Jean; Kiebel, Stefan J
2009-07-29
This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.
Reinforcement learning or active inference?
Directory of Open Access Journals (Sweden)
Karl J Friston
2009-07-01
Full Text Available This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.
Active inference and epistemic value.
Friston, Karl; Rigoli, Francesco; Ognibene, Dimitri; Mathys, Christoph; Fitzgerald, Thomas; Pezzulo, Giovanni
2015-01-01
We offer a formal treatment of choice behavior based on the premise that agents minimize the expected free energy of future outcomes. Crucially, the negative free energy or quality of a policy can be decomposed into extrinsic and epistemic (or intrinsic) value. Minimizing expected free energy is therefore equivalent to maximizing extrinsic value or expected utility (defined in terms of prior preferences or goals), while maximizing information gain or intrinsic value (or reducing uncertainty about the causes of valuable outcomes). The resulting scheme resolves the exploration-exploitation dilemma: Epistemic value is maximized until there is no further information gain, after which exploitation is assured through maximization of extrinsic value. This is formally consistent with the Infomax principle, generalizing formulations of active vision based upon salience (Bayesian surprise) and optimal decisions based on expected utility and risk-sensitive (Kullback-Leibler) control. Furthermore, as with previous active inference formulations of discrete (Markovian) problems, ad hoc softmax parameters become the expected (Bayes-optimal) precision of beliefs about, or confidence in, policies. This article focuses on the basic theory, illustrating the ideas with simulations. A key aspect of these simulations is the similarity between precision updates and dopaminergic discharges observed in conditioning paradigms.
Ancient Biomolecules and Evolutionary Inference.
Cappellini, Enrico; Prohaska, Ana; Racimo, Fernando; Welker, Frido; Pedersen, Mikkel Winther; Allentoft, Morten E; de Barros Damgaard, Peter; Gutenbrunner, Petra; Dunne, Julie; Hammann, Simon; Roffet-Salque, Mélanie; Ilardo, Melissa; Moreno-Mayar, J Víctor; Wang, Yucheng; Sikora, Martin; Vinner, Lasse; Cox, Jürgen; Evershed, Richard P; Willerslev, Eske
2018-04-25
Over the last decade, studies of ancient biomolecules-particularly ancient DNA, proteins, and lipids-have revolutionized our understanding of evolutionary history. Though initially fraught with many challenges, the field now stands on firm foundations. Researchers now successfully retrieve nucleotide and amino acid sequences, as well as lipid signatures, from progressively older samples, originating from geographic areas and depositional environments that, until recently, were regarded as hostile to long-term preservation of biomolecules. Sampling frequencies and the spatial and temporal scope of studies have also increased markedly, and with them the size and quality of the data sets generated. This progress has been made possible by continuous technical innovations in analytical methods, enhanced criteria for the selection of ancient samples, integrated experimental methods, and advanced computational approaches. Here, we discuss the history and current state of ancient biomolecule research, its applications to evolutionary inference, and future directions for this young and exciting field. Expected final online publication date for the Annual Review of Biochemistry Volume 87 is June 20, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
EI: A Program for Ecological Inference
Directory of Open Access Journals (Sweden)
Gary King
2004-09-01
Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.
The cylindrical K-function and Poisson line cluster point processes
DEFF Research Database (Denmark)
Møller, Jesper; Safavimanesh, Farzaneh; Rasmussen, Jakob G.
Poisson line cluster point processes, is also introduced. Parameter estimation based on moment methods or Bayesian inference for this model is discussed when the underlying Poisson line process and the cluster memberships are treated as hidden processes. To illustrate the methodologies, we analyze two...
DEFF Research Database (Denmark)
Østergaard, Christian Richter; Park, Eun Kyung
2015-01-01
Most studies on regional clusters focus on identifying factors and processes that make clusters grow. However, sometimes technologies and market conditions suddenly shift, and clusters decline. This paper analyses the process of decline of the wireless communication cluster in Denmark. The longit...... but being quick to withdraw in times of crisis....
Clustering of correlated networks
Dorogovtsev, S. N.
2003-01-01
We obtain the clustering coefficient, the degree-dependent local clustering, and the mean clustering of networks with arbitrary correlations between the degrees of the nearest-neighbor vertices. The resulting formulas allow one to determine the nature of the clustering of a network.
DEFF Research Database (Denmark)
Müller, Emmanuel; Assent, Ira; Günnemann, Stephan
2009-01-01
Subspace clustering aims at detecting clusters in any subspace projection of a high dimensional space. As the number of possible subspace projections is exponential in the number of dimensions, the result is often tremendously large. Recent approaches fail to reduce results to relevant subspace...... clusters. Their results are typically highly redundant, i.e. many clusters are detected multiple times in several projections. In this work, we propose a novel model for relevant subspace clustering (RESCU). We present a global optimization which detects the most interesting non-redundant subspace clusters...... achieves top clustering quality while competing approaches show greatly varying performance....
International Nuclear Information System (INIS)
Popok, V.N.; Prasalovich, S.V.; Odzhaev, V.B.; Campbell, E.E.B.
2001-01-01
A brief state-of-the-art review in the field of cluster-surface interactions is presented. Ionised cluster beams could become a powerful and versatile tool for the modification and processing of surfaces as an alternative to ion implantation and ion assisted deposition. The main effects of cluster-surface collisions and possible applications of cluster ion beams are discussed. The outlooks of the Cluster Implantation and Deposition Apparatus (CIDA) being developed in Guteborg University are shown
PREFACE: Nuclear Cluster Conference; Cluster'07
Freer, Martin
2008-05-01
The Cluster Conference is a long-running conference series dating back to the 1960's, the first being initiated by Wildermuth in Bochum, Germany, in 1969. The most recent meeting was held in Nara, Japan, in 2003, and in 2007 the 9th Cluster Conference was held in Stratford-upon-Avon, UK. As the name suggests the town of Stratford lies upon the River Avon, and shortly before the conference, due to unprecedented rainfall in the area (approximately 10 cm within half a day), lay in the River Avon! Stratford is the birthplace of the `Bard of Avon' William Shakespeare, and this formed an intriguing conference backdrop. The meeting was attended by some 90 delegates and the programme contained 65 70 oral presentations, and was opened by a historical perspective presented by Professor Brink (Oxford) and closed by Professor Horiuchi (RCNP) with an overview of the conference and future perspectives. In between, the conference covered aspects of clustering in exotic nuclei (both neutron and proton-rich), molecular structures in which valence neutrons are exchanged between cluster cores, condensates in nuclei, neutron-clusters, superheavy nuclei, clusters in nuclear astrophysical processes and exotic cluster decays such as 2p and ternary cluster decay. The field of nuclear clustering has become strongly influenced by the physics of radioactive beam facilities (reflected in the programme), and by the excitement that clustering may have an important impact on the structure of nuclei at the neutron drip-line. It was clear that since Nara the field had progressed substantially and that new themes had emerged and others had crystallized. Two particular topics resonated strongly condensates and nuclear molecules. These topics are thus likely to be central in the next cluster conference which will be held in 2011 in the Hungarian city of Debrechen. Martin Freer Participants and Cluster'07
Directory of Open Access Journals (Sweden)
Mattia C F Prosperi
2010-10-01
Full Text Available Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classification or infection transmission event identification. We introduce the threshold bootstrap clustering (TBC, a new methodology for partitioning molecular sequences, that does not require a phylogenetic tree estimation.The TBC is an incremental partition algorithm, inspired by the stochastic Chinese restaurant process, and takes advantage of resampling techniques and models of sequence evolution. TBC uses as input a multiple alignment of molecular sequences and its output is a crisp partition of the taxa into an automatically determined number of clusters. By varying initial conditions, the algorithm can produce different partitions. We describe a procedure that selects a prime partition among a set of candidate ones and calculates a measure of cluster reliability. TBC was successfully tested for the identification of type-1 human immunodeficiency and hepatitis C virus subtypes, and compared with previously established methodologies. It was also evaluated in the problem of HIV-1 intra-patient quasispecies clustering, and for transmission cluster identification, using a set of sequences from patients with known transmission event histories.TBC has been shown to be effective for the subtyping of HIV and HCV, and for identifying intra-patient quasispecies. To some extent, the algorithm was able also to infer clusters corresponding to events of infection transmission. The computational complexity of TBC is quadratic in the number of taxa, lower than other established methods; in addition, TBC has been enhanced with a measure of cluster reliability. The TBC can be useful to characterise molecular quasipecies in a broad context.
Prosperi, Mattia C F; De Luca, Andrea; Di Giambenedetto, Simona; Bracciale, Laura; Fabbiani, Massimiliano; Cauda, Roberto; Salemi, Marco
2010-10-25
Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classification or infection transmission event identification. We introduce the threshold bootstrap clustering (TBC), a new methodology for partitioning molecular sequences, that does not require a phylogenetic tree estimation. The TBC is an incremental partition algorithm, inspired by the stochastic Chinese restaurant process, and takes advantage of resampling techniques and models of sequence evolution. TBC uses as input a multiple alignment of molecular sequences and its output is a crisp partition of the taxa into an automatically determined number of clusters. By varying initial conditions, the algorithm can produce different partitions. We describe a procedure that selects a prime partition among a set of candidate ones and calculates a measure of cluster reliability. TBC was successfully tested for the identification of type-1 human immunodeficiency and hepatitis C virus subtypes, and compared with previously established methodologies. It was also evaluated in the problem of HIV-1 intra-patient quasispecies clustering, and for transmission cluster identification, using a set of sequences from patients with known transmission event histories. TBC has been shown to be effective for the subtyping of HIV and HCV, and for identifying intra-patient quasispecies. To some extent, the algorithm was able also to infer clusters corresponding to events of infection transmission. The computational complexity of TBC is quadratic in the number of taxa, lower than other established methods; in addition, TBC has been enhanced with a measure of cluster reliability. The TBC can be useful to characterise molecular quasipecies in a broad context.
A new fast method for inferring multiple consensus trees using k-medoids.
Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir
2018-04-05
Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while
MENENTUKAN PENERIMA KPS MENGGUNAKAN FUZZY INFERENCE SYSTEM METODE TSUKAMOTO
Directory of Open Access Journals (Sweden)
Sugianti .
2016-10-01
Full Text Available Social assistance programs launched by the Government, in particular the first Cluster program got more attention from the citizens of society. In order to reach out the objectivity and efficiency, determining of recipient households assistance program, we need a decision support system that allows the authorities villages / wards in decision making. In this study constructed a prototype system to define the poor household who receivet KPS using Fuzzy Inference System Tsukamoto method using 14 BPS’s criterias poverty. As the output of the system are a score of household, status on aid, and the number of villages / wards. The conclusion obtained in this study is the system can be run in accordance with the parameters specified poverty, able to adjust the poverty conditions of different regions poverty index.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Statistical inference an integrated Bayesianlikelihood approach
Aitkin, Murray
2010-01-01
Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
Management of cluster headache
DEFF Research Database (Denmark)
Tfelt-Hansen, Peer C; Jensen, Rigmor H
2012-01-01
The prevalence of cluster headache is 0.1% and cluster headache is often not diagnosed or misdiagnosed as migraine or sinusitis. In cluster headache there is often a considerable diagnostic delay - an average of 7 years in a population-based survey. Cluster headache is characterized by very severe...... or severe orbital or periorbital pain with a duration of 15-180 minutes. The cluster headache attacks are accompanied by characteristic associated unilateral symptoms such as tearing, nasal congestion and/or rhinorrhoea, eyelid oedema, miosis and/or ptosis. In addition, there is a sense of restlessness...... and agitation. Patients may have up to eight attacks per day. Episodic cluster headache (ECH) occurs in clusters of weeks to months duration, whereas chronic cluster headache (CCH) attacks occur for more than 1 year without remissions. Management of cluster headache is divided into acute attack treatment...
Symmetries of cluster configurations
International Nuclear Information System (INIS)
Kramer, P.
1975-01-01
A deeper understanding of clustering phenomena in nuclei must encompass at least two interrelated aspects of the subject: (A) Given a system of A nucleons with two-body interactions, what are the relevant and persistent modes of clustering involved. What is the nature of the correlated nucleon groups which form the clusters, and what is their mutual interaction. (B) Given the cluster modes and their interaction, what systematic patterns of nuclear structure and reactions emerge from it. Are there, for example, families of states which share the same ''cluster parents''. Which cluster modes are compatible or exclude each other. What quantum numbers could characterize cluster configurations. There is no doubt that we can learn a good deal from the experimentalists who have discovered many of the features relevant to aspect (B). Symmetries specific to cluster configurations which can throw some light on both aspects of clustering are discussed
Inferring Stop-Locations from WiFi.
Directory of Open Access Journals (Sweden)
David Kofoed Wind
Full Text Available Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring 'stop locations' (places of stationarity from GPS or CDR data, or on detection of state (static/active. In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants' GPS data as well as ground truth data collected during a two month period.
Self-similar gravitational clustering
International Nuclear Information System (INIS)
Efstathiou, G.; Fall, S.M.; Hogan, C.
1979-01-01
The evolution of gravitational clustering is considered and several new scaling relations are derived for the multiplicity function. These include generalizations of the Press-Schechter theory to different densities and cosmological parameters. The theory is then tested against multiplicity function and correlation function estimates for a series of 1000-body experiments. The results are consistent with the theory and show some dependence on initial conditions and cosmological density parameter. The statistical significance of the results, however, is fairly low because of several small number effects in the experiments. There is no evidence for a non-linear bootstrap effect or a dependence of the multiplicity function on the internal dynamics of condensed groups. Empirical estimates of the multiplicity function by Gott and Turner have a feature near the characteristic luminosity predicted by the theory. The scaling relations allow the inference from estimates of the galaxy luminosity function that galaxies must have suffered considerable dissipation if they originally formed from a self-similar hierarchy. A method is also developed for relating the multiplicity function to similar measures of clustering, such as those of Bhavsar, for the distribution of galaxies on the sky. These are shown to depend on the luminosity function in a complicated way. (author)
Cluster Decline and Resilience
DEFF Research Database (Denmark)
Østergaard, Christian Richter; Park, Eun Kyung
Most studies on regional clusters focus on identifying factors and processes that make clusters grow. However, sometimes technologies and market conditions suddenly shift, and clusters decline. This paper analyses the process of decline of the wireless communication cluster in Denmark, 1963......-2011. Our longitudinal study reveals that technological lock-in and exit of key firms have contributed to impairment of the cluster’s resilience in adapting to disruptions. Entrepreneurship has a positive effect on cluster resilience, while multinational companies have contradicting effects by bringing...... in new resources to the cluster but being quick to withdraw in times of crisis....
Inferring Domain Plans in Question-Answering
National Research Council Canada - National Science Library
Pollack, Martha E
1986-01-01
The importance of plan inference in models of conversation has been widely noted in the computational-linguistics literature, and its incorporation in question-answering systems has enabled a range...
Scalable inference for stochastic block models
Peng, Chengbin; Zhang, Zhihua; Wong, Ka-Chun; Zhang, Xiangliang; Keyes, David E.
2017-01-01
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference
Comprehensive cluster analysis with Transitivity Clustering.
Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan
2011-03-01
Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
Efficient algorithms for conditional independence inference
Czech Academy of Sciences Publication Activity Database
Bouckaert, R.; Hemmecke, R.; Lindner, S.; Studený, Milan
2010-01-01
Roč. 11, č. 1 (2010), s. 3453-3479 ISSN 1532-4435 R&D Projects: GA ČR GA201/08/0539; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : conditional independence inference * linear programming approach Subject RIV: BA - General Mathematics Impact factor: 2.949, year: 2010 http://library.utia.cas.cz/separaty/2010/MTR/studeny-efficient algorithms for conditional independence inference.pdf
On the criticality of inferred models
Mastromatteo, Iacopo; Marsili, Matteo
2011-10-01
Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality.
On the criticality of inferred models
International Nuclear Information System (INIS)
Mastromatteo, Iacopo; Marsili, Matteo
2011-01-01
Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality
Polynomial Chaos Surrogates for Bayesian Inference
Le Maitre, Olivier
2016-01-06
The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
The Morphologies and Alignments of Gas, Mass, and the Central Galaxies of CLASH Clusters of Galaxies
Donahue, Megan; Ettori, Stefano; Rasia, Elena; Sayers, Jack; Zitrin, Adi; Meneghetti, Massimo; Voit, G. Mark; Golwala, Sunil; Czakon, Nicole; Yepes, Gustavo; Baldi, Alessandro; Koekemoer, Anton; Postman, Marc
2016-03-01
Morphology is often used to infer the state of relaxation of galaxy clusters. The regularity, symmetry, and degree to which a cluster is centrally concentrated inform quantitative measures of cluster morphology. The Cluster Lensing and Supernova survey with Hubble Space Telescope (CLASH) used weak and strong lensing to measure the distribution of matter within a sample of 25 clusters, 20 of which were deemed to be “relaxed” based on their X-ray morphology and alignment of the X-ray emission with the Brightest Cluster Galaxy. Toward a quantitative characterization of this important sample of clusters, we present uniformly estimated X-ray morphological statistics for all 25 CLASH clusters. We compare X-ray morphologies of CLASH clusters with those identically measured for a large sample of simulated clusters from the MUSIC-2 simulations, selected by mass. We confirm a threshold in X-ray surface brightness concentration of C ≳ 0.4 for cool-core clusters, where C is the ratio of X-ray emission inside 100 h70-1 kpc compared to inside 500 {h}70-1 kpc. We report and compare morphologies of these clusters inferred from Sunyaev-Zeldovich Effect (SZE) maps of the hot gas and in from projected mass maps based on strong and weak lensing. We find a strong agreement in alignments of the orientation of major axes for the lensing, X-ray, and SZE maps of nearly all of the CLASH clusters at radii of 500 kpc (approximately 1/2 R500 for these clusters). We also find a striking alignment of clusters shapes at the 500 kpc scale, as measured with X-ray, SZE, and lensing, with that of the near-infrared stellar light at 10 kpc scales for the 20 “relaxed” clusters. This strong alignment indicates a powerful coupling between the cluster- and galaxy-scale galaxy formation processes.
A Bayesian Network Schema for Lessening Database Inference
National Research Council Canada - National Science Library
Chang, LiWu; Moskowitz, Ira S
2001-01-01
.... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...
International Nuclear Information System (INIS)
Freeman, K.C.
1980-01-01
The young globular clusters of the LMC have ages of 10 7 -10 8 y. Their masses and structure are similar to those of the smaller galactic globular clusters. Their stellar mass functions (in the mass range 6 solar masses to 1.2 solar masses) vary greatly from cluster to cluster, although the clusters are similar in total mass, age, structure and chemical composition. It would be very interesting to know why these clusters are forming now in the LMC and not in the Galaxy. The author considers the 'young globular' or 'blue populous' clusters of the LMC. The ages of these objects are 10 7 to 10 8 y, and their masses are 10 4 to 10 5 solar masses, so they are populous enough to be really useful for studying the evolution of massive stars. The author concentrates on the structure and stellar content of these young clusters. (Auth.)
Star clusters and associations
International Nuclear Information System (INIS)
Ruprecht, J.; Palous, J.
1983-01-01
All 33 papers presented at the symposium were inputted to INIS. They dealt with open clusters, globular clusters, stellar associations and moving groups, and local kinematics and galactic structures. (E.S.)
International Nuclear Information System (INIS)
Bottiglioni, F.; Coutant, J.; Fois, M.
1978-01-01
Areas of possible applications of cluster injection are discussed. The deposition inside the plasma of molecules, issued from the dissociation of the injected clusters, has been computed. Some empirical scaling laws for the penetration are given
International Nuclear Information System (INIS)
Shaver, P.A.
1986-01-01
Evidence for clustering of and with high-redshift QSOs is discussed. QSOs of different redshifts show no clustering, but QSOs of similar redshifts appear to be clustered on a scale comparable to that of galaxies at the present epoch. In addition, spectroscopic studies of close pairs of QSOs indicate that QSOs are surrounded by a relatively high density of absorbing matter, possibly clusters of galaxies
Cluster Physics with Merging Galaxy Clusters
Directory of Open Access Journals (Sweden)
Sandor M. Molnar
2016-02-01
Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.
Exploiting visual search theory to infer social interactions
Rota, Paolo; Dang-Nguyen, Duc-Tien; Conci, Nicola; Sebe, Nicu
2013-03-01
In this paper we propose a new method to infer human social interactions using typical techniques adopted in literature for visual search and information retrieval. The main piece of information we use to discriminate among different types of interactions is provided by proxemics cues acquired by a tracker, and used to distinguish between intentional and casual interactions. The proxemics information has been acquired through the analysis of two different metrics: on the one hand we observe the current distance between subjects, and on the other hand we measure the O-space synergy between subjects. The obtained values are taken at every time step over a temporal sliding window, and processed in the Discrete Fourier Transform (DFT) domain. The features are eventually merged into an unique array, and clustered using the K-means algorithm. The clusters are reorganized using a second larger temporal window into a Bag Of Words framework, so as to build the feature vector that will feed the SVM classifier.
Indian Academy of Sciences (India)
First page Back Continue Last page Overview Graphics. The Optical Absorption Spectra of Small Silver Clusters (5-11) ... Soft Landing and Fragmentation of Small Clusters Deposited in Noble-Gas Films. Harbich, W.; Fedrigo, S.; Buttet, J. Phys. Rev. B 1998, 58, 7428. CO combustion on supported gold clusters. Arenz M ...
DEFF Research Database (Denmark)
Lorentzen, Jochen; Robbins, Glen; Barnes, Justin
2004-01-01
The paper describes the formation of the Durban Auto Cluster in the context of trade liberalization. It argues that the improvement of operational competitiveness of firms in the cluster is prominently due to joint action. It tests this proposition by comparing the gains from cluster activities...
Marketing research cluster analysis
Directory of Open Access Journals (Sweden)
Marić Nebojša
2002-01-01
Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.
Marketing research cluster analysis
Marić Nebojša
2002-01-01
One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.
International Nuclear Information System (INIS)
Choi, Chang-Yeong; Kim, Jeong-Hyun; Kim, Seyong
2004-01-01
Using barebone PC components and NIC's, we construct a linux cluster which has 2-dimensional mesh structure. This cluster has smaller footprint, is less expensive, and use less power compared to conventional linux cluster. Here, we report our experience in building such a machine and discuss our current lattice project on the machine
Abrahamsen, M.; de Berg, M.T.; Buchin, K.A.; Mehr, M.; Mehrabi, A.D.
2017-01-01
In a geometric k -clustering problem the goal is to partition a set of points in R d into k subsets such that a certain cost function of the clustering is minimized. We present data structures for orthogonal range-clustering queries on a point set S : given a query box Q and an integer k>2 , compute
Cosmology with cluster surveys
Indian Academy of Sciences (India)
Abstract. Surveys of clusters of galaxies provide us with a powerful probe of the den- sity and nature of the dark energy. The red-shift distribution of detected clusters is highly sensitive to the dark energy equation of state parameter w. Upcoming Sunyaev–. Zel'dovich (SZ) surveys would provide us large yields of clusters to ...
MASSCLEANage-STELLAR CLUSTER AGES FROM INTEGRATED COLORS
International Nuclear Information System (INIS)
Popescu, Bogdan; Hanson, M. M.
2010-01-01
We present the recently updated and expanded MASSCLEANcolors, a database of 70 million Monte Carlo models selected to match the properties (metallicity, ages, and masses) of stellar clusters found in the Large Magellanic Cloud (LMC). This database shows the rather extreme and non-Gaussian distribution of integrated colors and magnitudes expected with different cluster age and mass and the enormous age degeneracy of integrated colors when mass is unknown. This degeneracy could lead to catastrophic failures in estimating age with standard simple stellar population models, particularly if most of the clusters are of intermediate or low mass, like in the LMC. Utilizing the MASSCLEANcolors database, we have developed MASSCLEANage, a statistical inference package which assigns the most likely age and mass (solved simultaneously) to a cluster based only on its integrated broadband photometric properties. Finally, we use MASSCLEANage to derive the age and mass of LMC clusters based on integrated photometry alone. First, we compare our cluster ages against those obtained for the same seven clusters using more accurate integrated spectroscopy. We find improved agreement with the integrated spectroscopy ages over the original photometric ages. A close examination of our results demonstrates the necessity of solving simultaneously for mass and age to reduce degeneracies in the cluster ages derived via integrated colors. We then selected an additional subset of 30 photometric clusters with previously well-constrained ages and independently derive their age using the MASSCLEANage with the same photometry with very good agreement. The MASSCLEANage program is freely available under GNU General Public License.
Intracluster age gradients in numerous young stellar clusters
Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.
2018-05-01
The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.
Fractional Yields Inferred from Halo and Thick Disk Stars
Caimmi, R.
2013-12-01
Linear [Q/H]-[O/H] relations, Q = Na, Mg, Si, Ca, Ti, Cr, Fe, Ni, are inferred from a sample (N=67) of recently studied FGK-type dwarf stars in the solar neighbourhood including different populations (Nissen and Schuster 2010, Ramirez et al. 2012), namely LH (N=24, low-α halo), HH (N=25, high-α halo), KD (N=16, thick disk), and OL (N=2, globular cluster outliers). Regression line slope and intercept estimators and related variance estimators are determined. With regard to the straight line, [Q/H]=a_{Q}[O/H]+b_{Q}, sample stars are displayed along a "main sequence", [Q,O] = [a_{Q},b_{Q},Δ b_{Q}], leaving aside the two OL stars, which, in most cases (e.g. Na), lie outside. The unit slope, a_{Q}=1, implies Q is a primary element synthesised via SNII progenitors in the presence of a universal stellar initial mass function (defined as simple primary element). In this respect, Mg, Si, Ti, show hat a_{Q}=1 within ∓2hatσ_ {hat a_{Q}}; Cr, Fe, Ni, within ∓3hatσ_{hat a_{Q}}; Na, Ca, within ∓ rhatσ_{hat a_{Q}}, r>3. The empirical, differential element abundance distributions are inferred from LH, HH, KD, HA = HH + KD subsamples, where related regression lines represent their theoretical counterparts within the framework of simple MCBR (multistage closed box + reservoir) chemical evolution models. Hence, the fractional yields, hat{p}_{Q}/hat{p}_{O}, are determined and (as an example) a comparison is shown with their theoretical counterparts inferred from SNII progenitor nucleosynthesis under the assumption of a power-law stellar initial mass function. The generalized fractional yields, C_{Q}=Z_{Q}/Z_{O}^{a_{Q}}, are determined regardless of the chemical evolution model. The ratio of outflow to star formation rate is compared for different populations in the framework of simple MCBR models. The opposite situation of element abundance variation entirely due to cosmic scatter is also considered under reasonable assumptions. The related differential element abundance
Origin and distribution of epipolythiodioxopiperazine (ETP gene clusters in filamentous ascomycetes
Directory of Open Access Journals (Sweden)
Gardiner Donald M
2007-09-01
Full Text Available Abstract Background Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Results Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. Conclusion ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of
A formal model of interpersonal inference
Directory of Open Access Journals (Sweden)
Michael eMoutoussis
2014-03-01
Full Text Available Introduction: We propose that active Bayesian inference – a general framework for decision-making – can equally be applied to interpersonal exchanges. Social cognition, however, entails special challenges. We address these challenges through a novel formulation of a formal model and demonstrate its psychological significance. Method: We review relevant literature, especially with regards to interpersonal representations, formulate a mathematical model and present a simulation study. The model accommodates normative models from utility theory and places them within the broader setting of Bayesian inference. Crucially, we endow people's prior beliefs, into which utilities are absorbed, with preferences of self and others. The simulation illustrates the model's dynamics and furnishes elementary predictions of the theory. Results: 1. Because beliefs about self and others inform both the desirability and plausibility of outcomes, in this framework interpersonal representations become beliefs that have to be actively inferred. This inference, akin to 'mentalising' in the psychological literature, is based upon the outcomes of interpersonal exchanges. 2. We show how some well-known social-psychological phenomena (e.g. self-serving biases can be explained in terms of active interpersonal inference. 3. Mentalising naturally entails Bayesian updating of how people value social outcomes. Crucially this includes inference about one’s own qualities and preferences. Conclusion: We inaugurate a Bayes optimal framework for modelling intersubject variability in mentalising during interpersonal exchanges. Here, interpersonal representations are endowed with explicit functional and affective properties. We suggest the active inference framework lends itself to the study of psychiatric conditions where mentalising is distorted.
Tang, Jinjun; Zou, Yajie; Ash, John; Zhang, Shen; Liu, Fang; Wang, Yinhai
2016-01-01
Travel time is an important measurement used to evaluate the extent of congestion within road networks. This paper presents a new method to estimate the travel time based on an evolving fuzzy neural inference system. The input variables in the system are traffic flow data (volume, occupancy, and speed) collected from loop detectors located at points both upstream and downstream of a given link, and the output variable is the link travel time. A first order Takagi-Sugeno fuzzy rule set is used to complete the inference. For training the evolving fuzzy neural network (EFNN), two learning processes are proposed: (1) a K-means method is employed to partition input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of samples to the cluster centers. As the number of input samples increases, the cluster centers are modified and membership functions are also updated; (2) a weighted recursive least squares estimator is used to optimize the parameters of the linear functions in the Takagi-Sugeno type fuzzy rules. Testing datasets consisting of actual and simulated data are used to test the proposed method. Three common criteria including mean absolute error (MAE), root mean square error (RMSE), and mean absolute relative error (MARE) are utilized to evaluate the estimation performance. Estimation results demonstrate the accuracy and effectiveness of the EFNN method through comparison with existing methods including: multiple linear regression (MLR), instantaneous model (IM), linear model (LM), neural network (NN), and cumulative plots (CP).
Cluster analysis for applications
Anderberg, Michael R
1973-01-01
Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o
Estimating uncertainty of inference for validation
Energy Technology Data Exchange (ETDEWEB)
Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM
2010-09-30
We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the
Following the pioneering discovery of alpha clustering and of molecular resonances, the field of nuclear clustering is today one of those domains of heavy-ion nuclear physics that faces the greatest challenges, yet also contains the greatest opportunities. After many summer schools and workshops, in particular over the last decade, the community of nuclear molecular physicists has decided to collaborate in producing a comprehensive collection of lectures and tutorial reviews covering the field. This third volume follows the successful Lect. Notes Phys. 818 (Vol. 1) and 848 (Vol. 2), and comprises six extensive lectures covering the following topics: - Gamma Rays and Molecular Structure - Faddeev Equation Approach for Three Cluster Nuclear Reactions - Tomography of the Cluster Structure of Light Nuclei Via Relativistic Dissociation - Clustering Effects Within the Dinuclear Model : From Light to Hyper-heavy Molecules in Dynamical Mean-field Approach - Clusterization in Ternary Fission - Clusters in Light N...
Lawson, Andrew B
2002-01-01
Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space and space-time, spatial and spatio-temporal process modelling, nonparametric methods for clustering, and spatio-temporal ...
Deep Learning for Population Genetic Inference.
Sheehan, Sara; Song, Yun S
2016-03-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference.
Directory of Open Access Journals (Sweden)
Sara Sheehan
2016-03-01
Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference
Sheehan, Sara; Song, Yun S.
2016-01-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908
Inferring Phylogenetic Networks Using PhyloNet.
Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay
2018-07-01
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Clusters and how to make it work : Cluster Strategy Toolkit
Manickam, Anu; van Berkel, Karel
2014-01-01
Clusters are the magic answer to regional economic development. Firms in clusters are more innovative; cluster policy dominates EU policy; ‘top-sectors’ and excellence are the choice of national policy makers; clusters are ‘in’. But, clusters are complex, clusters are ‘messy’; there is no clear
Goal inferences about robot behavior : goal inferences and human response behaviors
Broers, H.A.T.; Ham, J.R.C.; Broeders, R.; De Silva, P.; Okada, M.
2014-01-01
This explorative research focused on the goal inferences human observers draw based on a robot's behavior, and the extent to which those inferences predict people's behavior in response to that robot. Results show that different robot behaviors cause different response behavior from people.
Directory of Open Access Journals (Sweden)
Junha Shin
Full Text Available Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Explanatory Preferences Shape Learning and Inference.
Lombrozo, Tania
2016-10-01
Explanations play an important role in learning and inference. People often learn by seeking explanations, and they assess the viability of hypotheses by considering how well they explain the data. An emerging body of work reveals that both children and adults have strong and systematic intuitions about what constitutes a good explanation, and that these explanatory preferences have a systematic impact on explanation-based processes. In particular, people favor explanations that are simple and broad, with the consequence that engaging in explanation can shape learning and inference by leading people to seek patterns and favor hypotheses that support broad and simple explanations. Given the prevalence of explanation in everyday cognition, understanding explanation is therefore crucial to understanding learning and inference. Copyright © 2016 Elsevier Ltd. All rights reserved.
Fuzzy logic controller using different inference methods
International Nuclear Information System (INIS)
Liu, Z.; De Keyser, R.
1994-01-01
In this paper the design of fuzzy controllers by using different inference methods is introduced. Configuration of the fuzzy controllers includes a general rule-base which is a collection of fuzzy PI or PD rules, the triangular fuzzy data model and a centre of gravity defuzzification algorithm. The generalized modus ponens (GMP) is used with the minimum operator of the triangular norm. Under the sup-min inference rule, six fuzzy implication operators are employed to calculate the fuzzy look-up tables for each rule base. The performance is tested in simulated systems with MATLAB/SIMULINK. Results show the effects of using the fuzzy controllers with different inference methods and applied to different test processes
Uncertainty in prediction and in inference
International Nuclear Information System (INIS)
Hilgevoord, J.; Uffink, J.
1991-01-01
The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close relationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in inference can be obtained by means of the so-called statistical distance between probability distributions. When applied to quantum mechanics, this distance leads to a measure of the distinguishability of quantum states, which essentially is the absolute value of the matrix element between the states. The importance of this result to the quantum mechanical uncertainty principle is noted. The second part of the paper provides a derivation of the statistical distance on the basis of the so-called method of support
A Learning Algorithm for Multimodal Grammar Inference.
D'Ulizia, A; Ferri, F; Grifoni, P
2011-12-01
The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.
Examples in parametric inference with R
Dixit, Ulhas Jayram
2016-01-01
This book discusses examples in parametric inference with R. Combining basic theory with modern approaches, it presents the latest developments and trends in statistical inference for students who do not have an advanced mathematical and statistical background. The topics discussed in the book are fundamental and common to many fields of statistical inference and thus serve as a point of departure for in-depth study. The book is divided into eight chapters: Chapter 1 provides an overview of topics on sufficiency and completeness, while Chapter 2 briefly discusses unbiased estimation. Chapter 3 focuses on the study of moments and maximum likelihood estimators, and Chapter 4 presents bounds for the variance. In Chapter 5, topics on consistent estimator are discussed. Chapter 6 discusses Bayes, while Chapter 7 studies some more powerful tests. Lastly, Chapter 8 examines unbiased and other tests. Senior undergraduate and graduate students in statistics and mathematics, and those who have taken an introductory cou...
Grammatical inference algorithms, routines and applications
Wieczorek, Wojciech
2017-01-01
This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.
Statistical inference based on divergence measures
Pardo, Leandro
2005-01-01
The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...
Agricultural Clusters in the Netherlands
Schouten, M.A.; Heijman, W.J.M.
2012-01-01
Michael Porter was the first to use the term cluster in an economic context. He introduced the term in The Competitive Advantage of Nations (1990). The term cluster is also known as business cluster, industry cluster, competitive cluster or Porterian cluster. This article aims at determining and
Open source clustering software.
de Hoon, M J L; Imoto, S; Nolan, J; Miyano, S
2004-06-12
We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C. The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.
Improved Inference of Heteroscedastic Fixed Effects Models
Directory of Open Access Journals (Sweden)
Afshan Saeed
2016-12-01
Full Text Available Heteroscedasticity is a stern problem that distorts estimation and testing of panel data model (PDM. Arellano (1987 proposed the White (1980 estimator for PDM with heteroscedastic errors but it provides erroneous inference for the data sets including high leverage points. In this paper, our attempt is to improve heteroscedastic consistent covariance matrix estimator (HCCME for panel dataset with high leverage points. To draw robust inference for the PDM, our focus is to improve kernel bootstrap estimators, proposed by Racine and MacKinnon (2007. The Monte Carlo scheme is used for assertion of the results.
Likelihood inference for unions of interacting discs
DEFF Research Database (Denmark)
Møller, Jesper; Helisova, K.
2010-01-01
This is probably the first paper which discusses likelihood inference for a random set using a germ-grain model, where the individual grains are unobservable, edge effects occur and other complications appear. We consider the case where the grains form a disc process modelled by a marked point...... process, where the germs are the centres and the marks are the associated radii of the discs. We propose to use a recent parametric class of interacting disc process models, where the minimal sufficient statistic depends on various geometric properties of the random set, and the density is specified......-based maximum likelihood inference and the effect of specifying different reference Poisson models....
IMAGINE: Interstellar MAGnetic field INference Engine
Steininger, Theo
2018-03-01
IMAGINE (Interstellar MAGnetic field INference Engine) performs inference on generic parametric models of the Galaxy. The modular open source framework uses highly optimized tools and technology such as the MultiNest sampler (ascl:1109.006) and the information field theory framework NIFTy (ascl:1302.013) to create an instance of the Milky Way based on a set of parameters for physical observables, using Bayesian statistics to judge the mismatch between measured data and model prediction. The flexibility of the IMAGINE framework allows for simple refitting for newly available data sets and makes state-of-the-art Bayesian methods easily accessible particularly for random components of the Galactic magnetic field.
Inferring causality from noisy time series data
DEFF Research Database (Denmark)
Mønster, Dan; Fusaroli, Riccardo; Tylén, Kristian
2016-01-01
Convergent Cross-Mapping (CCM) has shown high potential to perform causal inference in the absence of models. We assess the strengths and weaknesses of the method by varying coupling strength and noise levels in coupled logistic maps. We find that CCM fails to infer accurate coupling strength...... and even causality direction in synchronized time-series and in the presence of intermediate coupling. We find that the presence of noise deterministically reduces the level of cross-mapping fidelity, while the convergence rate exhibits higher levels of robustness. Finally, we propose that controlled noise...
A heuristic approach to possibilistic clustering algorithms and applications
Viattchenin, Dmitri A
2013-01-01
The present book outlines a new approach to possibilistic clustering in which the sought clustering structure of the set of objects is based directly on the formal definition of fuzzy cluster and the possibilistic memberships are determined directly from the values of the pairwise similarity of objects. The proposed approach can be used for solving different classification problems. Here, some techniques that might be useful at this purpose are outlined, including a methodology for constructing a set of labeled objects for a semi-supervised clustering algorithm, a methodology for reducing analyzed attribute space dimensionality and a methods for asymmetric data processing. Moreover, a technique for constructing a subset of the most appropriate alternatives for a set of weak fuzzy preference relations, which are defined on a universe of alternatives, is described in detail, and a method for rapidly prototyping the Mamdani’s fuzzy inference systems is introduced. This book addresses engineers, scientist...
Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering
Directory of Open Access Journals (Sweden)
Xin Tian
2017-06-01
Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.
Electron: Cluster interactions
International Nuclear Information System (INIS)
Scheidemann, A.A.; Knight, W.D.
1994-02-01
Beam depletion spectroscopy has been used to measure absolute total inelastic electron-sodium cluster collision cross sections in the energy range from E ∼ 0.1 to E ∼ 6 eV. The investigation focused on the closed shell clusters Na 8 , Na 20 , Na 40 . The measured cross sections show an increase for the lowest collision energies where electron attachment is the primary scattering channel. The electron attachment cross section can be understood in terms of Langevin scattering, connecting this measurement with the polarizability of the cluster. For energies above the dissociation energy the measured electron-cluster cross section is energy independent, thus defining an electron-cluster interaction range. This interaction range increases with the cluster size
Clustering high dimensional data
DEFF Research Database (Denmark)
Assent, Ira
2012-01-01
High-dimensional data, i.e., data described by a large number of attributes, pose specific challenges to clustering. The so-called ‘curse of dimensionality’, coined originally to describe the general increase in complexity of various computational problems as dimensionality increases, is known...... to render traditional clustering algorithms ineffective. The curse of dimensionality, among other effects, means that with increasing number of dimensions, a loss of meaningful differentiation between similar and dissimilar objects is observed. As high-dimensional objects appear almost alike, new approaches...... for clustering are required. Consequently, recent research has focused on developing techniques and clustering algorithms specifically for high-dimensional data. Still, open research issues remain. Clustering is a data mining task devoted to the automatic grouping of data based on mutual similarity. Each cluster...
Substructure in clusters of galaxies
International Nuclear Information System (INIS)
Fitchett, M.J.
1988-01-01
Optical observations suggesting the existence of substructure in clusters of galaxies are examined. Models of cluster formation and methods used to detect substructure in clusters are reviewed. Consideration is given to classification schemes based on a departure of bright cluster galaxies from a spherically symmetric distribution, evidence for statistically significant substructure, and various types of substructure, including velocity, spatial, and spatial-velocity substructure. The substructure observed in the galaxy distribution in clusters is discussed, focusing on observations from general cluster samples, the Virgo cluster, the Hydra cluster, Centaurus, the Coma cluster, and the Cancer cluster. 88 refs
fastBMA: scalable network inference and transitive reduction.
Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee
2017-10-01
Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
International Nuclear Information System (INIS)
Rae, W.D.M.; Merchant, A.C.
1993-01-01
We review clustering in light nuclei including molecular resonances in heavy ion reactions. In particular we study the systematics, paying special attention to the relationships between cluster states and superdeformed configurations. We emphasise the selection rules which govern the formation and decay of cluster states. We review some recent experimental results from Daresbury and elsewhere. In particular we report on the evidence for a 7-α chain state in 28 Si in experiments recently performed at the NSF, Daresbury. Finally we begin to address theoretically the important question of the lifetimes of cluster states as deduced from the experimental energy widths of the resonances. (Author)
Laakso, Harri; Escoubet, C. Philippe; The Cluster Active Archive : Studying the Earth’s Space Plasma Environment
2010-01-01
Since the year 2000 the ESA Cluster mission has been investigating the small-scale structures and processes of the Earth's plasma environment, such as those involved in the interaction between the solar wind and the magnetospheric plasma, in global magnetotail dynamics, in cross-tail currents, and in the formation and dynamics of the neutral line and of plasmoids. This book contains presentations made at the 15th Cluster workshop held in March 2008. It also presents several articles about the Cluster Active Archive and its datasets, a few overview papers on the Cluster mission, and articles reporting on scientific findings on the solar wind, the magnetosheath, the magnetopause and the magnetotail.
International Nuclear Information System (INIS)
Sator, N.
2003-01-01
This article concerns the correspondence between thermodynamics and the morphology of simple fluids in terms of clusters. Definitions of clusters providing a geometric interpretation of the liquid-gas phase transition are reviewed with an eye to establishing their physical relevance. The author emphasizes their main features and basic hypotheses, and shows how these definitions lead to a recent approach based on self-bound clusters. Although theoretical, this tutorial review is also addressed to readers interested in experimental aspects of clustering in simple fluids
Model averaging, optimal inference and habit formation
Directory of Open Access Journals (Sweden)
Thomas H B FitzGerald
2014-06-01
Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.
Efficient Bayesian inference for ARFIMA processes
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-03-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
Campbell's and Rubin's Perspectives on Causal Inference
West, Stephen G.; Thoemmes, Felix
2010-01-01
Donald Campbell's approach to causal inference (D. T. Campbell, 1957; W. R. Shadish, T. D. Cook, & D. T. Campbell, 2002) is widely used in psychology and education, whereas Donald Rubin's causal model (P. W. Holland, 1986; D. B. Rubin, 1974, 2005) is widely used in economics, statistics, medicine, and public health. Campbell's approach focuses on…
Bayesian structural inference for hidden processes
Strelioff, Christopher C.; Crutchfield, James P.
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR
International Nuclear Information System (INIS)
Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin
2015-01-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics
Interest, Inferences, and Learning from Texts
Clinton, Virginia; van den Broek, Paul
2012-01-01
Topic interest and learning from texts have been found to be positively associated with each other. However, the reason for this positive association is not well understood. The purpose of this study is to examine a cognitive process, inference generation, that could explain the positive association between interest and learning from texts. In…
Ignorability in Statistical and Probabilistic Inference
DEFF Research Database (Denmark)
Jaeger, Manfred
2005-01-01
When dealing with incomplete data in statistical learning, or incomplete observations in probabilistic inference, one needs to distinguish the fact that a certain event is observed from the fact that the observed event has happened. Since the modeling and computational complexities entailed...
Evolutionary inference via the Poisson Indel Process.
Bouchard-Côté, Alexandre; Jordan, Michael I
2013-01-22
We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.
Culture and Pragmatic Inference in Interpersonal Communication
African Journals Online (AJOL)
cognitive process, and that the human capacity for inference is crucially important ... been noted that research in interpersonal communication is currently pushing the ... communicative actions, the social-cultural world of everyday life is not only ... personal experiences of the authors', as documented over time and recreated ...
Inference and the Introductory Statistics Course
Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross
2011-01-01
This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its…
Statistical Inference on the Canadian Middle Class
Directory of Open Access Journals (Sweden)
Russell Davidson
2018-03-01
Full Text Available Conventional wisdom says that the middle classes in many developed countries have recently suffered losses, in terms of both the share of the total population belonging to the middle class, and also their share in total income. Here, distribution-free methods are developed for inference on these shares, by means of deriving expressions for their asymptotic variances of sample estimates, and the covariance of the estimates. Asymptotic inference can be undertaken based on asymptotic normality. Bootstrap inference can be expected to be more reliable, and appropriate bootstrap procedures are proposed. As an illustration, samples of individual earnings drawn from Canadian census data are used to test various hypotheses about the middle-class shares, and confidence intervals for them are computed. It is found that, for the earlier censuses, sample sizes are large enough for asymptotic and bootstrap inference to be almost identical, but that, in the twenty-first century, the bootstrap fails on account of a strange phenomenon whereby many presumably different incomes in the data are rounded to one and the same value. Another difference between the centuries is the appearance of heavy right-hand tails in the income distributions of both men and women.
Spurious correlations and inference in landscape genetics
Samuel A. Cushman; Erin L. Landguth
2010-01-01
Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...
Cortical information flow during inferences of agency
Dogge, Myrthel; Hofman, Dennis; Boersma, Maria; Dijkerman, H Chris; Aarts, Henk
2014-01-01
Building on the recent finding that agency experiences do not merely rely on sensorimotor information but also on cognitive cues, this exploratory study uses electroencephalographic recordings to examine functional connectivity during agency inference processing in a setting where action and outcome
Quasi-Experimental Designs for Causal Inference
Kim, Yongnam; Steiner, Peter
2016-01-01
When randomized experiments are infeasible, quasi-experimental designs can be exploited to evaluate causal treatment effects. The strongest quasi-experimental designs for causal inference are regression discontinuity designs, instrumental variable designs, matching and propensity score designs, and comparative interrupted time series designs. This…
The importance of learning when making inferences
Directory of Open Access Journals (Sweden)
Jorg Rieskamp
2008-03-01
Full Text Available The assumption that people possess a repertoire of strategies to solve the inference problems they face has been made repeatedly. The experimental findings of two previous studies on strategy selection are reexamined from a learning perspective, which argues that people learn to select strategies for making probabilistic inferences. This learning process is modeled with the strategy selection learning (SSL theory, which assumes that people develop subjective expectancies for the strategies they have. They select strategies proportional to their expectancies, which are updated on the basis of experience. For the study by Newell, Weston, and Shanks (2003 it can be shown that people did not anticipate the success of a strategy from the beginning of the experiment. Instead, the behavior observed at the end of the experiment was the result of a learning process that can be described by the SSL theory. For the second study, by Br"oder and Schiffer (2006, the SSL theory is able to provide an explanation for why participants only slowly adapted to new environments in a dynamic inference situation. The reanalysis of the previous studies illustrates the importance of learning for probabilistic inferences.
Colligation, Or the Logical Inference of Interconnection
DEFF Research Database (Denmark)
Falster, Peter
1998-01-01
laws or assumptions. Yet interconnection as an abstract concept seems to be without scientific underpinning in pure logic. Adopting a historical viewpoint, our aim is to show that the reasoning of interconnection may be identified with a neglected kind of logical inference, called "colligation...
Colligation or, The Logical Inference of Interconnection
DEFF Research Database (Denmark)
Franksen, Ole Immanuel; Falster, Peter
2000-01-01
laws or assumptions. Yet interconnection as an abstract concept seems to be without scientific underpinning in oure logic. Adopting a historical viewpoint, our aim is to show that the reasoning of interconnection may be identified with a neglected kind of logical inference, called "colligation...
Inferring motion and location using WLAN RSSI
Kavitha Muthukrishnan, K.; van der Zwaag, B.J.; Havinga, Paul J.M.; Fuller, R.; Koutsoukos, X.
2009-01-01
We present novel algorithms to infer movement by making use of inherent fluctuations in the received signal strengths from existing WLAN infrastructure. We evaluate the performance of the presented algorithms based on classification metrics such as recall and precision using annotated traces
International Nuclear Information System (INIS)
Elmegreen, Bruce G.; Galliano, Emmanuel; Alloin, Danielle
2009-01-01
Cluster formation and gas dynamics in the central regions of barred galaxies are not well understood. This paper reviews the environment of three 10 7 M sun clusters near the inner Lindblad resonance (ILR) of the barred spiral NGC 1365. The morphology, mass, and flow of H I and CO gas in the spiral and barred regions are examined for evidence of the location and mechanism of cluster formation. The accretion rate is compared with the star formation rate to infer the lifetime of the starburst. The gas appears to move from inside corotation in the spiral region to looping filaments in the interbar region at a rate of ∼6 M sun yr -1 before impacting the bar dustlane somewhere along its length. The gas in this dustlane moves inward, growing in flux as a result of the accretion to ∼40 M sun yr -1 near the ILR. This inner rate exceeds the current nuclear star formation rate by a factor of 4, suggesting continued buildup of nuclear mass for another ∼0.5 Gyr. The bar may be only 1-2 Gyr old. Extrapolating the bar flow back in time, we infer that the clusters formed in the bar dustlane outside the central dust ring at a position where an interbar filament currently impacts the lane. The ram pressure from this impact is comparable to the pressure in the bar dustlane, and both are comparable to the pressure in the massive clusters. Impact triggering is suggested. The isothermal assumption in numerical simulations seems inappropriate for the rarefaction parts of spiral and bar gas flows. The clusters have enough lower-mass counterparts to suggest they are part of a normal power-law mass distribution. Gas trapping in the most massive clusters could explain their [Ne II] emission, which is not evident from the lower-mass clusters nearby.
Active inference, sensory attenuation and illusions.
Brown, Harriet; Adams, Rick A; Parees, Isabel; Edwards, Mark; Friston, Karl
2013-11-01
Active inference provides a simple and neurobiologically plausible account of how action and perception are coupled in producing (Bayes) optimal behaviour. This can be seen most easily as minimising prediction error: we can either change our predictions to explain sensory input through perception. Alternatively, we can actively change sensory input to fulfil our predictions. In active inference, this action is mediated by classical reflex arcs that minimise proprioceptive prediction error created by descending proprioceptive predictions. However, this creates a conflict between action and perception; in that, self-generated movements require predictions to override the sensory evidence that one is not actually moving. However, ignoring sensory evidence means that externally generated sensations will not be perceived. Conversely, attending to (proprioceptive and somatosensory) sensations enables the detection of externally generated events but precludes generation of actions. This conflict can be resolved by attenuating the precision of sensory evidence during movement or, equivalently, attending away from the consequences of self-made acts. We propose that this Bayes optimal withdrawal of precise sensory evidence during movement is the cause of psychophysical sensory attenuation. Furthermore, it explains the force-matching illusion and reproduces empirical results almost exactly. Finally, if attenuation is removed, the force-matching illusion disappears and false (delusional) inferences about agency emerge. This is important, given the negative correlation between sensory attenuation and delusional beliefs in normal subjects--and the reduction in the magnitude of the illusion in schizophrenia. Active inference therefore links the neuromodulatory optimisation of precision to sensory attenuation and illusory phenomena during the attribution of agency in normal subjects. It also provides a functional account of deficits in syndromes characterised by false inference
High Speed White Dwarf Asteroseismology with the Herty Hall Cluster
Gray, Aaron; Kim, A.
2012-01-01
Asteroseismology is the process of using observed oscillations of stars to infer their interior structure. In high speed asteroseismology, we complete that by quickly computing hundreds of thousands of models to match the observed period spectra. Each model on a single processor takes five to ten seconds to run. Therefore, we use a cluster of sixteen Dell Workstations with dual-core processors. The computers use the Ubuntu operating system and Apache Hadoop software to manage workloads.
Lifting to cluster-tilting objects in higher cluster categories
Liu, Pin
2008-01-01
In this note, we consider the $d$-cluster-tilted algebras, the endomorphism algebras of $d$-cluster-tilting objects in $d$-cluster categories. We show that a tilting module over such an algebra lifts to a $d$-cluster-tilting object in this $d$-cluster category.
International Nuclear Information System (INIS)
Webb, T. M. A.; O'Donnell, D.; Coppin, Kristen; Faloon, Ashley; Geach, James E.; Noble, Allison; Yee, H. K. C.; Gilbank, David; Ellingson, Erica; Gladders, Mike; Muzzin, Adam; Wilson, Gillian; Yan, Renbin
2013-01-01
We present the results of an infrared (IR) study of high-redshift galaxy clusters with the MIPS camera on board the Spitzer Space Telescope. We have assembled a sample of 42 clusters from the Red-Sequence Cluster Survey-1 over the redshift range 0.3 14-15 M ☉ . We statistically measure the number of IR-luminous galaxies in clusters above a fixed inferred IR luminosity of 2 × 10 11 M ☉ , assuming a star forming galaxy template, per unit cluster mass and find it increases to higher redshift. Fitting a simple power-law we measure evolution of (1 + z) 5.1±1.9 over the range 0.3 cluster ). The evolution is similar, with ΣSFR/M cluster ∼ (1 + z) 5.4±1.9 . We show that this can be accounted for by the evolution of the IR-bright field population over the same redshift range; that is, the evolution can be attributed entirely to the change in the in-falling field galaxy population. We show that the ΣSFR/M cluster (binned over all redshift) decreases with increasing cluster mass with a slope (ΣSFR/M cluster ∼M cluster -1.5±0.4 ) consistent with the dependence of the stellar-to-total mass per unit cluster mass seen locally. The inferred star formation seen here could produce ∼5%-10% of the total stellar mass in massive clusters at z = 0, but we cannot constrain the descendant population, nor how rapidly the star-formation must shut-down once the galaxies have entered the cluster environment. Finally, we show a clear decrease in the number of IR-bright galaxies per unit optical galaxy in the cluster cores, confirming star formation continues to avoid the highest density regions of the universe at z ∼ 0.75 (the average redshift of the high-redshift clusters). While several previous studies appear to show enhanced star formation in high-redshift clusters relative to the field we note that these papers have not accounted for the overall increase in galaxy or dark matter density at the location of clusters. Once this is done, clusters at z ∼ 0.75 have the same
Neurostimulation in cluster headache
DEFF Research Database (Denmark)
Pedersen, Jeppe L; Barloese, Mads; Jensen, Rigmor H
2013-01-01
PURPOSE OF REVIEW: Neurostimulation has emerged as a viable treatment for intractable chronic cluster headache. Several therapeutic strategies are being investigated including stimulation of the hypothalamus, occipital nerves and sphenopalatine ganglion. The aim of this review is to provide...... effective strategy must be preferred as first-line therapy for intractable chronic cluster headache....
DEFF Research Database (Denmark)
Ghorbani, Mohammad
2013-01-01
In this paper we introduce an instance of the well-know Neyman–Scott cluster process model with clusters having a long tail behaviour. In our model the offspring points are distributed around the parent points according to a circular Cauchy distribution. Using a modified Cramér-von Misses test...
S.M.W. Phlippen (Sandra); G.A. van der Knaap (Bert)
2007-01-01
textabstractPolicy makers spend large amounts of public resources on the foundation of science parks and other forms of geographically clustered business activities, in order to stimulate regional innovation. Underlying the relation between clusters and innovation is the assumption that co-located
Huang, Yifen
2010-01-01
Mixed-initiative clustering is a task where a user and a machine work collaboratively to analyze a large set of documents. We hypothesize that a user and a machine can both learn better clustering models through enriched communication and interactive learning from each other. The first contribution or this thesis is providing a framework of…
1999-01-01
Atlas Image mosaic, covering 34' x 34' on the sky, of the Coma cluster, aka Abell 1656. This is a particularly rich cluster of individual galaxies (over 1000 members), most prominently the two giant ellipticals, NGC 4874 (right) and NGC 4889 (left). The remaining members are mostly smaller ellipticals, but spiral galaxies are also evident in the 2MASS image. The cluster is seen toward the constellation Coma Berenices, but is actually at a distance of about 100 Mpc (330 million light years, or a redshift of 0.023) from us. At this distance, the cluster is in what is known as the 'Hubble flow,' or the overall expansion of the Universe. As such, astronomers can measure the Hubble Constant, or the universal expansion rate, based on the distance to this cluster. Large, rich clusters, such as Coma, allow astronomers to measure the 'missing mass,' i.e., the matter in the cluster that we cannot see, since it gravitationally influences the motions of the member galaxies within the cluster. The near-infrared maps the overall luminous mass content of the member galaxies, since the light at these wavelengths is dominated by the more numerous older stellar populations. Galaxies, as seen by 2MASS, look fairly smooth and homogeneous, as can be seen from the Hubble 'tuning fork' diagram of near-infrared galaxy morphology. Image mosaic by S. Van Dyk (IPAC).
International Nuclear Information System (INIS)
Dubovik, V.M.; Gal'perin, A.G.; Rikhvitskij, V.S.; Lushnikov, A.A.
2000-01-01
Processes of some traffic blocking coming into existence are considered as probabilistic ones. We study analytic solutions for models for the dynamics of both cluster growth and cluster growth with fragmentation in the systems of finite number of objects. Assuming rates constancy of both coalescence and fragmentation, the models under consideration are linear on the probability functions
International Nuclear Information System (INIS)
Hodgson, P.E.
1990-01-01
The effects of nucleon clustering in nuclei are described, with reference to both nuclear structure and nuclear reactions, and the advantages of using the cluster formalism to describe a range of phenomena are discussed. It is shown that bound and scattering alpha-particle states can be described in a unified way using an energy-dependent alpha-nucleus potential. (author)
Negotiating Cluster Boundaries
DEFF Research Database (Denmark)
Giacomin, Valeria
2017-01-01
Palm oil was introduced to Malay(si)a as an alternative to natural rubber, inheriting its cluster organizational structure. In the late 1960s, Malaysia became the world’s largest palm oil exporter. Based on archival material from British colonial institutions and agency houses, this paper focuses...... on the governance dynamics that drove institutional change within this cluster during decolonization. The analysis presents three main findings: (i) cluster boundaries are defined by continuous tug-of-war style negotiations between public and private actors; (ii) this interaction produces institutional change...... within the cluster, in the form of cumulative ‘institutional rounds’ – the correction or disruption of existing institutions or the creation of new ones; and (iii) this process leads to a broader inclusion of local actors in the original cluster configuration. The paper challenges the prevalent argument...
Mathematical classification and clustering
Mirkin, Boris
1996-01-01
I am very happy to have this opportunity to present the work of Boris Mirkin, a distinguished Russian scholar in the areas of data analysis and decision making methodologies. The monograph is devoted entirely to clustering, a discipline dispersed through many theoretical and application areas, from mathematical statistics and combina torial optimization to biology, sociology and organizational structures. It compiles an immense amount of research done to date, including many original Russian de velopments never presented to the international community before (for instance, cluster-by-cluster versions of the K-Means method in Chapter 4 or uniform par titioning in Chapter 5). The author's approach, approximation clustering, allows him both to systematize a great part of the discipline and to develop many in novative methods in the framework of optimization problems. The optimization methods considered are proved to be meaningful in the contexts of data analysis and clustering. The material presented in ...
Neutrosophic Hierarchical Clustering Algoritms
Directory of Open Access Journals (Sweden)
Rıdvan Şahin
2014-03-01
Full Text Available Interval neutrosophic set (INS is a generalization of interval valued intuitionistic fuzzy set (IVIFS, whose the membership and non-membership values of elements consist of fuzzy range, while single valued neutrosophic set (SVNS is regarded as extension of intuitionistic fuzzy set (IFS. In this paper, we extend the hierarchical clustering techniques proposed for IFSs and IVIFSs to SVNSs and INSs respectively. Based on the traditional hierarchical clustering procedure, the single valued neutrosophic aggregation operator, and the basic distance measures between SVNSs, we define a single valued neutrosophic hierarchical clustering algorithm for clustering SVNSs. Then we extend the algorithm to classify an interval neutrosophic data. Finally, we present some numerical examples in order to show the effectiveness and availability of the developed clustering algorithms.
Cosmological constraints from Chandra observations of galaxy clusters.
Allen, Steven W
2002-09-15
Chandra observations of rich, relaxed galaxy clusters allow the properties of the X-ray gas and the total gravitating mass to be determined precisely. Here, we present results for a sample of the most X-ray luminous, dynamically relaxed clusters known. We show that the Chandra data and independent gravitational lensing studies provide consistent answers on the mass distributions in the clusters. The mass profiles exhibit a form in good agreement with the predictions from numerical simulations. Combining Chandra results on the X-ray gas mass fractions in the clusters with independent measurements of the Hubble constant and the mean baryonic matter density in the Universe, we obtain a tight constraint on the mean total matter density of the Universe, Omega(m), and an interesting constraint on the cosmological constant, Omega(Lambda). We also describe the 'virial relations' linking the masses, X-ray temperatures and luminosities of galaxy clusters. These relations provide a key step in linking the observed number density and spatial distribution of clusters to the predictions from cosmological models. The Chandra data confirm the presence of a systematic offset of ca. 40% between the normalization of the observed mass-temperature relation and the predictions from standard simulations. This finding leads to a significant revision of the best-fit value of sigma(8) inferred from the observed temperature and luminosity functions of clusters.
Testing the accuracy of clustering redshifts with simulations
Scottez, V.; Benoit-Lévy, A.; Coupon, J.; Ilbert, O.; Mellier, Y.
2018-03-01
We explore the accuracy of clustering-based redshift inference within the MICE2 simulation. This method uses the spatial clustering of galaxies between a spectroscopic reference sample and an unknown sample. This study give an estimate of the reachable accuracy of this method. First, we discuss the requirements for the number objects in the two samples, confirming that this method does not require a representative spectroscopic sample for calibration. In the context of next generation of cosmological surveys, we estimated that the density of the Quasi Stellar Objects in BOSS allows us to reach 0.2 per cent accuracy in the mean redshift. Secondly, we estimate individual redshifts for galaxies in the densest regions of colour space ( ˜ 30 per cent of the galaxies) without using the photometric redshifts procedure. The advantage of this procedure is threefold. It allows: (i) the use of cluster-zs for any field in astronomy, (ii) the possibility to combine photo-zs and cluster-zs to get an improved redshift estimation, (iii) the use of cluster-z to define tomographic bins for weak lensing. Finally, we explore this last option and build five cluster-z selected tomographic bins from redshift 0.2 to 1. We found a bias on the mean redshift estimate of 0.002 per bin. We conclude that cluster-z could be used as a primary redshift estimator by next generation of cosmological surveys.
Herd Clustering: A synergistic data clustering approach using collective intelligence
Wong, Kachun; Peng, Chengbin; Li, Yue; Chan, Takming
2014-01-01
, this principle is used to develop a new clustering algorithm. Inspired by herd behavior, the clustering method is a synergistic approach using collective intelligence called Herd Clustering (HC). The novel part is laid in its first stage where data instances
Likelihood inference for unions of interacting discs
DEFF Research Database (Denmark)
Møller, Jesper; Helisová, Katarina
To the best of our knowledge, this is the first paper which discusses likelihood inference or a random set using a germ-grain model, where the individual grains are unobservable edge effects occur, and other complications appear. We consider the case where the grains form a disc process modelled...... is specified with respect to a given marked Poisson model (i.e. a Boolean model). We show how edge effects and other complications can be handled by considering a certain conditional likelihood. Our methodology is illustrated by analyzing Peter Diggle's heather dataset, where we discuss the results...... of simulation-based maximum likelihood inference and the effect of specifying different reference Poisson models....
An Intuitive Dashboard for Bayesian Network Inference
International Nuclear Information System (INIS)
Reddy, Vikas; Farr, Anna Charisse; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K D V
2014-01-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
The NIFTY way of Bayesian signal inference
International Nuclear Information System (INIS)
Selig, Marco
2014-01-01
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D 3 PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy
The NIFTy way of Bayesian signal inference
Selig, Marco
2014-12-01
We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
Dopamine, reward learning, and active inference
Directory of Open Access Journals (Sweden)
Thomas eFitzgerald
2015-11-01
Full Text Available Temporal difference learning models propose phasic dopamine signalling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behaviour. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings.
Dopamine, reward learning, and active inference.
FitzGerald, Thomas H B; Dolan, Raymond J; Friston, Karl
2015-01-01
Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behavior. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings.
Inferring genetic interactions from comparative fitness data.
Crona, Kristina; Gavryushkin, Alex; Greene, Devin; Beerenwinkel, Niko
2017-12-20
Darwinian fitness is a central concept in evolutionary biology. In practice, however, it is hardly possible to measure fitness for all genotypes in a natural population. Here, we present quantitative tools to make inferences about epistatic gene interactions when the fitness landscape is only incompletely determined due to imprecise measurements or missing observations. We demonstrate that genetic interactions can often be inferred from fitness rank orders, where all genotypes are ordered according to fitness, and even from partial fitness orders. We provide a complete characterization of rank orders that imply higher order epistasis. Our theory applies to all common types of gene interactions and facilitates comprehensive investigations of diverse genetic interactions. We analyzed various genetic systems comprising HIV-1, the malaria-causing parasite Plasmodium vivax , the fungus Aspergillus niger , and the TEM-family of β-lactamase associated with antibiotic resistance. For all systems, our approach revealed higher order interactions among mutations.
An emergent approach to analogical inference
Thibodeau, Paul H.; Flusberg, Stephen J.; Glick, Jeremy J.; Sternberg, Daniel A.
2013-03-01
In recent years, a growing number of researchers have proposed that analogy is a core component of human cognition. According to the dominant theoretical viewpoint, analogical reasoning requires a specific suite of cognitive machinery, including explicitly coded symbolic representations and a mapping or binding mechanism that operates over these representations. Here we offer an alternative approach: we find that analogical inference can emerge naturally and spontaneously from a relatively simple, error-driven learning mechanism without the need to posit any additional analogy-specific machinery. The results also parallel findings from the developmental literature on analogy, demonstrating a shift from an initial reliance on surface feature similarity to the use of relational similarity later in training. Variants of the model allow us to consider and rule out alternative accounts of its performance. We conclude by discussing how these findings can potentially refine our understanding of the processes that are required to perform analogical inference.
Pointwise probability reinforcements for robust statistical inference.
Frénay, Benoît; Verleysen, Michel
2014-02-01
Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.
Statistical inference from imperfect photon detection
International Nuclear Information System (INIS)
Audenaert, Koenraad M R; Scheel, Stefan
2009-01-01
We consider the statistical properties of photon detection with imperfect detectors that exhibit dark counts and less than unit efficiency, in the context of tomographic reconstruction. In this context, the detectors are used to implement certain positive operator-valued measures (POVMs) that would allow us to reconstruct the quantum state or quantum process under consideration. Here we look at the intermediate step of inferring outcome probabilities from measured outcome frequencies, and show how this inference can be performed in a statistically sound way in the presence of detector imperfections. Merging outcome probabilities for different sets of POVMs into a consistent quantum state picture has been treated elsewhere (Audenaert and Scheel 2009 New J. Phys. 11 023028). Single-photon pulsed measurements as well as continuous wave measurements are covered.
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
Working with sample data exploration and inference
Chaffe-Stengel, Priscilla
2014-01-01
Managers and analysts routinely collect and examine key performance measures to better understand their operations and make good decisions. Being able to render the complexity of operations data into a coherent account of significant events requires an understanding of how to work well with raw data and to make appropriate inferences. Although some statistical techniques for analyzing data and making inferences are sophisticated and require specialized expertise, there are methods that are understandable and applicable by anyone with basic algebra skills and the support of a spreadsheet package. By applying these fundamental methods themselves rather than turning over both the data and the responsibility for analysis and interpretation to an expert, managers will develop a richer understanding and potentially gain better control over their environment. This text is intended to describe these fundamental statistical techniques to managers, data analysts, and students. Statistical analysis of sample data is enh...
Parametric inference for biological sequence analysis.
Pachter, Lior; Sturmfels, Bernd
2004-11-16
One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.
Inferences on Children’s Reading Groups
Directory of Open Access Journals (Sweden)
Javier González García
2009-05-01
Full Text Available This article focuses on the non-literal information of a text, which can be inferred from key elements or clues offered by the text itself. This kind of text is called implicit text or inference, due to the thinking process that it stimulates. The explicit resources that lead to information retrieval are related to others of implicit information, which have increased their relevance. In this study, during two courses, how two teachers interpret three stories and how they establish a debate dividing the class into three student groups, was analyzed. The sample was formed by two classes of two urban public schools of Burgos capital (Spain, and two of public schools of Tampico (Mexico. This allowed us to observe an increasing percentage value of the group focused in text comprehension, and a lesser percentage of the group perceiving comprehension as a secondary objective.
Inferring Genetic Ancestry: Opportunities, Challenges, and Implications
Royal, Charmaine D.; Novembre, John; Fullerton, Stephanie M.; Goldstein, David B.; Long, Jeffrey C.; Bamshad, Michael J.; Clark, Andrew G.
2010-01-01
Increasing public interest in direct-to-consumer (DTC) genetic ancestry testing has been accompanied by growing concern about issues ranging from the personal and societal implications of the testing to the scientific validity of ancestry inference. The very concept of “ancestry” is subject to misunderstanding in both the general and scientific communities. What do we mean by ancestry? How exactly is ancestry measured? How far back can such ancestry be defined and by which genetic tools? How ...
Spatial Inference Based on Geometric Proportional Analogies
Mullally, Emma-Claire; O'Donoghue, Diarmuid P.
2006-01-01
We describe an instance-based reasoning solution to a variety of spatial reasoning problems. The solution centers on identifying an isomorphic mapping between labelled graphs that represent some problem data and a known solution instance. We describe a number of spatial reasoning problems that are solved by generating non-deductive inferences, integrating topology with area (and other) features. We report the accuracy of our algorithm on different categories of spatial reasoning tasks from th...
Inferring ontology graph structures using OWL reasoning
Rodriguez-Garcia, Miguel Angel
2018-01-05
Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies\\' semantic content remains a challenge.We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies\\' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph .Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.
Role of Speaker Cues in Attention Inference
Jin Joo Lee; Cynthia Breazeal; David DeSteno
2017-01-01
Current state-of-the-art approaches to emotion recognition primarily focus on modeling the nonverbal expressions of the sole individual without reference to contextual elements such as the co-presence of the partner. In this paper, we demonstrate that the accurate inference of listeners’ social-emotional state of attention depends on accounting for the nonverbal behaviors of their storytelling partner, namely their speaker cues. To gain a deeper understanding of the role of speaker cues in at...
Inferring ontology graph structures using OWL reasoning.
Rodríguez-García, Miguel Ángel; Hoehndorf, Robert
2018-01-05
Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies' semantic content remains a challenge. We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph . Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Using metacognitive cues to infer others' thinking
André Mata; Tiago Almeida
2014-01-01
Three studies tested whether people use cues about the way other people think---for example, whether others respond fast vs. slow---to infer what responses other people might give to reasoning problems. People who solve reasoning problems using deliberative thinking have better insight than intuitive problem-solvers into the responses that other people might give to the same problems. Presumably because deliberative responders think of intuitive responses before they think o...
Thermodynamics of statistical inference by cells.
Lang, Alex H; Fisher, Charles K; Mora, Thierry; Mehta, Pankaj
2014-10-03
The deep connection between thermodynamics, computation, and information is now well established both theoretically and experimentally. Here, we extend these ideas to show that thermodynamics also places fundamental constraints on statistical estimation and learning. To do so, we investigate the constraints placed by (nonequilibrium) thermodynamics on the ability of biochemical signaling networks to estimate the concentration of an external signal. We show that accuracy is limited by energy consumption, suggesting that there are fundamental thermodynamic constraints on statistical inference.
Cluster analysis of rural, urban, and curbside atmospheric particle size data.
Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M
2009-07-01
Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.
Directory of Open Access Journals (Sweden)
Ciira wa Maina
2014-05-01
Full Text Available Gene transcription mediated by RNA polymerase II (pol-II is a key step in gene expression. The dynamics of pol-II moving along the transcribed region influence the rate and timing of gene expression. In this work, we present a probabilistic model of transcription dynamics which is fitted to pol-II occupancy time course data measured using ChIP-Seq. The model can be used to estimate transcription speed and to infer the temporal pol-II activity profile at the gene promoter. Model parameters are estimated using either maximum likelihood estimation or via Bayesian inference using Markov chain Monte Carlo sampling. The Bayesian approach provides confidence intervals for parameter estimates and allows the use of priors that capture domain knowledge, e.g. the expected range of transcription speeds, based on previous experiments. The model describes the movement of pol-II down the gene body and can be used to identify the time of induction for transcriptionally engaged genes. By clustering the inferred promoter activity time profiles, we are able to determine which genes respond quickly to stimuli and group genes that share activity profiles and may therefore be co-regulated. We apply our methodology to biological data obtained using ChIP-seq to measure pol-II occupancy genome-wide when MCF-7 human breast cancer cells are treated with estradiol (E2. The transcription speeds we obtain agree with those obtained previously for smaller numbers of genes with the advantage that our approach can be applied genome-wide. We validate the biological significance of the pol-II promoter activity clusters by investigating cluster-specific transcription factor binding patterns and determining canonical pathway enrichment. We find that rapidly induced genes are enriched for both estrogen receptor alpha (ERα and FOXA1 binding in their proximal promoter regions.
Fielicke, André; von Helden, Gert; Meijer, Gerard; Pedersen, David B; Simard, Benoit; Rayner, David M
2005-06-15
We report on the interaction of carbon monoxide with cationic gold clusters in the gas phase. Successive adsorption of CO molecules on the Au(n)(+) clusters proceeds until a cluster size specific saturation coverage is reached. Structural information for the bare gold clusters is obtained by comparing the saturation stoichiometry with the number of available equivalent sites presented by candidate structures of Au(n)(+). Our findings are in agreement with the planar structures of the Au(n)(+) cluster cations with n < or = 7 that are suggested by ion mobility experiments [Gilb, S.; Weis, P.; Furche, F.; Ahlrichs, R.; Kappes, M. M. J. Chem. Phys. 2001, 116, 4094]. By inference we also establish the structure of the saturated Au(n)(CO)(m)(+) complexes. In certain cases we find evidence suggesting that successive adsorption of CO can distort the metal cluster framework. In addition, the vibrational spectra of the Au(n)(CO)(m)(+) complexes in both the CO stretching region and in the region of the Au-C stretch and the Au-C-O bend are measured using infrared photodepletion spectroscopy. The spectra further aid in the structure determination of Au(n)(+), provide information on the structure of the Au(n)(+)-CO complexes, and can be compared with spectra of CO adsorbates on deposited clusters or surfaces.
Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA
2009-12-22
Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.
Cluster-cluster correlations and constraints on the correlation hierarchy
Hamilton, A. J. S.; Gott, J. R., III
1988-01-01
The hypothesis that galaxies cluster around clusters at least as strongly as they cluster around galaxies imposes constraints on the hierarchy of correlation amplitudes in hierachical clustering models. The distributions which saturate these constraints are the Rayleigh-Levy random walk fractals proposed by Mandelbrot; for these fractal distributions cluster-cluster correlations are all identically equal to galaxy-galaxy correlations. If correlation amplitudes exceed the constraints, as is observed, then cluster-cluster correlations must exceed galaxy-galaxy correlations, as is observed.
Formation of stable products from cluster-cluster collisions
International Nuclear Information System (INIS)
Alamanova, Denitsa; Grigoryan, Valeri G; Springborg, Michael
2007-01-01
The formation of stable products from copper cluster-cluster collisions is investigated by using classical molecular-dynamics simulations in combination with an embedded-atom potential. The dependence of the product clusters on impact energy, relative orientation of the clusters, and size of the clusters is studied. The structures and total energies of the product clusters are analysed and compared with those of the colliding clusters before impact. These results, together with the internal temperature, are used in obtaining an increased understanding of cluster fusion processes
Bootstrap inference when using multiple imputation.
Schomaker, Michael; Heumann, Christian
2018-04-16
Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is nonsymmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present 4 methods that are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that 3 of the 4 approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the 4 methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the g-formula for inference, a method for which no standard errors are available. Copyright © 2018 John Wiley & Sons, Ltd.
Inferring epidemic network topology from surveillance data.
Directory of Open Access Journals (Sweden)
Xiang Wan
Full Text Available The transmission of infectious diseases can be affected by many or even hidden factors, making it difficult to accurately predict when and where outbreaks may emerge. One approach at the moment is to develop and deploy surveillance systems in an effort to detect outbreaks as timely as possible. This enables policy makers to modify and implement strategies for the control of the transmission. The accumulated surveillance data including temporal, spatial, clinical, and demographic information, can provide valuable information with which to infer the underlying epidemic networks. Such networks can be quite informative and insightful as they characterize how infectious diseases transmit from one location to another. The aim of this work is to develop a computational model that allows inferences to be made regarding epidemic network topology in heterogeneous populations. We apply our model on the surveillance data from the 2009 H1N1 pandemic in Hong Kong. The inferred epidemic network displays significant effect on the propagation of infectious diseases.
Role of Speaker Cues in Attention Inference
Directory of Open Access Journals (Sweden)
Jin Joo Lee
2017-10-01
Full Text Available Current state-of-the-art approaches to emotion recognition primarily focus on modeling the nonverbal expressions of the sole individual without reference to contextual elements such as the co-presence of the partner. In this paper, we demonstrate that the accurate inference of listeners’ social-emotional state of attention depends on accounting for the nonverbal behaviors of their storytelling partner, namely their speaker cues. To gain a deeper understanding of the role of speaker cues in attention inference, we conduct investigations into real-world interactions of children (5–6 years old storytelling with their peers. Through in-depth analysis of human–human interaction data, we first identify nonverbal speaker cues (i.e., backchannel-inviting cues and listener responses (i.e., backchannel feedback. We then demonstrate how speaker cues can modify the interpretation of attention-related backchannels as well as serve as a means to regulate the responsiveness of listeners. We discuss the design implications of our findings toward our primary goal of developing attention recognition models for storytelling robots, and we argue that social robots can proactively use speaker cues to form more accurate inferences about the attentive state of their human partners.
Cortical information flow during inferences of agency
Directory of Open Access Journals (Sweden)
Myrthel eDogge
2014-08-01
Full Text Available Building on the recent finding that agency experiences do not merely rely on sensorimotor information but also on cognitive cues, this exploratory study uses electroencephalographic recordings to examine functional connectivity during agency inference processing in a setting where action and outcome are independent. Participants completed a computerized task in which they pressed a button followed by one of two color words (red or blue and rated their experienced agency over producing the color. Before executing the action, a matching or mismatching color word was pre-activated by explicitly instructing participants to produce the color (goal condition or by briefly presenting the color word (prime condition. In both conditions, experienced agency was higher in matching versus mismatching trials. Furthermore, increased electroencephalography (EEG-based connectivity strength was observed between parietal and frontal nodes and within the (prefrontal cortex when color-outcomes matched with goals and participants reported high agency. This pattern of increased connectivity was not identified in trials where outcomes were pre-activated through primes. These results suggest that different connections are involved in the experience and in the loss of agency, as well as in inferences of agency resulting from different types of pre-activation. Moreover, the findings provide novel support for the involvement of a fronto-parietal network in agency inferences.
Causal inference of asynchronous audiovisual speech
Directory of Open Access Journals (Sweden)
John F Magnotti
2013-11-01
Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.
Functional neuroanatomy of intuitive physical inference.
Fischer, Jason; Mikhael, John G; Tenenbaum, Joshua B; Kanwisher, Nancy
2016-08-23
To engage with the world-to understand the scene in front of us, plan actions, and predict what will happen next-we must have an intuitive grasp of the world's physical structure and dynamics. How do the objects in front of us rest on and support each other, how much force would be required to move them, and how will they behave when they fall, roll, or collide? Despite the centrality of physical inferences in daily life, little is known about the brain mechanisms recruited to interpret the physical structure of a scene and predict how physical events will unfold. Here, in a series of fMRI experiments, we identified a set of cortical regions that are selectively engaged when people watch and predict the unfolding of physical events-a "physics engine" in the brain. These brain regions are selective to physical inferences relative to nonphysical but otherwise highly similar scenes and tasks. However, these regions are not exclusively engaged in physical inferences per se or, indeed, even in scene understanding; they overlap with the domain-general "multiple demand" system, especially the parts of that system involved in action planning and tool use, pointing to a close relationship between the cognitive and neural mechanisms involved in parsing the physical content of a scene and preparing an appropriate action.
Elements of Causal Inference: Foundations and Learning Algorithms
DEFF Research Database (Denmark)
Peters, Jonas Martin; Janzing, Dominik; Schölkopf, Bernhard
A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning......A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning...
Integrating distributed Bayesian inference and reinforcement learning for sensor management
Grappiolo, C.; Whiteson, S.; Pavlin, G.; Bakker, B.
2009-01-01
This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically
An integrative approach to inferring biologically meaningful gene modules
Directory of Open Access Journals (Sweden)
Wang Kai
2011-07-01
Full Text Available Abstract Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.
Tune Your Brown Clustering, Please
DEFF Research Database (Denmark)
Derczynski, Leon; Chester, Sean; Bøgh, Kenneth Sejdenfaden
2015-01-01
Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly...
Cluster Management Institutionalization
DEFF Research Database (Denmark)
Normann, Leo; Agger Nielsen, Jeppe
2015-01-01
of how it was legitimized as a “ready-to-use” management model. Further, our account reveals how cluster management translated into considerably different local variants as it travelled into specific organizations. However, these processes have not occurred sequentially with cluster management first...... legitimized at the field level, then spread, and finally translated into action in the adopting organizations. Instead, we observed entangled field and organizational-level processes. Accordingly, we argue that cluster management institutionalization is most readily understood by simultaneously investigating...
DEFF Research Database (Denmark)
Laursen, Lea Louise Holst; Møller, Jørgen
2013-01-01
villages in order to secure their future. This paper will address the concept of cluster-villages as a possible approach to strengthen the conditions of contemporary Danish villages. Cluster-villages is a concept that gather a number of villages in a network-structure where the villages both work together...... to forskellige positioner ser vi en ny mulighed for landsbyudvikling, som vi kalder Clustervillages. In order to investigate the potentials and possibilities of the cluster-village concept the paper will seek to unfold the concept strategically; looking into the benefits of such concept. Further, the paper seeks...
Dennis, Andrew K
2013-01-01
This book follows a step-by-step, tutorial-based approach which will teach you how to develop your own super cluster using Raspberry Pi computers quickly and efficiently.Raspberry Pi Super Cluster is an introductory guide for those interested in experimenting with parallel computing at home. Aimed at Raspberry Pi enthusiasts, this book is a primer for getting your first cluster up and running.Basic knowledge of C or Java would be helpful but no prior knowledge of parallel computing is necessary.
Introduction to cluster dynamics
Reinhard, Paul-Gerhard
2008-01-01
Clusters as mesoscopic particles represent an intermediate state of matter between single atoms and solid material. The tendency to miniaturise technical objects requires knowledge about systems which contain a ""small"" number of atoms or molecules only. This is all the more true for dynamical aspects, particularly in relation to the qick development of laser technology and femtosecond spectroscopy. Here, for the first time is a highly qualitative introduction to cluster physics. With its emphasis on cluster dynamics, this will be vital to everyone involved in this interdisciplinary subje
DEFF Research Database (Denmark)
Giacomin, Valeria
This dissertation examines the case of the palm oil cluster in Malaysia and Indonesia, today one of the largest agricultural clusters in the world. My analysis focuses on the evolution of the cluster from the 1880s to the 1970s in order to understand how it helped these two countries to integrate...... into the global economy in both colonial and post-colonial times. The study is based on empirical material drawn from five UK archives and background research using secondary sources, interviews, and archive visits to Malaysia and Singapore. The dissertation comprises three articles, each discussing a major under...
Korol, Andrey V.; Solov'yov, Andrey
2013-01-01
Atomic cluster collisions are a field of rapidly emerging research interest by both experimentalists and theorists. The international symposium on atomic cluster collisions (ISSAC) is the premier forum to present cutting-edge research in this field. It was established in 2003 and the most recent conference was held in Berlin, Germany in July of 2011. This Topical Issue presents original research results from some of the participants, who attended this conference. This issues specifically focuses on two research areas, namely Clusters and Fullerenes in External Fields and Nanoscale Insights in Radiation Biodamage.
Using AFLP markers and the Geneland program for the inference of population genetic structure
DEFF Research Database (Denmark)
Guillot, Gilles; Santos, Filipe
2010-01-01
the computer program Geneland designed to infer population structure has been adapted to deal with dominant markers; and (ii) we use Geneland for numerical comparison of dominant and codominant markers to perform clustering. AFLP markers lead to less accurate results than bi-allelic codominant markers...... such as single nucleotide polymorphisms (SNP) markers but this difference becomes negligible for data sets of common size (number of individuals n≥100, number of markers L≥200). The latest Geneland version (3.2.1) handling dominant markers is freely available as an R package with a fully clickable graphical...
Combining cluster number counts and galaxy clustering
Energy Technology Data Exchange (ETDEWEB)
Lacasa, Fabien; Rosenfeld, Rogerio, E-mail: fabien@ift.unesp.br, E-mail: rosenfel@ift.unesp.br [ICTP South American Institute for Fundamental Research, Instituto de Física Teórica, Universidade Estadual Paulista, São Paulo (Brazil)
2016-08-01
The abundance of clusters and the clustering of galaxies are two of the important cosmological probes for current and future large scale surveys of galaxies, such as the Dark Energy Survey. In order to combine them one has to account for the fact that they are not independent quantities, since they probe the same density field. It is important to develop a good understanding of their correlation in order to extract parameter constraints. We present a detailed modelling of the joint covariance matrix between cluster number counts and the galaxy angular power spectrum. We employ the framework of the halo model complemented by a Halo Occupation Distribution model (HOD). We demonstrate the importance of accounting for non-Gaussianity to produce accurate covariance predictions. Indeed, we show that the non-Gaussian covariance becomes dominant at small scales, low redshifts or high cluster masses. We discuss in particular the case of the super-sample covariance (SSC), including the effects of galaxy shot-noise, halo second order bias and non-local bias. We demonstrate that the SSC obeys mathematical inequalities and positivity. Using the joint covariance matrix and a Fisher matrix methodology, we examine the prospects of combining these two probes to constrain cosmological and HOD parameters. We find that the combination indeed results in noticeably better constraints, with improvements of order 20% on cosmological parameters compared to the best single probe, and even greater improvement on HOD parameters, with reduction of error bars by a factor 1.4-4.8. This happens in particular because the cross-covariance introduces a synergy between the probes on small scales. We conclude that accounting for non-Gaussian effects is required for the joint analysis of these observables in galaxy surveys.
Bootstrapping phylogenies inferred from rearrangement data
Directory of Open Access Journals (Sweden)
Lin Yu
2012-08-01
Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its
Bootstrapping phylogenies inferred from rearrangement data.
Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me
2012-08-29
Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver
Type Inference for Session Types in the Pi-Calculus
DEFF Research Database (Denmark)
Graversen, Eva Fajstrup; Harbo, Jacob Buchreitz; Huttel, Hans
2014-01-01
In this paper we present a direct algorithm for session type inference for the π-calculus. Type inference for session types has previously been achieved by either imposing limitations and restriction on the π-calculus, or by reducing the type inference problem to that for linear types. Our approach...
Reasoning about Informal Statistical Inference: One Statistician's View
Rossman, Allan J.
2008-01-01
This paper identifies key concepts and issues associated with the reasoning of informal statistical inference. I focus on key ideas of inference that I think all students should learn, including at secondary level as well as tertiary. I argue that a fundamental component of inference is to go beyond the data at hand, and I propose that statistical…
Statistical Inference at Work: Statistical Process Control as an Example
Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia
2008-01-01
To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…
Malle, Bertram F; Holbrook, Jess
2012-04-01
People interpret behavior by making inferences about agents' intentionality, mind, and personality. Past research studied such inferences 1 at a time; in real life, people make these inferences simultaneously. The present studies therefore examined whether 4 major inferences (intentionality, desire, belief, and personality), elicited simultaneously in response to an observed behavior, might be ordered in a hierarchy of likelihood and speed. To achieve generalizability, the studies included a wide range of stimulus behaviors, presented them verbally and as dynamic videos, and assessed inferences both in a retrieval paradigm (measuring the likelihood and speed of accessing inferences immediately after they were made) and in an online processing paradigm (measuring the speed of forming inferences during behavior observation). Five studies provide evidence for a hierarchy of social inferences-from intentionality and desire to belief to personality-that is stable across verbal and visual presentations and that parallels the order found in developmental and primate research. (c) 2012 APA, all rights reserved.
International Nuclear Information System (INIS)
Walther, B.
1988-01-01
This part of the review on metal cluster compounds deals with clusters containing isolated main group element atoms, with high nuclearity clusters and metal cluster fluxionality. It will be obvious that main group element atoms strongly influence the geometry, stability and reactivity of the clusters. High nuclearity clusters are of interest in there own due to the diversity of the structures adopted, but their intermediate position between molecules and the metallic state makes them a fascinating research object too. These both sites of the metal cluster chemistry as well as the frequently observed ligand and core fluxionality are related to the cluster metal and surface analogy. (author)
Disentangling Porterian Clusters
DEFF Research Database (Denmark)
Jagtfelt, Tue
, contested theory become so widely disseminated and applied as a normative and prescriptive strategy for economic development? The dissertation traces the introduction of the cluster notion into the EU’s Lisbon Strategy and demonstrates how its inclusion originates from Porter’s colleagues: Professor Örjan...... to his membership on the Commission on Industrial Competitiveness, and that the cluster notion found in his influential book, Nations, represents a significant shift in his conception of cluster compared with his early conceptions. This shift, it is argued, is a deliberate attempt by Porter to create...... a paradigmatic textbook that follows Kuhn’s blueprint for scientific revolutions by instilling Nations with circular references and thus creating a local linguistic holism conceptualized through an encompassing notion of cluster. The dissertation concludes that the two research questions are philosophically...
International Nuclear Information System (INIS)
Teller, E.
1985-01-01
In the following, a few simple remarks on the evolution and properties of stellar clusters will be collected. In particular, globular clusters will be considered. Though details of such clusters are often not known, a few questions can be clarified with the help of primitive arguments. These are:- why are spherical clusters spherical, why do they have high densities, why do they consist of approximately a million stars, how may a black hole of great mass form within them, may they be the origin of gamma-ray bursts, may their invisible remnants account for the missing mass of our galaxy. The available data do not warrant a detailed evaluation. However, it is remarkable that exceedingly simple models can shed some light on the questions enumerated above. (author)
DEFF Research Database (Denmark)
Loukonen, Ville; Bork, Nicolai; Vehkamaki, Hanna
2014-01-01
-principles molecular dynamics collision simulations of (sulphuric acid)1(water)0, 1 + (dimethylamine) → (sulphuric acid)1(dimethylamine)1(water)0, 1 cluster formation processes. The simulations indicate that the sticking factor in the collisions is unity: the interaction between the molecules is strong enough...... control. As a consequence, the clusters show very dynamic ion pair structure, which differs from both the static structure optimisation calculations and the equilibrium first-principles molecular dynamics simulations. In some of the simulation runs, water mediates the proton transfer by acting as a proton...... to overcome the possible initial non-optimal collision orientations. No post-collisional cluster break up is observed. The reasons for the efficient clustering are (i) the proton transfer reaction which takes place in each of the collision simulations and (ii) the subsequent competition over the proton...
Ruzmaikin, A.
1997-01-01
Observations show that newly emerging flux tends to appear on the Solar surface at sites where there is flux already. This results in clustering of solar activity. Standard dynamo theories do not predict this effect.
Technology innovation clusters are geographic concentrations of interconnected companies, universities, and other organizations with a focus on environmental technology. They play a key role in addressing the nation’s pressing environmental problems.
Evolution of clustered storage
CERN. Geneva; Van de Vyvre, Pierre
2007-01-01
The session actually featured two presentations: * Evolution of clustered storage by Lance Hukill, Quantum Corporation * ALICE DAQ - Usage of a Cluster-File System: Quantum StorNext by Pierre Vande Vyvre, CERN-PH the second one prepared at short notice by Pierre (thanks!) to present how the Quantum technologies are being used in the ALICE experiment. The abstract to Mr Hukill's follows. Clustered Storage is a technology that is driven by business and mission applications. The evolution of Clustered Storage solutions starts first at the alignment between End-users needs and Industry trends: * Push-and-Pull between managing for today versus planning for tomorrow * Breaking down the real business problems to the core applications * Commoditization of clients, servers, and target devices * Interchangeability, Interoperability, Remote Access, Centralized control * Oh, and yes, there is a budget and the "real world" to deal with This presentation will talk through these needs and trends, and then ask the question, ...
White, S
1994-01-01
Galaxy clusters are the largest coherent objects in Universe. It has been known since 1933 that their dynamical properties require either a modification of the theory of gravity, or the presence of a dominant component of unseen material of unknown nature. Clusters still provide the best laboratories for studying the amount and distribution of this dark matter relative to the material which can be observed directly -- the galaxies themselves and the hot,X-ray-emitting gas which lies between them.Imaging and spectroscopy of clusters by satellite-borne X -ray telescopes has greatly improved our knowledge of the structure and composition of this intergalactic medium. The results permit a number of new approaches to some fundamental cosmological questions,but current indications from the data are contradictory. The observed irregularity of real clusters seems to imply recent formation epochs which would require a universe with approximately the critical density. On the other hand, the large baryon fraction observ...
Indian Academy of Sciences (India)
First page Back Continue Last page Overview Graphics. Applications of Clustering. Biology – medical imaging, bioinformatics, ecology, phylogenies problems etc. Market research. Data Mining. Social Networks. Any problem measuring similarity/correlation. (dimensions represent different parameters)
DEFF Research Database (Denmark)
Bauckhage, C.; Drachen, Anders; Sifa, Rafet
2015-01-01
of the causes, the proliferation of behavioral data poses the problem of how to derive insights therefrom. Behavioral data sets can be large, time-dependent and high-dimensional. Clustering offers a way to explore such data and to discover patterns that can reduce the overall complexity of the data. Clustering...... and other techniques for player profiling and play style analysis have, therefore, become popular in the nascent field of game analytics. However, the proper use of clustering techniques requires expertise and an understanding of games is essential to evaluate results. With this paper, we address game data...... scientists and present a review and tutorial focusing on the application of clustering techniques to mine behavioral game data. Several algorithms are reviewed and examples of their application shown. Key topics such as feature normalization are discussed and open problems in the context of game analytics...
DEFF Research Database (Denmark)
Johannes, Ludger; Pezeshkian, Weria; Ipsen, John H
2018-01-01
Clustering of extracellular ligands and proteins on the plasma membrane is required to perform specific cellular functions, such as signaling and endocytosis. Attractive forces that originate in perturbations of the membrane's physical properties contribute to this clustering, in addition to direct...... protein-protein interactions. However, these membrane-mediated forces have not all been equally considered, despite their importance. In this review, we describe how line tension, lipid depletion, and membrane curvature contribute to membrane-mediated clustering. Additional attractive forces that arise...... from protein-induced perturbation of a membrane's fluctuations are also described. This review aims to provide a survey of the current understanding of membrane-mediated clustering and how this supports precise biological functions....
2015-06-01
Air void clustering around coarse aggregate in concrete has been identified as a potential source of : low strengths in concrete mixes by several Departments of Transportation around the country. Research was : carried out to (1) develop a quantitati...
Hofmann, B
2008-06-01
Are there similarities between scientific and moral inference? This is the key question in this article. It takes as its point of departure an instance of one person's story in the media changing both Norwegian public opinion and a brand-new Norwegian law prohibiting the use of saviour siblings. The case appears to falsify existing norms and to establish new ones. The analysis of this case reveals similarities in the modes of inference in science and morals, inasmuch as (a) a single case functions as a counter-example to an existing rule; (b) there is a common presupposition of stability, similarity and order, which makes it possible to reason from a few cases to a general rule; and (c) this makes it possible to hold things together and retain order. In science, these modes of inference are referred to as falsification, induction and consistency. In morals, they have a variety of other names. Hence, even without abandoning the fact-value divide, there appear to be similarities between inference in science and inference in morals, which may encourage communication across the boundaries between "the two cultures" and which are relevant to medical humanities.
Merging history of three bimodal clusters
Maurogordato, S.; Sauvageot, J. L.; Bourdin, H.; Cappi, A.; Benoist, C.; Ferrari, C.; Mars, G.; Houairi, K.
2011-01-01
We present a combined X-ray and optical analysis of three bimodal galaxy clusters selected as merging candidates at z ~ 0.1. These targets are part of MUSIC (MUlti-Wavelength Sample of Interacting Clusters), which is a general project designed to study the physics of merging clusters by means of multi-wavelength observations. Observations include spectro-imaging with XMM-Newton EPIC camera, multi-object spectroscopy (260 new redshifts), and wide-field imaging at the ESO 3.6 m and 2.2 m telescopes. We build a global picture of these clusters using X-ray luminosity and temperature maps together with galaxy density and velocity distributions. Idealized numerical simulations were used to constrain the merging scenario for each system. We show that A2933 is very likely an equal-mass advanced pre-merger ~200 Myr before the core collapse, while A2440 and A2384 are post-merger systems (~450 Myr and ~1.5 Gyr after core collapse, respectively). In the case of A2384, we detect a spectacular filament of galaxies and gas spreading over more than 1 h-1 Mpc, which we infer to have been stripped during the previous collision. The analysis of the MUSIC sample allows us to outline some general properties of merging clusters: a strong luminosity segregation of galaxies in recent post-mergers; the existence of preferential axes - corresponding to the merging directions - along which the BCGs and structures on various scales are aligned; the concomitance, in most major merger cases, of secondary merging or accretion events, with groups infalling onto the main cluster, and in some cases the evidence of previous merging episodes in one of the main components. These results are in good agreement with the hierarchical scenario of structure formation, in which clusters are expected to form by successive merging events, and matter is accreted along large-scale filaments. Based on data obtained with the European Southern Observatory, Chile (programs 072.A-0595, 075.A-0264, and 079.A-0425
Speaker segmentation and clustering
Kotti, M; Moschou, V; Kotropoulos, C
2008-01-01
07.08.13 KB. Ok to add the accepted version to Spiral, Elsevier says ok whlile mandate not enforced. This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker...
International Nuclear Information System (INIS)
Chandrasekharan, Shailesh
2000-01-01
Cluster algorithms have been recently used to eliminate sign problems that plague Monte-Carlo methods in a variety of systems. In particular such algorithms can also be used to solve sign problems associated with the permutation of fermion world lines. This solution leads to the possibility of designing fermion cluster algorithms in certain cases. Using the example of free non-relativistic fermions we discuss the ideas underlying the algorithm
Milan Davidovic
2013-01-01
E-clusters are strategic alliance in TIMES technology sector (Telecommunication, Information technology, Multimedia, Entertainment, Security) where products and processes are digitalized. They enable horizontal and vertical integration of small and medium companies and establish new added value e-chains. E-clusters also build supply chains based on cooperation relationship, innovation, organizational knowledge and compliance of intellectual properties. As an innovative approach for economic p...
International Nuclear Information System (INIS)
Schiffer, J.P.
1975-01-01
An attempt is made to present some data which may be construed as indicating that perhaps clusters play a role in high energy and exotic pion or kaon interactions with complex (A much greater than 16) nuclei. Also an attempt is made to summarize some very recent experimental work on pion interactions with nuclei which may or may not in the end support a picture in which clusters play an important role. (U.S.)
Shah, Sohil Atul; Koltun, Vladlen
2017-09-12
Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank.
Mansour, Ahmad M; Hamade, Haya; Ghaddar, Ayman; Mokadem, Ahmad Samih; El Hajj Ali, Mohamad; Awwad, Shady
2012-01-01
To present the visual outcomes and ocular sequelae of victims of cluster bombs. This retrospective, multicenter case series of ocular injury due to cluster bombs was conducted for 3 years after the war in South Lebanon (July 2006). Data were gathered from the reports to the Information Management System for Mine Action. There were 308 victims of clusters bombs; 36 individuals were killed, of which 2 received ocular lacerations and; 272 individuals were injured with 18 receiving ocular injury. These 18 surviving individuals were assessed by the authors. Ocular injury occurred in 6.5% (20/308) of cluster bomb victims. Trauma to multiple organs occurred in 12 of 18 cases (67%) with ocular injury. Ocular findings included corneal or scleral lacerations (16 eyes), corneal foreign bodies (9 eyes), corneal decompensation (2 eyes), ruptured cataract (6 eyes), and intravitreal foreign bodies (10 eyes). The corneas of one patient had extreme attenuation of the endothelium. Ocular injury occurred in 6.5% of cluster bomb victims and 67% of the patients with ocular injury sustained trauma to multiple organs. Visual morbidity in civilians is an additional reason for a global ban on the use of cluster bombs.
Is age really the second parameter in globular clusters?
International Nuclear Information System (INIS)
Vandenberg, D.A.; Durrell, P.R.
1990-01-01
From the close similarity of the magnitude difference between the tip of the red giant branch and the turnoff in the Fe/H = about -1.3 globular cluster NGC 288, NGC 362, and M5, it is inferred that the ages of these three systems (and Palomar 5, whose horizonal branch is used to define its distance relative to the others) are not detectably different. An identical conclusion, by similar means, is reached for the Fe/H = about -2.1 globular clusters M15, M30, M68, and M92. Several recent claims that age is responsible for the wide variation in horizontal-branch morphology among clusters of the same metal abundance are not supported. 73 refs
Directory of Open Access Journals (Sweden)
Jinjun Tang
Full Text Available Travel time is an important measurement used to evaluate the extent of congestion within road networks. This paper presents a new method to estimate the travel time based on an evolving fuzzy neural inference system. The input variables in the system are traffic flow data (volume, occupancy, and speed collected from loop detectors located at points both upstream and downstream of a given link, and the output variable is the link travel time. A first order Takagi-Sugeno fuzzy rule set is used to complete the inference. For training the evolving fuzzy neural network (EFNN, two learning processes are proposed: (1 a K-means method is employed to partition input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of samples to the cluster centers. As the number of input samples increases, the cluster centers are modified and membership functions are also updated; (2 a weighted recursive least squares estimator is used to optimize the parameters of the linear functions in the Takagi-Sugeno type fuzzy rules. Testing datasets consisting of actual and simulated data are used to test the proposed method. Three common criteria including mean absolute error (MAE, root mean square error (RMSE, and mean absolute relative error (MARE are utilized to evaluate the estimation performance. Estimation results demonstrate the accuracy and effectiveness of the EFNN method through comparison with existing methods including: multiple linear regression (MLR, instantaneous model (IM, linear model (LM, neural network (NN, and cumulative plots (CP.
Determination of atomic cluster structure with cluster fusion algorithm
DEFF Research Database (Denmark)
Obolensky, Oleg I.; Solov'yov, Ilia; Solov'yov, Andrey V.
2005-01-01
We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters.......We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters....
Fitting Latent Cluster Models for Networks with latentnet
Directory of Open Access Journals (Sweden)
Pavel N. Krivitsky
2007-12-01
Full Text Available latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoﬀ, Raftery, and Handcock (2002 suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariates. In latentnet social distances are represented in a Euclidean space. It also includes a variant of the extension of the latent position model to allow for clustering of the positions developed in Handcock, Raftery, and Tantrum (2007.The package implements Bayesian inference for the models based on an Markov chain Monte Carlo algorithm. It can also compute maximum likelihood estimates for the latent position model and a two-stage maximum likelihood method for the latent position cluster model. For latent position cluster models, the package provides a Bayesian way of assessing how many groups there are, and thus whether or not there is any clustering (since if the preferred number of groups is 1, there is little evidence for clustering. It also estimates which cluster each actor belongs to. These estimates are probabilistic, and provide the probability of each actor belonging to each cluster. It computes four types of point estimates for the coefficients and positions: maximum likelihood estimate, posterior mean, posterior mode and the estimator which minimizes Kullback-Leibler divergence from the posterior. You can assess the goodness-of-fit of the model via posterior predictive checks. It has a function to simulate networks from a latent position or latent position cluster model.
INDIVIDUAL AND GROUP GALAXIES IN CNOC1 CLUSTERS
International Nuclear Information System (INIS)
Li, I. H.; Yee, H. K. C.; Ellingson, E.
2009-01-01
Using wide-field BVR c I imaging for a sample of 16 intermediate redshift (0.17 red ) to infer the evolutionary status of galaxies in clusters, using both individual galaxies and galaxies in groups. We apply the local galaxy density, Σ 5 , derived using the fifth nearest neighbor distance, as a measure of local environment, and the cluster-centric radius, r CL , as a proxy for global cluster environment. Our cluster sample exhibits a Butcher-Oemler effect in both luminosity-selected and stellar-mass-selected samples. We find that f red depends strongly on Σ 5 and r CL , and the Butcher-Oemler effect is observed in all Σ 5 and r CL bins. However, when the cluster galaxies are separated into r CL bins, or into group and nongroup subsamples, the dependence on local galaxy density becomes much weaker. This suggests that the properties of the dark matter halo in which the galaxy resides have a dominant effect on its galaxy population and evolutionary history. We find that our data are consistent with the scenario that cluster galaxies situated in successively richer groups (i.e., more massive dark matter halos) reach a high f red value at earlier redshifts. Associated with this, we observe a clear signature of 'preprocessing', in which cluster galaxies belonging to moderately massive infalling galaxy groups show a much stronger evolution in f red than those classified as nongroup galaxies, especially at the outskirts of the cluster. This result suggests that galaxies in groups infalling into clusters are significant contributors to the Butcher-Oemler effect.
Globular Clusters: Absolute Proper Motions and Galactic Orbits
Chemel, A. A.; Glushkova, E. V.; Dambis, A. K.; Rastorguev, A. S.; Yalyalieva, L. N.; Klinichev, A. D.
2018-04-01
We cross-match objects from several different astronomical catalogs to determine the absolute proper motions of stars within the 30-arcmin radius fields of 115 Milky-Way globular clusters with the accuracy of 1-2 mas yr-1. The proper motions are based on positional data recovered from the USNO-B1, 2MASS, URAT1, ALLWISE, UCAC5, and Gaia DR1 surveys with up to ten positions spanning an epoch difference of up to about 65 years, and reduced to Gaia DR1 TGAS frame using UCAC5 as the reference catalog. Cluster members are photometrically identified by selecting horizontal- and red-giant branch stars on color-magnitude diagrams, and the mean absolute proper motions of the clusters with a typical formal error of about 0.4 mas yr-1 are computed by averaging the proper motions of selected members. The inferred absolute proper motions of clusters are combined with available radial-velocity data and heliocentric distance estimates to compute the cluster orbits in terms of the Galactic potential models based on Miyamoto and Nagai disk, Hernquist spheroid, and modified isothermal dark-matter halo (axisymmetric model without a bar) and the same model + rotating Ferre's bar (non-axisymmetric). Five distant clusters have higher-than-escape velocities, most likely due to large errors of computed transversal velocities, whereas the computed orbits of all other clusters remain bound to the Galaxy. Unlike previously published results, we find the bar to affect substantially the orbits of most of the clusters, even those at large Galactocentric distances, bringing appreciable chaotization, especially in the portions of the orbits close to the Galactic center, and stretching out the orbits of some of the thick-disk clusters.
Colosimo, Giuliano; Knapp, Charles R.; Wallace, Lisa E.; Welch, Mark E.
2014-01-01
Ecological data, the primary source of information on patterns and rates of migration, can be integrated with genetic data to more accurately describe the realized connectivity between geographically isolated demes. In this paper we implement this approach and discuss its implications for managing populations of the endangered Andros Island Rock Iguana, Cyclura cychlura cychlura. This iguana is endemic to Andros, a highly fragmented landmass of large islands and smaller cays. Field observations suggest that geographically isolated demes were panmictic due to high, inferred rates of gene flow. We expand on these observations using 16 polymorphic microsatellites to investigate the genetic structure and rates of gene flow from 188 Andros Iguanas collected across 23 island sites. Bayesian clustering of specimens assigned individuals to three distinct genotypic clusters. An analysis of molecular variance (AMOVA) indicates that allele frequency differences are responsible for a significant portion of the genetic variance across the three defined clusters (Fst = 0.117, p0.01). These clusters are associated with larger islands and satellite cays isolated by broad water channels with strong currents. These findings imply that broad water channels present greater obstacles to gene flow than was inferred from field observation alone. Additionally, rates of gene flow were indirectly estimated using BAYESASS 3.0. The proportion of individuals originating from within each identified cluster varied from 94.5 to 98.7%, providing further support for local isolation. Our assessment reveals a major disparity between inferred and realized gene flow. We discuss our results in a conservation perspective for species inhabiting highly fragmented landscapes. PMID:25229344
Colosimo, Giuliano; Knapp, Charles R; Wallace, Lisa E; Welch, Mark E
2014-01-01
Ecological data, the primary source of information on patterns and rates of migration, can be integrated with genetic data to more accurately describe the realized connectivity between geographically isolated demes. In this paper we implement this approach and discuss its implications for managing populations of the endangered Andros Island Rock Iguana, Cyclura cychlura cychlura. This iguana is endemic to Andros, a highly fragmented landmass of large islands and smaller cays. Field observations suggest that geographically isolated demes were panmictic due to high, inferred rates of gene flow. We expand on these observations using 16 polymorphic microsatellites to investigate the genetic structure and rates of gene flow from 188 Andros Iguanas collected across 23 island sites. Bayesian clustering of specimens assigned individuals to three distinct genotypic clusters. An analysis of molecular variance (AMOVA) indicates that allele frequency differences are responsible for a significant portion of the genetic variance across the three defined clusters (Fst = 0.117, p<0.01). These clusters are associated with larger islands and satellite cays isolated by broad water channels with strong currents. These findings imply that broad water channels present greater obstacles to gene flow than was inferred from field observation alone. Additionally, rates of gene flow were indirectly estimated using BAYESASS 3.0. The proportion of individuals originating from within each identified cluster varied from 94.5 to 98.7%, providing further support for local isolation. Our assessment reveals a major disparity between inferred and realized gene flow. We discuss our results in a conservation perspective for species inhabiting highly fragmented landscapes.
Directory of Open Access Journals (Sweden)
Giuliano Colosimo
Full Text Available Ecological data, the primary source of information on patterns and rates of migration, can be integrated with genetic data to more accurately describe the realized connectivity between geographically isolated demes. In this paper we implement this approach and discuss its implications for managing populations of the endangered Andros Island Rock Iguana, Cyclura cychlura cychlura. This iguana is endemic to Andros, a highly fragmented landmass of large islands and smaller cays. Field observations suggest that geographically isolated demes were panmictic due to high, inferred rates of gene flow. We expand on these observations using 16 polymorphic microsatellites to investigate the genetic structure and rates of gene flow from 188 Andros Iguanas collected across 23 island sites. Bayesian clustering of specimens assigned individuals to three distinct genotypic clusters. An analysis of molecular variance (AMOVA indicates that allele frequency differences are responsible for a significant portion of the genetic variance across the three defined clusters (Fst = 0.117, p<<0.01. These clusters are associated with larger islands and satellite cays isolated by broad water channels with strong currents. These findings imply that broad water channels present greater obstacles to gene flow than was inferred from field observation alone. Additionally, rates of gene flow were indirectly estimated using BAYESASS 3.0. The proportion of individuals originating from within each identified cluster varied from 94.5 to 98.7%, providing further support for local isolation. Our assessment reveals a major disparity between inferred and realized gene flow. We discuss our results in a conservation perspective for species inhabiting highly fragmented landscapes.
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Impact of noise on molecular network inference.
Directory of Open Access Journals (Sweden)
Radhakrishnan Nagarajan
Full Text Available Molecular entities work in concert as a system and mediate phenotypic outcomes and disease states. There has been recent interest in modelling the associations between molecular entities from their observed expression profiles as networks using a battery of algorithms. These networks have proven to be useful abstractions of the underlying pathways and signalling mechanisms. Noise is ubiquitous in molecular data and can have a pronounced effect on the inferred network. Noise can be an outcome of several factors including: inherent stochastic mechanisms at the molecular level, variation in the abundance of molecules, heterogeneity, sensitivity of the biological assay or measurement artefacts prevalent especially in high-throughput settings. The present study investigates the impact of discrepancies in noise variance on pair-wise dependencies, conditional dependencies and constraint-based Bayesian network structure learning algorithms that incorporate conditional independence tests as a part of the learning process. Popular network motifs and fundamental connections, namely: (a common-effect, (b three-chain, and (c coherent type-I feed-forward loop (FFL are investigated. The choice of these elementary networks can be attributed to their prevalence across more complex networks. Analytical expressions elucidating the impact of discrepancies in noise variance on pairwise dependencies and conditional dependencies for special cases of these motifs are presented. Subsequently, the impact of noise on two popular constraint-based Bayesian network structure learning algorithms such as Grow-Shrink (GS and Incremental Association Markov Blanket (IAMB that implicitly incorporate tests for conditional independence is investigated. Finally, the impact of noise on networks inferred from publicly available single cell molecular expression profiles is investigated. While discrepancies in noise variance are overlooked in routine molecular network inference, the
Bayesian Estimation and Inference using Stochastic Hardware
Directory of Open Access Journals (Sweden)
Chetan Singh Thakur
2016-03-01
Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Cluster dynamics at different cluster size and incident laser wavelengths
International Nuclear Information System (INIS)
Desai, Tara; Bernardinello, Andrea
2002-01-01
X-ray emission spectra from aluminum clusters of diameter -0.4 μm and gold clusters of dia. ∼1.25 μm are experimentally studied by irradiating the cluster foil targets with 1.06 μm laser, 10 ns (FWHM) at an intensity ∼10 12 W/cm 2 . Aluminum clusters show a different spectra compared to bulk material whereas gold cluster evolve towards bulk gold. Experimental data are analyzed on the basis of cluster dimension, laser wavelength and pulse duration. PIC simulations are performed to study the behavior of clusters at higher intensity I≥10 17 W/cm 2 for different size of the clusters irradiated at different laser wavelengths. Results indicate the dependence of cluster dynamics on cluster size and incident laser wavelength
Cluster fusion algorithm: application to Lennard-Jones clusters
DEFF Research Database (Denmark)
Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter
2006-01-01
paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...
Cluster fusion algorithm: application to Lennard-Jones clusters
DEFF Research Database (Denmark)
Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter
2008-01-01
paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...
Weighted community detection and data clustering using message passing
Shi, Cheng; Liu, Yanchen; Zhang, Pan
2018-03-01
Grouping objects into clusters based on the similarities or weights between them is one of the most important problems in science and engineering. In this work, by extending message-passing algorithms and spectral algorithms proposed for an unweighted community detection problem, we develop a non-parametric method based on statistical physics, by mapping the problem to the Potts model at the critical temperature of spin-glass transition and applying belief propagation to solve the marginals corresponding to the Boltzmann distribution. Our algorithm is robust to over-fitting and gives a principled way to determine whether there are significant clusters in the data and how many clusters there are. We apply our method to different clustering tasks. In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms. In the clustering problem, where the data were generated by mixture models in the sparse regime, we show that our method works all the way down to the theoretical limit of detectability and gives accuracy very close to that of the optimal Bayesian inference. In the semi-supervised clustering problem, our method only needs several labels to work perfectly in classic datasets. Finally, we further develop Thouless-Anderson-Palmer equations which heavily reduce the computation complexity in dense networks but give almost the same performance as belief propagation.
Metallothionein Zn(2+)- and Cu(2+)-clusters from first-principles calculations
DEFF Research Database (Denmark)
Greisen, Per Junior; Jespersen, Jakob Berg; Kepp, Kasper Planeta
2012-01-01
Detailed electronic structures of Zn(ii) and Cu(ii) clusters from metallothioneins (MT) have been obtained using density functional theory (DFT), in order to investigate how oxidative stress-caused Cu(ii) intermediates affect Zn-binding to MT and cooperatively lead to Cu(i)MT. The inferred accura...
Pillow, Bradford H; Pearson, Raeanne M; Hecht, Mary; Bremer, Amanda
2010-01-01
Children and adults rated their own certainty following inductive inferences, deductive inferences, and guesses. Beginning in kindergarten, participants rated deductions as more certain than weak inductions or guesses. Deductions were rated as more certain than strong inductions beginning in Grade 3, and fourth-grade children and adults differentiated strong inductions, weak inductions, and informed guesses from pure guesses. By Grade 3, participants also gave different types of explanations for their deductions and inductions. These results are discussed in relation to children's concepts of cognitive processes, logical reasoning, and epistemological development.
Approximate Inference and Deep Generative Models
CERN. Geneva
2018-01-01
Advances in deep generative models are at the forefront of deep learning research because of the promise they offer for allowing data-efficient learning, and for model-based reinforcement learning. In this talk I'll review a few standard methods for approximate inference and introduce modern approximations which allow for efficient large-scale training of a wide variety of generative models. Finally, I'll demonstrate several important application of these models to density estimation, missing data imputation, data compression and planning.
Abductive Inference using Array-Based Logic
DEFF Research Database (Denmark)
Frisvad, Jeppe Revall; Falster, Peter; Møller, Gert L.
The notion of abduction has found its usage within a wide variety of AI fields. Computing abductive solutions has, however, shown to be highly intractable in logic programming. To avoid this intractability we present a new approach to logicbased abduction; through the geometrical view of data...... employed in array-based logic we embrace abduction in a simple structural operation. We argue that a theory of abduction on this form allows for an implementation which, at runtime, can perform abductive inference quite efficiently on arbitrary rules of logic representing knowledge of finite domains....
DEFF Research Database (Denmark)
Andersen, Jesper; Lawall, Julia Laetitia
2008-01-01
A key issue in maintaining Linux device drivers is the need to update drivers in response to evolutions in Linux internal libraries. Currently, there is little tool support for performing and documenting such changes. In this paper we present a tool, spfind, that identifies common changes made...... developers can use it to extract an abstract representation of the set of changes that others have made. Our experiments on recent changes in Linux show that the inferred generic patches are more concise than the corresponding patches found in commits to the Linux source tree while being safe with respect...
Inverse Ising Inference Using All the Data
Aurell, Erik; Ekeberg, Magnus
2012-03-01
We show that a method based on logistic regression, using all the data, solves the inverse Ising problem far better than mean-field calculations relying only on sample pairwise correlation functions, while still computationally feasible for hundreds of nodes. The largest improvement in reconstruction occurs for strong interactions. Using two examples, a diluted Sherrington-Kirkpatrick model and a two-dimensional lattice, we also show that interaction topologies can be recovered from few samples with good accuracy and that the use of l1 regularization is beneficial in this process, pushing inference abilities further into low-temperature regimes.
GibbsCluster: unsupervised clustering and alignment of peptide sequences
DEFF Research Database (Denmark)
Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten
2017-01-01
motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large......-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0....
Fan, Wentao; Bouguila, Nizar
2013-11-01
A large class of problems can be formulated in terms of the clustering process. Mixture models are an increasingly important tool in statistical pattern recognition and for analyzing and clustering complex data. Two challenging aspects that should be addressed when considering mixture models are how to choose between a set of plausible models and how to estimate the model's parameters. In this paper, we address both problems simultaneously within a unified online nonparametric Bayesian framework that we develop to learn a Dirichlet process mixture of Beta-Liouville distributions (i.e., an infinite Beta-Liouville mixture model). The proposed infinite model is used for the online modeling and clustering of proportional data for which the Beta-Liouville mixture has been shown to be effective. We propose a principled approach for approximating the intractable model's posterior distribution by a tractable one-which we develop-such that all the involved mixture's parameters can be estimated simultaneously and effectively in a closed form. This is done through variational inference that enjoys important advantages, such as handling of unobserved attributes and preventing under or overfitting; we explain that in detail. The effectiveness of the proposed work is evaluated on three challenging real applications, namely facial expression recognition, behavior modeling and recognition, and dynamic textures clustering.
Cluster Implantation and Deposition Apparatus
DEFF Research Database (Denmark)
Hanif, Muhammad; Popok, Vladimir
2015-01-01
In the current report, a design and capabilities of a cluster implantation and deposition apparatus (CIDA) involving two different cluster sources are described. The clusters produced from gas precursors (Ar, N etc.) by PuCluS-2 can be used to study cluster ion implantation in order to develop...
Timmerman, Marieke E.; Ceulemans, Eva; De Roover, Kim; Van Leeuwen, Karla
2013-01-01
To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the
Projected coupled cluster theory.
Qiu, Yiheng; Henderson, Thomas M; Zhao, Jinmo; Scuseria, Gustavo E
2017-08-14
Coupled cluster theory is the method of choice for weakly correlated systems. But in the strongly correlated regime, it faces a symmetry dilemma, where it either completely fails to describe the system or has to artificially break certain symmetries. On the other hand, projected Hartree-Fock theory captures the essential physics of many kinds of strong correlations via symmetry breaking and restoration. In this work, we combine and try to retain the merits of these two methods by applying symmetry projection to broken symmetry coupled cluster wave functions. The non-orthogonal nature of states resulting from the application of symmetry projection operators furnishes particle-hole excitations to all orders, thus creating an obstacle for the exact evaluation of overlaps. Here we provide a solution via a disentanglement framework theory that can be approximated rigorously and systematically. Results of projected coupled cluster theory are presented for molecules and the Hubbard model, showing that spin projection significantly improves unrestricted coupled cluster theory while restoring good quantum numbers. The energy of projected coupled cluster theory reduces to the unprojected one in the thermodynamic limit, albeit at a much slower rate than projected Hartree-Fock.
Globular Clusters - Guides to Galaxies
Richtler, Tom; Joint ESO-FONDAP Workshop on Globular Clusters
2009-01-01
The principal question of whether and how globular clusters can contribute to a better understanding of galaxy formation and evolution is perhaps the main driving force behind the overall endeavour of studying globular cluster systems. Naturally, this splits up into many individual problems. The objective of the Joint ESO-FONDAP Workshop on Globular Clusters - Guides to Galaxies was to bring together researchers, both observational and theoretical, to present and discuss the most recent results. Topics covered in these proceedings are: internal dynamics of globular clusters and interaction with host galaxies (tidal tails, evolution of cluster masses), accretion of globular clusters, detailed descriptions of nearby cluster systems, ultracompact dwarfs, formations of massive clusters in mergers and elsewhere, the ACS Virgo survey, galaxy formation and globular clusters, dynamics and kinematics of globular cluster systems and dark matter-related problems. With its wide coverage of the topic, this book constitute...
Quantum Enhanced Inference in Markov Logic Networks.
Wittek, Peter; Gogolin, Christian
2017-04-19
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Inferring network topology from complex dynamics
International Nuclear Information System (INIS)
Shandilya, Srinivas Gorur; Timme, Marc
2011-01-01
Inferring the network topology from dynamical observations is a fundamental problem pervading research on complex systems. Here, we present a simple, direct method for inferring the structural connection topology of a network, given an observation of one collective dynamical trajectory. The general theoretical framework is applicable to arbitrary network dynamical systems described by ordinary differential equations. No interference (external driving) is required and the type of dynamics is hardly restricted in any way. In particular, the observed dynamics may be arbitrarily complex; stationary, invariant or transient; synchronous or asynchronous and chaotic or periodic. Presupposing a knowledge of the functional form of the dynamical units and of the coupling functions between them, we present an analytical solution to the inverse problem of finding the network topology from observing a time series of state variables only. Robust reconstruction is achieved in any sufficiently long generic observation of the system. We extend our method to simultaneously reconstructing both the entire network topology and all parameters appearing linear in the system's equations of motion. Reconstruction of network topology and system parameters is viable even in the presence of external noise that distorts the original dynamics substantially. The method provides a conceptually new step towards reconstructing a variety of real-world networks, including gene and protein interaction networks and neuronal circuits.
Inferring climate sensitivity from volcanic events
Energy Technology Data Exchange (ETDEWEB)
Boer, G.J. [Environment Canada, University of Victoria, Canadian Centre for Climate Modelling and Analysis, Victoria, BC (Canada); Stowasser, M.; Hamilton, K. [University of Hawaii, International Pacific Research Centre, Honolulu, HI (United States)
2007-04-15
The possibility of estimating the equilibrium climate sensitivity of the earth-system from observations following explosive volcanic eruptions is assessed in the context of a perfect model study. Two modern climate models (the CCCma CGCM3 and the NCAR CCSM2) with different equilibrium climate sensitivities are employed in the investigation. The models are perturbed with the same transient volcano-like forcing and the responses analysed to infer climate sensitivities. For volcano-like forcing the global mean surface temperature responses of the two models are very similar, despite their differing equilibrium climate sensitivities, indicating that climate sensitivity cannot be inferred from the temperature record alone even if the forcing is known. Equilibrium climate sensitivities can be reasonably determined only if both the forcing and the change in heat storage in the system are known very accurately. The geographic patterns of clear-sky atmosphere/surface and cloud feedbacks are similar for both the transient volcano-like and near-equilibrium constant forcing simulations showing that, to a considerable extent, the same feedback processes are invoked, and determine the climate sensitivity, in both cases. (orig.)
Facility Activity Inference Using Radiation Networks
Energy Technology Data Exchange (ETDEWEB)
Rao, Nageswara S. [ORNL; Ramirez Aviles, Camila A. [ORNL
2017-11-01
We consider the problem of inferring the operational status of a reactor facility using measurements from a radiation sensor network deployed around the facility’s ventilation off-gas stack. The intensity of stack emissions decays with distance, and the sensor counts or measurements are inherently random with parameters determined by the intensity at the sensor’s location. We utilize the measurements to estimate the intensity at the stack, and use it in a one-sided Sequential Probability Ratio Test (SPRT) to infer on/off status of the reactor. We demonstrate the superior performance of this method over conventional majority fusers and individual sensors using (i) test measurements from a network of 21 NaI detectors, and (ii) effluence measurements collected at the stack of a reactor facility. We also analytically establish the superior detection performance of the network over individual sensors with fixed and adaptive thresholds by utilizing the Poisson distribution of the counts. We quantify the performance improvements of the network detection over individual sensors using the packing number of the intensity space.
Models for inference in dynamic metacommunity systems
Dorazio, Robert M.; Kery, Marc; Royle, J. Andrew; Plattner, Matthias
2010-01-01
A variety of processes are thought to be involved in the formation and dynamics of species assemblages. For example, various metacommunity theories are based on differences in the relative contributions of dispersal of species among local communities and interactions of species within local communities. Interestingly, metacommunity theories continue to be advanced without much empirical validation. Part of the problem is that statistical models used to analyze typical survey data either fail to specify ecological processes with sufficient complexity or they fail to account for errors in detection of species during sampling. In this paper, we describe a statistical modeling framework for the analysis of metacommunity dynamics that is based on the idea of adopting a unified approach, multispecies occupancy modeling, for computing inferences about individual species, local communities of species, or the entire metacommunity of species. This approach accounts for errors in detection of species during sampling and also allows different metacommunity paradigms to be specified in terms of species- and location-specific probabilities of occurrence, extinction, and colonization: all of which are estimable. In addition, this approach can be used to address inference problems that arise in conservation ecology, such as predicting temporal and spatial changes in biodiversity for use in making conservation decisions. To illustrate, we estimate changes in species composition associated with the species-specific phenologies of flight patterns of butterflies in Switzerland for the purpose of estimating regional differences in biodiversity.
Causal inference, probability theory, and graphical insights.
Baker, Stuart G
2013-11-10
Causal inference from observational studies is a fundamental topic in biostatistics. The causal graph literature typically views probability theory as insufficient to express causal concepts in observational studies. In contrast, the view here is that probability theory is a desirable and sufficient basis for many topics in causal inference for the following two reasons. First, probability theory is generally more flexible than causal graphs: Besides explaining such causal graph topics as M-bias (adjusting for a collider) and bias amplification and attenuation (when adjusting for instrumental variable), probability theory is also the foundation of the paired availability design for historical controls, which does not fit into a causal graph framework. Second, probability theory is the basis for insightful graphical displays including the BK-Plot for understanding Simpson's paradox with a binary confounder, the BK2-Plot for understanding bias amplification and attenuation in the presence of an unobserved binary confounder, and the PAD-Plot for understanding the principal stratification component of the paired availability design. Published 2013. This article is a US Government work and is in the public domain in the USA.
Inferring relevance in a changing world
Directory of Open Access Journals (Sweden)
Robert C Wilson
2012-01-01
Full Text Available Reinforcement learning models of human and animal learning usually concentrate on how we learn the relationship between different stimuli or actions and rewards. However, in real world situations stimuli are ill-defined. On the one hand, our immediate environment is extremely multi-dimensional. On the other hand, in every decision-making scenario only a few aspects of the environment are relevant for obtaining reward, while most are irrelevant. Thus a key question is how do we learn these relevant dimensions, that is, how do we learn what to learn about? We investigated this process of representation learning experimentally, using a task in which one stimulus dimension was relevant for determining reward at each point in time. As in real life situations, in our task the relevant dimension can change without warning, adding ever-present uncertainty engendered by a constantly changing environment. We show that human performance on this task is better described by a suboptimal strategy based on selective attention and serial hypothesis testing rather than a normative strategy based on probabilistic inference. From this, we conjecture that the problem of inferring relevance in general scenarios is too computationally demanding for the brain to solve optimally. As a result the brain utilizes approximations, employing these even in simplified scenarios in which optimal representation learning is tractable, such as the one in our experiment.
Automated adaptive inference of phenomenological dynamical models
Daniels, Bryan
Understanding the dynamics of biochemical systems can seem impossibly complicated at the microscopic level: detailed properties of every molecular species, including those that have not yet been discovered, could be important for producing macroscopic behavior. The profusion of data in this area has raised the hope that microscopic dynamics might be recovered in an automated search over possible models, yet the combinatorial growth of this space has limited these techniques to systems that contain only a few interacting species. We take a different approach inspired by coarse-grained, phenomenological models in physics. Akin to a Taylor series producing Hooke's Law, forgoing microscopic accuracy allows us to constrain the search over dynamical models to a single dimension. This makes it feasible to infer dynamics with very limited data, including cases in which important dynamical variables are unobserved. We name our method Sir Isaac after its ability to infer the dynamical structure of the law of gravitation given simulated planetary motion data. Applying the method to output from a microscopically complicated but macroscopically simple biological signaling model, it is able to adapt the level of detail to the amount of available data. Finally, using nematode behavioral time series data, the method discovers an effective switch between behavioral attractors after the application of a painful stimulus.
Graphical models for inferring single molecule dynamics
Directory of Open Access Journals (Sweden)
Gonzalez Ruben L
2010-10-01
Full Text Available Abstract Background The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM. The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM with Gaussian observables. A detailed description of smFRET is provided as well. Results The VBEM algorithm returns the model’s evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME, and the latter a description of the model’s parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML optimized by the expectation maximization (EM algorithm, the most important being a natural form of model selection and a well-posed (non-divergent optimization problem. Conclusions The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.
Quantum Enhanced Inference in Markov Logic Networks
Wittek, Peter; Gogolin, Christian
2017-04-01
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Causal Inference in the Perception of Verticality.
de Winkel, Ksander N; Katliar, Mikhail; Diers, Daniel; Bülthoff, Heinrich H
2018-04-03
The perceptual upright is thought to be constructed by the central nervous system (CNS) as a vector sum; by combining estimates on the upright provided by the visual system and the body's inertial sensors with prior knowledge that upright is usually above the head. Recent findings furthermore show that the weighting of the respective sensory signals is proportional to their reliability, consistent with a Bayesian interpretation of a vector sum (Forced Fusion, FF). However, violations of FF have also been reported, suggesting that the CNS may rely on a single sensory system (Cue Capture, CC), or choose to process sensory signals based on inferred signal causality (Causal Inference, CI). We developed a novel alternative-reality system to manipulate visual and physical tilt independently. We tasked participants (n = 36) to indicate the perceived upright for various (in-)congruent combinations of visual-inertial stimuli, and compared models based on their agreement with the data. The results favor the CI model over FF, although this effect became unambiguous only for large discrepancies (±60°). We conclude that the notion of a vector sum does not provide a comprehensive explanation of the perception of the upright, and that CI offers a better alternative.
Massardo, Darli; Fornel, Rodrigo; Kronforst, Marcus; Gonçalves, Gislene Lopes; Moreira, Gilson Rudinei Pires
2015-01-01
The tribe Heliconiini (Lepidoptera: Nymphalidae) is a diverse group of butterflies distributed throughout the Neotropics, which has been studied extensively, in particular the genus Heliconius. However, most of the other lineages, such as Dione, which are less diverse and considered basal within the group, have received little attention. Basic information, such as species limits and geographical distributions remain uncertain for this genus. Here we used multilocus DNA sequence data and the geographical distribution analysis across the entire range of Dione in the Neotropical region in order to make inferences on the evolutionary history of this poorly explored lineage. Bayesian time-tree reconstruction allows inferring two major diversification events in this tribe around 25mya. Lineages thought to be ancient, such as Dione and Agraulis, are as recent as Heliconius. Dione formed a monophyletic clade, sister to the genus Agraulis. Dione juno, D. glycera and D. moneta were reciprocally monophyletic and formed genetic clusters, with the first two more close related than each other in relation to the third. Divergence time estimates support the hypothesis that speciation in Dione coincided with both the rise of Passifloraceae (the host plants) and the uplift of the Andes. Since the sister species D. glycera and D. moneta are specialized feeders on passion-vine lineages that are endemic to areas located either within or adjacent to the Andes, we inferred that they co-speciated with their host plants during this vicariant event. Copyright © 2014 Elsevier Inc. All rights reserved.
Spanning Tree Based Attribute Clustering
DEFF Research Database (Denmark)
Zeng, Yifeng; Jorge, Cordero Hernandez
2009-01-01
Attribute clustering has been previously employed to detect statistical dependence between subsets of variables. We propose a novel attribute clustering algorithm motivated by research of complex networks, called the Star Discovery algorithm. The algorithm partitions and indirectly discards...... inconsistent edges from a maximum spanning tree by starting appropriate initial modes, therefore generating stable clusters. It discovers sound clusters through simple graph operations and achieves significant computational savings. We compare the Star Discovery algorithm against earlier attribute clustering...
Constraint Satisfaction Inference : Non-probabilistic Global Inference for Sequence Labelling
Canisius, S.V.M.; van den Bosch, A.; Daelemans, W.; Basili, R.; Moschitti, A.
2006-01-01
We present a new method for performing sequence labelling based on the idea of using a machine-learning classifier to generate several possible output sequences, and then applying an inference procedure to select the best sequence among those. Most sequence labelling methods following a similar
Gekhtman, M; Vainshtein, A
2017-01-01
This is the second paper in the series of papers dedicated to the study of natural cluster structures in the rings of regular functions on simple complex Lie groups and Poisson-Lie structures compatible with these cluster structures. According to our main conjecture, each class in the Belavin-Drinfeld classification of Poisson-Lie structures on \\mathcal{G} corresponds to a cluster structure in \\mathcal{O}(\\mathcal{G}). The authors have shown before that this conjecture holds for any \\mathcal{G} in the case of the standard Poisson-Lie structure and for all Belavin-Drinfeld classes in SL_n, n<5. In this paper the authors establish it for the Cremmer-Gervais Poisson-Lie structure on SL_n, which is the least similar to the standard one.
From superdeformation to clusters
Energy Technology Data Exchange (ETDEWEB)
Betts, R R [Argonne National Lab., IL (United States). Physics Div.
1992-08-01
Much of the discussion at the conference centred on superdeformed states and their study by precise gamma spectrometry. The author suggests that the study of superdeformation by fission fragments and by auto-scattering is of importance, and may become more important. He concludes that there exists clear evidence of shell effects at extreme deformation in light nuclei studied by fission or cluster decay. The connection between the deformed shell model and the multi-center shell model can be exploited to give give insight into the cluster structure of these extremely deformed states, and also gives hope of a spectroscopy based on selection rules for cluster decay. A clear disadvantage at this stage is inability to make this spectroscopy more quantitative through calculation of the decay widths. The introduction of a new generation of high segmentation, high resolution, particle arrays has and will have a major impact on this aspect of the study of highly deformed nuclei. 20 refs., 16 figs.
Refractory chronic cluster headache
DEFF Research Database (Denmark)
Mitsikostas, Dimos D; Edvinsson, Lars; Jensen, Rigmor H
2014-01-01
Chronic cluster headache (CCH) often resists to prophylactic pharmaceutical treatments resulting in patients' life damage. In this rare but pragmatic situation escalation to invasive management is needed but framing criteria are lacking. We aimed to reach a consensus for refractory CCH definition...... for clinical and research use. The preparation of the final consensus followed three stages. Internal between authors, a larger between all European Headache Federation members and finally an international one among all investigators that have published clinical studies on cluster headache the last five years...
Directory of Open Access Journals (Sweden)
Maurizio Rosina
2010-03-01
Full Text Available Geographic ClustersOver the past decade, public alphanumeric database have been growing at exceptional rate. Most of data can be georeferenced, so that is possible gaining new knowledge from such databases. The contribution of this paper is two-fold. We first present a model of geographic clusters, which uses only geographic and functionally data properties. The model is useful to process huge amount of public/government data, even daily upgrading. After that, we merge the model into the framework GEOPOI (GEOcoding Points Of Interest, and show some graphic map results.
Directory of Open Access Journals (Sweden)
Maurizio Rosina
2010-03-01
Full Text Available Geographic Clusters Over the past decade, public alphanumeric database have been growing at exceptional rate. Most of data can be georeferenced, so that is possible gaining new knowledge from such databases. The contribution of this paper is two-fold. We first present a model of geographic clusters, which uses only geographic and functionally data properties. The model is useful to process huge amount of public/government data, even daily upgrading. After that, we merge the model into the framework GEOPOI (GEOcoding Points Of Interest, and show some graphic map results.
Clustering via Kernel Decomposition
DEFF Research Database (Denmark)
Have, Anna Szynkowiak; Girolami, Mark A.; Larsen, Jan
2006-01-01
Methods for spectral clustering have been proposed recently which rely on the eigenvalue decomposition of an affinity matrix. In this work it is proposed that the affinity matrix is created based on the elements of a non-parametric density estimator. This matrix is then decomposed to obtain...... posterior probabilities of class membership using an appropriate form of nonnegative matrix factorization. The troublesome selection of hyperparameters such as kernel width and number of clusters can be obtained using standard cross-validation methods as is demonstrated on a number of diverse data sets....
Energy Technology Data Exchange (ETDEWEB)
Webb, T. M. A.; O' Donnell, D.; Coppin, Kristen; Faloon, Ashley; Geach, James E.; Noble, Allison [McGill University, 3600 rue University, Montreal, QC, H3A 2T8 (Canada); Yee, H. K. C. [Department of Astronomy and Astrophysics, University of Toronto, 50 St. George St., Toronto, ON, M5S 3H4 (Canada); Gilbank, David [South African Astronomical Observatory, P.O. Box 9, Observatory, 7935 (South Africa); Ellingson, Erica [Department of Astrophysical and Planetary Sciences, University of Colorado at Boulder, Boulder, CO 80309 (United States); Gladders, Mike [Department of Astronomy and Astrophysics, University of Chicago, 5640 S. Ellis Ave., Chicago, IL 60637 (United States); Muzzin, Adam [Leiden Observatory, University of Leiden, Niels Bohrweg 2, NL-2333 CA, Leiden (Netherlands); Wilson, Gillian [Department of Physics and Astronomy, University of California at Riverside, 900 University Avenue, Riverside, CA 92521 (United States); Yan, Renbin [Center for Cosmology and Particle Physics, Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States)
2013-10-01
We present the results of an infrared (IR) study of high-redshift galaxy clusters with the MIPS camera on board the Spitzer Space Telescope. We have assembled a sample of 42 clusters from the Red-Sequence Cluster Survey-1 over the redshift range 0.3 < z < 1.0 and spanning an approximate range in mass of 10{sup 14-15} M {sub ☉}. We statistically measure the number of IR-luminous galaxies in clusters above a fixed inferred IR luminosity of 2 × 10{sup 11} M {sub ☉}, assuming a star forming galaxy template, per unit cluster mass and find it increases to higher redshift. Fitting a simple power-law we measure evolution of (1 + z){sup 5.1±1.9} over the range 0.3 < z < 1.0. These results are tied to the adoption of a single star forming galaxy template; the presence of active galactic nuclei, and an evolution in their relative contribution to the mid-IR galaxy emission, will alter the overall number counts per cluster and their rate of evolution. Under the star formation assumption we infer the approximate total star formation rate per unit cluster mass (ΣSFR/M {sub cluster}). The evolution is similar, with ΣSFR/M {sub cluster} ∼ (1 + z){sup 5.4±1.9}. We show that this can be accounted for by the evolution of the IR-bright field population over the same redshift range; that is, the evolution can be attributed entirely to the change in the in-falling field galaxy population. We show that the ΣSFR/M {sub cluster} (binned over all redshift) decreases with increasing cluster mass with a slope (ΣSFR/M{sub cluster}∼M{sub cluster}{sup -1.5±0.4}) consistent with the dependence of the stellar-to-total mass per unit cluster mass seen locally. The inferred star formation seen here could produce ∼5%-10% of the total stellar mass in massive clusters at z = 0, but we cannot constrain the descendant population, nor how rapidly the star-formation must shut-down once the galaxies have entered the cluster environment. Finally, we show a clear decrease in the number of IR
Human brain lesion-deficit inference remapped.
Mah, Yee-Haur; Husain, Masud; Rees, Geraint; Nachev, Parashkev
2014-09-01
Our knowledge of the anatomical organization of the human brain in health and disease draws heavily on the study of patients with focal brain lesions. Historically the first method of mapping brain function, it is still potentially the most powerful, establishing the necessity of any putative neural substrate for a given function or deficit. Great inferential power, however, carries a crucial vulnerability: without stronger alternatives any consistent error cannot be easily detected. A hitherto unexamined source of such error is the structure of the high-dimensional distribution of patterns of focal damage, especially in ischaemic injury-the commonest aetiology in lesion-deficit studies-where the anatomy is naturally shaped by the architecture of the vascular tree. This distribution is so complex that analysis of lesion data sets of conventional size cannot illuminate its structure, leaving us in the dark about the presence or absence of such error. To examine this crucial question we assembled the largest known set of focal brain lesions (n = 581), derived from unselected patients with acute ischaemic injury (mean age = 62.3 years, standard deviation = 17.8, male:female ratio = 0.547), visualized with diffusion-weighted magnetic resonance imaging, and processed with validated automated lesion segmentation routines. High-dimensional analysis of this data revealed a hidden bias within the multivariate patterns of damage that will consistently distort lesion-deficit maps, displacing inferred critical regions from their true locations, in a manner opaque to replication. Quantifying the size of this mislocalization demonstrates that past lesion-deficit relationships estimated with conventional inferential methodology are likely to be significantly displaced, by a magnitude dependent on the unknown underlying lesion-deficit relationship itself. Past studies therefore cannot be retrospectively corrected, except by new knowledge that would render them redundant
Differential Retention of Gene Functions in a Secondary Metabolite Cluster.
Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W
2017-08-01
In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Energy Technology Data Exchange (ETDEWEB)
Beerman, Lori C.; Johnson, L. Clifton; Fouesneau, Morgan; Dalcanton, Julianne J.; Weisz, Daniel R.; Williams, Ben F. [Department of Astronomy, University of Washington, Box 351580, Seattle, WA 98195 (United States); Seth, Anil C. [Department of Physics and Astronomy, University of Utah, Salt Lake City, UT 84112 (United States); Bell, Eric F. [Department of Astronomy, University of Michigan, 500 Church Street, Ann Arbor, MI 48109 (United States); Bianchi, Luciana C. [Department of Physics and Astronomy, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218 (United States); Caldwell, Nelson [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Dolphin, Andrew E. [Raytheon Company, 1151 East Hermans Road, Tucson, AZ 85756 (United States); Gouliermis, Dimitrios A. [Zentrum fuer Astronomie, Institut fuer Theoretische Astrophysik, Universitaet Heidelberg, Albert-Ueberle-Strasse 2, D-69120 Heidelberg (Germany); Kalirai, Jason S. [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States); Larsen, Soren S. [Department of Astrophysics, IMAPP, Radboud University Nijmegen, P.O. Box 9010, NL-6500 GL Nijmegen (Netherlands); Melbourne, Jason L. [Caltech Optical Observatories, Division of Physics, Mathematics and Astronomy, Mail Stop 301-17, California Institute of Technology, Pasadena, CA 91125 (United States); Rix, Hans-Walter [Max-Planck-Institut fuer Astronomie, Koenigstuhl 17, D-69117 Heidelberg (Germany); Skillman, Evan D., E-mail: beermalc@astro.washington.edu [Department of Astronomy, University of Minnesota, 116 Church Street SE, Minneapolis, MN 55455 (United States)
2012-12-01
The apparent age and mass of a stellar cluster can be strongly affected by stochastic sampling of the stellar initial mass function (IMF), when inferred from the integrated color of low-mass clusters ({approx}<10{sup 4} M {sub Sun }). We use simulated star clusters to show that these effects are minimized when the brightest, rapidly evolving stars in a cluster can be resolved, and the light of the fainter, more numerous unresolved stars can be analyzed separately. When comparing the light from the less luminous cluster members to models of unresolved light, more accurate age estimates can be obtained than when analyzing the integrated light from the entire cluster under the assumption that the IMF is fully populated. We show the success of this technique first using simulated clusters, and then with a stellar cluster in M31. This method represents one way of accounting for the discrete, stochastic sampling of the stellar IMF in less massive clusters and can be leveraged in studies of clusters throughout the Local Group and other nearby galaxies.
Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Support Distribution Machines
Ntampaka, Michelle; Trac, Hy; Sutherland, Dougal; Fromenteau, Sebastien; Poczos, Barnabas; Schneider, Jeff
2018-01-01
We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create two mock catalogs from Multidark’s publicly available N-body MDPL1 simulation, one with perfect galaxy cluster membership infor- mation and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width E=0.87. Interlopers introduce additional scatter, significantly widening the error distribution further (E=2.13). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement (E=0.67) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncon- taminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.
Performance quantification of clustering algorithms for false positive removal in fMRI by ROC curves
Directory of Open Access Journals (Sweden)
André Salles Cunha Peres
Full Text Available Abstract Introduction Functional magnetic resonance imaging (fMRI is a non-invasive technique that allows the detection of specific cerebral functions in humans based on hemodynamic changes. The contrast changes are about 5%, making visual inspection impossible. Thus, statistic strategies are applied to infer which brain region is engaged in a task. However, the traditional methods like general linear model and cross-correlation utilize voxel-wise calculation, introducing a lot of false-positive data. So, in this work we tested post-processing cluster algorithms to diminish the false-positives. Methods In this study, three clustering algorithms (the hierarchical cluster, k-means and self-organizing maps were tested and compared for false-positive removal in the post-processing of cross-correlation analyses. Results Our results showed that the hierarchical cluster presented the best performance to remove the false positives in fMRI, being 2.3 times more accurate than k-means, and 1.9 times more accurate than self-organizing maps. Conclusion The hierarchical cluster presented the best performance in false-positive removal because it uses the inconsistency coefficient threshold, while k-means and self-organizing maps utilize a priori cluster number (centroids and neurons number; thus, the hierarchical cluster avoids clustering scattered voxels, as the inconsistency coefficient threshold allows only the voxels to be clustered that are at a minimum distance to some cluster.
Multi-Optimisation Consensus Clustering
Li, Jian; Swift, Stephen; Liu, Xiaohui
Ensemble Clustering has been developed to provide an alternative way of obtaining more stable and accurate clustering results. It aims to avoid the biases of individual clustering algorithms. However, it is still a challenge to develop an efficient and robust method for Ensemble Clustering. Based on an existing ensemble clustering method, Consensus Clustering (CC), this paper introduces an advanced Consensus Clustering algorithm called Multi-Optimisation Consensus Clustering (MOCC), which utilises an optimised Agreement Separation criterion and a Multi-Optimisation framework to improve the performance of CC. Fifteen different data sets are used for evaluating the performance of MOCC. The results reveal that MOCC can generate more accurate clustering results than the original CC algorithm.
Photochemistry in rare gas clusters
International Nuclear Information System (INIS)
Moeller, T.; Haeften, K. von; Pietrowski, R. von
1999-01-01
In this contribution photochemical processes in pure rare gas clusters will be discussed. The relaxation dynamics of electronically excited He clusters is investigated with luminescence spectroscopy. After electronic excitation of He clusters many sharp lines are observed in the visible and infrared spectral range which can be attributed to He atoms and molecules desorbing from the cluster. It turns out that the desorption of electronically excited He atoms and molecules is an important decay channel. The findings for He clusters are compared with results for Ar clusters. While desorption of electronically excited He atoms is observed for all clusters containing up to several thousand atoms a corresponding process in Ar clusters is only observed for very small clusters (N<10). (orig.)
Photochemistry in rare gas clusters
Energy Technology Data Exchange (ETDEWEB)
Moeller, T.; Haeften, K. von; Pietrowski, R. von [Deutsches Elektronen-Synchrotron (DESY), Hamburg (Germany). Hamburger Synchrotronstrahlungslabor; Laarman, T. [Universitaet Hamburg, II. Institut fuer Experimentalphysik, Luruper Chaussee 149, D-22761 Hamburg (Germany)
1999-12-01
In this contribution photochemical processes in pure rare gas clusters will be discussed. The relaxation dynamics of electronically excited He clusters is investigated with luminescence spectroscopy. After electronic excitation of He clusters many sharp lines are observed in the visible and infrared spectral range which can be attributed to He atoms and molecules desorbing from the cluster. It turns out that the desorption of electronically excited He atoms and molecules is an important decay channel. The findings for He clusters are compared with results for Ar clusters. While desorption of electronically excited He atoms is observed for all clusters containing up to several thousand atoms a corresponding process in Ar clusters is only observed for very small clusters (N<10). (orig.)
Globular clusters, old and young
International Nuclear Information System (INIS)
Samus', N.N.
1984-01-01
The problem of similarity of and difference in the globular and scattered star clusters is considered. Star clusters in astronomy are related either to globular or to scattered ones according to the structure of Hertzsprung-Russell diagram constructed for star clusters, but not according to the appearance. The qlobular clusters in the Galaxy are composed of giants and subgiants, which testifies to the old age of the globular clusters. The Globular clusters in the Magellanic clouds are classified into ''red'' ones - similar to the globular clusters of the Galaxy, and ''blue'' ones - similar to them in appearance but differing extremely by the star composition and so by the age. The old star clusters are suggested to be called globular ones, while another name (''populous'', for example) is suggested to be used for other clusters similar to globular ones only in appearance
Meta-learning framework applied in bioinformatics inference system design.
Arredondo, Tomás; Ormazábal, Wladimir
2015-01-01
This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.
Active Inference, homeostatic regulation and adaptive behavioural control.
Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl
2015-11-01
We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Globular clusters and galaxy halos
International Nuclear Information System (INIS)
Van Den Bergh, S.
1984-01-01
Using semipartial correlation coefficients and bootstrap techniques, a study is made of the important features of globular clusters with respect to the total number of galaxy clusters and dependence of specific galaxy cluster on parent galaxy type, cluster radii, luminosity functions and cluster ellipticity. It is shown that the ellipticity of LMC clusters correlates significantly with cluster luminosity functions, but not with cluster age. The cluter luminosity value above which globulars are noticeably flattened may differ by a factor of about 100 from galaxy to galaxy. Both in the Galaxy and in M31 globulars with small core radii have a Gaussian distribution over luminosity, whereas clusters with large core radii do not. In the cluster systems surrounding the Galaxy, M31 and NGC 5128 the mean radii of globular clusters was found to increase with the distance from the nucleus. Central galaxies in rich clusters have much higher values for specific globular cluster frequency than do other cluster ellipticals, suggesting that such central galaxies must already have been different from normal ellipticals at the time they were formed
Clustering of resting state networks.
Directory of Open Access Journals (Sweden)
Megan H Lee
Full Text Available The goal of the study was to demonstrate a hierarchical structure of resting state activity in the healthy brain using a data-driven clustering algorithm.The fuzzy-c-means clustering algorithm was applied to resting state fMRI data in cortical and subcortical gray matter from two groups acquired separately, one of 17 healthy individuals and the second of 21 healthy individuals. Different numbers of clusters and different starting conditions were used. A cluster dispersion measure determined the optimal numbers of clusters. An inner product metric provided a measure of similarity between different clusters. The two cluster result found the task-negative and task-positive systems. The cluster dispersion measure was minimized with seven and eleven clusters. Each of the clusters in the seven and eleven cluster result was associated with either the task-negative or task-positive system. Applying the algorithm to find seven clusters recovered previously described resting state networks, including the default mode network, frontoparietal control network, ventral and dorsal attention networks, somatomotor, visual, and language networks. The language and ventral attention networks had significant subcortical involvement. This parcellation was consistently found in a large majority of algorithm runs under different conditions and was robust to different methods of initialization.The clustering of resting state activity using different optimal numbers of clusters identified resting state networks comparable to previously obtained results. This work reinforces the observation that resting state networks are hierarchically organized.
Bayesian inference data evaluation and decisions
Harney, Hanns Ludwig
2016-01-01
This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...
Bayesian inference and updating of reliability data
International Nuclear Information System (INIS)
Sabri, Z.A.; Cullingford, M.C.; David, H.T.; Husseiny, A.A.
1980-01-01
A Bayes methodology for inference of reliability values using available but scarce current data is discussed. The method can be used to update failure rates as more information becomes available from field experience, assuming that the performance of a given component (or system) exhibits a nonhomogeneous Poisson process. Bayes' theorem is used to summarize the historical evidence and current component data in the form of a posterior distribution suitable for prediction and for smoothing or interpolation. An example is given. It may be appropriate to apply the methodology developed here to human error data, in which case the exponential model might be used to describe the learning behavior of the operator or maintenance crew personnel
Automatic inference of indexing rules for MEDLINE
Directory of Open Access Journals (Sweden)
Shooshan Sonya E
2008-11-01
Full Text Available Abstract Background: Indexing is a crucial step in any information retrieval system. In MEDLINE, a widely used database of the biomedical literature, the indexing process involves the selection of Medical Subject Headings in order to describe the subject matter of articles. The need for automatic tools to assist MEDLINE indexers in this task is growing with the increasing number of publications being added to MEDLINE. Methods: In this paper, we describe the use and the customization of Inductive Logic Programming (ILP to infer indexing rules that may be used to produce automatic indexing recommendations for MEDLINE indexers. Results: Our results show that this original ILP-based approach outperforms manual rules when they exist. In addition, the use of ILP rules also improves the overall performance of the Medical Text Indexer (MTI, a system producing automatic indexing recommendations for MEDLINE. Conclusion: We expect the sets of ILP rules obtained in this experiment to be integrated into MTI.
Progression inference for somatic mutations in cancer
Directory of Open Access Journals (Sweden)
Leif E. Peterson
2017-04-01
Full Text Available Computational methods were employed to determine progression inference of genomic alterations in commonly occurring cancers. Using cross-sectional TCGA data, we computed evolutionary trajectories involving selectivity relationships among pairs of gene-specific genomic alterations such as somatic mutations, deletions, amplifications, downregulation, and upregulation among the top 20 driver genes associated with each cancer. Results indicate that the majority of hierarchies involved TP53, PIK3CA, ERBB2, APC, KRAS, EGFR, IDH1, VHL, etc. Research into the order and accumulation of genomic alterations among cancer driver genes will ever-increase as the costs of nextgen sequencing subside, and personalized/precision medicine incorporates whole-genome scans into the diagnosis and treatment of cancer. Keywords: Oncology, Cancer research, Genetics, Computational biology
Inferring Phylogenetic Networks from Gene Order Data
Directory of Open Access Journals (Sweden)
Alexey Anatolievich Morozov
2013-01-01
Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.
Supplier Selection Using Fuzzy Inference System
Directory of Open Access Journals (Sweden)
hamidreza kadhodazadeh
2014-01-01
Full Text Available Suppliers are one of the most vital parts of supply chain whose operation has significant indirect effect on customer satisfaction. Since customer's expectations from organization are different, organizations should consider different standards, respectively. There are many researches in this field using different standards and methods in recent years. The purpose of this study is to propose an approach for choosing a supplier in a food manufacturing company considering cost, quality, service, type of relationship and structure standards of the supplier organization. To evaluate supplier according to the above standards, the fuzzy inference system has been used. Input data of this system includes supplier's score in any standard that is achieved by AHP approach and the output is final score of each supplier. Finally, a supplier has been selected that although is not the best in price and quality, has achieved good score in all of the standards.
Cosmological constraints with clustering-based redshifts
Kovetz, Ely D.; Raccanelli, Alvise; Rahman, Mubdi
2017-07-01
We demonstrate that observations lacking reliable redshift information, such as photometric and radio continuum surveys, can produce robust measurements of cosmological parameters when empowered by clustering-based redshift estimation. This method infers the redshift distribution based on the spatial clustering of sources, using cross-correlation with a reference data set with known redshifts. Applying this method to the existing Sloan Digital Sky Survey (SDSS) photometric galaxies, and projecting to future radio continuum surveys, we show that sources can be efficiently divided into several redshift bins, increasing their ability to constrain cosmological parameters. We forecast constraints on the dark-energy equation of state and on local non-Gaussianity parameters. We explore several pertinent issues, including the trade-off between including more sources and minimizing the overlap between bins, the shot-noise limitations on binning and the predicted performance of the method at high redshifts, and most importantly pay special attention to possible degeneracies with the galaxy bias. Remarkably, we find that once this technique is implemented, constraints on dynamical dark energy from the SDSS imaging catalogue can be competitive with, or better than, those from the spectroscopic BOSS survey and even future planned experiments. Further, constraints on primordial non-Gaussianity from future large-sky radio-continuum surveys can outperform those from the Planck cosmic microwave background experiment and rival those from future spectroscopic galaxy surveys. The application of this method thus holds tremendous promise for cosmology.
Gene expression inference with deep learning.
Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui
2016-06-15
Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Systematic parameter inference in stochastic mesoscopic modeling
Energy Technology Data Exchange (ETDEWEB)
Lei, Huan; Yang, Xiu [Pacific Northwest National Laboratory, Richland, WA 99352 (United States); Li, Zhen [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States); Karniadakis, George Em, E-mail: george_karniadakis@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States)
2017-02-01
We propose a method to efficiently determine the optimal coarse-grained force field in mesoscopic stochastic simulations of Newtonian fluid and polymer melt systems modeled by dissipative particle dynamics (DPD) and energy conserving dissipative particle dynamics (eDPD). The response surfaces of various target properties (viscosity, diffusivity, pressure, etc.) with respect to model parameters are constructed based on the generalized polynomial chaos (gPC) expansion using simulation results on sampling points (e.g., individual parameter sets). To alleviate the computational cost to evaluate the target properties, we employ the compressive sensing method to compute the coefficients of the dominant gPC terms given the prior knowledge that the coefficients are “sparse”. The proposed method shows comparable accuracy with the standard probabilistic collocation method (PCM) while it imposes a much weaker restriction on the number of the simulation samples especially for systems with high dimensional parametric space. Fully access to the response surfaces within the confidence range enables us to infer the optimal force parameters given the desirable values of target properties at the macroscopic scale. Moreover, it enables us to investigate the intrinsic relationship between the model parameters, identify possible degeneracies in the parameter space, and optimize the model by eliminating model redundancies. The proposed method provides an efficient alternative approach for constructing mesoscopic models by inferring model parameters to recover target properties of the physics systems (e.g., from experimental measurements), where those force field parameters and formulation cannot be derived from the microscopic level in a straight forward way.
African Journals Online (AJOL)
Background: The importance of local variations in patterns of health and disease are increasingly recognised, but, particularly in the case of tropical infections, available methods and resources for characterising disease clusters in time and space are limited. Whilst the Global Positioning System. (GPS) allows accurate and ...
Indian Academy of Sciences (India)
First page Back Continue Last page Overview Graphics. Hardness of Clustering. Both k-means and k-medians intractable (when n and d are both inputs even for k =2). The best known deterministic algorithms. are based on Voronoi partitioning that. takes about time. Need for approximation – “close” to optimal.
International Nuclear Information System (INIS)
Bernardes, N.
1984-01-01
A discussion is presented of zero-point motion effects on the binding energy of a small cluster of identical particles interacting through short range attractive-repulsive forces. The model is appropriate to a discussion of both Van der Waals as well as nuclear forces. (Author) [pt
Emergence of regional clusters
DEFF Research Database (Denmark)
Dahl, Michael S.; Østergaard, Christian Richter; Dalum, Bent
2010-01-01
The literature on regional clusters has increased considerably during the last decade. The emergence and growth patterns are usually explained by such factors as unique local culture, regional capabilities, tacit knowledge or the existence of location-specific externalities (knowledge spillovers...
International Nuclear Information System (INIS)
Powers, D.E.; Hansen, S.G.; Geusic, M.E.; Michalopoulos, D.L.; Smalley, R.E.
1983-01-01
Copper clusters ranging in size from 1 to 29 atoms have been prepared in a supersonic beam by laser vaporization of a rotating copper target rod within the throat of a pulsed supersonic nozzle using helium for the carrier gas. The clusters were cooled extensively in the supersonic expansion [T(translational) 1 to 4 K, T(rotational) = 4 K, T(vibrational) = 20 to 70 K]. These clusters were detected in the supersonic beam by laser photoionization with time-of-flight mass analysis. Using a number of fixed frequency outputs of an exciplex laser, the threshold behavior of the photoionization cross section was monitored as a function of cluster size.nce two-photon ionization (R2PI) with mass selective detection allowed the detection of five new electronic band systems in the region between 2690 and 3200 A, for each of the three naturally occurring isotopic forms of Cu 2 . In the process of scanning the R2PI spectrum of these new electronic states, the ionization potential of the copper dimer was determined to be 7.894 +- 0.015 eV
2016-09-01
We consider the problem of subspace clustering: given points that lie on or near the union of many low-dimensional linear subspaces, recover the subspaces. To this end, one first identifies sets of points close to the same subspace and uses the sets ...
State-Space Inference and Learning with Gaussian Processes
Turner, R; Deisenroth, MP; Rasmussen, CE
2010-01-01
18.10.13 KB. Ok to add author version to spiral, authors hold copyright. State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. C...
Yau, Christopher; Holmes, Chris
2011-07-01
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
Data clustering algorithms and applications
Aggarwal, Charu C
2013-01-01
Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.The book focuses on three primary aspects of data clustering: Methods, describing key techniques commonly used for clustering, such as fea
Cluster structures in light nuclei
International Nuclear Information System (INIS)
Horiuchi, H.
2000-01-01
Complete text of publication follows. Clustering in neutron-rich nuclei is discussed. To understand the novel features (1,2,3) of the clustering in neutron-rich nuclei, the basic features of the clustering in stable nuclei (4) are briefly reviewed. In neutron-rich nuclei, the requirement of the stability of clusters is questioned and the threshold rule is no more obeyed. Examples of clustering in Be and B isotopes (4,5) are discussed in some detail. Possible existence of novel type of clustering near neutron dripline is suggested (1). (author)
International Nuclear Information System (INIS)
Horiuchi, H.; Ikeda, K.
1986-01-01
This article reviews the development of the cluster model study. The stress is put on two points; one is how the cluster structure has come to be regarded as a fundamental structure in light nuclei together with the shell-model structure, and the other is how at present the cluster model is extended to and connected with the studies of the various subjects many of which are in the neighbouring fields. The authors the present the main theme with detailed explanations of the fundamentals of the microscopic cluster model which have promoted the development of the cluster mode. Examples of the microscopic cluster model study of light nuclear structure are given
Probabilistic logic networks a comprehensive framework for uncertain inference
Goertzel, Ben; Goertzel, Izabela Freire; Heljakka, Ari
2008-01-01
This comprehensive book describes Probabilistic Logic Networks (PLN), a novel conceptual, mathematical and computational approach to uncertain inference. A broad scope of reasoning types are considered.
Parametric statistical inference basic theory and modern approaches
Zacks, Shelemyahu; Tsokos, C P
1981-01-01
Parametric Statistical Inference: Basic Theory and Modern Approaches presents the developments and modern trends in statistical inference to students who do not have advanced mathematical and statistical preparation. The topics discussed in the book are basic and common to many fields of statistical inference and thus serve as a jumping board for in-depth study. The book is organized into eight chapters. Chapter 1 provides an overview of how the theory of statistical inference is presented in subsequent chapters. Chapter 2 briefly discusses statistical distributions and their properties. Chapt
The rotation of galaxy clusters
International Nuclear Information System (INIS)
Tovmassian, H.M.
2015-01-01
The method for detection of the galaxy cluster rotation based on the study of distribution of member galaxies with velocities lower and higher of the cluster mean velocity over the cluster image is proposed. The search for rotation is made for flat clusters with a/b> 1.8 and BMI type clusters which are expected to be rotating. For comparison there were studied also round clusters and clusters of NBMI type, the second by brightness galaxy in which does not differ significantly from the cluster cD galaxy. Seventeen out of studied 65 clusters are found to be rotating. It was found that the detection rate is sufficiently high for flat clusters, over 60 per cent, and clusters of BMI type with dominant cD galaxy, ≈ 35 per cent. The obtained results show that clusters were formed from the huge primordial gas clouds and preserved the rotation of the primordial clouds, unless they did not have mergings with other clusters and groups of galaxies, in the result of which the rotation has been prevented
Single-cluster dynamics for the random-cluster model
Deng, Y.; Qian, X.; Blöte, H.W.J.
2009-01-01
We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q-state Potts model to noninteger values q>1. Its results for static quantities are in a satisfactory agreement with those
Choosing the Number of Clusters in K-Means Clustering
Steinley, Douglas; Brusco, Michael J.
2011-01-01
Steinley (2007) provided a lower bound for the sum-of-squares error criterion function used in K-means clustering. In this article, on the basis of the lower bound, the authors propose a method to distinguish between 1 cluster (i.e., a single distribution) versus more than 1 cluster. Additionally, conditional on indicating there are multiple…
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Making inference from wildlife collision data: inferring predator absence from prey strikes
Directory of Open Access Journals (Sweden)
Peter Caley
2017-02-01
Full Text Available Wildlife collision data are ubiquitous, though challenging for making ecological inference due to typically irreducible uncertainty relating to the sampling process. We illustrate a new approach that is useful for generating inference from predator data arising from wildlife collisions. By simply conditioning on a second prey species sampled via the same collision process, and by using a biologically realistic numerical response functions, we can produce a coherent numerical response relationship between predator and prey. This relationship can then be used to make inference on the population size of the predator species, including the probability of extinction. The statistical conditioning enables us to account for unmeasured variation in factors influencing the runway strike incidence for individual airports and to enable valid comparisons. A practical application of the approach for testing hypotheses about the distribution and abundance of a predator species is illustrated using the hypothesized red fox incursion into Tasmania, Australia. We estimate that conditional on the numerical response between fox and lagomorph runway strikes on mainland Australia, the predictive probability of observing no runway strikes of foxes in Tasmania after observing 15 lagomorph strikes is 0.001. We conclude there is enough evidence to safely reject the null hypothesis that there is a widespread red fox population in Tasmania at a population density consistent with prey availability. The method is novel and has potential wider application.
Making inference from wildlife collision data: inferring predator absence from prey strikes.
Caley, Peter; Hosack, Geoffrey R; Barry, Simon C
2017-01-01
Wildlife collision data are ubiquitous, though challenging for making ecological inference due to typically irreducible uncertainty relating to the sampling process. We illustrate a new approach that is useful for generating inference from predator data arising from wildlife collisions. By simply conditioning on a second prey species sampled via the same collision process, and by using a biologically realistic numerical response functions, we can produce a coherent numerical response relationship between predator and prey. This relationship can then be used to make inference on the population size of the predator species, including the probability of extinction. The statistical conditioning enables us to account for unmeasured variation in factors influencing the runway strike incidence for individual airports and to enable valid comparisons. A practical application of the approach for testing hypotheses about the distribution and abundance of a predator species is illustrated using the hypothesized red fox incursion into Tasmania, Australia. We estimate that conditional on the numerical response between fox and lagomorph runway strikes on mainland Australia, the predictive probability of observing no runway strikes of foxes in Tasmania after observing 15 lagomorph strikes is 0.001. We conclude there is enough evidence to safely reject the null hypothesis that there is a widespread red fox population in Tasmania at a population density consistent with prey availability. The method is novel and has potential wider application.
Heavy hitters via cluster-preserving clustering
DEFF Research Database (Denmark)
Larsen, Kasper Green; Nelson, Jelani; Nguyen, Huy L.
2016-01-01
In the turnstile lp heavy hitters problem with parameter ε, one must maintain a high-dimensional vector xεRn subject to updates of the form update (i,Δ) causing the change xi≤ ← xi + Δ, where iε[n], ΔεR. Upon receiving a query, the goal is to report every "heavy hitter" iε[n] with |xi| ≥ε......|x|p as part of a list L⊆[n] of size O(1/εp), i.e. proportional to the maximum possible number of heavy hitters. For any pε(0,2] the COUNTSKETCH of [CCFC04] solves lp heavy hitters using O(ε-plog n) words of space with O(log n) update time, O(nlog n) query time to output L, and whose output after any query......, providing correctness whp. In fact, a simpler version of our algorithm for p = 1 in the strict turnstile model answers queries even faster than the "dyadic trick" by roughly a log n factor, dominating it in all regards. Our main innovation is an efficient reduction from the heavy hitters to a clustering...
Evolution of the spherical clusters
International Nuclear Information System (INIS)
Surdin, V.G.
1978-01-01
The possible processes of the Galaxy spherical clusters formation and evolution are described on a popular level. The orbits of spherical cluster motion and their spatial velocities are determined. Given are the distrbutions of spherical cluster stars according to their velocities and the observed distribution of spherical clusters in the area of the Galaxy slow evolution. The dissipation and dynamic friction processes destructing clusters with the mass less than 10 4 of solar mass and bringing about the reduction of clusters in the Galaxy are considered. The paradox of forming mainly X-ray sources in spherical clusters is explained. The schematic image of possible ways of forming X-ray sources in spherical clusters is given
International Nuclear Information System (INIS)
Beck, Christian
2010-01-01
Following the pioneering discovery of alpha clustering and of molecular resonances, the field of nuclear clustering is presently one of the domains of heavy-ion nuclear physics facing both the greatest challenges and opportunities. After many summer schools and workshops, in particular over the last decade, the community of nuclear molecular physics decided to team up in producing a comprehensive collection of lectures and tutorial reviews covering the field. This first volume, gathering seven extensive lectures, covers the follow topics: - Cluster Radioactivity - Cluster States and Mean Field Theories - Alpha Clustering and Alpha Condensates - Clustering in Neutron-rich Nuclei - Di-neutron Clustering - Collective Clusterization in Nuclei - Giant Nuclear Molecules By promoting new ideas and developments while retaining a pedagogical nature of presentation throughout, these lectures will both serve as a reference and as advanced teaching material for future courses and schools in the fields of nuclear physics and nuclear astrophysics. (orig.)
Structure and bonding in clusters
International Nuclear Information System (INIS)
Kumar, V.
1991-10-01
We review here the recent progress made in the understanding of the electronic and atomic structure of small clusters of s-p bonded materials using the density functional molecular dynamics technique within the local density approximation. Starting with a brief description of the method, results are presented for alkali metal clusters, clusters of divalent metals such as Mg and Be which show a transition from van der Waals or weak chemical bonding to metallic behaviour as the cluster size grows and clusters of Al, Sn and Sb. In the case of semiconductors, we discuss results for Si, Ge and GaAs clusters. Clusters of other materials such as P, C, S, and Se are also briefly discussed. From these and other available results we suggest the possibility of unique structures for the magic clusters. (author). 69 refs, 7 figs, 1 tab
Random matrix improved subspace clustering
Couillet, Romain; Kammoun, Abla
2017-01-01
This article introduces a spectral method for statistical subspace clustering. The method is built upon standard kernel spectral clustering techniques, however carefully tuned by theoretical understanding arising from random matrix findings. We show
Eclipsing binaries in open clusters
DEFF Research Database (Denmark)
Southworth, John; Clausen, J.V.
2006-01-01
Stars: fundamental parameters - Stars : binaries : eclipsing - Stars: Binaries: spectroscopic - Open clusters and ass. : general Udgivelsesdato: 5 August......Stars: fundamental parameters - Stars : binaries : eclipsing - Stars: Binaries: spectroscopic - Open clusters and ass. : general Udgivelsesdato: 5 August...
Dynamical aspects of galaxy clustering
International Nuclear Information System (INIS)
Fall, S.M.
1980-01-01
Some recent work on the origin and evolution of galaxy clustering is reviewed, particularly within the context of the gravitational instability theory and the hot big-bang cosmological model. Statistical measures of clustering, including correlation functions and multiplicity functions, are explained and discussed. The close connection between galaxy formation and clustering is emphasized. Additional topics include the dependence of galaxy clustering on the spectrum of primordial density fluctuations and the mean mass density of the Universe. (author)
[Cluster analysis in biomedical researches].
Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D
2013-01-01
Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.
THE ASSEMBLY OF GALAXY CLUSTERS
International Nuclear Information System (INIS)
Berrier, Joel C.; Stewart, Kyle R.; Bullock, James S.; Purcell, Chris W.; Barton, Elizabeth J.; Wechsler, Risa H.
2009-01-01
We study the formation of 53 galaxy cluster-size dark matter halos (M = 10 14.0-14.76 M sun ) formed within a pair of cosmological Λ cold dark matter N-body simulations, and track the accretion histories of cluster subhalos with masses large enough to host ∼0.3 L * galaxies. By associating subhalos with cluster galaxies, we find the majority of galaxies in clusters experience no 'preprocessing' in the group environment prior to their accretion into the cluster. On average, 70% of cluster galaxies fall into the cluster potential directly from the field, with no luminous companions in their host halos at the time of accretion; less than 12% are accreted as members of groups with five or more galaxies. Moreover, we find that cluster galaxies are significantly less likely to have experienced a merger in the recent past (∼<6 Gyr) than a field halo of the same mass. These results suggest that local cluster processes such as ram pressure stripping, galaxy harassment, or strangulation play the dominant role in explaining the difference between cluster and field populations at a fixed stellar mass, and that pre-evolution or past merging in the group environment is of secondary importance for setting cluster galaxy properties for most clusters. The accretion times for z = 0 cluster members are quite extended, with ∼20% incorporated into the cluster halo more than 7 Gyr ago and ∼20% within the last 2 Gyr. By comparing the observed morphological fractions in cluster and field populations, we estimate an approximate timescale for late-type to early-type transformation within the cluster environment to be ∼6 Gyr.
Chronorisk in cluster headache
DEFF Research Database (Denmark)
Barloese, Mads; Haddock, Bryan; Lund, Nunu T
2018-01-01
and a spectral analysis identifying oscillations in risk. Results The Gaussian model fit for the chronorisk distribution for all patients reporting diurnal rhythmicity (n = 286) had a goodness of fit R2 value of 0.97 and identified three times of increased risk peaking at 21:41, 02:02 and 06:23 hours....... In subgroups, three to five modes of increased risk were found and goodness of fit values ranged from 0.85-0.99. Spectral analysis revealed multiple distinct oscillation frequencies in chronorisk in subgroups including a dominant circadian oscillation in episodic patients and an ultradian in chronic....... Conclusions Chronorisk in cluster headache can be characterised as a sum of individual, timed events of increased risk, each having a Gaussian distribution. In episodic cluster headache, attacks follow a circadian rhythmicity whereas, in the chronic variant, ultradian oscillations are dominant reflecting...
DEFF Research Database (Denmark)
Lund-Thomsen, Peter; Pillay, Renginee G.
2012-01-01
Purpose – The paper seeks to review the literature on CSR in industrial clusters in developing countries, identifying the main strengths, weaknesses, and gaps in this literature, pointing to future research directions and policy implications in the area of CSR and industrial cluster development....... Design/methodology/approach – A literature review is conducted of both academic and policy-oriented writings that contain the keywords “industrial clusters” and “developing countries” in combination with one or more of the following terms: corporate social responsibility, environmental management, labor...... standards, child labor, climate change, social upgrading, and environmental upgrading. The authors examine the key themes in this literature, identify the main gaps, and point to areas where future work in this area could usefully be undertaken. Feedback has been sought from some of the leading authors...
Huchtmeier, W. K.; Richter, O. G.; Materne, J.
1981-09-01
The large-scale structure of the universe is dominated by clustering. Most galaxies seem to be members of pairs, groups, clusters, and superclusters. To that degree we are able to recognize a hierarchical structure of the universe. Our local group of galaxies (LG) is centred on two large spiral galaxies: the Andromeda nebula and our own galaxy. Three sr:naller galaxies - like M 33 - and at least 23 dwarf galaxies (KraanKorteweg and Tammann, 1979, Astronomische Nachrichten, 300, 181) can be found in the evironment of these two large galaxies. Neighbouring groups have comparable sizes (about 1 Mpc in extent) and comparable numbers of bright members. Small dwarf galaxies cannot at present be observed at great distances.
Fractal properties of percolation clusters in Euclidian neural networks
International Nuclear Information System (INIS)
Franovic, Igor; Miljkovic, Vladimir
2009-01-01
The process of spike packet propagation is observed in two-dimensional recurrent networks, consisting of locally coupled neuron pools. Local population dynamics is characterized by three key parameters - probability for pool connectedness, synaptic strength and neuron refractoriness. The formation of dynamic attractors in our model, synfire chains, exhibits critical behavior, corresponding to percolation phase transition, with probability for non-zero synaptic strength values representing the critical parameter. Applying the finite-size scaling method, we infer a family of critical lines for various synaptic strengths and refractoriness values, and determine the Hausdorff-Besicovitch fractal dimension of the percolation clusters.
GAMMA RAYS FROM STAR FORMATION IN CLUSTERS OF GALAXIES
International Nuclear Information System (INIS)
Storm, Emma M.; Jeltema, Tesla E.; Profumo, Stefano
2012-01-01
Star formation in galaxies is observed to be associated with gamma-ray emission, presumably from non-thermal processes connected to the acceleration of cosmic-ray nuclei and electrons. The detection of gamma rays from starburst galaxies by the Fermi Large Area Telescope (LAT) has allowed the determination of a functional relationship between star formation rate and gamma-ray luminosity. Since star formation is known to scale with total infrared (8-1000 μm) and radio (1.4 GHz) luminosity, the observed infrared and radio emission from a star-forming galaxy can be used to quantitatively infer the galaxy's gamma-ray luminosity. Similarly, star-forming galaxies within galaxy clusters allow us to derive lower limits on the gamma-ray emission from clusters, which have not yet been conclusively detected in gamma rays. In this study, we apply the functional relationships between gamma-ray luminosity and radio and IR luminosities of galaxies derived by the Fermi Collaboration to a sample of the best candidate galaxy clusters for detection in gamma rays in order to place lower limits on the gamma-ray emission associated with star formation in galaxy clusters. We find that several clusters have predicted gamma-ray emission from star formation that are within an order of magnitude of the upper limits derived in Ackermann et al. based on non-detection by Fermi-LAT. Given the current gamma-ray limits, star formation likely plays a significant role in the gamma-ray emission in some clusters, especially those with cool cores. We predict that both Fermi-LAT over the course of its lifetime and the future Cerenkov Telescope Array will be able to detect gamma-ray emission from star-forming galaxies in clusters.
Text Clustering Algorithm Based on Random Cluster Core
Directory of Open Access Journals (Sweden)
Huang Long-Jun
2016-01-01
Full Text Available Nowadays clustering has become a popular text mining algorithm, but the huge data can put forward higher requirements for the accuracy and performance of text mining. In view of the performance bottleneck of traditional text clustering algorithm, this paper proposes a text clustering algorithm with random features. This is a kind of clustering algorithm based on text density, at the same time using the neighboring heuristic rules, the concept of random cluster is introduced, which effectively reduces the complexity of the distance calculation.
On clusters and clustering from atoms to fractals
Reynolds, PJ
1993-01-01
This book attempts to answer why there is so much interest in clusters. Clusters occur on all length scales, and as a result occur in a variety of fields. Clusters are interesting scientifically, but they also have important consequences technologically. The division of the book into three parts roughly separates the field into small, intermediate, and large-scale clusters. Small clusters are the regime of atomic and molecular physics and chemistry. The intermediate regime is the transitional regime, with its characteristics including the onset of bulk-like behavior, growth and aggregation, a
Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
Directory of Open Access Journals (Sweden)
Wei Du
2013-01-01
Full Text Available Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.