protohox cluster inferred: Topics by WorldWideScience.org

Sample records for protohox cluster inferred

Phylogenetic Inference of HIV Transmission Clusters

Directory of Open Access Journals (Sweden)

Vlad Novitsky

2017-10-01

Full Text Available Better understanding the structure and dynamics of HIV transmission networks is essential for designing the most efficient interventions to prevent new HIV transmissions, and ultimately for gaining control of the HIV epidemic. The inference of phylogenetic relationships and the interpretation of results rely on the definition of the HIV transmission cluster. The definition of the HIV cluster is complex and dependent on multiple factors, including the design of sampling, accuracy of sequencing, precision of sequence alignment, evolutionary models, the phylogenetic method of inference, and specified thresholds for cluster support. While the majority of studies focus on clusters, non-clustered cases could also be highly informative. A new dimension in the analysis of the global and local HIV epidemics is the concept of phylogenetically distinct HIV sub-epidemics. The identification of active HIV sub-epidemics reveals spreading viral lineages and may help in the design of targeted interventions.HIVclustering can also be affected by sampling density. Obtaining a proper sampling density may increase statistical power and reduce sampling bias, so sampling density should be taken into account in study design and in interpretation of phylogenetic results. Finally, recent advances in long-range genotyping may enable more accurate inference of HIV transmission networks. If performed in real time, it could both inform public-health strategies and be clinically relevant (e.g., drug-resistance testing.
Robust Inference with Multi-way Clustering

OpenAIRE

A. Colin Cameron; Jonah B. Gelbach; Douglas L. Miller; Doug Miller

2009-01-01

In this paper we propose a variance estimator for the OLS estimator as well as for nonlinear estimators such as logit, probit and GMM. This variance estimator enables cluster-robust inference when there is two-way or multi-way clustering that is non-nested. The variance estimator extends the standard cluster-robust variance estimator or sandwich estimator for one-way clustering (e.g. Liang and Zeger (1986), Arellano (1987)) and relies on similar relatively weak distributional assumptions. Our...
Likelihood-based inference for clustered line transect data

DEFF Research Database (Denmark)

Waagepetersen, Rasmus Plenge; Schweder, Tore

The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...
Likelihood-based inference for clustered line transect data

DEFF Research Database (Denmark)

Waagepetersen, Rasmus; Schweder, Tore

2006-01-01

The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

Science.gov (United States)

Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

2000-01-01

Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Modulated modularity clustering as an exploratory tool for functional genomic inference.

Directory of Open Access Journals (Sweden)

Eric A Stone

2009-05-01

Full Text Available In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC, seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation.
TreeCluster: Massively scalable transmission clustering using phylogenetic trees

OpenAIRE

Moshiri, Alexander

2018-01-01

Background: The ability to infer transmission clusters from molecular data is critical to designing and evaluating viral control strategies. Viral sequencing datasets are growing rapidly, but standard methods of transmission cluster inference do not scale well beyond thousands of sequences. Results: I present TreeCluster, a cross-platform tool that performs transmission cluster inference on a given phylogenetic tree orders of magnitude faster than existing inference methods and supports multi...
Field line distribution of density at L=4.8 inferred from observations by CLUSTER

Directory of Open Access Journals (Sweden)

S. Schäfer

2009-02-01

Full Text Available For two events observed by the CLUSTER spacecraft, the field line distribution of mass density ρ was inferred from Alfvén wave harmonic frequencies and compared to the electron density ne from plasma wave data and the oxygen density nO+ from the ion composition experiment. In one case, the average ion mass M≈ρ/ne was about 5 amu (28 October 2002, while in the other it was about 3 amu (10 September 2002. Both events occurred when the CLUSTER 1 (C1 spacecraft was in the plasmatrough. Nevertheless, the electron density ne was significantly lower for the first event (ne=8 cm−3 than for the second event (ne=22 cm−3, and this seems to be the main difference leading to a different value of M. For the first event (28 October 2002, we were able to measure the Alfvén wave frequencies for eight harmonics with unprecedented precision, so that the error in the inferred mass density is probably dominated by factors other than the uncertainty in frequency (e.g., magnetic field model and theoretical wave equation. This field line distribution (at L=4.8 was very flat for magnetic latitude |MLAT|≲20° but very steeply increasing with respect to |MLAT| for |MLAT|≳40°. The total variation in ρ was about four orders of magnitude, with values at large |MLAT| roughly consistent with ionospheric values. For the second event (10 September 2002, there was a small local maximum in mass density near the magnetic equator. The inferred mass density decreases to a minimum 23% lower than the equatorial value at |MLAT|=15.5°, and then steeply increases as one moves along the field line toward the ionosphere. For this event we were also able to examine the spatial dependence of the electron density using measurements of ne from all four CLUSTER spacecraft. Our analysis indicates that the density varies with L at L~5 roughly like L−4, and that ne is also locally peaked at the magnetic equator, but with a smaller peak. The value of ne reaches a density minimum
Time clustered sampling can inflate the inferred substitution rate in foot-and-mouth disease virus analyses

DEFF Research Database (Denmark)

Pedersen, Casper-Emil Tingskov; Frandsen, Peter; Wekesa, Sabenzia N.

2015-01-01

abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale...... through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer...... to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully...
Data-driven inference for the spatial scan statistic.

Science.gov (United States)

Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C

2011-08-02

Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Data-driven inference for the spatial scan statistic

Directory of Open Access Journals (Sweden)

Duczmal Luiz H

2011-08-01

Full Text Available Abstract Background Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. Results A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. Conclusions A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Time Clustered Sampling Can Inflate the Inferred Substitution Rate in Foot-And-Mouth Disease Virus Analyses.

Science.gov (United States)

Pedersen, Casper-Emil T; Frandsen, Peter; Wekesa, Sabenzia N; Heller, Rasmus; Sangula, Abraham K; Wadsworth, Jemma; Knowles, Nick J; Muwanika, Vincent B; Siegismund, Hans R

2015-01-01

With the emergence of analytical software for the inference of viral evolution, a number of studies have focused on estimating important parameters such as the substitution rate and the time to the most recent common ancestor (tMRCA) for rapidly evolving viruses. Coupled with an increasing abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully consider how samples are combined.
A stepwise-cluster microbial biomass inference model in food waste composting

International Nuclear Information System (INIS)

Sun Wei; Huang, Guo H.; Zeng Guangming; Qin Xiaosheng; Sun Xueling

2009-01-01

A stepwise-cluster microbial biomass inference (SMI) model was developed through introducing stepwise-cluster analysis (SCA) into composting process modeling to tackle the nonlinear relationships among state variables and microbial activities. The essence of SCA is to form a classification tree based on a series of cutting or mergence processes according to given statistical criteria. Eight runs of designed experiments in bench-scale reactors in a laboratory were constructed to demonstrate the feasibility of the proposed method. The results indicated that SMI could help establish a statistical relationship between state variables and composting microbial characteristics, where discrete and nonlinear complexities exist. Significance levels of cutting/merging were provided such that the accuracies of the developed forecasting trees were controllable. Through an attempted definition of input effects on the output in SMI, the effects of the state variables on thermophilic bacteria were ranged in a descending order as: Time (day) > moisture content (%) > ash content (%, dry) > Lower Temperature (deg. C) > pH > NH 4 + -N (mg/Kg, dry) > Total N (%, dry) > Total C (%, dry); the effects on mesophilic bacteria were ordered as: Time > Upper Temperature (deg. C) > Total N > moisture content > NH 4 + -N > Total C > pH. This study made the first attempt in applying SCA to mapping the nonlinear and discrete relationships in composting processes.
Approximation Of Multi-Valued Inverse Functions Using Clustering And Sugeno Fuzzy Inference

Science.gov (United States)

Walden, Maria A.; Bikdash, Marwan; Homaifar, Abdollah

1998-01-01

Finding the inverse of a continuous function can be challenging and computationally expensive when the inverse function is multi-valued. Difficulties may be compounded when the function itself is difficult to evaluate. We show that we can use fuzzy-logic approximators such as Sugeno inference systems to compute the inverse on-line. To do so, a fuzzy clustering algorithm can be used in conjunction with a discriminating function to split the function data into branches for the different values of the forward function. These data sets are then fed into a recursive least-squares learning algorithm that finds the proper coefficients of the Sugeno approximators; each Sugeno approximator finds one value of the inverse function. Discussions about the accuracy of the approximation will be included.
Cluster-level statistical inference in fMRI datasets: The unexpected behavior of random fields in high dimensions.

Science.gov (United States)

Bansal, Ravi; Peterson, Bradley S

2018-06-01

Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal
Model-based and design-based inference goals frame how to account for neighborhood clustering in studies of health in overlapping context types.

Science.gov (United States)

Lovasi, Gina S; Fink, David S; Mooney, Stephen J; Link, Bruce G

2017-12-01

Accounting for non-independence in health research often warrants attention. Particularly, the availability of geographic information systems data has increased the ease with which studies can add measures of the local "neighborhood" even if participant recruitment was through other contexts, such as schools or clinics. We highlight a tension between two perspectives that is often present, but particularly salient when more than one type of potentially health-relevant context is indexed (e.g., both neighborhood and school). On the one hand, a model-based perspective emphasizes the processes producing outcome variation, and observed data are used to make inference about that process. On the other hand, a design-based perspective emphasizes inference to a well-defined finite population, and is commonly invoked by those using complex survey samples or those with responsibility for the health of local residents. These two perspectives have divergent implications when deciding whether clustering must be accounted for analytically and how to select among candidate cluster definitions, though the perspectives are by no means monolithic. There are tensions within each perspective as well as between perspectives. We aim to provide insight into these perspectives and their implications for population health researchers. We focus on the crucial step of deciding which cluster definition or definitions to use at the analysis stage, as this has consequences for all subsequent analytic and interpretational challenges with potentially clustered data.
Bootstrap-Based Improvements for Inference with Clustered Errors

OpenAIRE

Doug Miller; A. Colin Cameron; Jonah B. Gelbach

2006-01-01

Microeconometrics researchers have increasingly realized the essential need to account for any within-group dependence in estimating standard errors of regression parameter estimates. The typical preferred solution is to calculate cluster-robust or sandwich standard errors that permit quite general heteroskedasticity and within-cluster error correlation, but presume that the number of clusters is large. In applications with few (5-30) clusters, standard asymptotic tests can over-reject consid...
Convex Clustering: An Attractive Alternative to Hierarchical Clustering

Science.gov (United States)

Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

2015-01-01

The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340
Model-based and design-based inference goals frame how to account for neighborhood clustering in studies of health in overlapping context types

Directory of Open Access Journals (Sweden)

Gina S. Lovasi

2017-12-01

Full Text Available Accounting for non-independence in health research often warrants attention. Particularly, the availability of geographic information systems data has increased the ease with which studies can add measures of the local “neighborhood” even if participant recruitment was through other contexts, such as schools or clinics. We highlight a tension between two perspectives that is often present, but particularly salient when more than one type of potentially health-relevant context is indexed (e.g., both neighborhood and school. On the one hand, a model-based perspective emphasizes the processes producing outcome variation, and observed data are used to make inference about that process. On the other hand, a design-based perspective emphasizes inference to a well-defined finite population, and is commonly invoked by those using complex survey samples or those with responsibility for the health of local residents. These two perspectives have divergent implications when deciding whether clustering must be accounted for analytically and how to select among candidate cluster definitions, though the perspectives are by no means monolithic. There are tensions within each perspective as well as between perspectives. We aim to provide insight into these perspectives and their implications for population health researchers. We focus on the crucial step of deciding which cluster definition or definitions to use at the analysis stage, as this has consequences for all subsequent analytic and interpretational challenges with potentially clustered data.
An Application of Fuzzy Inference System by Clustering Subtractive Fuzzy Method for Estimating of Product Requirement

Directory of Open Access Journals (Sweden)

Fajar Ibnu Tufeil

2009-06-01

Full Text Available Model fuzzy memiliki kemampuan untuk menjelaskan secara linguistik suatu sistem yang terlalu kompleks. Aturan-aturan dalam model fuzzy pada umumnya dibangun berdasarkan keahlian manusia dan pengetahuan heuristik dari sistem yang dimodelkan. Teknik ini selanjutnya dikembangkan menjadi teknik yang dapat mengidentifikasi aturan-aturan dari suatu basis data yang telah dikelompokkan berdasarkan persamaan strukturnya. Dalam hal ini metode pengelompokan fuzzy berfungsi untuk mencari kelompok-kelompok data. Informasi yang dihasilkan dari metode pengelompokan ini, yaitu informasi tentang pusat kelompok, digunakan untuk membentuk aturan-aturan dalam sistem penalaran fuzzy. Dalam skripsi ini dibahas mengenai penerapan fuzzy infereance system dengan metode pengelompokan fuzzy subtractive clustering, yaitu untuk membentuk sistem penalaran fuzzy dengan menggunakan model fuzzy Takagi-Sugeno orde satu. Selanjutnya, metode pengelompokan fuzzy subtractive clustering diterapkan dalam memodelkan masalah dibidang pemasaran, yaitu untuk memprediksi permintaan pasar terhadap suatu produk susu. Aplikasi ini dibangun menggunakan Borland Delphi 6.0. Dari hasil pengujian diperoleh tingkat error prediksi terkecil yaitu dengan Error Average 0.08%.

The cluster bootstrap consistency in generalized estimating equations

KAUST Repository

Cheng, Guang

2013-03-01

The cluster bootstrap resamples clusters or subjects instead of individual observations in order to preserve the dependence within each cluster or subject. In this paper, we provide a theoretical justification of using the cluster bootstrap for the inferences of the generalized estimating equations (GEE) for clustered/longitudinal data. Under the general exchangeable bootstrap weights, we show that the cluster bootstrap yields a consistent approximation of the distribution of the regression estimate, and a consistent approximation of the confidence sets. We also show that a computationally more efficient one-step version of the cluster bootstrap provides asymptotically equivalent inference. © 2012.
Identifikasi Gangguan Neurologis Menggunakan Metode Adaptive Neuro Fuzzy Inference System (ANFIS

Directory of Open Access Journals (Sweden)

Jani Kusanti

2015-07-01

Abstract The use of Adaptive Neuro Fuzzy Inference System (ANFIS methods in the process of identifying one of neurological disorders in the head, known in medical terms ischemic stroke from the ct scan of the head in order to identify the location of ischemic stroke. The steps are performed in the extraction process of identifying, among others, the image of the ct scan of the head by using a histogram. Enhanced image of the intensity histogram image results using Otsu threshold to obtain results pixels rated 1 related to the object while pixel rated 0 associated with the measurement background. The result used for image clustering process, to process image clusters used fuzzy c-mean (FCM clustering result is a row of the cluster center, the results of the data used to construct a fuzzy inference system (FIS. Fuzzy inference system applied is fuzzy inference model of Takagi-Sugeno-Kang. In this study ANFIS is used to optimize the results of the determination of the location of the blockage ischemic stroke. Used recursive least squares estimator (RLSE for learning. RMSE results obtained in the training process of 0.0432053, while in the process of generated test accuracy rate of 98.66% Keywords— Stroke Ischemik, Global threshold, Fuzzy Inference System model Sugeno, ANFIS, RMSE
Cluster evolution

International Nuclear Information System (INIS)

Schaeffer, R.

1987-01-01

The galaxy and cluster luminosity functions are constructed from a model of the mass distribution based on hierarchical clustering at an epoch where the matter distribution is non-linear. These luminosity functions are seen to reproduce the present distribution of objects as can be inferred from the observations. They can be used to deduce the redshift dependence of the cluster distribution and to extrapolate the observations towards the past. The predicted evolution of the cluster distribution is quite strong, although somewhat less rapid than predicted by the linear theory
Likelihood-Based Inference of B Cell Clonal Families.

Directory of Open Access Journals (Sweden)

Duncan K Ralph

2016-10-01

Full Text Available The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called "rearrangement" forming progenitor B cells, then a Darwinian process of lineage diversification and selection called "affinity maturation." The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem "clonal family inference." In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.
Inferring time-varying network topologies from gene expression data.

Science.gov (United States)

Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas

2007-01-01

Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
A new fast method for inferring multiple consensus trees using k-medoids.

Science.gov (United States)

Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

2018-04-05

Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while
The implementation of two stages clustering (k-means clustering and adaptive neuro fuzzy inference system) for prediction of medicine need based on medical data

Science.gov (United States)

Husein, A. M.; Harahap, M.; Aisyah, S.; Purba, W.; Muhazir, A.

2018-03-01

Medication planning aim to get types, amount of medicine according to needs, and avoid the emptiness medicine based on patterns of disease. In making the medicine planning is still rely on ability and leadership experience, this is due to take a long time, skill, difficult to obtain a definite disease data, need a good record keeping and reporting, and the dependence of the budget resulted in planning is not going well, and lead to frequent lack and excess of medicines. In this research, we propose Adaptive Neuro Fuzzy Inference System (ANFIS) method to predict medication needs in 2016 and 2017 based on medical data in 2015 and 2016 from two source of hospital. The framework of analysis using two approaches. The first phase is implementing ANFIS to a data source, while the second approach we keep using ANFIS, but after the process of clustering from K-Means algorithm, both approaches are calculated values of Root Mean Square Error (RMSE) for training and testing. From the testing result, the proposed method with better prediction rates based on the evaluation analysis of quantitative and qualitative compared with existing systems, however the implementation of K-Means Algorithm against ANFIS have an effect on the timing of the training process and provide a classification accuracy significantly better without clustering.
Non-parametric co-clustering of large scale sparse bipartite networks on the GPU

DEFF Research Database (Denmark)

Hansen, Toke Jansen; Mørup, Morten; Hansen, Lars Kai

2011-01-01

of row and column clusters from a hypothesis space of an infinite number of clusters. To reach large scale applications of co-clustering we exploit that parameter inference for co-clustering is well suited for parallel computing. We develop a generic GPU framework for efficient inference on large scale...... sparse bipartite networks and achieve a speedup of two orders of magnitude compared to estimation based on conventional CPUs. In terms of scalability we find for networks with more than 100 million links that reliable inference can be achieved in less than an hour on a single GPU. To efficiently manage...
Bioconductor workflow for single-cell RNA sequencing: Normalization, dimensionality reduction, clustering, and lineage inference [version 1; referees: 1 approved, 2 approved with reservations

Directory of Open Access Journals (Sweden)

Fanny Perraudeau

2017-07-01

Full Text Available Novel single-cell transcriptome sequencing assays allow researchers to measure gene expression levels at the resolution of single cells and offer the unprecendented opportunity to investigate at the molecular level fundamental biological questions, such as stem cell differentiation or the discovery and characterization of rare cell types. However, such assays raise challenging statistical and computational questions and require the development of novel methodology and software. Using stem cell differentiation in the mouse olfactory epithelium as a case study, this integrated workflow provides a step-by-step tutorial to the methodology and associated software for the following four main tasks: (1 dimensionality reduction accounting for zero inflation and over dispersion and adjusting for gene and cell-level covariates; (2 cell clustering using resampling-based sequential ensemble clustering; (3 inference of cell lineages and pseudotimes; and (4 differential expression analysis along lineages.
Molecular Polarizability of Sc and C (Fullerene and Graphite Clusters

Directory of Open Access Journals (Sweden)

Francisco Torrens

2001-05-01

Full Text Available A method (POLAR for the calculation of the molecular polarizability is presented. It uses the interacting induced dipoles polarization model. As an example, the method is applied to Scn and Cn (fullerene and one-shell graphite model clusters. On varying the number of atoms, the clusters show numbers indicative of particularly polarizable structures. The are compared with reference calculations (PAPID. In general, the Scn calculated (POLAR and Cn computed (POLAR and PAPID are less polarizable than what is inferred from the bulk. However, the Scn calculated (PAPID are more polarizable than what is inferred. Moreover, previous theoretical work yielded the same trend for Sin, Gen and GanAsm small clusters. The high polarizability of the Scn clusters (PAPID is attributed to arise from dangling bonds at the surface of the cluster.
Cycle-Based Cluster Variational Method for Direct and Inverse Inference

Science.gov (United States)

Furtlehner, Cyril; Decelle, Aurélien

2016-08-01

Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to 10^5 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.
Ensemble stacking mitigates biases in inference of synaptic connectivity.

Science.gov (United States)

Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N

2018-01-01

A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.
Inverse Ising inference with correlated samples

International Nuclear Information System (INIS)

Obermayer, Benedikt; Levine, Erel

2014-01-01

Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)
Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust.

Science.gov (United States)

Cun, Yupeng; Yang, Tsun-Po; Achter, Viktor; Lang, Ulrich; Peifer, Martin

2018-06-01

The genomes of cancer cells constantly change during pathogenesis. This evolutionary process can lead to the emergence of drug-resistant mutations in subclonal populations, which can hinder therapeutic intervention in patients. Data derived from massively parallel sequencing can be used to infer these subclonal populations using tumor-specific point mutations. The accurate determination of copy-number changes and tumor impurity is necessary to reliably infer subclonal populations by mutational clustering. This protocol describes how to use Sclust, a copy-number analysis method with a recently developed mutational clustering approach. In a series of simulations and comparisons with alternative methods, we have previously shown that Sclust accurately determines copy-number states and subclonal populations. Performance tests show that the method is computationally efficient, with copy-number analysis and mutational clustering taking Linux/Unix command-line syntax should be able to carry out analyses of subclonal populations.
Inferring hierarchical clustering structures by deterministic annealing

International Nuclear Information System (INIS)

Hofmann, T.; Buhmann, J.M.

1996-01-01

The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree evaluation and we derive a non-greedy technique for tree growing. Applying the principles of maximum entropy and minimum cross entropy, a deterministic annealing algorithm is derived in a meanfield approximation. This technique allows us to canonically superimpose tree structures and to fit parameters to averaged or open-quote fuzzified close-quote trees
The cylindrical K-function and Poisson line cluster point processes

DEFF Research Database (Denmark)

Møller, Jesper; Safavimanesh, Farzaneh; Rasmussen, Jakob G.

Poisson line cluster point processes, is also introduced. Parameter estimation based on moment methods or Bayesian inference for this model is discussed when the underlying Poisson line process and the cluster memberships are treated as hidden processes. To illustrate the methodologies, we analyze two...
THE EVOLUTION OF DUSTY STAR FORMATION IN GALAXY CLUSTERS TO z = 1: SPITZER INFRARED OBSERVATIONS OF THE FIRST RED-SEQUENCE CLUSTER SURVEY

Energy Technology Data Exchange (ETDEWEB)

Webb, T. M. A.; O' Donnell, D.; Coppin, Kristen; Faloon, Ashley; Geach, James E.; Noble, Allison [McGill University, 3600 rue University, Montreal, QC, H3A 2T8 (Canada); Yee, H. K. C. [Department of Astronomy and Astrophysics, University of Toronto, 50 St. George St., Toronto, ON, M5S 3H4 (Canada); Gilbank, David [South African Astronomical Observatory, P.O. Box 9, Observatory, 7935 (South Africa); Ellingson, Erica [Department of Astrophysical and Planetary Sciences, University of Colorado at Boulder, Boulder, CO 80309 (United States); Gladders, Mike [Department of Astronomy and Astrophysics, University of Chicago, 5640 S. Ellis Ave., Chicago, IL 60637 (United States); Muzzin, Adam [Leiden Observatory, University of Leiden, Niels Bohrweg 2, NL-2333 CA, Leiden (Netherlands); Wilson, Gillian [Department of Physics and Astronomy, University of California at Riverside, 900 University Avenue, Riverside, CA 92521 (United States); Yan, Renbin [Center for Cosmology and Particle Physics, Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States)

2013-10-01

We present the results of an infrared (IR) study of high-redshift galaxy clusters with the MIPS camera on board the Spitzer Space Telescope. We have assembled a sample of 42 clusters from the Red-Sequence Cluster Survey-1 over the redshift range 0.3 < z < 1.0 and spanning an approximate range in mass of 10{sup 14-15} M {sub ☉}. We statistically measure the number of IR-luminous galaxies in clusters above a fixed inferred IR luminosity of 2 × 10{sup 11} M {sub ☉}, assuming a star forming galaxy template, per unit cluster mass and find it increases to higher redshift. Fitting a simple power-law we measure evolution of (1 + z){sup 5.1±1.9} over the range 0.3 < z < 1.0. These results are tied to the adoption of a single star forming galaxy template; the presence of active galactic nuclei, and an evolution in their relative contribution to the mid-IR galaxy emission, will alter the overall number counts per cluster and their rate of evolution. Under the star formation assumption we infer the approximate total star formation rate per unit cluster mass (ΣSFR/M {sub cluster}). The evolution is similar, with ΣSFR/M {sub cluster} ∼ (1 + z){sup 5.4±1.9}. We show that this can be accounted for by the evolution of the IR-bright field population over the same redshift range; that is, the evolution can be attributed entirely to the change in the in-falling field galaxy population. We show that the ΣSFR/M {sub cluster} (binned over all redshift) decreases with increasing cluster mass with a slope (ΣSFR/M{sub cluster}∼M{sub cluster}{sup -1.5±0.4}) consistent with the dependence of the stellar-to-total mass per unit cluster mass seen locally. The inferred star formation seen here could produce ∼5%-10% of the total stellar mass in massive clusters at z = 0, but we cannot constrain the descendant population, nor how rapidly the star-formation must shut-down once the galaxies have entered the cluster environment. Finally, we show a clear decrease in the number of IR
MASSIVE CLUSTERS IN THE INNER REGIONS OF NGC 1365: CLUSTER FORMATION AND GAS DYNAMICS IN GALACTIC BARS

International Nuclear Information System (INIS)

Elmegreen, Bruce G.; Galliano, Emmanuel; Alloin, Danielle

2009-01-01

Cluster formation and gas dynamics in the central regions of barred galaxies are not well understood. This paper reviews the environment of three 10 7 M sun clusters near the inner Lindblad resonance (ILR) of the barred spiral NGC 1365. The morphology, mass, and flow of H I and CO gas in the spiral and barred regions are examined for evidence of the location and mechanism of cluster formation. The accretion rate is compared with the star formation rate to infer the lifetime of the starburst. The gas appears to move from inside corotation in the spiral region to looping filaments in the interbar region at a rate of ∼6 M sun yr -1 before impacting the bar dustlane somewhere along its length. The gas in this dustlane moves inward, growing in flux as a result of the accretion to ∼40 M sun yr -1 near the ILR. This inner rate exceeds the current nuclear star formation rate by a factor of 4, suggesting continued buildup of nuclear mass for another ∼0.5 Gyr. The bar may be only 1-2 Gyr old. Extrapolating the bar flow back in time, we infer that the clusters formed in the bar dustlane outside the central dust ring at a position where an interbar filament currently impacts the lane. The ram pressure from this impact is comparable to the pressure in the bar dustlane, and both are comparable to the pressure in the massive clusters. Impact triggering is suggested. The isothermal assumption in numerical simulations seems inappropriate for the rarefaction parts of spiral and bar gas flows. The clusters have enough lower-mass counterparts to suggest they are part of a normal power-law mass distribution. Gas trapping in the most massive clusters could explain their [Ne II] emission, which is not evident from the lower-mass clusters nearby.
Bayesian inference for Hawkes processes

DEFF Research Database (Denmark)

Rasmussen, Jakob Gulddahl

The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes

DEFF Research Database (Denmark)

Rasmussen, Jakob Gulddahl

2013-01-01

The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

The threshold bootstrap clustering: a new approach to find families or transmission clusters within molecular quasispecies.

Directory of Open Access Journals (Sweden)

Mattia C F Prosperi

2010-10-01

Full Text Available Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classification or infection transmission event identification. We introduce the threshold bootstrap clustering (TBC, a new methodology for partitioning molecular sequences, that does not require a phylogenetic tree estimation.The TBC is an incremental partition algorithm, inspired by the stochastic Chinese restaurant process, and takes advantage of resampling techniques and models of sequence evolution. TBC uses as input a multiple alignment of molecular sequences and its output is a crisp partition of the taxa into an automatically determined number of clusters. By varying initial conditions, the algorithm can produce different partitions. We describe a procedure that selects a prime partition among a set of candidate ones and calculates a measure of cluster reliability. TBC was successfully tested for the identification of type-1 human immunodeficiency and hepatitis C virus subtypes, and compared with previously established methodologies. It was also evaluated in the problem of HIV-1 intra-patient quasispecies clustering, and for transmission cluster identification, using a set of sequences from patients with known transmission event histories.TBC has been shown to be effective for the subtyping of HIV and HCV, and for identifying intra-patient quasispecies. To some extent, the algorithm was able also to infer clusters corresponding to events of infection transmission. The computational complexity of TBC is quadratic in the number of taxa, lower than other established methods; in addition, TBC has been enhanced with a measure of cluster reliability. The TBC can be useful to characterise molecular quasipecies in a broad context.
The threshold bootstrap clustering: a new approach to find families or transmission clusters within molecular quasispecies.

Science.gov (United States)

Prosperi, Mattia C F; De Luca, Andrea; Di Giambenedetto, Simona; Bracciale, Laura; Fabbiani, Massimiliano; Cauda, Roberto; Salemi, Marco

2010-10-25

Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classification or infection transmission event identification. We introduce the threshold bootstrap clustering (TBC), a new methodology for partitioning molecular sequences, that does not require a phylogenetic tree estimation. The TBC is an incremental partition algorithm, inspired by the stochastic Chinese restaurant process, and takes advantage of resampling techniques and models of sequence evolution. TBC uses as input a multiple alignment of molecular sequences and its output is a crisp partition of the taxa into an automatically determined number of clusters. By varying initial conditions, the algorithm can produce different partitions. We describe a procedure that selects a prime partition among a set of candidate ones and calculates a measure of cluster reliability. TBC was successfully tested for the identification of type-1 human immunodeficiency and hepatitis C virus subtypes, and compared with previously established methodologies. It was also evaluated in the problem of HIV-1 intra-patient quasispecies clustering, and for transmission cluster identification, using a set of sequences from patients with known transmission event histories. TBC has been shown to be effective for the subtyping of HIV and HCV, and for identifying intra-patient quasispecies. To some extent, the algorithm was able also to infer clusters corresponding to events of infection transmission. The computational complexity of TBC is quadratic in the number of taxa, lower than other established methods; in addition, TBC has been enhanced with a measure of cluster reliability. The TBC can be useful to characterise molecular quasipecies in a broad context.
Pearson's chi-square test and rank correlation inferences for clustered data.

Science.gov (United States)

Shih, Joanna H; Fay, Michael P

2017-09-01

Pearson's chi-square test has been widely used in testing for association between two categorical responses. Spearman rank correlation and Kendall's tau are often used for measuring and testing association between two continuous or ordered categorical responses. However, the established statistical properties of these tests are only valid when each pair of responses are independent, where each sampling unit has only one pair of responses. When each sampling unit consists of a cluster of paired responses, the assumption of independent pairs is violated. In this article, we apply the within-cluster resampling technique to U-statistics to form new tests and rank-based correlation estimators for possibly tied clustered data. We develop large sample properties of the new proposed tests and estimators and evaluate their performance by simulations. The proposed methods are applied to a data set collected from a PET/CT imaging study for illustration. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
BioCluster: Tool for Identification and Clustering of Enterobacteriaceae Based on Biochemical Data

Directory of Open Access Journals (Sweden)

Ahmed Abdullah

2015-06-01

Full Text Available Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC and the Improved Hierarchical Clustering (IHC, a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1–47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/.
Ensemble stacking mitigates biases in inference of synaptic connectivity

Directory of Open Access Journals (Sweden)

Brendan Chambers

2018-03-01

Full Text Available A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches. Mapping the routing of spikes through local circuitry is crucial for understanding neocortical computation. Under appropriate experimental conditions, these maps can be used to infer likely patterns of synaptic recruitment, linking activity to underlying anatomical connections. Such inferences help to reveal the synaptic implementation of population dynamics and computation. We compare a number of standard functional measures to infer underlying connectivity. We find that regularization impacts measures
Numerical approximations for speeding up mcmc inference in the infinite relational model

DEFF Research Database (Denmark)

Schmidt, Mikkel Nørgaard; Albers, Kristoffer Jon

2015-01-01

The infinite relational model (IRM) is a powerful model for discovering clusters in complex networks; however, the computational speed of Markov chain Monte Carlo inference in the model can be a limiting factor when analyzing large networks. We investigate how using numerical approximations...
Mapping Dark Matter in Simulated Galaxy Clusters

Science.gov (United States)

Bowyer, Rachel

2018-01-01

Galaxy clusters are the most massive bound objects in the Universe with most of their mass being dark matter. Cosmological simulations of structure formation show that clusters are embedded in a cosmic web of dark matter filaments and large scale structure. It is thought that these filaments are found preferentially close to the long axes of clusters. We extract galaxy clusters from the simulations "cosmo-OWLS" in order to study their properties directly and also to infer their properties from weak gravitational lensing signatures. We investigate various stacking procedures to enhance the signal of the filaments and large scale structure surrounding the clusters to better understand how the filaments of the cosmic web connect with galaxy clusters. This project was supported in part by the NSF REU grant AST-1358980 and by the Nantucket Maria Mitchell Association.
Dynamics of Galaxy Clusters and their Outskirts

DEFF Research Database (Denmark)

Falco, Martina

Galaxy clusters have demonstrated to be powerful probes of cosmology, since their mass and abundance depend on the cosmological model that describes the Universe and on the gravitational formation process of cosmological structures. The main challenge in using clusters to constrain cosmology...... is that their masses cannot be measured directly, but need to be inferred indirectly through their observable properties. The most common methods extract the cluster mass from their strong X-ray emission or from the measured redshifts of the galaxy members. The gravitational lensing effect caused by clusters...... on the background galaxies is also an important trace of their total mass distribution.In the work presented within this thesis, we exploit the connection between the gravitational potential of galaxy clusters and the kinematical properties of their surroundings, in order to determine the total cluster mass...
A Multiobjective Fuzzy Inference System based Deployment Strategy for a Distributed Mobile Sensor Network

Directory of Open Access Journals (Sweden)

Amol P. Bhondekar

2010-03-01

Full Text Available Sensor deployment scheme highly governs the effectiveness of distributed wireless sensor network. Issues such as energy conservation and clustering make the deployment problem much more complex. A multiobjective Fuzzy Inference System based strategy for mobile sensor deployment is presented in this paper. This strategy gives a synergistic combination of energy capacity, clustering and peer-to-peer deployment. Performance of our strategy is evaluated in terms of coverage, uniformity, speed and clustering. Our algorithm is compared against a modified distributed self-spreading algorithm to exhibit better performance.
An intelligent clustering based methodology for confusable diseases ...

African Journals Online (AJOL)

Journal of Computer Science and Its Application ... In this paper, an intelligent system driven by fuzzy clustering algorithm and Adaptive Neuro-Fuzzy Inference System for ... Data on patients diagnosed and confirmed by laboratory tests of viral ...
Prediction of settled water turbidity and optimal coagulant dosage in drinking water treatment plant using a hybrid model of k-means clustering and adaptive neuro-fuzzy inference system

Science.gov (United States)

Kim, Chan Moon; Parnichkun, Manukid

2017-11-01

Coagulation is an important process in drinking water treatment to attain acceptable treated water quality. However, the determination of coagulant dosage is still a challenging task for operators, because coagulation is nonlinear and complicated process. Feedback control to achieve the desired treated water quality is difficult due to lengthy process time. In this research, a hybrid of k-means clustering and adaptive neuro-fuzzy inference system ( k-means-ANFIS) is proposed for the settled water turbidity prediction and the optimal coagulant dosage determination using full-scale historical data. To build a well-adaptive model to different process states from influent water, raw water quality data are classified into four clusters according to its properties by a k-means clustering technique. The sub-models are developed individually on the basis of each clustered data set. Results reveal that the sub-models constructed by a hybrid k-means-ANFIS perform better than not only a single ANFIS model, but also seasonal models by artificial neural network (ANN). The finally completed model consisting of sub-models shows more accurate and consistent prediction ability than a single model of ANFIS and a single model of ANN based on all five evaluation indices. Therefore, the hybrid model of k-means-ANFIS can be employed as a robust tool for managing both treated water quality and production costs simultaneously.
Comparison of Bayesian clustering and edge detection methods for inferring boundaries in landscape genetics

Science.gov (United States)

Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.

2011-01-01

Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Penalized likelihood and multi-objective spatial scans for the detection and inference of irregular clusters

Directory of Open Access Journals (Sweden)

Fonseca Carlos M

2010-10-01

Full Text Available Abstract Background Irregularly shaped spatial clusters are difficult to delineate. A cluster found by an algorithm often spreads through large portions of the map, impacting its geographical meaning. Penalized likelihood methods for Kulldorff's spatial scan statistics have been used to control the excessive freedom of the shape of clusters. Penalty functions based on cluster geometry and non-connectivity have been proposed recently. Another approach involves the use of a multi-objective algorithm to maximize two objectives: the spatial scan statistics and the geometric penalty function. Results & Discussion We present a novel scan statistic algorithm employing a function based on the graph topology to penalize the presence of under-populated disconnection nodes in candidate clusters, the disconnection nodes cohesion function. A disconnection node is defined as a region within a cluster, such that its removal disconnects the cluster. By applying this function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions. To evaluate the statistical significance of solutions for multi-objective scans, a statistical approach based on the concept of attainment function is used. In this paper we compared different penalized likelihoods employing the geometric and non-connectivity regularity functions and the novel disconnection nodes cohesion function. We also build multi-objective scans using those three functions and compare them with the previous penalized likelihood scans. An application is presented using comprehensive state-wide data for Chagas' disease in puerperal women in Minas Gerais state, Brazil. Conclusions We show that, compared to the other single-objective algorithms, multi-objective scans present better performance, regarding power, sensitivity and positive predicted value. The multi-objective non-connectivity scan is faster and better suited for the
Inferred vs realized patterns of gene flow: an analysis of population structure in the Andros Island Rock Iguana.

Science.gov (United States)

Colosimo, Giuliano; Knapp, Charles R; Wallace, Lisa E; Welch, Mark E

2014-01-01

Ecological data, the primary source of information on patterns and rates of migration, can be integrated with genetic data to more accurately describe the realized connectivity between geographically isolated demes. In this paper we implement this approach and discuss its implications for managing populations of the endangered Andros Island Rock Iguana, Cyclura cychlura cychlura. This iguana is endemic to Andros, a highly fragmented landmass of large islands and smaller cays. Field observations suggest that geographically isolated demes were panmictic due to high, inferred rates of gene flow. We expand on these observations using 16 polymorphic microsatellites to investigate the genetic structure and rates of gene flow from 188 Andros Iguanas collected across 23 island sites. Bayesian clustering of specimens assigned individuals to three distinct genotypic clusters. An analysis of molecular variance (AMOVA) indicates that allele frequency differences are responsible for a significant portion of the genetic variance across the three defined clusters (Fst = 0.117, p<0.01). These clusters are associated with larger islands and satellite cays isolated by broad water channels with strong currents. These findings imply that broad water channels present greater obstacles to gene flow than was inferred from field observation alone. Additionally, rates of gene flow were indirectly estimated using BAYESASS 3.0. The proportion of individuals originating from within each identified cluster varied from 94.5 to 98.7%, providing further support for local isolation. Our assessment reveals a major disparity between inferred and realized gene flow. We discuss our results in a conservation perspective for species inhabiting highly fragmented landscapes.
Inferred vs Realized Patterns of Gene Flow: An Analysis of Population Structure in the Andros Island Rock Iguana

Science.gov (United States)

Colosimo, Giuliano; Knapp, Charles R.; Wallace, Lisa E.; Welch, Mark E.

2014-01-01

Ecological data, the primary source of information on patterns and rates of migration, can be integrated with genetic data to more accurately describe the realized connectivity between geographically isolated demes. In this paper we implement this approach and discuss its implications for managing populations of the endangered Andros Island Rock Iguana, Cyclura cychlura cychlura. This iguana is endemic to Andros, a highly fragmented landmass of large islands and smaller cays. Field observations suggest that geographically isolated demes were panmictic due to high, inferred rates of gene flow. We expand on these observations using 16 polymorphic microsatellites to investigate the genetic structure and rates of gene flow from 188 Andros Iguanas collected across 23 island sites. Bayesian clustering of specimens assigned individuals to three distinct genotypic clusters. An analysis of molecular variance (AMOVA) indicates that allele frequency differences are responsible for a significant portion of the genetic variance across the three defined clusters (Fst = 0.117, p0.01). These clusters are associated with larger islands and satellite cays isolated by broad water channels with strong currents. These findings imply that broad water channels present greater obstacles to gene flow than was inferred from field observation alone. Additionally, rates of gene flow were indirectly estimated using BAYESASS 3.0. The proportion of individuals originating from within each identified cluster varied from 94.5 to 98.7%, providing further support for local isolation. Our assessment reveals a major disparity between inferred and realized gene flow. We discuss our results in a conservation perspective for species inhabiting highly fragmented landscapes. PMID:25229344
Inferred vs realized patterns of gene flow: an analysis of population structure in the Andros Island Rock Iguana.

Directory of Open Access Journals (Sweden)

Giuliano Colosimo

Full Text Available Ecological data, the primary source of information on patterns and rates of migration, can be integrated with genetic data to more accurately describe the realized connectivity between geographically isolated demes. In this paper we implement this approach and discuss its implications for managing populations of the endangered Andros Island Rock Iguana, Cyclura cychlura cychlura. This iguana is endemic to Andros, a highly fragmented landmass of large islands and smaller cays. Field observations suggest that geographically isolated demes were panmictic due to high, inferred rates of gene flow. We expand on these observations using 16 polymorphic microsatellites to investigate the genetic structure and rates of gene flow from 188 Andros Iguanas collected across 23 island sites. Bayesian clustering of specimens assigned individuals to three distinct genotypic clusters. An analysis of molecular variance (AMOVA indicates that allele frequency differences are responsible for a significant portion of the genetic variance across the three defined clusters (Fst = 0.117, p<<0.01. These clusters are associated with larger islands and satellite cays isolated by broad water channels with strong currents. These findings imply that broad water channels present greater obstacles to gene flow than was inferred from field observation alone. Additionally, rates of gene flow were indirectly estimated using BAYESASS 3.0. The proportion of individuals originating from within each identified cluster varied from 94.5 to 98.7%, providing further support for local isolation. Our assessment reveals a major disparity between inferred and realized gene flow. We discuss our results in a conservation perspective for species inhabiting highly fragmented landscapes.
Network inference from functional experimental data (Conference Presentation)

Science.gov (United States)

Desrosiers, Patrick; Labrecque, Simon; Tremblay, Maxime; Bélanger, Mathieu; De Dorlodot, Bertrand; Côté, Daniel C.

2016-03-01

Functional connectivity maps of neuronal networks are critical tools to understand how neurons form circuits, how information is encoded and processed by neurons, how memory is shaped, and how these basic processes are altered under pathological conditions. Current light microscopy allows to observe calcium or electrical activity of thousands of neurons simultaneously, yet assessing comprehensive connectivity maps directly from such data remains a non-trivial analytical task. There exist simple statistical methods, such as cross-correlation and Granger causality, but they only detect linear interactions between neurons. Other more involved inference methods inspired by information theory, such as mutual information and transfer entropy, identify more accurately connections between neurons but also require more computational resources. We carried out a comparative study of common connectivity inference methods. The relative accuracy and computational cost of each method was determined via simulated fluorescence traces generated with realistic computational models of interacting neurons in networks of different topologies (clustered or non-clustered) and sizes (10-1000 neurons). To bridge the computational and experimental works, we observed the intracellular calcium activity of live hippocampal neuronal cultures infected with the fluorescent calcium marker GCaMP6f. The spontaneous activity of the networks, consisting of 50-100 neurons per field of view, was recorded from 20 to 50 Hz on a microscope controlled by a homemade software. We implemented all connectivity inference methods in the software, which rapidly loads calcium fluorescence movies, segments the images, extracts the fluorescence traces, and assesses the functional connections (with strengths and directions) between each pair of neurons. We used this software to assess, in real time, the functional connectivity from real calcium imaging data in basal conditions, under plasticity protocols, and epileptic
THE EVOLUTION OF DUSTY STAR FORMATION IN GALAXY CLUSTERS TO z = 1: SPITZER INFRARED OBSERVATIONS OF THE FIRST RED-SEQUENCE CLUSTER SURVEY

International Nuclear Information System (INIS)

Webb, T. M. A.; O'Donnell, D.; Coppin, Kristen; Faloon, Ashley; Geach, James E.; Noble, Allison; Yee, H. K. C.; Gilbank, David; Ellingson, Erica; Gladders, Mike; Muzzin, Adam; Wilson, Gillian; Yan, Renbin

2013-01-01

We present the results of an infrared (IR) study of high-redshift galaxy clusters with the MIPS camera on board the Spitzer Space Telescope. We have assembled a sample of 42 clusters from the Red-Sequence Cluster Survey-1 over the redshift range 0.3 14-15 M ☉ . We statistically measure the number of IR-luminous galaxies in clusters above a fixed inferred IR luminosity of 2 × 10 11 M ☉ , assuming a star forming galaxy template, per unit cluster mass and find it increases to higher redshift. Fitting a simple power-law we measure evolution of (1 + z) 5.1±1.9 over the range 0.3 cluster ). The evolution is similar, with ΣSFR/M cluster ∼ (1 + z) 5.4±1.9 . We show that this can be accounted for by the evolution of the IR-bright field population over the same redshift range; that is, the evolution can be attributed entirely to the change in the in-falling field galaxy population. We show that the ΣSFR/M cluster (binned over all redshift) decreases with increasing cluster mass with a slope (ΣSFR/M cluster ∼M cluster -1.5±0.4 ) consistent with the dependence of the stellar-to-total mass per unit cluster mass seen locally. The inferred star formation seen here could produce ∼5%-10% of the total stellar mass in massive clusters at z = 0, but we cannot constrain the descendant population, nor how rapidly the star-formation must shut-down once the galaxies have entered the cluster environment. Finally, we show a clear decrease in the number of IR-bright galaxies per unit optical galaxy in the cluster cores, confirming star formation continues to avoid the highest density regions of the universe at z ∼ 0.75 (the average redshift of the high-redshift clusters). While several previous studies appear to show enhanced star formation in high-redshift clusters relative to the field we note that these papers have not accounted for the overall increase in galaxy or dark matter density at the location of clusters. Once this is done, clusters at z ∼ 0.75 have the same
Comparative analysis on the selection of number of clusters in community detection

Science.gov (United States)

Kawamoto, Tatsuro; Kabashima, Yoshiyuki

2018-02-01

We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.
Smoothed Particle Inference: A Kilo-Parametric Method for X-ray Galaxy Cluster Modeling

Energy Technology Data Exchange (ETDEWEB)

Peterson, John R.; Marshall, P.J.; /KIPAC, Menlo Park; Andersson, K.; /Stockholm U. /SLAC

2005-08-05

We propose an ambitious new method that models the intracluster medium in clusters of galaxies as a set of X-ray emitting smoothed particles of plasma. Each smoothed particle is described by a handful of parameters including temperature, location, size, and elemental abundances. Hundreds to thousands of these particles are used to construct a model cluster of galaxies, with the appropriate complexity estimated from the data quality. This model is then compared iteratively with X-ray data in the form of adaptively binned photon lists via a two-sample likelihood statistic and iterated via Markov Chain Monte Carlo. The complex cluster model is propagated through the X-ray instrument response using direct sampling Monte Carlo methods. Using this approach the method can reproduce many of the features observed in the X-ray emission in a less assumption-dependent way that traditional analyses, and it allows for a more detailed characterization of the density, temperature, and metal abundance structure of clusters. Multi-instrument X-ray analyses and simultaneous X-ray, Sunyaev-Zeldovich (SZ), and lensing analyses are a straight-forward extension of this methodology. Significant challenges still exist in understanding the degeneracy in these models and the statistical noise induced by the complexity of the models.

Subjective randomness as statistical inference.

Science.gov (United States)

Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B

2018-06-01

Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.
Inference of RNA polymerase II transcription dynamics from chromatin immunoprecipitation time course data.

Directory of Open Access Journals (Sweden)

Ciira wa Maina

2014-05-01

Full Text Available Gene transcription mediated by RNA polymerase II (pol-II is a key step in gene expression. The dynamics of pol-II moving along the transcribed region influence the rate and timing of gene expression. In this work, we present a probabilistic model of transcription dynamics which is fitted to pol-II occupancy time course data measured using ChIP-Seq. The model can be used to estimate transcription speed and to infer the temporal pol-II activity profile at the gene promoter. Model parameters are estimated using either maximum likelihood estimation or via Bayesian inference using Markov chain Monte Carlo sampling. The Bayesian approach provides confidence intervals for parameter estimates and allows the use of priors that capture domain knowledge, e.g. the expected range of transcription speeds, based on previous experiments. The model describes the movement of pol-II down the gene body and can be used to identify the time of induction for transcriptionally engaged genes. By clustering the inferred promoter activity time profiles, we are able to determine which genes respond quickly to stimuli and group genes that share activity profiles and may therefore be co-regulated. We apply our methodology to biological data obtained using ChIP-seq to measure pol-II occupancy genome-wide when MCF-7 human breast cancer cells are treated with estradiol (E2. The transcription speeds we obtain agree with those obtained previously for smaller numbers of genes with the advantage that our approach can be applied genome-wide. We validate the biological significance of the pol-II promoter activity clusters by investigating cluster-specific transcription factor binding patterns and determining canonical pathway enrichment. We find that rapidly induced genes are enriched for both estrogen receptor alpha (ERα and FOXA1 binding in their proximal promoter regions.
Inferring Stop-Locations from WiFi.

Directory of Open Access Journals (Sweden)

David Kofoed Wind

Full Text Available Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring 'stop locations' (places of stationarity from GPS or CDR data, or on detection of state (static/active. In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants' GPS data as well as ground truth data collected during a two month period.
Gold cluster carbonyls: saturated adsorption of CO on gold cluster cations, vibrational spectroscopy, and implications for their structures.

Science.gov (United States)

Fielicke, André; von Helden, Gert; Meijer, Gerard; Pedersen, David B; Simard, Benoit; Rayner, David M

2005-06-15

We report on the interaction of carbon monoxide with cationic gold clusters in the gas phase. Successive adsorption of CO molecules on the Au(n)(+) clusters proceeds until a cluster size specific saturation coverage is reached. Structural information for the bare gold clusters is obtained by comparing the saturation stoichiometry with the number of available equivalent sites presented by candidate structures of Au(n)(+). Our findings are in agreement with the planar structures of the Au(n)(+) cluster cations with n < or = 7 that are suggested by ion mobility experiments [Gilb, S.; Weis, P.; Furche, F.; Ahlrichs, R.; Kappes, M. M. J. Chem. Phys. 2001, 116, 4094]. By inference we also establish the structure of the saturated Au(n)(CO)(m)(+) complexes. In certain cases we find evidence suggesting that successive adsorption of CO can distort the metal cluster framework. In addition, the vibrational spectra of the Au(n)(CO)(m)(+) complexes in both the CO stretching region and in the region of the Au-C stretch and the Au-C-O bend are measured using infrared photodepletion spectroscopy. The spectra further aid in the structure determination of Au(n)(+), provide information on the structure of the Au(n)(+)-CO complexes, and can be compared with spectra of CO adsorbates on deposited clusters or surfaces.
Is age really the second parameter in globular clusters?

International Nuclear Information System (INIS)

Vandenberg, D.A.; Durrell, P.R.

1990-01-01

From the close similarity of the magnitude difference between the tip of the red giant branch and the turnoff in the Fe/H = about -1.3 globular cluster NGC 288, NGC 362, and M5, it is inferred that the ages of these three systems (and Palomar 5, whose horizonal branch is used to define its distance relative to the others) are not detectably different. An identical conclusion, by similar means, is reached for the Fe/H = about -2.1 globular clusters M15, M30, M68, and M92. Several recent claims that age is responsible for the wide variation in horizontal-branch morphology among clusters of the same metal abundance are not supported. 73 refs
Exploiting visual search theory to infer social interactions

Science.gov (United States)

Rota, Paolo; Dang-Nguyen, Duc-Tien; Conci, Nicola; Sebe, Nicu

2013-03-01

In this paper we propose a new method to infer human social interactions using typical techniques adopted in literature for visual search and information retrieval. The main piece of information we use to discriminate among different types of interactions is provided by proxemics cues acquired by a tracker, and used to distinguish between intentional and casual interactions. The proxemics information has been acquired through the analysis of two different metrics: on the one hand we observe the current distance between subjects, and on the other hand we measure the O-space synergy between subjects. The obtained values are taken at every time step over a temporal sliding window, and processed in the Discrete Fourier Transform (DFT) domain. The features are eventually merged into an unique array, and clustered using the K-means algorithm. The clusters are reorganized using a second larger temporal window into a Bag Of Words framework, so as to build the feature vector that will feed the SVM classifier.
The Morphologies and Alignments of Gas, Mass, and the Central Galaxies of CLASH Clusters of Galaxies

Science.gov (United States)

Donahue, Megan; Ettori, Stefano; Rasia, Elena; Sayers, Jack; Zitrin, Adi; Meneghetti, Massimo; Voit, G. Mark; Golwala, Sunil; Czakon, Nicole; Yepes, Gustavo; Baldi, Alessandro; Koekemoer, Anton; Postman, Marc

2016-03-01

Morphology is often used to infer the state of relaxation of galaxy clusters. The regularity, symmetry, and degree to which a cluster is centrally concentrated inform quantitative measures of cluster morphology. The Cluster Lensing and Supernova survey with Hubble Space Telescope (CLASH) used weak and strong lensing to measure the distribution of matter within a sample of 25 clusters, 20 of which were deemed to be “relaxed” based on their X-ray morphology and alignment of the X-ray emission with the Brightest Cluster Galaxy. Toward a quantitative characterization of this important sample of clusters, we present uniformly estimated X-ray morphological statistics for all 25 CLASH clusters. We compare X-ray morphologies of CLASH clusters with those identically measured for a large sample of simulated clusters from the MUSIC-2 simulations, selected by mass. We confirm a threshold in X-ray surface brightness concentration of C ≳ 0.4 for cool-core clusters, where C is the ratio of X-ray emission inside 100 h70-1 kpc compared to inside 500 {h}70-1 kpc. We report and compare morphologies of these clusters inferred from Sunyaev-Zeldovich Effect (SZE) maps of the hot gas and in from projected mass maps based on strong and weak lensing. We find a strong agreement in alignments of the orientation of major axes for the lensing, X-ray, and SZE maps of nearly all of the CLASH clusters at radii of 500 kpc (approximately 1/2 R500 for these clusters). We also find a striking alignment of clusters shapes at the 500 kpc scale, as measured with X-ray, SZE, and lensing, with that of the near-infrared stellar light at 10 kpc scales for the 20 “relaxed” clusters. This strong alignment indicates a powerful coupling between the cluster- and galaxy-scale galaxy formation processes.
Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination.

Science.gov (United States)

Yau, Christopher; Holmes, Chris

2011-07-01

We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
Fuzzy-Logic Based Distributed Energy-Efficient Clustering Algorithm for Wireless Sensor Networks.

Science.gov (United States)

Zhang, Ying; Wang, Jun; Han, Dezhi; Wu, Huafeng; Zhou, Rundong

2017-07-03

Due to the high-energy efficiency and scalability, the clustering routing algorithm has been widely used in wireless sensor networks (WSNs). In order to gather information more efficiently, each sensor node transmits data to its Cluster Head (CH) to which it belongs, by multi-hop communication. However, the multi-hop communication in the cluster brings the problem of excessive energy consumption of the relay nodes which are closer to the CH. These nodes' energy will be consumed more quickly than the farther nodes, which brings the negative influence on load balance for the whole networks. Therefore, we propose an energy-efficient distributed clustering algorithm based on fuzzy approach with non-uniform distribution (EEDCF). During CHs' election, we take nodes' energies, nodes' degree and neighbor nodes' residual energies into consideration as the input parameters. In addition, we take advantage of Takagi, Sugeno and Kang (TSK) fuzzy model instead of traditional method as our inference system to guarantee the quantitative analysis more reasonable. In our scheme, each sensor node calculates the probability of being as CH with the help of fuzzy inference system in a distributed way. The experimental results indicate EEDCF algorithm is better than some current representative methods in aspects of data transmission, energy consumption and lifetime of networks.
Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering

Directory of Open Access Journals (Sweden)

Xin Tian

2017-06-01

Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.
Gene cluster statistics with gene families.

Science.gov (United States)

Raghupathy, Narayanan; Durand, Dannie

2009-05-01

Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data
Asteroseismic inferences on red giants in open clusters NGC 6791, NGC 6819, and NGC 6811 using Kepler

DEFF Research Database (Denmark)

Hekker, S.; Basu, S.; Stello, D.

2011-01-01

and metallicity contribute to the observed difference in locations in the H-R diagram of the old metal-rich cluster NGC 6791 and the middle-aged solar-metallicity cluster NGC 6819. For the young cluster NGC 6811, the explanation of the position of the stars in the H-R diagram challenges the assumption of solar...
Travel Time Estimation Using Freeway Point Detector Data Based on Evolving Fuzzy Neural Inference System.

Directory of Open Access Journals (Sweden)

Jinjun Tang

Full Text Available Travel time is an important measurement used to evaluate the extent of congestion within road networks. This paper presents a new method to estimate the travel time based on an evolving fuzzy neural inference system. The input variables in the system are traffic flow data (volume, occupancy, and speed collected from loop detectors located at points both upstream and downstream of a given link, and the output variable is the link travel time. A first order Takagi-Sugeno fuzzy rule set is used to complete the inference. For training the evolving fuzzy neural network (EFNN, two learning processes are proposed: (1 a K-means method is employed to partition input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of samples to the cluster centers. As the number of input samples increases, the cluster centers are modified and membership functions are also updated; (2 a weighted recursive least squares estimator is used to optimize the parameters of the linear functions in the Takagi-Sugeno type fuzzy rules. Testing datasets consisting of actual and simulated data are used to test the proposed method. Three common criteria including mean absolute error (MAE, root mean square error (RMSE, and mean absolute relative error (MARE are utilized to evaluate the estimation performance. Estimation results demonstrate the accuracy and effectiveness of the EFNN method through comparison with existing methods including: multiple linear regression (MLR, instantaneous model (IM, linear model (LM, neural network (NN, and cumulative plots (CP.
Travel Time Estimation Using Freeway Point Detector Data Based on Evolving Fuzzy Neural Inference System.

Science.gov (United States)

Tang, Jinjun; Zou, Yajie; Ash, John; Zhang, Shen; Liu, Fang; Wang, Yinhai

2016-01-01

Travel time is an important measurement used to evaluate the extent of congestion within road networks. This paper presents a new method to estimate the travel time based on an evolving fuzzy neural inference system. The input variables in the system are traffic flow data (volume, occupancy, and speed) collected from loop detectors located at points both upstream and downstream of a given link, and the output variable is the link travel time. A first order Takagi-Sugeno fuzzy rule set is used to complete the inference. For training the evolving fuzzy neural network (EFNN), two learning processes are proposed: (1) a K-means method is employed to partition input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of samples to the cluster centers. As the number of input samples increases, the cluster centers are modified and membership functions are also updated; (2) a weighted recursive least squares estimator is used to optimize the parameters of the linear functions in the Takagi-Sugeno type fuzzy rules. Testing datasets consisting of actual and simulated data are used to test the proposed method. Three common criteria including mean absolute error (MAE), root mean square error (RMSE), and mean absolute relative error (MARE) are utilized to evaluate the estimation performance. Estimation results demonstrate the accuracy and effectiveness of the EFNN method through comparison with existing methods including: multiple linear regression (MLR), instantaneous model (IM), linear model (LM), neural network (NN), and cumulative plots (CP).
Metallothionein Zn(2+)- and Cu(2+)-clusters from first-principles calculations

DEFF Research Database (Denmark)

Greisen, Per Junior; Jespersen, Jakob Berg; Kepp, Kasper Planeta

2012-01-01

Detailed electronic structures of Zn(ii) and Cu(ii) clusters from metallothioneins (MT) have been obtained using density functional theory (DFT), in order to investigate how oxidative stress-caused Cu(ii) intermediates affect Zn-binding to MT and cooperatively lead to Cu(i)MT. The inferred accura...
Co-Inheritance Analysis within the Domains of Life Substantially Improves Network Inference by Phylogenetic Profiling.

Directory of Open Access Journals (Sweden)

Junha Shin

Full Text Available Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co
MASSCLEANage-STELLAR CLUSTER AGES FROM INTEGRATED COLORS

International Nuclear Information System (INIS)

Popescu, Bogdan; Hanson, M. M.

2010-01-01

We present the recently updated and expanded MASSCLEANcolors, a database of 70 million Monte Carlo models selected to match the properties (metallicity, ages, and masses) of stellar clusters found in the Large Magellanic Cloud (LMC). This database shows the rather extreme and non-Gaussian distribution of integrated colors and magnitudes expected with different cluster age and mass and the enormous age degeneracy of integrated colors when mass is unknown. This degeneracy could lead to catastrophic failures in estimating age with standard simple stellar population models, particularly if most of the clusters are of intermediate or low mass, like in the LMC. Utilizing the MASSCLEANcolors database, we have developed MASSCLEANage, a statistical inference package which assigns the most likely age and mass (solved simultaneously) to a cluster based only on its integrated broadband photometric properties. Finally, we use MASSCLEANage to derive the age and mass of LMC clusters based on integrated photometry alone. First, we compare our cluster ages against those obtained for the same seven clusters using more accurate integrated spectroscopy. We find improved agreement with the integrated spectroscopy ages over the original photometric ages. A close examination of our results demonstrates the necessity of solving simultaneously for mass and age to reduce degeneracies in the cluster ages derived via integrated colors. We then selected an additional subset of 30 photometric clusters with previously well-constrained ages and independently derive their age using the MASSCLEANage with the same photometry with very good agreement. The MASSCLEANage program is freely available under GNU General Public License.
Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions.

Science.gov (United States)

Tokuda, Tomoki; Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

2017-01-01

We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions.

Directory of Open Access Journals (Sweden)

Tomoki Tokuda

Full Text Available We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

Science.gov (United States)

Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

2017-01-01

We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392

Measuring the Mean and Scatter of the X-ray Luminosity -- Optical Richness Relation for maxBCG Galaxy Clusters

Energy Technology Data Exchange (ETDEWEB)

Rykoff, E.S.; McKay, T.A.; Becker, M.A.; Evrard, A.; Johnston, D.E.; Koester, B.P.; Rozo, E.; Sheldon, E.S.; Wechsler, Risa H.

2007-10-02

We interpret and model the statistical weak lensing measurements around 130,000 groups and clusters of galaxies in the Sloan Digital Sky Survey presented by Sheldon et al. (2007). We present non-parametric inversions of the 2D shear profiles to the mean 3D cluster density and mass profiles in bins of both optical richness and cluster i-band luminosity. Since the mean cluster density profile is proportional to the cluster-mass correlation function, the mean profile is spherically symmetric by the assumptions of large-scale homogeneity and isotropy. We correct the inferred 3D profiles for systematic effects, including non-linear shear and the fact that cluster halos are not all precisely centered on their brightest galaxies. We also model the measured cluster shear profile as a sum of contributions from the brightest central galaxy, the cluster dark matter halo, and neighboring halos. We infer the relations between mean cluster virial mass and optical richness and luminosity over two orders of magnitude in cluster mass; the virial mass at fixed richness or luminosity is determined with a precision of {approx} 13% including both statistical and systematic errors. We also constrain the halo concentration parameter and halo bias as a function of cluster mass; both are in good agreement with predictions from N-body simulations of LCDM models. The methods employed here will be applicable to deeper, wide-area optical surveys that aim to constrain the nature of the dark energy, such as the Dark Energy Survey, the Large Synoptic Survey Telescope and space-based surveys.
Intracluster age gradients in numerous young stellar clusters

Science.gov (United States)

Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.

2018-05-01

The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.
Star Formation Activity in CLASH Brightest Cluster Galaxies

Science.gov (United States)

Fogarty, Kevin; Postman, Marc; Connor, Thomas; Donahue, Megan; Moustakas, John

2015-11-01

The CLASH X-ray selected sample of 20 galaxy clusters contains 10 brightest cluster galaxies (BCGs) that exhibit significant (>5σ) extinction-corrected star formation rates (SFRs). Star formation activity is inferred from photometric estimates of UV and Hα+[N ii] emission in knots and filaments detected in CLASH Hubble Space Telescope ACS and WFC3 observations. UV-derived SFRs in these BCGs span two orders of magnitude, including two with a SFR ≳ 100 M⊙ yr-1. These measurements are supplemented with [O ii], [O iii], and Hβ fluxes measured from spectra obtained with the SOAR telescope. We confirm that photoionization from ongoing star formation powers the line emission nebulae in these BCGs, although in many BCGs there is also evidence of a LINER-like contribution to the line emission. Coupling these data with Chandra X-ray measurements, we infer that the star formation occurs exclusively in low-entropy cluster cores and exhibits a correlation with gas properties related to cooling. We also perform an in-depth study of the starburst history of the BCG in the cluster RXJ1532.9+3021, and create 2D maps of stellar properties on scales down to ˜350 pc. These maps reveal evidence for an ongoing burst occurring in elongated filaments, generally on ˜0.5-1.0 Gyr timescales, although some filaments are consistent with much younger (≲100 Myr) burst timescales and may be correlated with recent activity from the active galactic nucleus. The relationship between BCG SFRs and the surrounding intracluster medium gas properties provide new support for the process of feedback-regulated cooling in galaxy clusters and is consistent with recent theoretical predictions. Based on observations obtained at the Southern Astrophysical Research (SOAR) telescope, which is a joint project of the Ministério da Ciência, Tecnologia, e Inovação (MCTI) da República Federativa do Brasil, the U.S. National Optical Astronomy Observatory (NOAO), the University of North Carolina at Chapel
Entropic Inference

Science.gov (United States)

Caticha, Ariel

2011-03-01

In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Possible systematic decreases in the age of globular clusters

Energy Technology Data Exchange (ETDEWEB)

Shi, X. [Univ. of Chicago, Chicago, IL (United States); Schramm, D. N. [Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States); Univ. of Chicago, Chicago, IL (United States); Dearborn, D. S.P. [Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States); Truran, J. W. [Univ. of Chicago, Chicago, IL (United States)

1994-03-01

The ages of globular clusters inferred from observations depends sensitively on assumptions like the initial helium abundance and the mass loss rate. A high helium abundance (e.g., Y\\approx0.28) or a mass loss rate of \\sim10^{-11}M_\\odot yr^{-1} near the main sequence turn-off region lowers the current age estimate from 14 Gyr to about 10--12 Gyr, significantly relaxing the constraints on the Hubble constant, allowing values as high as 60km/sec/Mpc for a universe with the critical density and 90km/sec/Mpc for a baryon-only universe. Possible mechanisms for the helium enhancement in globular clusters are discussed, as are arguments for an instability strip induced mass loss near the turn-off. Ages lower than 10 Gyr are not possible even with the operation of both of these mechanisms unless the initial helium abundance in globular clusters is >0.30, which would conflict with indirect measurements of helium abundances in globular clusters.
A heuristic approach to possibilistic clustering algorithms and applications

CERN Document Server

Viattchenin, Dmitri A

2013-01-01

The present book outlines a new approach to possibilistic clustering in which the sought clustering structure of the set of objects is based directly on the formal definition of fuzzy cluster and the possibilistic memberships are determined directly from the values of the pairwise similarity of objects. The proposed approach can be used for solving different classification problems. Here, some techniques that might be useful at this purpose are outlined, including a methodology for constructing a set of labeled objects for a semi-supervised clustering algorithm, a methodology for reducing analyzed attribute space dimensionality and a methods for asymmetric data processing. Moreover, a technique for constructing a subset of the most appropriate alternatives for a set of weak fuzzy preference relations, which are defined on a universe of alternatives, is described in detail, and a method for rapidly prototyping the Mamdani’s fuzzy inference systems is introduced. This book addresses engineers, scientist...
Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data

Directory of Open Access Journals (Sweden)

Wei Du

2013-01-01

Full Text Available Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.
Bayesian versus frequentist statistical inference for investigating a one-off cancer cluster reported to a health department

Directory of Open Access Journals (Sweden)

Wills Rachael A

2009-05-01

Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Cosmological constraints from Chandra observations of galaxy clusters.

Science.gov (United States)

Allen, Steven W

2002-09-15

Chandra observations of rich, relaxed galaxy clusters allow the properties of the X-ray gas and the total gravitating mass to be determined precisely. Here, we present results for a sample of the most X-ray luminous, dynamically relaxed clusters known. We show that the Chandra data and independent gravitational lensing studies provide consistent answers on the mass distributions in the clusters. The mass profiles exhibit a form in good agreement with the predictions from numerical simulations. Combining Chandra results on the X-ray gas mass fractions in the clusters with independent measurements of the Hubble constant and the mean baryonic matter density in the Universe, we obtain a tight constraint on the mean total matter density of the Universe, Omega(m), and an interesting constraint on the cosmological constant, Omega(Lambda). We also describe the 'virial relations' linking the masses, X-ray temperatures and luminosities of galaxy clusters. These relations provide a key step in linking the observed number density and spatial distribution of clusters to the predictions from cosmological models. The Chandra data confirm the presence of a systematic offset of ca. 40% between the normalization of the observed mass-temperature relation and the predictions from standard simulations. This finding leads to a significant revision of the best-fit value of sigma(8) inferred from the observed temperature and luminosity functions of clusters.
ACTION-SPACE CLUSTERING OF TIDAL STREAMS TO INFER THE GALACTIC POTENTIAL

Energy Technology Data Exchange (ETDEWEB)

Sanderson, Robyn E.; Helmi, Amina [Kapteyn Astronomical Institute, P.O. Box 800, 9700 AV Groningen (Netherlands); Hogg, David W., E-mail: robyn@astro.columbia.edu [Center for Cosmology and Particle Physics, Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States)

2015-03-10

We present a new method for constraining the Milky Way halo gravitational potential by simultaneously fitting multiple tidal streams. This method requires three-dimensional positions and velocities for all stars to be fit, but does not require identification of any specific stream or determination of stream membership for any star. We exploit the principle that the action distribution of stream stars is most clustered when the potential used to calculate the actions is closest to the true potential. Clustering is quantified with the Kullback-Leibler Divergence (KLD), which also provides conditional uncertainties for our parameter estimates. We show, for toy Gaia-like data in a spherical isochrone potential, that maximizing the KLD of the action distribution relative to a smoother distribution recovers the input potential. The precision depends on the observational errors and number of streams; using K III giants as tracers, we measure the enclosed mass at the average radius of the sample stars accurate to 3% and precise to 20%-40%. Recovery of the scale radius is precise to 25%, biased 50% high by the small galactocentric distance range of stars in our mock sample (1-25 kpc, or about three scale radii, with mean 6.5 kpc). 20-25 streams with at least 100 stars each are required for a stable confidence interval. With radial velocities (RVs) to 100 kpc, all parameters are determined with ∼10% accuracy and 20% precision (1.3% accuracy for the enclosed mass), underlining the need to complete the RV catalog for faint halo stars observed by Gaia.
More than one kind of inference: re-examining what's learned in feature inference and classification.

Science.gov (United States)

Sweller, Naomi; Hayes, Brett K

2010-08-01

Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.
Perceptual inference.

Science.gov (United States)

Aggelopoulos, Nikolaos C

2015-08-01

Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.
SEMANTIC PATCH INFERENCE

DEFF Research Database (Denmark)

Andersen, Jesper

2009-01-01

Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....
From GPS tracks to context: Inference of high-level context information through spatial clustering

OpenAIRE

Moreira, Adriano; Santos, Maribel Yasmina

2005-01-01

Location-aware applications use the location of users to adapt their behaviour and to select the relevant information for users in a particular situation. This location information is obtained through a set of location sensors, or from network-based location services, and is often used directly, without any further processing, as a parameter in a selection process. In this paper we propose a method to infer high-level context information from a series of position records obtained from a GPS r...
Young star clusters in nearby molecular clouds

Science.gov (United States)

Getman, K. V.; Kuhn, M. A.; Feigelson, E. D.; Broos, P. S.; Bate, M. R.; Garmire, G. P.

2018-06-01

The SFiNCs (Star Formation in Nearby Clouds) project is an X-ray/infrared study of the young stellar populations in 22 star-forming regions with distances ≲ 1 kpc designed to extend our earlier MYStIX (Massive Young Star-Forming Complex Study in Infrared and X-ray) survey of more distant clusters. Our central goal is to give empirical constraints on cluster formation mechanisms. Using parametric mixture models applied homogeneously to the catalogue of SFiNCs young stars, we identify 52 SFiNCs clusters and 19 unclustered stellar structures. The procedure gives cluster properties including location, population, morphology, association with molecular clouds, absorption, age (AgeJX), and infrared spectral energy distribution (SED) slope. Absorption, SED slope, and AgeJX are age indicators. SFiNCs clusters are examined individually, and collectively with MYStIX clusters, to give the following results. (1) SFiNCs is dominated by smaller, younger, and more heavily obscured clusters than MYStIX. (2) SFiNCs cloud-associated clusters have the high ellipticities aligned with their host molecular filaments indicating morphology inherited from their parental clouds. (3) The effect of cluster expansion is evident from the radius-age, radius-absorption, and radius-SED correlations. Core radii increase dramatically from ˜0.08 to ˜0.9 pc over the age range 1-3.5 Myr. Inferred gas removal time-scales are longer than 1 Myr. (4) Rich, spatially distributed stellar populations are present in SFiNCs clouds representing early generations of star formation. An appendix compares the performance of the mixture models and non-parametric minimum spanning tree to identify clusters. This work is a foundation for future SFiNCs/MYStIX studies including disc longevity, age gradients, and dynamical modelling.
INDIVIDUAL AND GROUP GALAXIES IN CNOC1 CLUSTERS

International Nuclear Information System (INIS)

Li, I. H.; Yee, H. K. C.; Ellingson, E.

2009-01-01

Using wide-field BVR c I imaging for a sample of 16 intermediate redshift (0.17 red ) to infer the evolutionary status of galaxies in clusters, using both individual galaxies and galaxies in groups. We apply the local galaxy density, Σ 5 , derived using the fifth nearest neighbor distance, as a measure of local environment, and the cluster-centric radius, r CL , as a proxy for global cluster environment. Our cluster sample exhibits a Butcher-Oemler effect in both luminosity-selected and stellar-mass-selected samples. We find that f red depends strongly on Σ 5 and r CL , and the Butcher-Oemler effect is observed in all Σ 5 and r CL bins. However, when the cluster galaxies are separated into r CL bins, or into group and nongroup subsamples, the dependence on local galaxy density becomes much weaker. This suggests that the properties of the dark matter halo in which the galaxy resides have a dominant effect on its galaxy population and evolutionary history. We find that our data are consistent with the scenario that cluster galaxies situated in successively richer groups (i.e., more massive dark matter halos) reach a high f red value at earlier redshifts. Associated with this, we observe a clear signature of 'preprocessing', in which cluster galaxies belonging to moderately massive infalling galaxy groups show a much stronger evolution in f red than those classified as nongroup galaxies, especially at the outskirts of the cluster. This result suggests that galaxies in groups infalling into clusters are significant contributors to the Butcher-Oemler effect.
GLOBULAR CLUSTER FORMATION EFFICIENCIES FROM BLACK HOLE X-RAY BINARY FEEDBACK

Energy Technology Data Exchange (ETDEWEB)

Justham, Stephen [The Key Laboratory of Optical Astronomy, National Astronomical Observatories, The Chinese Academy of Sciences, Datun Road, Beijing 100012 (China); Peng, Eric W. [Department of Astronomy, Peking University, Beijing 100871 (China); Schawinski, Kevin, E-mail: sjustham@nao.cas.cn [Institute for Astronomy, ETH Zurich, Wolfgang-Pauli-Strasse 27, 8093 Zurich (Switzerland)

2015-08-10

We investigate a scenario in which feedback from black hole X-ray binaries (BHXBs) sometimes begins inside young star clusters before strong supernova (SN) feedback. Those BHXBs could reduce the gas fraction inside embedded young clusters while maintaining virial equilibrium, which may help globular clusters (GCs) to stay bound when SN-driven gas ejection subsequently occurs. Adopting a simple toy model with parameters guided by BHXB population models, we produce GC formation efficiencies consistent with empirically inferred values. The metallicity dependence of BHXB formation could naturally explain why GC formation efficiency is higher at lower metallicity. For reasonable assumptions about that metallicity dependence, our toy model can produce a GC metallicity bimodality in some galaxies without a bimodality in the field-star metallicity distribution.
Multimodel inference and adaptive management

Science.gov (United States)

Rehme, S.E.; Powell, L.A.; Allen, Craig R.

2011-01-01

Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.
Detection of enhancement in number densities of background galaxies due to magnification by massive galaxy clusters

Energy Technology Data Exchange (ETDEWEB)

Chiu, I.; Dietrich, J. P.; Mohr, J.; Applegate, D. E.; Benson, B. A.; Bleem, L. E.; Bayliss, M. B.; Bocquet, S.; Carlstrom, J. E.; Capasso, R.; Desai, S.; Gangkofner, C.; Gonzalez, A. H.; Gupta, N.; Hennig, C.; Hoekstra, H.; von der Linden, A.; Liu, J.; McDonald, M.; Reichardt, C. L.; Saro, A.; Schrabback, T.; Strazzullo, V.; Stubbs, C. W.; Zenteno, A.

2016-02-18

We present a detection of the enhancement in the number densities of background galaxies induced from lensing magnification and use it to test the Sunyaev-Zel'dovich effect (SZE-) inferred masses in a sample of 19 galaxy clusters with median redshift z similar or equal to 0.42 selected from the South Pole Telescope SPT-SZ survey. These clusters are observed by the Megacam on the Magellan Clay Telescope though gri filters. Two background galaxy populations are selected for this study through their photometric colours; they have median redshifts zmedian similar or equal to 0.9 (low-z background) and z(median) similar or equal to 1.8 (high-z background). Stacking these populations, we detect the magnification bias effect at 3.3 sigma and 1.3 sigma for the low-and high-z backgrounds, respectively. We fit Navarro, Frenk and White models simultaneously to all observed magnification bias profiles to estimate the multiplicative factor. that describes the ratio of the weak lensing mass to the mass inferred from the SZE observable-mass relation. We further quantify systematic uncertainties in. resulting from the photometric noise and bias, the cluster galaxy contamination and the estimations of the background properties. The resulting. for the combined background populations with 1 sigma uncertainties is 0.83 +/- 0.24(stat) +/- 0.074(sys), indicating good consistency between the lensing and the SZE-inferred masses. We use our best-fitting eta to predict the weak lensing shear profiles and compare these predictions with observations, showing agreement between the magnification and shear mass constraints. This work demonstrates the promise of using the magnification as a complementary method to estimate cluster masses in large surveys.
Study on Data Clustering and Intelligent Decision Algorithm of Indoor Localization

Science.gov (United States)

Liu, Zexi

2018-01-01

Indoor positioning technology enables the human beings to have the ability of positional perception in architectural space, and there is a shortage of single network coverage and the problem of location data redundancy. So this article puts forward the indoor positioning data clustering algorithm and intelligent decision-making research, design the basic ideas of multi-source indoor positioning technology, analyzes the fingerprint localization algorithm based on distance measurement, position and orientation of inertial device integration. By optimizing the clustering processing of massive indoor location data, the data normalization pretreatment, multi-dimensional controllable clustering center and multi-factor clustering are realized, and the redundancy of locating data is reduced. In addition, the path is proposed based on neural network inference and decision, design the sparse data input layer, the dynamic feedback hidden layer and output layer, low dimensional results improve the intelligent navigation path planning.

A phylogenomic gene cluster resource: The phylogeneticallyinferred groups (PhlGs) database

Energy Technology Data Exchange (ETDEWEB)

Dehal, Paramvir S.; Boore, Jeffrey L.

2005-08-25

We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.
Testing the accuracy of clustering redshifts with simulations

Science.gov (United States)

Scottez, V.; Benoit-Lévy, A.; Coupon, J.; Ilbert, O.; Mellier, Y.

2018-03-01

We explore the accuracy of clustering-based redshift inference within the MICE2 simulation. This method uses the spatial clustering of galaxies between a spectroscopic reference sample and an unknown sample. This study give an estimate of the reachable accuracy of this method. First, we discuss the requirements for the number objects in the two samples, confirming that this method does not require a representative spectroscopic sample for calibration. In the context of next generation of cosmological surveys, we estimated that the density of the Quasi Stellar Objects in BOSS allows us to reach 0.2 per cent accuracy in the mean redshift. Secondly, we estimate individual redshifts for galaxies in the densest regions of colour space ( ˜ 30 per cent of the galaxies) without using the photometric redshifts procedure. The advantage of this procedure is threefold. It allows: (i) the use of cluster-zs for any field in astronomy, (ii) the possibility to combine photo-zs and cluster-zs to get an improved redshift estimation, (iii) the use of cluster-z to define tomographic bins for weak lensing. Finally, we explore this last option and build five cluster-z selected tomographic bins from redshift 0.2 to 1. We found a bias on the mean redshift estimate of 0.002 per bin. We conclude that cluster-z could be used as a primary redshift estimator by next generation of cosmological surveys.
THERE ARE NO STARLESS MASSIVE PROTO-CLUSTERS IN THE FIRST QUADRANT OF THE GALAXY

Energy Technology Data Exchange (ETDEWEB)

Ginsburg, A.; Bally, J.; Battersby, C. [Center for Astrophysics and Space Astronomy, University of Colorado, Boulder, CO 80309 (United States); Bressert, E. [European Southern Observatory, Karl Schwarzschild str. 2, D-85748 Garching bei Muenchen (Germany)

2012-10-20

We search the {lambda} = 1.1 mm Bolocam Galactic Plane Survey for clumps containing sufficient mass to form {approx}10{sup 4} M{sub Sun} star clusters. Eighteen candidate massive proto-clusters are identified in the first Galactic quadrant outside of the central kiloparsec. This sample is complete to clumps with mass M{sub clump} > 10{sup 4} M{sub Sun} and radius r {approx}< 2.5 pc. The overall Galactic massive cluster formation rate is CFR(M{sub cluster} > 10{sup 4}) {approx}<5 Myr{sup -1}, which is in agreement with the rates inferred from Galactic open clusters and M31 massive clusters. We find that all massive proto-clusters in the first quadrant are actively forming massive stars and place an upper limit of {tau}{sub starless} < 0.5 Myr on the lifetime of the starless phase of massive cluster formation. If massive clusters go through a starless phase with all of their mass in a single clump, the lifetime of this phase is very short.
Optimal inference with suboptimal models: Addiction and active Bayesian inference

Science.gov (United States)

Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

2015-01-01

When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Inference rule and problem solving

Energy Technology Data Exchange (ETDEWEB)

Goto, S

1982-04-01

Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.
Cluster analysis of rural, urban, and curbside atmospheric particle size data.

Science.gov (United States)

Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

2009-07-01

Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.
THE PANCHROMATIC HUBBLE ANDROMEDA TREASURY. III. MEASURING AGES AND MASSES OF PARTIALLY RESOLVED STELLAR CLUSTERS

Energy Technology Data Exchange (ETDEWEB)

Beerman, Lori C.; Johnson, L. Clifton; Fouesneau, Morgan; Dalcanton, Julianne J.; Weisz, Daniel R.; Williams, Ben F. [Department of Astronomy, University of Washington, Box 351580, Seattle, WA 98195 (United States); Seth, Anil C. [Department of Physics and Astronomy, University of Utah, Salt Lake City, UT 84112 (United States); Bell, Eric F. [Department of Astronomy, University of Michigan, 500 Church Street, Ann Arbor, MI 48109 (United States); Bianchi, Luciana C. [Department of Physics and Astronomy, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218 (United States); Caldwell, Nelson [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Dolphin, Andrew E. [Raytheon Company, 1151 East Hermans Road, Tucson, AZ 85756 (United States); Gouliermis, Dimitrios A. [Zentrum fuer Astronomie, Institut fuer Theoretische Astrophysik, Universitaet Heidelberg, Albert-Ueberle-Strasse 2, D-69120 Heidelberg (Germany); Kalirai, Jason S. [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States); Larsen, Soren S. [Department of Astrophysics, IMAPP, Radboud University Nijmegen, P.O. Box 9010, NL-6500 GL Nijmegen (Netherlands); Melbourne, Jason L. [Caltech Optical Observatories, Division of Physics, Mathematics and Astronomy, Mail Stop 301-17, California Institute of Technology, Pasadena, CA 91125 (United States); Rix, Hans-Walter [Max-Planck-Institut fuer Astronomie, Koenigstuhl 17, D-69117 Heidelberg (Germany); Skillman, Evan D., E-mail: beermalc@astro.washington.edu [Department of Astronomy, University of Minnesota, 116 Church Street SE, Minneapolis, MN 55455 (United States)

2012-12-01

The apparent age and mass of a stellar cluster can be strongly affected by stochastic sampling of the stellar initial mass function (IMF), when inferred from the integrated color of low-mass clusters ({approx}<10{sup 4} M {sub Sun }). We use simulated star clusters to show that these effects are minimized when the brightest, rapidly evolving stars in a cluster can be resolved, and the light of the fainter, more numerous unresolved stars can be analyzed separately. When comparing the light from the less luminous cluster members to models of unresolved light, more accurate age estimates can be obtained than when analyzing the integrated light from the entire cluster under the assumption that the IMF is fully populated. We show the success of this technique first using simulated clusters, and then with a stellar cluster in M31. This method represents one way of accounting for the discrete, stochastic sampling of the stellar IMF in less massive clusters and can be leveraged in studies of clusters throughout the Local Group and other nearby galaxies.
Fitting Latent Cluster Models for Networks with latentnet

Directory of Open Access Journals (Sweden)

Pavel N. Krivitsky

2007-12-01

Full Text Available latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoﬀ, Raftery, and Handcock (2002 suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariates. In latentnet social distances are represented in a Euclidean space. It also includes a variant of the extension of the latent position model to allow for clustering of the positions developed in Handcock, Raftery, and Tantrum (2007.The package implements Bayesian inference for the models based on an Markov chain Monte Carlo algorithm. It can also compute maximum likelihood estimates for the latent position model and a two-stage maximum likelihood method for the latent position cluster model. For latent position cluster models, the package provides a Bayesian way of assessing how many groups there are, and thus whether or not there is any clustering (since if the preferred number of groups is 1, there is little evidence for clustering. It also estimates which cluster each actor belongs to. These estimates are probabilistic, and provide the probability of each actor belonging to each cluster. It computes four types of point estimates for the coefficients and positions: maximum likelihood estimate, posterior mean, posterior mode and the estimator which minimizes Kullback-Leibler divergence from the posterior. You can assess the goodness-of-fit of the model via posterior predictive checks. It has a function to simulate networks from a latent position or latent position cluster model.
Knowledge and inference

CERN Document Server

Nagao, Makoto

1990-01-01

Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig
THE IMPACT OF CONTAMINATED RR LYRAE/GLOBULAR CLUSTER PHOTOMETRY ON THE DISTANCE SCALE

Energy Technology Data Exchange (ETDEWEB)

Majaess, D.; Turner, D.; Lane, D. [Department of Astronomy and Physics, Saint Mary' s University, Halifax, NS B3H 3C3 (Canada); Gieren, W., E-mail: dmajaess@ap.smu.ca [Departamento de Astronomia, Universidad de Concepcion, Casilla 160-C, CL Concepcion (Chile)

2012-06-10

RR Lyrae variables and the stellar constituents of globular clusters are employed to establish the cosmic distance scale and age of the universe. However, photometry for RR Lyrae variables in the globular clusters M3, M15, M54, M92, NGC 2419, and NGC 6441 exhibit a dependence on the clustercentric distance. For example, variables and stars positioned near the crowded high-surface brightness cores of the clusters may suffer from photometric contamination, which invariably affects a suite of inferred parameters (e.g., distance, color excess, absolute magnitude, etc.). The impetus for this study is to mitigate the propagation of systematic uncertainties by increasing awareness of the pernicious impact of contaminated and radial-dependent photometry.
Geometric statistical inference

International Nuclear Information System (INIS)

Periwal, Vipul

1999-01-01

A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined
Dynamics of voids and clusters and fluctuations in the cosmic background radiation

International Nuclear Information System (INIS)

Salpeter, E.E.

1983-01-01

The author summarizes briefly calculations on spherically symmetric models without dissipation for the dynamical development of large voids and galaxy (super)clusters from small underdensities and overdensities, respectively, at the recombination era. Implications are mentioned and conjectures for more complex geometries are discussed. He infers the density fluctuations which must have been present just after the recombination era to produce some present-day configuration. Fluctuations in the present-day cosmic background radiation are related to this and their inferred amplitude depends very strongly on the present-day value of the cosmological density parameter. The relation to observed upper limits on these fluctuations are discussed. (Auth.)
Origin and distribution of epipolythiodioxopiperazine (ETP gene clusters in filamentous ascomycetes

Directory of Open Access Journals (Sweden)

Gardiner Donald M

2007-09-01

Full Text Available Abstract Background Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Results Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. Conclusion ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of
Goal inferences about robot behavior : goal inferences and human response behaviors

NARCIS (Netherlands)

Broers, H.A.T.; Ham, J.R.C.; Broeders, R.; De Silva, P.; Okada, M.

2014-01-01

This explorative research focused on the goal inferences human observers draw based on a robot's behavior, and the extent to which those inferences predict people's behavior in response to that robot. Results show that different robot behaviors cause different response behavior from people.
Evolution of homeobox genes.

Science.gov (United States)

Holland, Peter W H

2013-01-01

Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.
Clustering of near clusters versus cluster compactness

International Nuclear Information System (INIS)

Yu Gao; Yipeng Jing

1989-01-01

The clustering properties of near Zwicky clusters are studied by using the two-point angular correlation function. The angular correlation functions for compact and medium compact clusters, for open clusters, and for all near Zwicky clusters are estimated. The results show much stronger clustering for compact and medium compact clusters than for open clusters, and that open clusters have nearly the same clustering strength as galaxies. A detailed study of the compactness-dependence of correlation function strength is worth investigating. (author)
Weighted community detection and data clustering using message passing

Science.gov (United States)

Shi, Cheng; Liu, Yanchen; Zhang, Pan

2018-03-01

Grouping objects into clusters based on the similarities or weights between them is one of the most important problems in science and engineering. In this work, by extending message-passing algorithms and spectral algorithms proposed for an unweighted community detection problem, we develop a non-parametric method based on statistical physics, by mapping the problem to the Potts model at the critical temperature of spin-glass transition and applying belief propagation to solve the marginals corresponding to the Boltzmann distribution. Our algorithm is robust to over-fitting and gives a principled way to determine whether there are significant clusters in the data and how many clusters there are. We apply our method to different clustering tasks. In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms. In the clustering problem, where the data were generated by mixture models in the sparse regime, we show that our method works all the way down to the theoretical limit of detectability and gives accuracy very close to that of the optimal Bayesian inference. In the semi-supervised clustering problem, our method only needs several labels to work perfectly in classic datasets. Finally, we further develop Thouless-Anderson-Palmer equations which heavily reduce the computation complexity in dense networks but give almost the same performance as belief propagation.
Entropic Inference

OpenAIRE

Caticha, Ariel

2010-01-01

In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...
STAR CLUSTER FORMATION WITH STELLAR FEEDBACK AND LARGE-SCALE INFLOW

International Nuclear Information System (INIS)

Matzner, Christopher D.; Jumper, Peter H.

2015-01-01

During star cluster formation, ongoing mass accretion is resisted by stellar feedback in the form of protostellar outflows from the low-mass stars and photo-ionization and radiation pressure feedback from the massive stars. We model the evolution of cluster-forming regions during a phase in which both accretion and feedback are present and use these models to investigate how star cluster formation might terminate. Protostellar outflows are the strongest form of feedback in low-mass regions, but these cannot stop cluster formation if matter continues to flow in. In more massive clusters, radiation pressure and photo-ionization rapidly clear the cluster-forming gas when its column density is too small. We assess the rates of dynamical mass ejection and of evaporation, while accounting for the important effect of dust opacity on photo-ionization. Our models are consistent with the census of protostellar outflows in NGC 1333 and Serpens South and with the dust temperatures observed in regions of massive star formation. Comparing observations of massive cluster-forming regions against our model parameter space, and against our expectations for accretion-driven evolution, we infer that massive-star feedback is a likely cause of gas disruption in regions with velocity dispersions less than a few kilometers per second, but that more massive and more turbulent regions are too strongly bound for stellar feedback to be disruptive
Cluster-cluster clustering

International Nuclear Information System (INIS)

Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)

1985-01-01

The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references

Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Support Distribution Machines

Science.gov (United States)

Ntampaka, Michelle; Trac, Hy; Sutherland, Dougal; Fromenteau, Sebastien; Poczos, Barnabas; Schneider, Jeff

2018-01-01

We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create two mock catalogs from Multidark’s publicly available N-body MDPL1 simulation, one with perfect galaxy cluster membership infor- mation and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width E=0.87. Interlopers introduce additional scatter, significantly widening the error distribution further (E=2.13). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement (E=0.67) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncon- taminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.
STAR-TO-STAR IRON ABUNDANCE VARIATIONS IN RED GIANT BRANCH STARS IN THE GALACTIC GLOBULAR CLUSTER NGC 3201

International Nuclear Information System (INIS)

Simmerer, Jennifer; Ivans, Inese I.; Filler, Dan; Francois, Patrick; Charbonnel, Corinne; Monier, Richard; James, Gaël

2013-01-01

We present the metallicity as traced by the abundance of iron in the retrograde globular cluster NGC 3201, measured from high-resolution, high signal-to-noise spectra of 24 red giant branch stars. A spectroscopic analysis reveals a spread in [Fe/H] in the cluster stars at least as large as 0.4 dex. Star-to-star metallicity variations are supported both through photometry and through a detailed examination of spectra. We find no correlation between iron abundance and distance from the cluster core, as might be inferred from recent photometric studies. NGC 3201 is the lowest mass halo cluster to date to contain stars with significantly different [Fe/H] values.
Star-to-star Iron Abundance Variations in Red Giant Branch Stars in the Galactic Globular Cluster NGC 3201

Science.gov (United States)

Simmerer, Jennifer; Ivans, Inese I.; Filler, Dan; Francois, Patrick; Charbonnel, Corinne; Monier, Richard; James, Gaël

2013-02-01

We present the metallicity as traced by the abundance of iron in the retrograde globular cluster NGC 3201, measured from high-resolution, high signal-to-noise spectra of 24 red giant branch stars. A spectroscopic analysis reveals a spread in [Fe/H] in the cluster stars at least as large as 0.4 dex. Star-to-star metallicity variations are supported both through photometry and through a detailed examination of spectra. We find no correlation between iron abundance and distance from the cluster core, as might be inferred from recent photometric studies. NGC 3201 is the lowest mass halo cluster to date to contain stars with significantly different [Fe/H] values.
Globular Clusters: Absolute Proper Motions and Galactic Orbits

Science.gov (United States)

Chemel, A. A.; Glushkova, E. V.; Dambis, A. K.; Rastorguev, A. S.; Yalyalieva, L. N.; Klinichev, A. D.

2018-04-01

We cross-match objects from several different astronomical catalogs to determine the absolute proper motions of stars within the 30-arcmin radius fields of 115 Milky-Way globular clusters with the accuracy of 1-2 mas yr-1. The proper motions are based on positional data recovered from the USNO-B1, 2MASS, URAT1, ALLWISE, UCAC5, and Gaia DR1 surveys with up to ten positions spanning an epoch difference of up to about 65 years, and reduced to Gaia DR1 TGAS frame using UCAC5 as the reference catalog. Cluster members are photometrically identified by selecting horizontal- and red-giant branch stars on color-magnitude diagrams, and the mean absolute proper motions of the clusters with a typical formal error of about 0.4 mas yr-1 are computed by averaging the proper motions of selected members. The inferred absolute proper motions of clusters are combined with available radial-velocity data and heliocentric distance estimates to compute the cluster orbits in terms of the Galactic potential models based on Miyamoto and Nagai disk, Hernquist spheroid, and modified isothermal dark-matter halo (axisymmetric model without a bar) and the same model + rotating Ferre's bar (non-axisymmetric). Five distant clusters have higher-than-escape velocities, most likely due to large errors of computed transversal velocities, whereas the computed orbits of all other clusters remain bound to the Galaxy. Unlike previously published results, we find the bar to affect substantially the orbits of most of the clusters, even those at large Galactocentric distances, bringing appreciable chaotization, especially in the portions of the orbits close to the Galactic center, and stretching out the orbits of some of the thick-disk clusters.
Data mining in forecasting PVT correlations of crude oil systems based on Type1 fuzzy logic inference systems

Science.gov (United States)

El-Sebakhy, Emad A.

2009-09-01

Pressure-volume-temperature properties are very important in the reservoir engineering computations. There are many empirical approaches for predicting various PVT properties based on empirical correlations and statistical regression models. Last decade, researchers utilized neural networks to develop more accurate PVT correlations. These achievements of neural networks open the door to data mining techniques to play a major role in oil and gas industry. Unfortunately, the developed neural networks correlations are often limited, and global correlations are usually less accurate compared to local correlations. Recently, adaptive neuro-fuzzy inference systems have been proposed as a new intelligence framework for both prediction and classification based on fuzzy clustering optimization criterion and ranking. This paper proposes neuro-fuzzy inference systems for estimating PVT properties of crude oil systems. This new framework is an efficient hybrid intelligence machine learning scheme for modeling the kind of uncertainty associated with vagueness and imprecision. We briefly describe the learning steps and the use of the Takagi Sugeno and Kang model and Gustafson-Kessel clustering algorithm with K-detected clusters from the given database. It has featured in a wide range of medical, power control system, and business journals, often with promising results. A comparative study will be carried out to compare their performance of this new framework with the most popular modeling techniques, such as neural networks, nonlinear regression, and the empirical correlations algorithms. The results show that the performance of neuro-fuzzy systems is accurate, reliable, and outperform most of the existing forecasting techniques. Future work can be achieved by using neuro-fuzzy systems for clustering the 3D seismic data, identification of lithofacies types, and other reservoir characterization.
Learning Convex Inference of Marginals

OpenAIRE

Domke, Justin

2012-01-01

Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...
REEXAMINING THE LITHIUM DEPLETION BOUNDARY IN THE PLEIADES AND THE INFERRED AGE OF THE CLUSTER

Energy Technology Data Exchange (ETDEWEB)

Dahm, S. E. [W. M. Keck Observatory, Kamuela, HI 96743 (United States)

2015-11-10

Moderate-dispersion (R ∼ 5400), optical spectroscopy of seven brown dwarf candidate members of the Pleiades was obtained using the Echellette Spectrograph and Imager on the Keck II telescope. The proper motion and photometrically selected sample lies on the single-star main sequence of the cluster and effectively brackets the established lithium depletion boundary. The brown dwarf candidates range in spectral type from M6 to M7, implying effective temperatures between ∼2800 and 2650 K. All sources exhibit Hα emission, consistent with enhanced chromospheric activity that is expected for young, very low-mass stars and brown dwarfs. Li i λ6708 absorption is confidently detected in the photospheres of two of the seven sources. A revised lithium depletion boundary is established in the near-infrared where the effects of extinction and variability are minimized. This lithium depletion edge occurs near K{sub o} = 14.45 or M{sub K} = 8.78 mag (UKIRT Infrared Deep Sky Survey), assuming the most accurate and precise distance estimate for the cluster of 136.2 pc. From recent theoretical evolutionary models, a revised age of τ = 112 ± 5 Myr is determined for the Pleiades. Accounting for the effects of magnetic activity on the photospheres of these very low-mass stars and brown dwarfs, however, would imply an even younger age for the cluster of ∼100 Myr.
On the Structure of Cortical Microcircuits Inferred from Small Sample Sizes.

Science.gov (United States)

Vegué, Marina; Perin, Rodrigo; Roxin, Alex

2017-08-30

The structure in cortical microcircuits deviates from what would be expected in a purely random network, which has been seen as evidence of clustering. To address this issue, we sought to reproduce the nonrandom features of cortical circuits by considering several distinct classes of network topology, including clustered networks, networks with distance-dependent connectivity, and those with broad degree distributions. To our surprise, we found that all of these qualitatively distinct topologies could account equally well for all reported nonrandom features despite being easily distinguishable from one another at the network level. This apparent paradox was a consequence of estimating network properties given only small sample sizes. In other words, networks that differ markedly in their global structure can look quite similar locally. This makes inferring network structure from small sample sizes, a necessity given the technical difficulty inherent in simultaneous intracellular recordings, problematic. We found that a network statistic called the sample degree correlation (SDC) overcomes this difficulty. The SDC depends only on parameters that can be estimated reliably given small sample sizes and is an accurate fingerprint of every topological family. We applied the SDC criterion to data from rat visual and somatosensory cortex and discovered that the connectivity was not consistent with any of these main topological classes. However, we were able to fit the experimental data with a more general network class, of which all previous topologies were special cases. The resulting network topology could be interpreted as a combination of physical spatial dependence and nonspatial, hierarchical clustering. SIGNIFICANCE STATEMENT The connectivity of cortical microcircuits exhibits features that are inconsistent with a simple random network. Here, we show that several classes of network models can account for this nonrandom structure despite qualitative differences in
Agglomerative concentric hypersphere clustering applied to structural damage detection

Science.gov (United States)

Silva, Moisés; Santos, Adam; Santos, Reginaldo; Figueiredo, Eloi; Sales, Claudomiro; Costa, João C. W. A.

2017-08-01

The present paper proposes a novel cluster-based method, named as agglomerative concentric hypersphere (ACH), to detect structural damage in engineering structures. Continuous structural monitoring systems often require unsupervised approaches to automatically infer the health condition of a structure. However, when a structure is under linear and nonlinear effects caused by environmental and operational variability, data normalization procedures are also required to overcome these effects. The proposed approach aims, through a straightforward clustering procedure, to discover automatically the optimal number of clusters, representing the main state conditions of a structural system. Three initialization procedures are introduced to evaluate the impact of deterministic and stochastic initializations on the performance of this approach. The ACH is compared to state-of-the-art approaches, based on Gaussian mixture models and Mahalanobis squared distance, on standard data sets from a post-tensioned bridge located in Switzerland: the Z-24 Bridge. The proposed approach demonstrates more efficiency in modeling the normal condition of the structure and its corresponding main clusters. Furthermore, it reveals a better classification performance than the alternative ones in terms of false-positive and false-negative indications of damage, demonstrating a promising applicability in real-world structural health monitoring scenarios.
On the Analysis of Case-Control Studies in Cluster-correlated Data Settings.

Science.gov (United States)

Haneuse, Sebastien; Rivera-Rodriguez, Claudia

2018-01-01

In resource-limited settings, long-term evaluation of national antiretroviral treatment (ART) programs often relies on aggregated data, the analysis of which may be subject to ecological bias. As researchers and policy makers consider evaluating individual-level outcomes such as treatment adherence or mortality, the well-known case-control design is appealing in that it provides efficiency gains over random sampling. In the context that motivates this article, valid estimation and inference requires acknowledging any clustering, although, to our knowledge, no statistical methods have been published for the analysis of case-control data for which the underlying population exhibits clustering. Furthermore, in the specific context of an ongoing collaboration in Malawi, rather than performing case-control sampling across all clinics, case-control sampling within clinics has been suggested as a more practical strategy. To our knowledge, although similar outcome-dependent sampling schemes have been described in the literature, a case-control design specific to correlated data settings is new. In this article, we describe this design, discuss balanced versus unbalanced sampling techniques, and provide a general approach to analyzing case-control studies in cluster-correlated settings based on inverse probability-weighted generalized estimating equations. Inference is based on a robust sandwich estimator with correlation parameters estimated to ensure appropriate accounting of the outcome-dependent sampling scheme. We conduct comprehensive simulations, based in part on real data on a sample of N = 78,155 program registrants in Malawi between 2005 and 2007, to evaluate small-sample operating characteristics and potential trade-offs associated with standard case-control sampling or when case-control sampling is performed within clusters.
Probabilistic inductive inference: a survey

OpenAIRE

Ambainis, Andris

2001-01-01

Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.
LAIT: a local ancestry inference toolkit.

Science.gov (United States)

Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei

2017-09-06

Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.
Bayesian statistical inference

Directory of Open Access Journals (Sweden)

Bruno De Finetti

2017-04-01

Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Is there a hierarchy of social inferences? The likelihood and speed of inferring intentionality, mind, and personality.

Science.gov (United States)

Malle, Bertram F; Holbrook, Jess

2012-04-01

People interpret behavior by making inferences about agents' intentionality, mind, and personality. Past research studied such inferences 1 at a time; in real life, people make these inferences simultaneously. The present studies therefore examined whether 4 major inferences (intentionality, desire, belief, and personality), elicited simultaneously in response to an observed behavior, might be ordered in a hierarchy of likelihood and speed. To achieve generalizability, the studies included a wide range of stimulus behaviors, presented them verbally and as dynamic videos, and assessed inferences both in a retrieval paradigm (measuring the likelihood and speed of accessing inferences immediately after they were made) and in an online processing paradigm (measuring the speed of forming inferences during behavior observation). Five studies provide evidence for a hierarchy of social inferences-from intentionality and desire to belief to personality-that is stable across verbal and visual presentations and that parallels the order found in developmental and primate research. (c) 2012 APA, all rights reserved.
Online learning of a Dirichlet process mixture of Beta-Liouville distributions via variational inference.

Science.gov (United States)

Fan, Wentao; Bouguila, Nizar

2013-11-01

A large class of problems can be formulated in terms of the clustering process. Mixture models are an increasingly important tool in statistical pattern recognition and for analyzing and clustering complex data. Two challenging aspects that should be addressed when considering mixture models are how to choose between a set of plausible models and how to estimate the model's parameters. In this paper, we address both problems simultaneously within a unified online nonparametric Bayesian framework that we develop to learn a Dirichlet process mixture of Beta-Liouville distributions (i.e., an infinite Beta-Liouville mixture model). The proposed infinite model is used for the online modeling and clustering of proportional data for which the Beta-Liouville mixture has been shown to be effective. We propose a principled approach for approximating the intractable model's posterior distribution by a tractable one-which we develop-such that all the involved mixture's parameters can be estimated simultaneously and effectively in a closed form. This is done through variational inference that enjoys important advantages, such as handling of unobserved attributes and preventing under or overfitting; we explain that in detail. The effectiveness of the proposed work is evaluated on three challenging real applications, namely facial expression recognition, behavior modeling and recognition, and dynamic textures clustering.
Using Spatial Clustering in Forecasting Groundwater Quality Parameters by ANFIS

Directory of Open Access Journals (Sweden)

MohammadTaghi Alami

2016-07-01

Full Text Available Groundwater is a major source of water supply for domestic, agricultural, and industrial uses; hence, its quality modeling is an important task in hydro-environmental studies. While many data-based models have been developed for this purpose, the performance of such data-based models can be drastically enhanced if they are based on temporal and spatial pre-processing. In this study, geostatistics tools (e.g., Co-Kriging, as spatial estimators, and self-organizing map (SOM, as a clustering technique, were employed in conjunction with Adaptive Neuro-Fuzzy Inference System (ANFIS for the temporal forecasting of such quality parameters as electrical conductivity (EC and total dissolved solids (TDS of the groundwater in Ardabil Plain. Using the results thus obtained, the impact of spatial data clustering was also investigated on the same parameters. The results showed that, if propoer input data are selected, the proposed spatial clustering technique is capable of imporving groundwater quality forecasts made by ANFIS.
INFERENCE BUILDING BLOCKS

Science.gov (United States)

2018-02-15

expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation
Practical Bayesian Inference

Science.gov (United States)

Bailer-Jones, Coryn A. L.

2017-04-01

Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
Interpretable inference on the mixed effect model with the Box-Cox transformation.

Science.gov (United States)

Maruo, K; Yamaguchi, Y; Noma, H; Gosho, M

2017-07-10

We derived results for inference on parameters of the marginal model of the mixed effect model with the Box-Cox transformation based on the asymptotic theory approach. We also provided a robust variance estimator of the maximum likelihood estimator of the parameters of this model in consideration of the model misspecifications. Using these results, we developed an inference procedure for the difference of the model median between treatment groups at the specified occasion in the context of mixed effects models for repeated measures analysis for randomized clinical trials, which provided interpretable estimates of the treatment effect. From simulation studies, it was shown that our proposed method controlled type I error of the statistical test for the model median difference in almost all the situations and had moderate or high performance for power compared with the existing methods. We illustrated our method with cluster of differentiation 4 (CD4) data in an AIDS clinical trial, where the interpretability of the analysis results based on our proposed method is demonstrated. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Observations of Hα-emission stars in the young cluster NGC 2264

International Nuclear Information System (INIS)

Rydgren, A.E.

1979-01-01

UBVRI photometry is given for a sample of 25 late-type Hα-emission stars in the young cluster NGC 2264. The stars are in the magnitude range 12< or =V<16. Some but not all appear to be T Tauri stars. The color--color diagrams support the view that the deviations from normal photospheric colors (due to ''spectral veiling'' and line emission) decrease with increasing wavelength between the U and I filters. In the (V, V-R) diagram, the Hα-emission stars lie in a well-defined pre-main-sequence band. Within this sample, there is a trend toward stronger line emission and spectral veiling with later spectral type. All of the likely legitimate T Tauri stars have inferred spectral types later than about K3. The question of cluster membership for stars in the cluster field with very small proper motions is considered

Social cognition in people with schizophrenia: a cluster-analytic approach.

Science.gov (United States)

Rocca, P; Galderisi, S; Rossi, A; Bertolino, A; Rucci, P; Gibertoni, D; Montemagni, C; Sigaudo, M; Mucci, A; Bucci, P; Acciavatti, T; Aguglia, E; Amore, M; Bellomo, A; De Ronchi, D; Dell'Osso, L; Di Fabio, F; Girardi, P; Goracci, A; Marchesi, C; Monteleone, P; Niolu, C; Pinna, F; Roncone, R; Sacchetti, E; Santonastaso, P; Zeppegno, P; Maj, M

2016-10-01

The study aimed to subtype patients with schizophrenia on the basis of social cognition (SC), and to identify cut-offs that best discriminate among subtypes in 809 out-patients recruited in the context of the Italian Network for Research on Psychoses. A two-step cluster analysis of The Awareness of Social Inference Test (TASIT), the Facial Emotion Identification Test and Mayer-Salovey-Caruso Emotional Intelligence Test scores was performed. Classification and regression tree analysis was used to identify the cut-offs of variables that best discriminated among clusters. We identified three clusters, characterized by unimpaired (42%), impaired (50.4%) and very impaired (7.5%) SC. Three theory-of-mind domains were more important for the cluster definition as compared with emotion perception and emotional intelligence. Patients more able to understand simple sarcasm (⩾14 for TASIT-SS) were very likely to belong to the unimpaired SC cluster. Compared with patients in the impaired SC cluster, those in the very impaired SC cluster performed significantly worse in lie scenes (TASIT-LI <10), but not in simple sarcasm. Moreover, functioning, neurocognition, disorganization and SC had a linear relationship across the three clusters, while positive symptoms were significantly lower in patients with unimpaired SC as compared with patients with impaired and very impaired SC. On the other hand, negative symptoms were highest in patients with impaired levels of SC. If replicated, the identification of such subtypes in clinical practice may help in tailoring rehabilitation efforts to the person's strengths to gain more benefit to the person.
Logical inference and evaluation

International Nuclear Information System (INIS)

Perey, F.G.

1981-01-01

Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given
Comparison of Cluster Lensing Profiles with Lambda CDM Predictions

Energy Technology Data Exchange (ETDEWEB)

Broadhurst, Tom; /Tel Aviv U.; Umetsu, Keiichi; /Taipei, Inst. Astron. Astrophys.; Medezinski, Elinor; /Tel Aviv U.; Oguri, Masamune; /KIPAC, Menlo Park; Rephaeli, Yoel; /Tel Aviv U. /San Diego, CASS

2008-05-21

We derive lens distortion and magnification profiles of four well known clusters observed with Subaru. Each cluster is very well fitted by the general form predicted for Cold Dark Matter (CDM) dominated halos, with good consistency found between the independent distortion and magnification measurements. The inferred level of mass concentration is surprisingly high, 8 < c{sub vir} < 15 ( = 10.39 {+-} 0.91), compared to the relatively shallow profiles predicted by the {Lambda}CDM model, c{sub vir} = 5.06 {+-} 1.10 (for = 1.25 x 10{sup 15} M{sub {circle_dot}}/h). This represents a 4{sigma} discrepancy, and includes the relatively modest effects of projection bias and profile evolution derived from N-body simulations, which oppose each other with little residual effect. In the context of CDM based cosmologies, this discrepancy implies some modification of the widely assumed spectrum of initial density perturbations, so clusters collapse earlier (z {ge} 1) than predicted (z < 0.5) when the Universe was correspondingly denser.
A Local Poisson Graphical Model for inferring networks from sequencing data.

Science.gov (United States)

Allen, Genevera I; Liu, Zhandong

2013-09-01

Gaussian graphical models, a class of undirected graphs or Markov Networks, are often used to infer gene networks based on microarray expression data. Many scientists, however, have begun using high-throughput sequencing technologies such as RNA-sequencing or next generation sequencing to measure gene expression. As the resulting data consists of counts of sequencing reads for each gene, Gaussian graphical models are not optimal for this discrete data. In this paper, we propose a novel method for inferring gene networks from sequencing data: the Local Poisson Graphical Model. Our model assumes a Local Markov property where each variable conditional on all other variables is Poisson distributed. We develop a neighborhood selection algorithm to fit our model locally by performing a series of l1 penalized Poisson, or log-linear, regressions. This yields a fast parallel algorithm for estimating networks from next generation sequencing data. In simulations, we illustrate the effectiveness of our methods for recovering network structure from count data. A case study on breast cancer microRNAs (miRNAs), a novel application of graphical models, finds known regulators of breast cancer genes and discovers novel miRNA clusters and hubs that are targets for future research.
Effect of primordial non-Gaussianities on galaxy clusters scaling relations

Science.gov (United States)

Trindade, A. M. M.; da Silva, Antonio

2017-07-01

Galaxy clusters are a valuable source of cosmological information. Their formation and evolution depends on the underlying cosmology and on the statistical nature of the primordial density fluctuations. Here we investigate the impact of primordial non-Gaussianities (PNG) on the scaling properties of galaxy clusters. We performed a series of hydrodynamic N-body simulations featuring adiabatic gas physics and different levels of non-Gaussianity within the Λ cold dark matter framework. We focus on the T-M, S-M, Y-M and YX-M scalings relating the total cluster mass with temperature, entropy and Sunyaev-Zeld'ovich integrated pressure that reflect the thermodynamic state of the intracluster medium. Our results show that PNG have an impact on cluster scalings laws. The scalings mass power-law indexes are almost unaffected by the existence of PNG, but the amplitude and redshift evolution of their normalizations are clearly affected. Changes in the Y-M and YX-M normalizations are as high as 22 per cent and 16 per cent when fNL varies from -500 to 500, respectively. Results are consistent with the view that positive/negative fNL affect cluster profiles due to an increase/decrease of cluster concentrations. At low values of fNL, as suggested by present Planck constraints on a scale invariant fNL, the impact on the scaling normalizations is only a few per cent. However, if fNL varies with scale, PNG may have larger amplitudes at clusters scales; thus, our results suggest that PNG should be taken into account when cluster data are used to infer or forecast cosmological parameters from existing or future cluster surveys.
High Speed White Dwarf Asteroseismology with the Herty Hall Cluster

Science.gov (United States)

Gray, Aaron; Kim, A.

2012-01-01

Asteroseismology is the process of using observed oscillations of stars to infer their interior structure. In high speed asteroseismology, we complete that by quickly computing hundreds of thousands of models to match the observed period spectra. Each model on a single processor takes five to ten seconds to run. Therefore, we use a cluster of sixteen Dell Workstations with dual-core processors. The computers use the Ubuntu operating system and Apache Hadoop software to manage workloads.
Using AFLP markers and the Geneland program for the inference of population genetic structure

DEFF Research Database (Denmark)

Guillot, Gilles; Santos, Filipe

2010-01-01

the computer program Geneland designed to infer population structure has been adapted to deal with dominant markers; and (ii) we use Geneland for numerical comparison of dominant and codominant markers to perform clustering. AFLP markers lead to less accurate results than bi-allelic codominant markers...... such as single nucleotide polymorphisms (SNP) markers but this difference becomes negligible for data sets of common size (number of individuals n≥100, number of markers L≥200). The latest Geneland version (3.2.1) handling dominant markers is freely available as an R package with a fully clickable graphical...
Inference

DEFF Research Database (Denmark)

Møller, Jesper

(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
MENENTUKAN PENERIMA KPS MENGGUNAKAN FUZZY INFERENCE SYSTEM METODE TSUKAMOTO

Directory of Open Access Journals (Sweden)

Sugianti .

2016-10-01

Full Text Available Social assistance programs launched by the Government, in particular the first Cluster program got more attention from the citizens of society. In order to reach out the objectivity and efficiency, determining of recipient households assistance program, we need a decision support system that allows the authorities villages / wards in decision making. In this study constructed a prototype system to define the poor household who receivet KPS using Fuzzy Inference System Tsukamoto method using 14 BPS’s criterias poverty. As the output of the system are a score of household, status on aid, and the number of villages / wards. The conclusion obtained in this study is the system can be run in accordance with the parameters specified poverty, able to adjust the poverty conditions of different regions poverty index.
Lower complexity bounds for lifted inference

DEFF Research Database (Denmark)

Jaeger, Manfred

2015-01-01

instances of the model. Numerous approaches for such “lifted inference” techniques have been proposed. While it has been demonstrated that these techniques will lead to significantly more efficient inference on some specific models, there are only very recent and still quite restricted results that show...... the feasibility of lifted inference on certain syntactically defined classes of models. Lower complexity bounds that imply some limitations for the feasibility of lifted inference on more expressive model classes were established earlier in Jaeger (2000; Jaeger, M. 2000. On the complexity of inference about...... that under the assumption that NETIME≠ETIME, there is no polynomial lifted inference algorithm for knowledge bases of weighted, quantifier-, and function-free formulas. Further strengthening earlier results, this is also shown to hold for approximate inference and for knowledge bases not containing...
Self-similar gravitational clustering

International Nuclear Information System (INIS)

Efstathiou, G.; Fall, S.M.; Hogan, C.

1979-01-01

The evolution of gravitational clustering is considered and several new scaling relations are derived for the multiplicity function. These include generalizations of the Press-Schechter theory to different densities and cosmological parameters. The theory is then tested against multiplicity function and correlation function estimates for a series of 1000-body experiments. The results are consistent with the theory and show some dependence on initial conditions and cosmological density parameter. The statistical significance of the results, however, is fairly low because of several small number effects in the experiments. There is no evidence for a non-linear bootstrap effect or a dependence of the multiplicity function on the internal dynamics of condensed groups. Empirical estimates of the multiplicity function by Gott and Turner have a feature near the characteristic luminosity predicted by the theory. The scaling relations allow the inference from estimates of the galaxy luminosity function that galaxies must have suffered considerable dissipation if they originally formed from a self-similar hierarchy. A method is also developed for relating the multiplicity function to similar measures of clustering, such as those of Bhavsar, for the distribution of galaxies on the sky. These are shown to depend on the luminosity function in a complicated way. (author)
Variations on Bayesian Prediction and Inference

Science.gov (United States)

2016-05-09

inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
The Next Generation Virgo Cluster Survey. VII. The Intrinsic Shapes of Low-luminosity Galaxies in the Core of the Virgo Cluster, and a Comparison with the Local Group

Science.gov (United States)

Sánchez-Janssen, Rubén; Ferrarese, Laura; MacArthur, Lauren A.; Côté, Patrick; Blakeslee, John P.; Cuillandre, Jean-Charles; Duc, Pierre-Alain; Durrell, Patrick; Gwyn, Stephen; McConnacchie, Alan W.; Boselli, Alessandro; Courteau, Stéphane; Emsellem, Eric; Mei, Simona; Peng, Eric; Puzia, Thomas H.; Roediger, Joel; Simard, Luc; Boyer, Fred; Santos, Matthew

2016-03-01

We investigate the intrinsic shapes of low-luminosity galaxies in the central 300 kpc of the Virgo Cluster using deep imaging obtained as part of the Next Generation Virgo Cluster Survey (NGVS). We build a sample of nearly 300 red-sequence cluster members in the yet-unexplored -14 families of triaxial models with normally distributed intrinsic ellipticities, E = 1 - C/A, and triaxialities, T = (A2 - B2)/(A2 - C2). We develop a Bayesian framework to explore the posterior distribution of the model parameters, which allows us to work directly on discrete data, and to account for individual, surface-brightness-dependent axis ratio uncertainties. For this population we infer a mean intrinsic ellipticity \\bar{E} = {0.43}-0.02+0.02 and a mean triaxiality \\bar{T} = {0.16}-0.06+0.07. This implies that faint Virgo galaxies are best described as a family of thick, nearly oblate spheroids with mean intrinsic axis ratios 1:0.94:0.57. The core of Virgo lacks highly elongated low-luminosity galaxies, with 95% of the population having q > 0.45. We additionally attempt a study of the intrinsic shapes of Local Group (LG) satellites of similar luminosities. For the LG population we infer a slightly larger mean intrinsic ellipticity \\bar{E} = {0.51}-0.06+0.07, and the paucity of objects with round apparent shapes translates into more triaxial mean shapes, 1:0.76:0.49. Numerical studies that follow the tidal evolution of satellites within LG-sized halos are in good agreement with the inferred shape distributions, but the mismatch for faint galaxies in Virgo highlights the need for more adequate simulations of this population in the cluster environment. We finally compare the intrinsic shapes of NGVS low-mass galaxies with samples of more massive quiescent systems, and with field, star-forming galaxies of similar luminosities. We find that the intrinsic flattening in this low-luminosity regime is almost independent of the environment in which the galaxy resides, but there is a hint
The effects of assembly bias on the inference of matter clustering from galaxy-galaxy lensing and galaxy clustering

Science.gov (United States)

McEwen, Joseph E.; Weinberg, David H.

2018-04-01

The combination of galaxy-galaxy lensing (GGL) and galaxy clustering is a promising route to measuring the amplitude of matter clustering and testing modified gravity theories of cosmic acceleration. Halo occupation distribution (HOD) modeling can extend the approach down to nonlinear scales, but galaxy assembly bias could introduce systematic errors by causing the HOD to vary with large scale environment at fixed halo mass. We investigate this problem using the mock galaxy catalogs created by Hearin & Watson (2013, HW13), which exhibit significant assembly bias because galaxy luminosity is tied to halo peak circular velocity and galaxy colour is tied to halo formation time. The preferential placement of galaxies (especially red galaxies) in older halos affects the cutoff of the mean occupation function for central galaxies, with halos in overdense regions more likely to host galaxies. The effect of assembly bias on the satellite galaxy HOD is minimal. We introduce an extended, environment dependent HOD (EDHOD) prescription to describe these results and fit galaxy correlation measurements. Crucially, we find that the galaxy-matter cross-correlation coefficient, rgm(r) ≡ ξgm(r) . [ξmm(r)ξgg(r)]-1/2, is insensitive to assembly bias on scales r ≳ 1 h^{-1} Mpc, even though ξgm(r) and ξgg(r) are both affected individually. We can therefore recover the correct ξmm(r) from the HW13 galaxy-galaxy and galaxy-matter correlations using either a standard HOD or EDHOD fitting method. For Mr ≤ -19 or Mr ≤ -20 samples the recovery of ξmm(r) is accurate to 2% or better. For a sample of red Mr ≤ -20 galaxies we achieve 2% recovery at r ≳ 2 h^{-1} Mpc with EDHOD modeling but lower accuracy at smaller scales or with a standard HOD fit. Most of our mock galaxy samples are consistent with rgm = 1 down to r = 1h-1Mpc, to within the uncertainties set by our finite simulation volume.
The effects of assembly bias on the inference of matter clustering from galaxy-galaxy lensing and galaxy clustering

Science.gov (United States)

McEwen, Joseph E.; Weinberg, David H.

2018-07-01

The combination of galaxy-galaxy lensing and galaxy clustering is a promising route to measuring the amplitude of matter clustering and testing modified gravity theories of cosmic acceleration. Halo occupation distribution (HOD) modelling can extend the approach down to non-linear scales, but galaxy assembly bias could introduce systematic errors by causing the HOD to vary with the large-scale environment at fixed halo mass. We investigate this problem using the mock galaxy catalogs created by Hearin & Watson (2013, HW13), which exhibit significant assembly bias because galaxy luminosity is tied to halo peak circular velocity and galaxy colour is tied to halo formation time. The preferential placement of galaxies (especially red galaxies) in older haloes affects the cutoff of the mean occupation function ⟨Ncen(Mmin)⟩ for central galaxies, with haloes in overdense regions more likely to host galaxies. The effect of assembly bias on the satellite galaxy HOD is minimal. We introduce an extended, environment-dependent HOD (EDHOD) prescription to describe these results and fit galaxy correlation measurements. Crucially, we find that the galaxy-matter cross-correlation coefficient, rgm(r) ≡ ξgm(r) . [ξmm(r)ξgg(r)]-1/2, is insensitive to assembly bias on scales r ≳ 1 h-1 Mpc, even though ξgm(r) and ξgg(r) are both affected individually. We can therefore recover the correct ξmm(r) from the HW13 galaxy-galaxy and galaxy-matter correlations using either a standard HOD or EDHOD fitting method. For Mr ≤ -19 or Mr ≤ -20 samples the recovery of ξmm(r) is accurate to 2 per cent or better. For a sample of red Mr ≤ -20 galaxies, we achieve 2 per cent recovery at r ≳ 2 h-1 Mpc with EDHOD modelling but lower accuracy at smaller scales or with a standard HOD fit. Most of our mock galaxy samples are consistent with rgm = 1 down to r = 1 h-1 Mpc, to within the uncertainties set by our finite simulation volume.
Scalable inference for stochastic block models

KAUST Repository

Peng, Chengbin

2017-12-08

Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.
Active learning for semi-supervised clustering based on locally linear propagation reconstruction.

Science.gov (United States)

Chang, Chin-Chun; Lin, Po-Yi

2015-03-01

The success of semi-supervised clustering relies on the effectiveness of side information. To get effective side information, a new active learner learning pairwise constraints known as must-link and cannot-link constraints is proposed in this paper. Three novel techniques are developed for learning effective pairwise constraints. The first technique is used to identify samples less important to cluster structures. This technique makes use of a kernel version of locally linear embedding for manifold learning. Samples neither important to locally linear propagation reconstructions of other samples nor on flat patches in the learned manifold are regarded as unimportant samples. The second is a novel criterion for query selection. This criterion considers not only the importance of a sample to expanding the space coverage of the learned samples but also the expected number of queries needed to learn the sample. To facilitate semi-supervised clustering, the third technique yields inferred must-links for passing information about flat patches in the learned manifold to semi-supervised clustering algorithms. Experimental results have shown that the learned pairwise constraints can capture the underlying cluster structures and proven the feasibility of the proposed approach. Copyright © 2014 Elsevier Ltd. All rights reserved.
Fractional Yields Inferred from Halo and Thick Disk Stars

Science.gov (United States)

Caimmi, R.

2013-12-01

Linear [Q/H]-[O/H] relations, Q = Na, Mg, Si, Ca, Ti, Cr, Fe, Ni, are inferred from a sample (N=67) of recently studied FGK-type dwarf stars in the solar neighbourhood including different populations (Nissen and Schuster 2010, Ramirez et al. 2012), namely LH (N=24, low-α halo), HH (N=25, high-α halo), KD (N=16, thick disk), and OL (N=2, globular cluster outliers). Regression line slope and intercept estimators and related variance estimators are determined. With regard to the straight line, [Q/H]=a_{Q}[O/H]+b_{Q}, sample stars are displayed along a "main sequence", [Q,O] = [a_{Q},b_{Q},Δ b_{Q}], leaving aside the two OL stars, which, in most cases (e.g. Na), lie outside. The unit slope, a_{Q}=1, implies Q is a primary element synthesised via SNII progenitors in the presence of a universal stellar initial mass function (defined as simple primary element). In this respect, Mg, Si, Ti, show hat a_{Q}=1 within ∓2hatσ_ {hat a_{Q}}; Cr, Fe, Ni, within ∓3hatσ_{hat a_{Q}}; Na, Ca, within ∓ rhatσ_{hat a_{Q}}, r>3. The empirical, differential element abundance distributions are inferred from LH, HH, KD, HA = HH + KD subsamples, where related regression lines represent their theoretical counterparts within the framework of simple MCBR (multistage closed box + reservoir) chemical evolution models. Hence, the fractional yields, hat{p}_{Q}/hat{p}_{O}, are determined and (as an example) a comparison is shown with their theoretical counterparts inferred from SNII progenitor nucleosynthesis under the assumption of a power-law stellar initial mass function. The generalized fractional yields, C_{Q}=Z_{Q}/Z_{O}^{a_{Q}}, are determined regardless of the chemical evolution model. The ratio of outflow to star formation rate is compared for different populations in the framework of simple MCBR models. The opposite situation of element abundance variation entirely due to cosmic scatter is also considered under reasonable assumptions. The related differential element abundance
Magnetic field gradients inferred from multi-point measurements of Cluster FGM and EDI

Science.gov (United States)

Teubenbacher, Robert; Nakamura, Rumi; Giner, Lukas; Plaschke, Ferdinand; Baumjohann, Wolfgang; Magnes, Werner; Eichelberger, Hans; Steller, Manfred; Torbert, Roy

2013-04-01

We use Cluster data from fluxgate magnetometer (FGM) and electron drift instrument (EDI) to determine the magnetic field gradients in the near-Earth magnetotail. Here we use the magnetic field data from FGM measurements as well as the gyro-time data of electrons determined from the time of flight measurements of EDI. The results are compared with the values estimated from empirical magnetic field models for different magnetospheric conditions. We also estimated the spin axis offset of FGM based on comparison between EDI and FGM data and discuss the possible effect in determining the current sheet characteristics.
Adaptive Inference on General Graphical Models

OpenAIRE

Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur

2012-01-01

Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...

The inference from a single case: moral versus scientific inferences in implementing new biotechnologies.

Science.gov (United States)

Hofmann, B

2008-06-01

Are there similarities between scientific and moral inference? This is the key question in this article. It takes as its point of departure an instance of one person's story in the media changing both Norwegian public opinion and a brand-new Norwegian law prohibiting the use of saviour siblings. The case appears to falsify existing norms and to establish new ones. The analysis of this case reveals similarities in the modes of inference in science and morals, inasmuch as (a) a single case functions as a counter-example to an existing rule; (b) there is a common presupposition of stability, similarity and order, which makes it possible to reason from a few cases to a general rule; and (c) this makes it possible to hold things together and retain order. In science, these modes of inference are referred to as falsification, induction and consistency. In morals, they have a variety of other names. Hence, even without abandoning the fact-value divide, there appear to be similarities between inference in science and inference in morals, which may encourage communication across the boundaries between "the two cultures" and which are relevant to medical humanities.
QSdpR: Viral quasispecies reconstruction via correlation clustering.

Science.gov (United States)

Barik, Somsubhra; Das, Shreepriya; Vikalo, Haris

2017-12-19

RNA viruses are characterized by high mutation rates that give rise to populations of closely related genomes, known as viral quasispecies. Underlying heterogeneity enables the quasispecies to adapt to changing conditions and proliferate over the course of an infection. Determining genetic diversity of a virus (i.e., inferring haplotypes and their proportions in the population) is essential for understanding its mutation patterns, and for effective drug developments. Here, we present QSdpR, a method and software for the reconstruction of quasispecies from short sequencing reads. The reconstruction is achieved by solving a correlation clustering problem on a read-similarity graph and the results of the clustering are used to estimate frequencies of sub-species; the number of sub-species is determined using pseudo F index. Extensive tests on both synthetic datasets and experimental HIV-1 and Zika virus data demonstrate that QSdpR compares favorably to existing methods in terms of various performance metrics. Copyright © 2018 Elsevier Inc. All rights reserved.
Stability of maximum-likelihood-based clustering methods: exploring the backbone of classifications

International Nuclear Information System (INIS)

Mungan, Muhittin; Ramasco, José J

2010-01-01

Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or maximum likelihood approaches. In this paper, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an expectation maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors' group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks
Introductory statistical inference

CERN Document Server

Mukhopadhyay, Nitis

2014-01-01

This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist
Dynamics of star clusters

International Nuclear Information System (INIS)

Goodman, J.; Hut, P.

1985-01-01

The enigma of core collapse receives much attention in this volume. In addition, several observational papers summarize recent techniques and results and discuss the stellar dynamical implications of the enormous progress in the quality of surface photometry, proper motion studies, radial velocity determinations, as well as space-based measurements in a variety of wavelengths. The value of these Proceedings as a standard reference work is enhanced by the inclusion of two appendices, featuring English translations of two seminal papers on stellar dynamics published in Russian and not previously available in a Western language. A third appendix contains an up-to-date catalogue of observationally determined parameters of galactic globular clusters, as well as theoretically inferred parameters. This catalogue will prove to be an essential reference for phenomenonological studies and an ideal testing ground for new theoretical developments. (orig.)
Merging history of three bimodal clusters

Science.gov (United States)

Maurogordato, S.; Sauvageot, J. L.; Bourdin, H.; Cappi, A.; Benoist, C.; Ferrari, C.; Mars, G.; Houairi, K.

2011-01-01

We present a combined X-ray and optical analysis of three bimodal galaxy clusters selected as merging candidates at z ~ 0.1. These targets are part of MUSIC (MUlti-Wavelength Sample of Interacting Clusters), which is a general project designed to study the physics of merging clusters by means of multi-wavelength observations. Observations include spectro-imaging with XMM-Newton EPIC camera, multi-object spectroscopy (260 new redshifts), and wide-field imaging at the ESO 3.6 m and 2.2 m telescopes. We build a global picture of these clusters using X-ray luminosity and temperature maps together with galaxy density and velocity distributions. Idealized numerical simulations were used to constrain the merging scenario for each system. We show that A2933 is very likely an equal-mass advanced pre-merger ~200 Myr before the core collapse, while A2440 and A2384 are post-merger systems (~450 Myr and ~1.5 Gyr after core collapse, respectively). In the case of A2384, we detect a spectacular filament of galaxies and gas spreading over more than 1 h-1 Mpc, which we infer to have been stripped during the previous collision. The analysis of the MUSIC sample allows us to outline some general properties of merging clusters: a strong luminosity segregation of galaxies in recent post-mergers; the existence of preferential axes - corresponding to the merging directions - along which the BCGs and structures on various scales are aligned; the concomitance, in most major merger cases, of secondary merging or accretion events, with groups infalling onto the main cluster, and in some cases the evidence of previous merging episodes in one of the main components. These results are in good agreement with the hierarchical scenario of structure formation, in which clusters are expected to form by successive merging events, and matter is accreted along large-scale filaments. Based on data obtained with the European Southern Observatory, Chile (programs 072.A-0595, 075.A-0264, and 079.A-0425
Performance quantification of clustering algorithms for false positive removal in fMRI by ROC curves

Directory of Open Access Journals (Sweden)

André Salles Cunha Peres

Full Text Available Abstract Introduction Functional magnetic resonance imaging (fMRI is a non-invasive technique that allows the detection of specific cerebral functions in humans based on hemodynamic changes. The contrast changes are about 5%, making visual inspection impossible. Thus, statistic strategies are applied to infer which brain region is engaged in a task. However, the traditional methods like general linear model and cross-correlation utilize voxel-wise calculation, introducing a lot of false-positive data. So, in this work we tested post-processing cluster algorithms to diminish the false-positives. Methods In this study, three clustering algorithms (the hierarchical cluster, k-means and self-organizing maps were tested and compared for false-positive removal in the post-processing of cross-correlation analyses. Results Our results showed that the hierarchical cluster presented the best performance to remove the false positives in fMRI, being 2.3 times more accurate than k-means, and 1.9 times more accurate than self-organizing maps. Conclusion The hierarchical cluster presented the best performance in false-positive removal because it uses the inconsistency coefficient threshold, while k-means and self-organizing maps utilize a priori cluster number (centroids and neurons number; thus, the hierarchical cluster avoids clustering scattered voxels, as the inconsistency coefficient threshold allows only the voxels to be clustered that are at a minimum distance to some cluster.
Active inference, communication and hermeneutics.

Science.gov (United States)

Friston, Karl J; Frith, Christopher D

2015-07-01

Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Core condensation in heavy halos: a two-stage theory for galaxy formation and clustering

Energy Technology Data Exchange (ETDEWEB)

White, S D.M.; Rees, M J [Cambridge Univ. Inst. of Astronomy (UK)

1978-05-01

It is suggested that most of the material in the Universe condensed at an early epoch into small 'dark' objects. Irrespective of their nature, these objects must subsequently have undergone hierarchical clustering, whose present scale is inferred from the large-scale distribution of galaxies. As each stage of the hierarchy forms and collapses, relaxation effects wipe out its substructure, and to a self-similar distribution of bound masses. The entire luminous content of galaxies, however, results from the cooling and fragmentation of residual gas within the transient potential wells provided by the dark matter. Every galaxy thus forms as a concentrated luminous core embedded in an extensive dark halo. The observed sizes of galaxies and their survival through later stages of the hierarchy seem inexplicable without invoking substantial dissipation; this dissipation allows the galaxies to become sufficiently concentrated to survive the disruption of their halos in groups and clusters of galaxies. A specific model is proposed in which ..cap omega.. approximately equals 0.2, the dark matter makes up 80 per cent of the total mass, and half the residual gas has been converted into luminous galaxies by the present time. This model is consistent with the inferred proportions of dark matter and gas in rich clusters, with the observed luminosity density of the Universe and with the observed radii of galaxies; further, it predicts the characteristic luminosities of bright galaxies can give a luminosity function of the observed shape.
The effect of redshift-space distortions on projected 2-pt clustering measurements

OpenAIRE

Nock, Kelly; Percival, Will J.; Ross, Ashley J.

2010-01-01

Although redshift-space distortions only affect inferred distances and not angles, they still distort the projected angular clustering of galaxy samples selected using redshift dependent quantities. From an Eulerian view-point, this effect is caused by the apparent movement of galaxies into or out of the sample. From a Lagrangian view-point, we find that projecting the redshift-space overdensity field over a finite radial distance does not remove all the anisotropic distortions. We investigat...
THE DISCOVERY OF A MASSIVE CLUSTER OF RED SUPERGIANTS WITH GLIMPSE

International Nuclear Information System (INIS)

Alexander, Michael J.; Kobulnicky, Henry A.; Clemens, Dan P.; Jameson, Katherine; Pinnick, April; Pavel, Michael

2009-01-01

We report the discovery of a previously unknown massive Galactic star cluster at l = 29. 0 22, b = -0. 0 20. Identified visually in mid-IR images from the Spitzer GLIMPSE survey, the cluster contains at least eight late-type supergiants, based on follow-up near-IR spectroscopy, and an additional 3-6 candidate supergiant members having IR photometry consistent with a similar distance and reddening. The cluster lies at a local minimum in the 13 CO column density and 8 μm emission. We interpret this feature as a hole carved by the energetic winds of the evolving massive stars. The 13 CO hole seen in molecular maps at V LSR ∼ 95 km s -1 corresponds to near/far kinematic distances of 6.1/8.7 ± 1 kpc. We calculate a mean spectrophotometric distance of 7.0 +3.7 -2.4 kpc, broadly consistent with the kinematic distances inferred. This location places it near the northern end of the Galactic bar. For the mean extinction of A V = 12.6 ± 0.5 mag (A K = 1.5 ± 0.1 mag), the color-magnitude diagram of probable cluster members is well fit by isochrones in the age range 18-24 Myr. The estimated cluster mass is ∼20,000 M sun . With the most massive original cluster stars likely deceased, no strong radio emission is detected in this vicinity. As such, this red supergiant (RSG) cluster is representative of adolescent massive Galactic clusters that lie hidden behind many magnitudes of dust obscuration. This cluster joins two similar RSG clusters as residents of the volatile region where the end of our Galaxy's bar joins the base of the Scutum-Crux spiral arm, suggesting a recent episode of widespread massive star formation there.
Inferring reputation promotes the evolution of cooperation in spatial social dilemma games.

Directory of Open Access Journals (Sweden)

Zhen Wang

Full Text Available In realistic world individuals with high reputation are more likely to influence the collective behaviors. Due to the cost and error of information dissemination, however, it is unreasonable to assign each individual with a complete cognitive power, which means that not everyone can accurately realize others' reputation situation. Here we introduce the mechanism of inferring reputation into the selection of potential strategy sources to explore the evolution of cooperation. Before the game each player is assigned with a randomly distributed parameter p denoting his ability to infer the reputation of others. The parameter p of each individual is kept constant during the game. The value of p indicates that the neighbor possessing highest reputation is chosen with the probability p and randomly choosing an opponent is left with the probability 1-p. We find that this novel mechanism can be seen as an universally applicable promoter of cooperation, which works on various interaction networks and in different types of evolutionary game. Of particular interest is the fact that, in the early stages of evolutionary process, cooperators with high reputation who are easily regarded as the potential strategy donors can quickly lead to the formation of extremely robust clusters of cooperators that are impervious to defector attacks. These clusters eventually help cooperators reach their undisputed dominance, which transcends what can be warranted by the spatial reciprocity alone. Moreover, we provide complete phase diagrams to depict the impact of uncertainty in strategy adoptions and conclude that the effective interaction topology structure may be altered under such a mechanism. When the estimation of reputation is extended, we also show that the moderate value of evaluation factor enables cooperation to thrive best. We thus present a viable method of understanding the ubiquitous cooperative behaviors in nature and hope that it will inspire further studies
Velocity-based movement modeling for individual and population level inference.

Directory of Open Access Journals (Sweden)

Ephraim M Hanks

Full Text Available Understanding animal movement and resource selection provides important information about the ecology of the animal, but an animal's movement and behavior are not typically constant in time. We present a velocity-based approach for modeling animal movement in space and time that allows for temporal heterogeneity in an animal's response to the environment, allows for temporal irregularity in telemetry data, and accounts for the uncertainty in the location information. Population-level inference on movement patterns and resource selection can then be made through cluster analysis of the parameters related to movement and behavior. We illustrate this approach through a study of northern fur seal (Callorhinus ursinus movement in the Bering Sea, Alaska, USA. Results show sex differentiation, with female northern fur seals exhibiting stronger response to environmental variables.
GAMMA RAYS FROM STAR FORMATION IN CLUSTERS OF GALAXIES

International Nuclear Information System (INIS)

Storm, Emma M.; Jeltema, Tesla E.; Profumo, Stefano

2012-01-01

Star formation in galaxies is observed to be associated with gamma-ray emission, presumably from non-thermal processes connected to the acceleration of cosmic-ray nuclei and electrons. The detection of gamma rays from starburst galaxies by the Fermi Large Area Telescope (LAT) has allowed the determination of a functional relationship between star formation rate and gamma-ray luminosity. Since star formation is known to scale with total infrared (8-1000 μm) and radio (1.4 GHz) luminosity, the observed infrared and radio emission from a star-forming galaxy can be used to quantitatively infer the galaxy's gamma-ray luminosity. Similarly, star-forming galaxies within galaxy clusters allow us to derive lower limits on the gamma-ray emission from clusters, which have not yet been conclusively detected in gamma rays. In this study, we apply the functional relationships between gamma-ray luminosity and radio and IR luminosities of galaxies derived by the Fermi Collaboration to a sample of the best candidate galaxy clusters for detection in gamma rays in order to place lower limits on the gamma-ray emission associated with star formation in galaxy clusters. We find that several clusters have predicted gamma-ray emission from star formation that are within an order of magnitude of the upper limits derived in Ackermann et al. based on non-detection by Fermi-LAT. Given the current gamma-ray limits, star formation likely plays a significant role in the gamma-ray emission in some clusters, especially those with cool cores. We predict that both Fermi-LAT over the course of its lifetime and the future Cerenkov Telescope Array will be able to detect gamma-ray emission from star-forming galaxies in clusters.
Optimization methods for logical inference

CERN Document Server

Chandru, Vijay

2011-01-01

Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in
THE REDSHIFT EVOLUTION OF THE MEAN TEMPERATURE, PRESSURE, AND ENTROPY PROFILES IN 80 SPT-SELECTED GALAXY CLUSTERS

Energy Technology Data Exchange (ETDEWEB)

McDonald, M.; Benson, B. A.; Vikhlinin, A.; Aird, K. A.; Allen, S. W.; Bautz, M.; Bayliss, M.; Bleem, L. E.; Bocquet, S.; Brodwin, M.; Carlstrom, J. E.; Chang, C. L.; Cho, H. M.; Clocchiatti, A.; Crawford, T. M.; Crites, A. T.; de Haan, T.; Dobbs, M. A.; Foley, R. J.; Forman, W. R.; George, E. M.; Gladders, M. D.; Gonzalez, A. H.; Halverson, N. W.; Hlavacek-Larrondo, J.; Holder, G. P.; Holzapfel, W. L.; Hrubes, J. D.; Jones, C.; Keisler, R.; Knox, L.; Lee, A. T.; Leitch, E. M.; Liu, J.; Lueker, M.; Luong-Van, D.; Mantz, A.; Marrone, D. P.; McMahon, J. J.; Meyer, S. S.; Miller, E. D.; Mocanu, L.; Mohr, J. J.; Murray, S. S.; Padin, S.; Pryke, C.; Reichardt, C. L.; Rest, A.; Ruhl, J. E.; Saliwanchik, B. R.; Saro, A.; Sayre, J. T.; Schaffer, K. K.; Shirokoff, E.; Spieler, H. G.; Stalder, B.; Stanford, S. A.; Staniszewski, Z.; Stark, A. A.; Story, K. T.; Stubbs, C. W.; Vanderlinde, K.; Vieira, J. D.; Williamson, R.; Zahn, O.; Zenteno, A.

2014-09-24

We present the results of an X-ray analysis of 80 galaxy clusters selected in the 2500 deg(2) South Pole Telescope survey and observed with the Chandra X-ray Observatory. We divide the full sample into subsamples of ~20 clusters based on redshift and central density, performing a joint X-ray spectral fit to all clusters in a subsample simultaneously, assuming self-similarity of the temperature profile. This approach allows us to constrain the shape of the temperature profile over 0 < r < 1.5R (500), which would be impossible on a per-cluster basis, since the observations of individual clusters have, on average, 2000 X-ray counts. The results presented here represent the first constraints on the evolution of the average temperature profile from z = 0 to z = 1.2. We find that high-z (0.6 < z < 1.2) clusters are slightly (~30%) cooler both in the inner (r < 0.1R (500)) and outer (r > R (500)) regions than their low-z (0.3 < z < 0.6) counterparts. Combining the average temperature profile with measured gas density profiles from our earlier work, we infer the average pressure and entropy profiles for each subsample. Confirming earlier results from this data set, we find an absence of strong cool cores at high z, manifested in this analysis as a significantly lower observed pressure in the central 0.1R (500) of the high-z cool-core subset of clusters compared to the low-z cool-core subset. Overall, our observed pressure profiles agree well with earlier lower-redshift measurements, suggesting minimal redshift evolution in the pressure profile outside of the core. We find no measurable redshift evolution in the entropy profile at r lsim 0.7R (500)—this may reflect a long-standing balance between cooling and feedback over long timescales and large physical scales. We observe a slight flattening of the entropy profile at r gsim R (500) in our high-z subsample. This flattening is consistent with a temperature bias due to the enhanced (~3×) rate at which group-mass (~2�
Inference in `poor` languages

Energy Technology Data Exchange (ETDEWEB)

Petrov, S.

1996-10-01

Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.
EI: A Program for Ecological Inference

Directory of Open Access Journals (Sweden)

Gary King

2004-09-01

Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.
Differential Retention of Gene Functions in a Secondary Metabolite Cluster.

Science.gov (United States)

Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W

2017-08-01

In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Fast Bayesian Inference in Dirichlet Process Mixture Models.

Science.gov (United States)

Wang, Lianming; Dunson, David B

2011-01-01

There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.

LASSIM-A network inference toolbox for genome-wide mechanistic modeling.

Directory of Open Access Journals (Sweden)

Rasmus Magnusson

2017-06-01

Full Text Available Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM, which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE for gene regulatory networks (GRNs. LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models
On the criticality of inferred models

Science.gov (United States)

Mastromatteo, Iacopo; Marsili, Matteo

2011-10-01

Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality.
On the criticality of inferred models

International Nuclear Information System (INIS)

Mastromatteo, Iacopo; Marsili, Matteo

2011-01-01

Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality
An Inference Language for Imaging

DEFF Research Database (Denmark)

Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen

2014-01-01

We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...
The age distributions of clusters and field stars in the Small Magellanic Cloud — implications for star formation histories

NARCIS (Netherlands)

Kruijssen, J.M.D.|info:eu-repo/dai/nl/325799911; Lamers, H.J.G.L.M.|info:eu-repo/dai/nl/072834870

2008-01-01

Differences between the inferred star formation histories (SFHs) of star clusters and field stars seem to suggest distinct star formation processes for the two. The Small Magellanic Cloud (SMC) is an example of a galaxy where such a discrepancy is observed. We model the observed age distributions of
Using Blue Stragglers to Predict Retained Black Hole Population in Globular Clusters

Science.gov (United States)

Hermanek, Keith; Chatterjee, Sourav; Rasio, Frederic

2018-01-01

Large numbers of black holes (BHs) are expected to form in massive star clusters typical of the globular clusters (GCs). Sophisticated theoretical models suggest that many of these BHs can be retained in present-day GCs. Observations have also identified several BH candidates in Galactic and extragalactic GCs (e.g., Macarone et al. 2007; Irwin et al. 2010; Strader et al. 2012; Chomiuk et al. 2013; Miller-Jones et al. 2014). It has also been shown that high-mass and high-density clusters such as GCs are efficient factories of merging binary BHs similar to those observed by the LIGO observatories (Abbott et al. 2016a,b,c,d,e; Rodriguez et al. 2016). Understanding the formation rate and properties of binary BHs are dependent on a detailed understanding of how the BHs dynamically evolve within GCs. Nevertheless, directly detecting BHs in GCs is extremely challenging; BHs only in binaries with limited configurations can be directly detected by the detection of gravitational wave, X-ray, or radio emissions. We propose an indirect of inferring the number of undetected retained BHs in a GC by investigating the dynamical effects of a large number of BHs on the production of other tracer populations such as Blue Straggler Stars (BSS). Using a large grid of detailed GC models we show that there is a clear anti-correlation between the number of BSS in a cluster and the number of retained BHs. Being the most massive species, large numbers of retained BHs will dominate the core of the cluster as a result of mass-segregation driving away other low-mass species such as main-sequence stars from central high-density regions. BSS are expected to form from physical collisions between main-sequence (MS) stars mediated by binary encounters (e.g., Chatterjee et al. 2013) in cores of GCs. Production of BSS by collisions or mass transfer channels are suppressed if a large number of retained BHs in a cluster restrict the number of MS stars in the core. Extensive observational data exist on
Inference

DEFF Research Database (Denmark)

Møller, Jesper

2010-01-01

Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
Does cluster loading enhance lower body power development in preseason preparation of elite rugby union players?

Science.gov (United States)

Hansen, Keir T; Cronin, John B; Pickering, Stuart L; Newton, Michael J

2011-08-01

The purpose of this study was to ascertain whether cluster training led to improved power training adaptations in the preseason preparation of elite level rugby union players. Eighteen highly trained athletes were divided into 2 training groups, a traditional training (TT, N = 9) group and a cluster training (CT, N = 9) group before undertaking 8 weeks of lower body resistance training. Force-velocity-power profiling in the jump squat movement was undertaken, and maximum strength was assessed in the back squat before and after the training intervention. Two-way analysis of variance and magnitude-based inferences were used to assess changes in maximum strength and force, velocity, and power values pretraining to posttraining. Both TT and CT groups significantly (p benefit of cluster type loading in training prescription for lower body power development.
Feature Inference Learning and Eyetracking

Science.gov (United States)

Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.

2009-01-01

Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…
Fractal properties of percolation clusters in Euclidian neural networks

International Nuclear Information System (INIS)

Franovic, Igor; Miljkovic, Vladimir

2009-01-01

The process of spike packet propagation is observed in two-dimensional recurrent networks, consisting of locally coupled neuron pools. Local population dynamics is characterized by three key parameters - probability for pool connectedness, synaptic strength and neuron refractoriness. The formation of dynamic attractors in our model, synfire chains, exhibits critical behavior, corresponding to percolation phase transition, with probability for non-zero synaptic strength values representing the critical parameter. Applying the finite-size scaling method, we infer a family of critical lines for various synaptic strengths and refractoriness values, and determine the Hausdorff-Besicovitch fractal dimension of the percolation clusters.
Forward and backward inference in spatial cognition.

Directory of Open Access Journals (Sweden)

Will D Penny

Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.
The structure of the nuclear stellar cluster of the Milky Way

International Nuclear Information System (INIS)

Schoedel, Rainer; Eckart, Andreas

2006-01-01

The structure of the nuclear stellar cluster of the Milky Way is of particular interest because it is the densest stellar cluster in our Galaxy, where the theoretical prediction of the formation of a stellar cusp around the central supermassive black hole, Sagittarius A* (Sgr A*) can be examined. We present high-resolution adaptive optics observations with multiple intermediate band liters of the inner ∼20'' around Sgr A*. From the images, stellar number counts and a detailed map of the interstellar extinction toward the central 0.5 pc of the Milky Way were determined. The extinction map is consistent with a putative southwest-northeast aligned outfbw from the central arcseconds. An azimuthally averaged, crowding and extinction corrected stellar density profle presents clear evidence for the existence of a stellar cusp around Sgr A*. We show that the profle of the surface brightness density is dominated by the brightest stars in the central arcseconds and is different from the shape of the stellar cluster as inferred from the number counts. Several density peaks found in the cluster may indicate clumping, possibly related to the last epoch of star formation in the Galactic Center. There is evidence for a common proper motion of the stars in one of these clumps
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

Science.gov (United States)

Ferrari, Alberto; Comelli, Mario

2016-12-01

In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Bayesian Predictive Inference of a Proportion Under a Twofold Small-Area Model

Directory of Open Access Journals (Sweden)

Nandram Balgobin

2016-03-01

Full Text Available We extend the twofold small-area model of Stukel and Rao (1997; 1999 to accommodate binary data. An example is the Third International Mathematics and Science Study (TIMSS, in which pass-fail data for mathematics of students from US schools (clusters are available at the third grade by regions and communities (small areas. We compare the finite population proportions of these small areas. We present a hierarchical Bayesian model in which the firststage binary responses have independent Bernoulli distributions, and each subsequent stage is modeled using a beta distribution, which is parameterized by its mean and a correlation coefficient. This twofold small-area model has an intracluster correlation at the first stage and an intercluster correlation at the second stage. The final-stage mean and all correlations are assumed to be noninformative independent random variables. We show how to infer the finite population proportion of each area. We have applied our models to synthetic TIMSS data to show that the twofold model is preferred over a onefold small-area model that ignores the clustering within areas. We further compare these models using a simulation study, which shows that the intracluster correlation is particularly important.
Novel Method To Identify Source-Associated Phylogenetic Clustering Shows that Listeria monocytogenes Includes Niche-Adapted Clonal Groups with Distinct Ecological Preferences

DEFF Research Database (Denmark)

Nightingale, K. K.; Lyles, K.; Ayodele, M.

2006-01-01

population are identified (TreeStats test). Analysis of sequence data for 120 L. monocytogenes isolates revealed evidence of clustering between isolates from the same source, based on the phylogenies inferred from actA and inlA (P = 0.02 and P = 0.07, respectively; SourceCluster test). Overall, the Tree...... are biologically valid. Overall, our data show that (i) the SourceCluster and TreeStats tests can identify biologically meaningful source-associated phylogenetic clusters and (ii) L. monocytogenes includes clonal groups that have adapted to infect specific host species or colonize nonhost environments......., including humans, animals, and food. If the null hypothesis that the genetic distances for isolates within and between source populations are identical can be rejected (SourceCluster test), then particular clades in the phylogenetic tree with significant overrepresentation of sequences from a given source...
A formal model of interpersonal inference

Directory of Open Access Journals (Sweden)

Michael eMoutoussis

2014-03-01

Full Text Available Introduction: We propose that active Bayesian inference – a general framework for decision-making – can equally be applied to interpersonal exchanges. Social cognition, however, entails special challenges. We address these challenges through a novel formulation of a formal model and demonstrate its psychological significance. Method: We review relevant literature, especially with regards to interpersonal representations, formulate a mathematical model and present a simulation study. The model accommodates normative models from utility theory and places them within the broader setting of Bayesian inference. Crucially, we endow people's prior beliefs, into which utilities are absorbed, with preferences of self and others. The simulation illustrates the model's dynamics and furnishes elementary predictions of the theory. Results: 1. Because beliefs about self and others inform both the desirability and plausibility of outcomes, in this framework interpersonal representations become beliefs that have to be actively inferred. This inference, akin to 'mentalising' in the psychological literature, is based upon the outcomes of interpersonal exchanges. 2. We show how some well-known social-psychological phenomena (e.g. self-serving biases can be explained in terms of active interpersonal inference. 3. Mentalising naturally entails Bayesian updating of how people value social outcomes. Crucially this includes inference about one’s own qualities and preferences. Conclusion: We inaugurate a Bayes optimal framework for modelling intersubject variability in mentalising during interpersonal exchanges. Here, interpersonal representations are endowed with explicit functional and affective properties. We suggest the active inference framework lends itself to the study of psychiatric conditions where mentalising is distorted.
Inferring Social Functions Available in the Metro Station Area from Passengers’ Staying Activities in Smart Card Data

Directory of Open Access Journals (Sweden)

Yang Zhou

2017-12-01

Full Text Available The function of a metro station area is vital for city planners to consider when establishing a context-aware Transit-Oriented Development policy around the station area. However, the functions of metro station areas are hard to infer using the static land use distribution and other traditional survey datasets. In this paper, we propose a method to infer the functions occurring around the metro station catchment areas according to the patterns of staying activities derived from smart card data. We first define the staying activities by the spatial and temporal constraints of the two consecutive alighting and boarding records from the individual travel profile. Then we cluster and label the whole staying activities by considering the features of duration, frequency, and start time. By analyzing the percentage of different types of aggregated activities happening around each metro station, we cluster and explore the functions of the metro station area. Taking Wuhan as a case study, we analyze the results of Wuhan metro systems and discuss the similarities and differences between the functions and the land use distribution around the station area. The results show that although there exist some agreements, there is also a gap between the human activities and the land uses around the station area. These findings could give us deeper insight into how people act around the stations by metro systems, which will ultimately benefit the urban planning and policy development.
Nucleation of Small Silicon Carbide Dust Clusters in AGB Stars

Energy Technology Data Exchange (ETDEWEB)

Gobrecht, David; Cristallo, Sergio; Piersanti, Luciano [Osservatorio Astronomico di Teramo, INAF, I-64100 Teramo (Italy); Bromley, Stefan T. [Departament de Cincia de Materials i Química Fisica and Institut de Química Terica i Computacional (IQTCUB),Universitat de Barcelona, E-08028 Barcelona (Spain)

2017-05-10

Silicon carbide (SiC) grains are a major dust component in carbon-rich asymptotic giant branch stars. However, the formation pathways of these grains are not fully understood. We calculate ground states and energetically low-lying structures of (SiC){sub n}, n = 1, 16 clusters by means of simulated annealing and Monte Carlo simulations of seed structures and subsequent quantum-mechanical calculations on the density functional level of theory. We derive the infrared (IR) spectra of these clusters and compare the IR signatures to observational and laboratory data. According to energetic considerations, we evaluate the viability of SiC cluster growth at several densities and temperatures, characterizing various locations and evolutionary states in circumstellar envelopes. We discover new, energetically low-lying structures for Si{sub 4}C{sub 4}, Si{sub 5}C{sub 5}, Si{sub 15}C{sub 15}, and Si{sub 16}C{sub 16} and new ground states for Si{sub 10}C{sub 10} and Si{sub 15}C{sub 15}. The clusters with carbon-segregated substructures tend to be more stable by 4–9 eV than their bulk-like isomers with alternating Si–C bonds. However, we find ground states with cage geometries resembling buckminsterfullerens (“bucky-like”) for Si{sub 12}C{sub 12} and Si{sub 16}C{sub 16} and low-lying stable cage structures for n ≥ 12. The latter findings thus indicate a regime of cluster sizes that differ from small clusters as well as from large-scale crystals. Thus—and owing to their stability and geometry—the latter clusters may mark a transition from a quantum-confined cluster regime to a crystalline, solid bulk-material. The calculated vibrational IR spectra of the ground-state SiC clusters show significant emission. They include the 10–13 μ m wavelength range and the 11.3 μm feature inferred from laboratory measurements and observations, respectively, although the overall intensities are rather low.
Distributional Inference

NARCIS (Netherlands)

Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.

1995-01-01

The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is
Inferring Groups of Objects, Preferred Routes, and Facility Locations from Trajectories

DEFF Research Database (Denmark)

Ceikute, Vaida

(i) infer groups of objects traveling together, (ii) determine routes preferred by local drivers, and (iii) identify attractive facility locations. First, we present framework that efficiently supports online discovery of groups of moving objects that travel together. We adopt a sampling......-independent approach that makes no assumptions about when object positions are sampled and that supports the use of approximate trajectories. The framework’s algorithms exploit density-based clustering to identify groups. Such identified groups are scored based on cardinality and duration. With the use of domination...... and similarity notions, groups of low interest are pruned, and a variety of different, interesting groups are returned. Results from empirical studies with real and synthetic data offer insight into the effectiveness and efficiency of the proposed framework. Next, we view GPS trajectories as trips that represent...

Continuous Integrated Invariant Inference, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...
Estimating uncertainty of inference for validation

Energy Technology Data Exchange (ETDEWEB)

Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM

2010-09-30

We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the
The Dynamical Properties of Virgo Cluster Disk Galaxies

Science.gov (United States)

Ouellette, N. N. Q.; Courteau, S.; Holtzman, J. A.; Dalcanton, J. J.; McDonald, M.; Zhu, Y.

2014-03-01

By virtue of its proximity, the Virgo Cluster is an ideal laboratory for testing our understanding of structure formation in the Universe. In this spirit, we present a dynamical study of Virgo galaxies as part of the Spectroscopic and H-band Imaging of Virgo (SHIVir) survey. Hα rotation curves (RC) for our gas-rich galaxies were modeled with a multi-parameter fit function from which various velocity measurements were inferred. Our study takes advantage of archival and our own new data as we aim to compile the largest Tully-Fisher relation (TFR) for a cluster to date. Extended velocity dispersion profiles (VDP) are integrated over varying aperture sizes to extract representative velocity dispersions (VDs) for gas-poor galaxies. Considering the lack of a common standard for the measurement of a fiducial galaxy VD in the literature, we rectify this situation by determining the radius at which the measured VD yields the tightest Fundamental Plane (FP). We found that radius to be at least 1 Re, which exceeds the extent of most dispersion profiles in other works.
fastBMA: scalable network inference and transitive reduction.

Science.gov (United States)

Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee

2017-10-01

Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
Simultaneous inference for multilevel linear mixed models - with an application to a large-scale school meal study

DEFF Research Database (Denmark)

Ritz, Christian; Laursen, Rikke Pilmann; Damsgaard, Camilla Trab

2017-01-01

of a school meal programme. We propose a novel and versatile framework for simultaneous inference on parameters estimated from linear mixed models that were fitted separately for several outcomes from the same study, but did not necessarily contain the same fixed or random effects. By combining asymptotic...... sizes of practical relevance we studied simultaneous coverage through simulation, which showed that the approach achieved acceptable coverage probabilities even for small sample sizes (10 clusters) and for 2–16 outcomes. The approach also compared favourably with a joint modelling approach. We also...
Quantum-Like Representation of Non-Bayesian Inference

Science.gov (United States)

Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

2013-01-01

This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Bayesian Inference Methods for Sparse Channel Estimation

DEFF Research Database (Denmark)

Pedersen, Niels Lovmand

2013-01-01

This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Statistical inference an integrated Bayesianlikelihood approach

CERN Document Server

Aitkin, Murray

2010-01-01

Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

Energy Technology Data Exchange (ETDEWEB)

Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

2010-05-26

RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.
Population genetic structure of the cotton bollworm Helicoverpa armigera (Hübner) (Lepidoptera: Noctuidae) in India as inferred from EPIC-PCR DNA markers.

Science.gov (United States)

Behere, Gajanan Tryambak; Tay, Wee Tek; Russell, Derek Alan; Kranthi, Keshav Raj; Batterham, Philip

2013-01-01

Helicoverpa armigera is an important pest of cotton and other agricultural crops in the Old World. Its wide host range, high mobility and fecundity, and the ability to adapt and develop resistance against all common groups of insecticides used for its management have exacerbated its pest status. An understanding of the population genetic structure in H. armigera under Indian agricultural conditions will help ascertain gene flow patterns across different agricultural zones. This study inferred the population genetic structure of Indian H. armigera using five Exon-Primed Intron-Crossing (EPIC)-PCR markers. Nested alternative EPIC markers detected moderate null allele frequencies (4.3% to 9.4%) in loci used to infer population genetic structure but the apparently genome-wide heterozygote deficit suggests in-breeding or a Wahlund effect rather than a null allele effect. Population genetic analysis of the 26 populations suggested significant genetic differentiation within India but especially in cotton-feeding populations in the 2006-07 cropping season. In contrast, overall pair-wise F(ST) estimates from populations feeding on food crops indicated no significant population substructure irrespective of cropping seasons. A Baysian cluster analysis was used to assign the genetic make-up of individuals to likely membership of population clusters. Some evidence was found for four major clusters with individuals in two populations from cotton in one year (from two populations in northern India) showing especially high homogeneity. Taken as a whole, this study found evidence of population substructure at host crop, temporal and spatial levels in Indian H. armigera, without, however, a clear biological rationale for these structures being evident.
Inference Attacks and Control on Database Structures

Directory of Open Access Journals (Sweden)

Muhamed Turkanovic

2015-02-01

Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.
Clustering network layers with the strata multilayer stochastic block model.

Science.gov (United States)

Stanley, Natalie; Shai, Saray; Taylor, Dane; Mucha, Peter J

2016-01-01

Multilayer networks are a useful data structure for simultaneously capturing multiple types of relationships between a set of nodes. In such networks, each relational definition gives rise to a layer. While each layer provides its own set of information, community structure across layers can be collectively utilized to discover and quantify underlying relational patterns between nodes. To concisely extract information from a multilayer network, we propose to identify and combine sets of layers with meaningful similarities in community structure. In this paper, we describe the "strata multilayer stochastic block model" (sMLSBM), a probabilistic model for multilayer community structure. The central extension of the model is that there exist groups of layers, called "strata", which are defined such that all layers in a given stratum have community structure described by a common stochastic block model (SBM). That is, layers in a stratum exhibit similar node-to-community assignments and SBM probability parameters. Fitting the sMLSBM to a multilayer network provides a joint clustering that yields node-to-community and layer-to-stratum assignments, which cooperatively aid one another during inference. We describe an algorithm for separating layers into their appropriate strata and an inference technique for estimating the SBM parameters for each stratum. We demonstrate our method using synthetic networks and a multilayer network inferred from data collected in the Human Microbiome Project.
An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

KAUST Repository

Bayer, Christian

2016-02-20

© 2016 Taylor & Francis Group, LLC. ABSTRACT: In this work, we present an extension of the forward–reverse representation introduced by Bayer and Schoenmakers (Annals of Applied Probability, 24(5):1994–2032, 2014) to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, that is, SRNs conditional on their values in the extremes of given time intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the expectation-maximization algorithm to the phase I output. By selecting a set of overdispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
An efficient forward-reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

KAUST Repository

Vilanova, Pedro

2016-01-07

In this work, we present an extension of the forward-reverse representation introduced in Simulation of forward-reverse stochastic representations for conditional diffusions , a 2014 paper by Bayer and Schoenmakers to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, i.e., SRNs conditional on their values in the extremes of given time-intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the Expectation-Maximization algorithm to the phase I output. By selecting a set of over-dispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
Risk Mapping of Cutaneous Leishmaniasis via a Fuzzy C Means-based Neuro-Fuzzy Inference System

Science.gov (United States)

Akhavan, P.; Karimi, M.; Pahlavani, P.

2014-10-01

Finding pathogenic factors and how they are spread in the environment has become a global demand, recently. Cutaneous Leishmaniasis (CL) created by Leishmania is a special parasitic disease which can be passed on to human through phlebotomus of vector-born. Studies show that economic situation, cultural issues, as well as environmental and ecological conditions can affect the prevalence of this disease. In this study, Data Mining is utilized in order to predict CL prevalence rate and obtain a risk map. This case is based on effective environmental parameters on CL and a Neuro-Fuzzy system was also used. Learning capacity of Neuro-Fuzzy systems in neural network on one hand and reasoning power of fuzzy systems on the other, make it very efficient to use. In this research, in order to predict CL prevalence rate, an adaptive Neuro-fuzzy inference system with fuzzy inference structure of fuzzy C Means clustering was applied to determine the initial membership functions. Regarding to high incidence of CL in Ilam province, counties of Ilam, Mehran, and Dehloran have been examined and evaluated. The CL prevalence rate was predicted in 2012 by providing effective environmental map and topography properties including temperature, moisture, annual, rainfall, vegetation and elevation. Results indicate that the model precision with fuzzy C Means clustering structure rises acceptable RMSE values of both training and checking data and support our analyses. Using the proposed data mining technology, the pattern of disease spatial distribution and vulnerable areas become identifiable and the map can be used by experts and decision makers of public health as a useful tool in management and optimal decision-making.
Risk Mapping of Cutaneous Leishmaniasis via a Fuzzy C Means-based Neuro-Fuzzy Inference System

Directory of Open Access Journals (Sweden)

P. Akhavan

2014-10-01

Full Text Available Finding pathogenic factors and how they are spread in the environment has become a global demand, recently. Cutaneous Leishmaniasis (CL created by Leishmania is a special parasitic disease which can be passed on to human through phlebotomus of vector-born. Studies show that economic situation, cultural issues, as well as environmental and ecological conditions can affect the prevalence of this disease. In this study, Data Mining is utilized in order to predict CL prevalence rate and obtain a risk map. This case is based on effective environmental parameters on CL and a Neuro-Fuzzy system was also used. Learning capacity of Neuro-Fuzzy systems in neural network on one hand and reasoning power of fuzzy systems on the other, make it very efficient to use. In this research, in order to predict CL prevalence rate, an adaptive Neuro-fuzzy inference system with fuzzy inference structure of fuzzy C Means clustering was applied to determine the initial membership functions. Regarding to high incidence of CL in Ilam province, counties of Ilam, Mehran, and Dehloran have been examined and evaluated. The CL prevalence rate was predicted in 2012 by providing effective environmental map and topography properties including temperature, moisture, annual, rainfall, vegetation and elevation. Results indicate that the model precision with fuzzy C Means clustering structure rises acceptable RMSE values of both training and checking data and support our analyses. Using the proposed data mining technology, the pattern of disease spatial distribution and vulnerable areas become identifiable and the map can be used by experts and decision makers of public health as a useful tool in management and optimal decision-making.
Approximate Bayesian computation for modular inference problems with many parameters: the example of migration rates.

Science.gov (United States)

Aeschbacher, S; Futschik, A; Beaumont, M A

2013-02-01

We propose a two-step procedure for estimating multiple migration rates in an approximate Bayesian computation (ABC) framework, accounting for global nuisance parameters. The approach is not limited to migration, but generally of interest for inference problems with multiple parameters and a modular structure (e.g. independent sets of demes or loci). We condition on a known, but complex demographic model of a spatially subdivided population, motivated by the reintroduction of Alpine ibex (Capra ibex) into Switzerland. In the first step, the global parameters ancestral mutation rate and male mating skew have been estimated for the whole population in Aeschbacher et al. (Genetics 2012; 192: 1027). In the second step, we estimate in this study the migration rates independently for clusters of demes putatively connected by migration. For large clusters (many migration rates), ABC faces the problem of too many summary statistics. We therefore assess by simulation if estimation per pair of demes is a valid alternative. We find that the trade-off between reduced dimensionality for the pairwise estimation on the one hand and lower accuracy due to the assumption of pairwise independence on the other depends on the number of migration rates to be inferred: the accuracy of the pairwise approach increases with the number of parameters, relative to the joint estimation approach. To distinguish between low and zero migration, we perform ABC-type model comparison between a model with migration and one without. Applying the approach to microsatellite data from Alpine ibex, we find no evidence for substantial gene flow via migration, except for one pair of demes in one direction. © 2013 Blackwell Publishing Ltd.
Type Inference with Inequalities

DEFF Research Database (Denmark)

Schwartzbach, Michael Ignatieff

1991-01-01

of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...
Kepler red-clump stars in the field and in open clusters

DEFF Research Database (Denmark)

Bossini, D.; Miglio, A.; Salaris, M.

2017-01-01

Convective mixing in helium-core-burning (HeCB) stars is one of the outstanding issues in stellar modelling. The precise asteroseismic measurements of gravity-mode period spacing (Delta Pi(1)) have opened the door to detailed studies of the near-core structure of such stars, which had not been...... possible before. Here, we provide stringent tests of various core-mixing scenarios against the largely unbiased population of red-clump stars belonging to the old-open clusters monitored by Kepler, and by coupling the updated precise inference on Delta Pi(1) in thousands of field stars with spectroscopic...... constraints. We find that models with moderate overshooting successfully reproduce the range observed of Delta Pi(1) in clusters. In particular, we show that there is no evidence for the need to extend the size of the adiabatically stratified core, at least at the beginning of the HeCB phase. This conclusion...
Galactic globular cluster NGC 6752 and its stellar population as inferred from multicolor photometry

Energy Technology Data Exchange (ETDEWEB)

Kravtsov, Valery [Instituto de Astronomía, Universidad Católica del Norte, Avenida Angamos 0610, Casilla 1280, Antofagasta (Chile); Alcaíno, Gonzalo [Isaac Newton Institute of Chile, Ministerio de Educación de Chile, Casilla 8-9, Correo 9, Santiago (Chile); Marconi, Gianni; Alvarado, Franklin, E-mail: vkravtsov@ucn.cl, E-mail: inewton@terra.cl, E-mail: falvarad@eso.org, E-mail: gmarconi@eso.org [ESO-European Southern Observatory, Alonso de Cordova 3107, Vitacura, Santiago (Chile)

2014-03-01

This paper is devoted to photometric study of the Galactic globular cluster (GGC) NGC 6752 in UBVI, focusing on the multiplicity of its stellar population. We emphasize that our U passband is (1) narrower than the standard one due to its smaller extension blueward and (2) redshifted by ∼300 Å relative to its counterparts, such as the HST F336W filter. Accordingly, both the spectral features encompassed by it and photometric effects of the multiplicity revealed in our study are somewhat different than in recent studies of NGC 6752. Main sequence stars bluer in U – B are less centrally concentrated, as red giants are. We find a statistically significant increasing luminosity of the red giant branch (RGB) bump of ΔU ≈ 0.2 mag toward the cluster outskirts with no so obvious effect in V. The photometric results are correlated with spectroscopic data: the bluer RGB stars in U – B have lower nitrogen abundances. We draw attention to a larger width of the RGB than the blue horizontal branch (BHB) in U – B. This seems to agree with the effects predicted to be caused by molecular bands produced by nitrogen-containing molecules. We find that brighter BHB stars, especially the brightest ones, are more centrally concentrated. This implies that red giants that are redder in U – B, i.e., more nitrogen enriched and centrally concentrated, are the main progenitors of the brighter BHB stars. However, such a progenitor-progeny relationship disagrees with theoretical predictions and with the results on the elemental abundances in horizontal branch stars. We isolated the asymptotic giant branch clump and estimated the parameter ΔV{sub ZAHB}{sup clump} = 0.98 ± 0.12.

Diversification of the silverspot butterflies (Nymphalidae) in the Neotropics inferred from multi-locus DNA sequences.

Science.gov (United States)

Massardo, Darli; Fornel, Rodrigo; Kronforst, Marcus; Gonçalves, Gislene Lopes; Moreira, Gilson Rudinei Pires

2015-01-01

The tribe Heliconiini (Lepidoptera: Nymphalidae) is a diverse group of butterflies distributed throughout the Neotropics, which has been studied extensively, in particular the genus Heliconius. However, most of the other lineages, such as Dione, which are less diverse and considered basal within the group, have received little attention. Basic information, such as species limits and geographical distributions remain uncertain for this genus. Here we used multilocus DNA sequence data and the geographical distribution analysis across the entire range of Dione in the Neotropical region in order to make inferences on the evolutionary history of this poorly explored lineage. Bayesian time-tree reconstruction allows inferring two major diversification events in this tribe around 25mya. Lineages thought to be ancient, such as Dione and Agraulis, are as recent as Heliconius. Dione formed a monophyletic clade, sister to the genus Agraulis. Dione juno, D. glycera and D. moneta were reciprocally monophyletic and formed genetic clusters, with the first two more close related than each other in relation to the third. Divergence time estimates support the hypothesis that speciation in Dione coincided with both the rise of Passifloraceae (the host plants) and the uplift of the Andes. Since the sister species D. glycera and D. moneta are specialized feeders on passion-vine lineages that are endemic to areas located either within or adjacent to the Andes, we inferred that they co-speciated with their host plants during this vicariant event. Copyright © 2014 Elsevier Inc. All rights reserved.
Integrative inference of population history in the Ibero-Maghrebian endemic Pleurodeles waltl (Salamandridae).

Science.gov (United States)

Gutiérrez-Rodríguez, Jorge; Barbosa, A Márcia; Martínez-Solano, Íñigo

2017-07-01

Inference of population histories from the molecular signatures of past demographic processes is challenging, but recent methodological advances in species distribution models and their integration in time-calibrated phylogeographic studies allow detailed reconstruction of complex biogeographic scenarios. We apply an integrative approach to infer the evolutionary history of the Iberian ribbed newt (Pleurodeles waltl), an Ibero-Maghrebian endemic with populations north and south of the Strait of Gibraltar. We analyzed an extensive multilocus dataset (mitochondrial and nuclear DNA sequences and ten polymorphic microsatellite loci) and found a deep east-west phylogeographic break in Iberian populations dating back to the Plio-Pleistocene. This break is inferred to result from vicariance associated with the formation of the Guadalquivir river basin. In contrast with previous studies, North African populations showed exclusive mtDNA haplotypes, and formed a monophyletic clade within the Eastern Iberian lineage in the mtDNA genealogy. On the other hand, microsatellites failed to recover Moroccan populations as a differentiated genetic cluster. This is interpreted to result from post-divergence gene flow based on the results of IMA2 and Migrate analyses. Thus, Moroccan populations would have originated after overseas dispersal from the Iberian Peninsula in the Pleistocene, with subsequent gene flow in more recent times, implying at least two trans-marine dispersal events. We modeled the distribution of the species and of each lineage, and projected these models back in time to infer climatically favourable areas during the mid-Holocene, the last glacial maximum (LGM) and the last interglacial (LIG), to reconstruct more recent population dynamics. We found minor differences in climatic favourability across lineages, suggesting intraspecific niche conservatism. Genetic diversity was significantly correlated with the intersection of environmental favourability in the LIG and
THE VERY MASSIVE STAR CONTENT OF THE NUCLEAR STAR CLUSTERS IN NGC 5253

Energy Technology Data Exchange (ETDEWEB)

Smith, L. J. [Space Telescope Science Institute and European Space Agency, 3700 San Martin Drive, Baltimore, MD 21218 (United States); Crowther, P. A. [Department of Physics and Astronomy, University of Sheffield, Sheffield S3 7RH (United Kingdom); Calzetti, D. [Department of Astronomy, University of Massachusetts—Amherst, Amherst, MA 01003 (United States); Sidoli, F., E-mail: lsmith@stsci.edu [London Centre for Nanotechnology, University College London, London WC1E 6BT (United Kingdom)

2016-05-20

The blue compact dwarf galaxy NGC 5253 hosts a very young starburst containing twin nuclear star clusters, separated by a projected distance of 5 pc. One cluster (#5) coincides with the peak of the H α emission and the other (#11) with a massive ultracompact H ii region. A recent analysis of these clusters shows that they have a photometric age of 1 ± 1 Myr, in apparent contradiction with the age of 3–5 Myr inferred from the presence of Wolf-Rayet features in the cluster #5 spectrum. We examine Hubble Space Telescope ultraviolet and Very Large Telescope optical spectroscopy of #5 and show that the stellar features arise from very massive stars (VMSs), with masses greater than 100 M {sub ⊙}, at an age of 1–2 Myr. We further show that the very high ionizing flux from the nuclear clusters can only be explained if VMSs are present. We investigate the origin of the observed nitrogen enrichment in the circumcluster ionized gas and find that the excess N can be produced by massive rotating stars within the first 1 Myr. We find similarities between the NGC 5253 cluster spectrum and those of metal-poor, high-redshift galaxies. We discuss the presence of VMSs in young, star-forming galaxies at high redshift; these should be detected in rest-frame UV spectra to be obtained with the James Webb Space Telescope . We emphasize that population synthesis models with upper mass cutoffs greater than 100 M {sub ⊙} are crucial for future studies of young massive star clusters at all redshifts.
Inference in models with adaptive learning

NARCIS (Netherlands)

Chevillon, G.; Massmann, M.; Mavroeidis, S.

2010-01-01

Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be
The quiescent intracluster medium in the core of the Perseus cluster

Energy Technology Data Exchange (ETDEWEB)

Aharonian, Felix; Akamatsu, Hiroki; Akimoto, Fumie; Allen, Steven W.; Anabuki, Naohisa; Angelini, Lorella; Arnaud, Keith; Audard, Marc; Awaki, Hisamitsu; Axelsson, Magnus; Bamba, Aya; Bautz, Marshall; Blandford, Roger; Brenneman, Laura; Brown, Gregory V.; Bulbul, Esra; Cackett, Edward; Chernyakova, Maria; Chiao, Meng; Coppi, Paolo; Costantini, Elisa; de Plaa, Jelle; den Herder, Jan-Willem; Done, Chris; Dotani, Tadayasu; Ebisawa, Ken; Eckart, Megan; Enoto, Teruaki; Ezoe, Yuichiro; Fabian, Andrew C.; Ferrigno, Carlo; Foster, Adam; Fujimoto, Ryuichi; Fukazawa, Yasushi; Furuzawa, Akihiro; Galeazzi, Massimiliano; Gallo, Luigi; Gandhi, Poshak; Giustini, Margherita; Goldwurm, Andrea; Gu, Liyi; Guainazzi, Matteo; Haba, Yoshito; Hagino, Kouichi; Hamaguchi, Kenji; Harrus, Ilana; Hatsukade, Isamu; Hayashi, Katsuhiro; Hayashi, Takayuki; Hayashida, Kiyoshi; Hiraga, Junko; Hornschemeier, Ann; Hoshino, Akio; Hughes, John; Iizuka, Ryo; Inoue, Hajime; Inoue, Yoshiyuki; Ishibashi, Kazunori; Ishida, Manabu; Ishikawa, Kumi; Ishisaki, Yoshitaka; Itoh, Masayuki; Iyomoto, Naoko; Kaastra, Jelle; Kallman, Timothy; Kamae, Tuneyoshi; Kara, Erin; Kataoka, Jun; Katsuda, Satoru; Katsuta, Junichiro; Kawaharada, Madoka; Kawai, Nobuyuki; Kelley, Richard; Khangulyan, Dmitry; Kilbourne, Caroline; King, Ashley; Kitaguchi, Takao; Kitamoto, Shunji; Kitayama, Tetsu; Kohmura, Takayoshi; Kokubun, Motohide; Koyama, Shu; Koyama, Katsuji; Kretschmar, Peter; Krimm, Hans; Kubota, Aya; Kunieda, Hideyo; Laurent, Philippe; Lebrun, François; Lee, Shiu-Hang; Leutenegger, Maurice; Limousin, Olivier; Loewenstein, Michael; Long, Knox S.; Lumb, David; Madejski, Grzegorz; Maeda, Yoshitomo; Maier, Daniel; Makishima, Kazuo; Markevitch, Maxim; Matsumoto, Hironori; Matsushita, Kyoko; McCammon, Dan; McNamara, Brian; Mehdipour, Missagh; Miller, Eric; Miller, Jon; Mineshige, Shin; Mitsuda, Kazuhisa; Mitsuishi, Ikuyuki; Miyazawa, Takuya; Mizuno, Tsunefumi; Mori, Hideyuki; Mori, Koji; Moseley, Harvey; Mukai, Koji; Murakami, Hiroshi; Murakami, Toshio; Mushotzky, Richard; Nagino, Ryo; Nakagawa, Takao; Nakajima, Hiroshi; Nakamori, Takeshi; Nakano, Toshio; Nakashima, Shinya; Nakazawa, Kazuhiro; Nobukawa, Masayoshi; Noda, Hirofumi; Nomachi, Masaharu; O’Dell, Steve; Odaka, Hirokazu; Ohashi, Takaya; Ohno, Masanori; Okajima, Takashi; Ota, Naomi; Ozaki, Masanobu; Paerels, Frits; Paltani, Stephane; Parmar, Arvind; Petre, Robert; Pinto, Ciro; Pohl, Martin; Porter, F. Scott; Pottschmidt, Katja; Ramsey, Brian; Reynolds, Christopher; Russell, Helen; Safi-Harb, Samar; Saito, Shinya; Sakai, Kazuhiro; Sameshima, Hiroaki; Sato, Goro; Sato, Kosuke; Sato, Rie; Sawada, Makoto; Schartel, Norbert; Serlemitsos, Peter; Seta, Hiromi; Shidatsu, Megumi; Simionescu, Aurora; Smith, Randall; Soong, Yang; Stawarz, Lukasz; Sugawara, Yasuharu; Sugita, Satoshi; Szymkowiak, Andrew; Tajima, Hiroyasu; Takahashi, Hiromitsu; Takahashi, Tadayuki; Takeda, Shin’ichiro; Takei, Yoh; Tamagawa, Toru; Tamura, Keisuke; Tamura, Takayuki; Tanaka, Takaaki; Tanaka, Yasuo; Tanaka, Yasuyuki; Tashiro, Makoto; Tawara, Yuzuru; Terada, Yukikatsu; Terashima, Yuichi; Tombesi, Francesco; Tomida, Hiroshi; Tsuboi, Yohko; Tsujimoto, Masahiro; Tsunemi, Hiroshi; Tsuru, Takeshi; Uchida, Hiroyuki; Uchiyama, Hideki; Uchiyama, Yasunobu; Ueda, Shutaro; Ueda, Yoshihiro; Ueno, Shiro; Uno, Shin’ichiro; Urry, Meg; Ursino, Eugenio; de Vries, Cor; Watanabe, Shin; Werner, Norbert; Wik, Daniel; Wilkins, Dan; Williams, Brian; Yamada, Shinya; Yamaguchi, Hiroya; Yamaoka, Kazutaka; Yamasaki, Noriko Y.; Yamauchi, Makoto; Yamauchi, Shigeo; Yaqoob, Tahir; Yatsu, Yoichi; Yonetoku, Daisuke; Yoshida, Atsumasa; Yuasa, Takayuki; Zhuravleva, Irina; Zoghbi, Abderahmen

2016-07-06

Clusters of galaxies are the most massive gravitationally bound objects in the Universe and are still forming. They are thus important probes1 of cosmological parameters and many astrophysical processes. However, knowledge of the dynamics of the pervasive hot gas, the mass of which is much larger than the combined mass of all the stars in the cluster, is lacking. Such knowledge would enable insights into the injection of mechanical energy by the central supermassive black hole and the use of hydrostatic equilibrium for determining cluster masses. X-rays from the core of the Perseus cluster are emitted by the 50-million-kelvin diffuse hot plasma filling its gravitational potential well. The active galactic nucleus of the central galaxy NGC 1275 is pumping jetted energy into the surrounding intracluster medium, creating buoyant bubbles filled with relativistic plasma. These bubbles probably induce motions in the intracluster medium and heat the inner gas, preventing runaway radiative cooling—a process known as active galactic nucleus feedback2, 3, 4, 5, 6. Here we report X-ray observations of the core of the Perseus cluster, which reveal a remarkably quiescent atmosphere in which the gas has a line-of-sight velocity dispersion of 164 ± 10 kilometres per second in the region 30–60 kiloparsecs from the central nucleus. A gradient in the line-of-sight velocity of 150 ± 70 kilometres per second is found across the 60-kiloparsec image of the cluster core. Turbulent pressure support in the gas is four per cent of the thermodynamic pressure, with large-scale shear at most doubling this estimate. We infer that a total cluster mass determined from hydrostatic equilibrium in a central region would require little correction for turbulent pressure.
Fiducial inference - A Neyman-Pearson interpretation

NARCIS (Netherlands)

Salome, D; VonderLinden, W; Dose,; Fischer, R; Preuss, R

1999-01-01

Fisher's fiducial argument is a tool for deriving inferences in the form of a probability distribution on the parameter space, not based on Bayes's Theorem. Lindley established that in exceptional situations fiducial inferences coincide with posterior distributions; in the other situations fiducial
Uncertainty in prediction and in inference

NARCIS (Netherlands)

Hilgevoord, J.; Uffink, J.

1991-01-01

The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close re-lationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in
Preliminary Test of Adaptive Neuro-Fuzzy Inference System Controller for Spacecraft Attitude Control

Directory of Open Access Journals (Sweden)

Sung-Woo Kim

2012-12-01

Full Text Available The problem of spacecraft attitude control is solved using an adaptive neuro-fuzzy inference system (ANFIS. An ANFIS produces a control signal for one of the three axes of a spacecraft’s body frame, so in total three ANFISs are constructed for 3-axis attitude control. The fuzzy inference system of the ANFIS is initialized using a subtractive clustering method. The ANFIS is trained by a hybrid learning algorithm using the data obtained from attitude control simulations using state-dependent Riccati equation controller. The training data set for each axis is composed of state errors for 3 axes (roll, pitch, and yaw and a control signal for one of the 3 axes. The stability region of the ANFIS controller is estimated numerically based on Lyapunov stability theory using a numerical method to calculate Jacobian matrix. To measure the performance of the ANFIS controller, root mean square error and correlation factor are used as performance indicators. The performance is tested on two ANFIS controllers trained in different conditions. The test results show that the performance indicators are proper in the sense that the ANFIS controller with the larger stability region provides better performance according to the performance indicators.
Performance assessment of the SIMFAP parallel cluster at IFIN-HH Bucharest

International Nuclear Information System (INIS)

Adam, Gh.; Adam, S.; Ayriyan, A.; Dushanov, E.; Hayryan, E.; Korenkov, V.; Lutsenko, A.; Mitsyn, V.; Sapozhnikova, T.; Sapozhnikov, A; Streltsova, O.; Buzatu, F.; Dulea, M.; Vasile, I.; Sima, A.; Visan, C.; Busa, J.; Pokorny, I.

2008-01-01

Performance assessment and case study outputs of the parallel SIMFAP cluster at IFIN-HH Bucharest point to its effective and reliable operation. A comparison with results on the supercomputing system in LIT-JINR Dubna adds insight on resource allocation for problem solving by parallel computing. The solution of models asking for very large numbers of knots in the discretization mesh needs the migration to high performance computing based on parallel cluster architectures. The acquisition of ready-to-use parallel computing facilities being beyond limited budgetary resources, the solution at IFIN-HH was to buy the hardware and the inter-processor network, and to implement by own efforts the open software concerning both the operating system and the parallel computing standard. The present paper provides a report demonstrating the successful solution of these tasks. The implementation of the well-known HPL (High Performance LINPACK) Benchmark points to the effective and reliable operation of the cluster. The comparison of HPL outputs obtained on parallel clusters of different magnitudes shows that there is an optimum range of the order N of the linear algebraic system over which a given parallel cluster provides optimum parallel solutions. For the SIMFAP cluster, this range can be inferred to correspond to about 1 to 2 x 10 4 linear algebraic equations. For an algorithm of polynomial complexity N α the task sharing among p processors within a parallel solution mainly follows an (N/p)α behaviour under peak performance achievement. Thus, while the problem complexity remains the same, a substantial decrease of the coefficient of the leading order of the polynomial complexity is achieved. (authors)
Case-control geographic clustering for residential histories accounting for risk factors and covariates

Science.gov (United States)

2006-01-01

Background Methods for analyzing space-time variation in risk in case-control studies typically ignore residential mobility. We develop an approach for analyzing case-control data for mobile individuals and apply it to study bladder cancer in 11 counties in southeastern Michigan. At this time data collection is incomplete and no inferences should be drawn – we analyze these data to demonstrate the novel methods. Global, local and focused clustering of residential histories for 219 cases and 437 controls is quantified using time-dependent nearest neighbor relationships. Business address histories for 268 industries that release known or suspected bladder cancer carcinogens are analyzed. A logistic model accounting for smoking, gender, age, race and education specifies the probability of being a case, and is incorporated into the cluster randomization procedures. Sensitivity of clustering to definition of the proximity metric is assessed for 1 to 75 k nearest neighbors. Results Global clustering is partly explained by the covariates but remains statistically significant at 12 of the 14 levels of k considered. After accounting for the covariates 26 Local clusters are found in Lapeer, Ingham, Oakland and Jackson counties, with the clusters in Ingham and Oakland counties appearing in 1950 and persisting to the present. Statistically significant focused clusters are found about the business address histories of 22 industries located in Oakland (19 clusters), Ingham (2) and Jackson (1) counties. Clusters in central and southeastern Oakland County appear in the 1930's and persist to the present day. Conclusion These methods provide a systematic approach for evaluating a series of increasingly realistic alternative hypotheses regarding the sources of excess risk. So long as selection of cases and controls is population-based and not geographically biased, these tools can provide insights into geographic risk factors that were not specifically assessed in the case
Case-control geographic clustering for residential histories accounting for risk factors and covariates

Directory of Open Access Journals (Sweden)

Goovaerts Pierre

2006-08-01

Full Text Available Abstract Background Methods for analyzing space-time variation in risk in case-control studies typically ignore residential mobility. We develop an approach for analyzing case-control data for mobile individuals and apply it to study bladder cancer in 11 counties in southeastern Michigan. At this time data collection is incomplete and no inferences should be drawn – we analyze these data to demonstrate the novel methods. Global, local and focused clustering of residential histories for 219 cases and 437 controls is quantified using time-dependent nearest neighbor relationships. Business address histories for 268 industries that release known or suspected bladder cancer carcinogens are analyzed. A logistic model accounting for smoking, gender, age, race and education specifies the probability of being a case, and is incorporated into the cluster randomization procedures. Sensitivity of clustering to definition of the proximity metric is assessed for 1 to 75 k nearest neighbors. Results Global clustering is partly explained by the covariates but remains statistically significant at 12 of the 14 levels of k considered. After accounting for the covariates 26 Local clusters are found in Lapeer, Ingham, Oakland and Jackson counties, with the clusters in Ingham and Oakland counties appearing in 1950 and persisting to the present. Statistically significant focused clusters are found about the business address histories of 22 industries located in Oakland (19 clusters, Ingham (2 and Jackson (1 counties. Clusters in central and southeastern Oakland County appear in the 1930's and persist to the present day. Conclusion These methods provide a systematic approach for evaluating a series of increasingly realistic alternative hypotheses regarding the sources of excess risk. So long as selection of cases and controls is population-based and not geographically biased, these tools can provide insights into geographic risk factors that were not specifically
Hybrid clustering based fuzzy structure for vibration control - Part 1: A novel algorithm for building neuro-fuzzy system

Science.gov (United States)

Nguyen, Sy Dzung; Nguyen, Quoc Hung; Choi, Seung-Bok

2015-01-01

This paper presents a new algorithm for building an adaptive neuro-fuzzy inference system (ANFIS) from a training data set called B-ANFIS. In order to increase accuracy of the model, the following issues are executed. Firstly, a data merging rule is proposed to build and perform a data-clustering strategy. Subsequently, a combination of clustering processes in the input data space and in the joint input-output data space is presented. Crucial reason of this task is to overcome problems related to initialization and contradictory fuzzy rules, which usually happen when building ANFIS. The clustering process in the input data space is accomplished based on a proposed merging-possibilistic clustering (MPC) algorithm. The effectiveness of this process is evaluated to resume a clustering process in the joint input-output data space. The optimal parameters obtained after completion of the clustering process are used to build ANFIS. Simulations based on a numerical data, 'Daily Data of Stock A', and measured data sets of a smart damper are performed to analyze and estimate accuracy. In addition, convergence and robustness of the proposed algorithm are investigated based on both theoretical and testing approaches.
Polynomial Chaos Surrogates for Bayesian Inference

KAUST Repository

Le Maitre, Olivier

2016-01-06

The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
Inference and Analysis of Population Structure Using Genetic Data and Network Theory.

Science.gov (United States)

Greenbaum, Gili; Templeton, Alan R; Bar-David, Shirli

2016-04-01

Clustering individuals to subpopulations based on genetic data has become commonplace in many genetic studies. Inference about population structure is most often done by applying model-based approaches, aided by visualization using distance-based approaches such as multidimensional scaling. While existing distance-based approaches suffer from a lack of statistical rigor, model-based approaches entail assumptions of prior conditions such as that the subpopulations are at Hardy-Weinberg equilibria. Here we present a distance-based approach for inference about population structure using genetic data by defining population structure using network theory terminology and methods. A network is constructed from a pairwise genetic-similarity matrix of all sampled individuals. The community partition, a partition of a network to dense subgraphs, is equated with population structure, a partition of the population to genetically related groups. Community-detection algorithms are used to partition the network into communities, interpreted as a partition of the population to subpopulations. The statistical significance of the structure can be estimated by using permutation tests to evaluate the significance of the partition's modularity, a network theory measure indicating the quality of community partitions. To further characterize population structure, a new measure of the strength of association (SA) for an individual to its assigned community is presented. The strength of association distribution (SAD) of the communities is analyzed to provide additional population structure characteristics, such as the relative amount of gene flow experienced by the different subpopulations and identification of hybrid individuals. Human genetic data and simulations are used to demonstrate the applicability of the analyses. The approach presented here provides a novel, computationally efficient model-free method for inference about population structure that does not entail assumption of
Interactive Instruction in Bayesian Inference

DEFF Research Database (Denmark)

Khan, Azam; Breslav, Simon; Hornbæk, Kasper

2018-01-01

An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
Inferring Phylogenetic Networks Using PhyloNet.

Science.gov (United States)

Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay

2018-07-01

PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Active inference and learning.

Science.gov (United States)

Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni

2016-09-01

This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
A spatio-temporal nonparametric Bayesian variable selection model of fMRI data for clustering correlated time courses.

Science.gov (United States)

Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina

2014-07-15

In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.
DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.

Science.gov (United States)

Sun, Zhe; Wang, Ting; Deng, Ke; Wang, Xiao-Feng; Lafyatis, Robert; Ding, Ying; Hu, Ming; Chen, Wei

2018-01-01

Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. wei.chen@chp.edu or hum@ccf.org. Supplementary data are available at Bioinformatics online. © The Author
Brightest Cluster Galaxies in REXCESS Clusters

Science.gov (United States)

Haarsma, Deborah B.; Leisman, L.; Bruch, S.; Donahue, M.

2009-01-01

Most galaxy clusters contain a Brightest Cluster Galaxy (BCG) which is larger than the other cluster ellipticals and has a more extended profile. In the hierarchical model, the BCG forms through many galaxy mergers in the crowded center of the cluster, and thus its properties give insight into the assembly of the cluster as a whole. In this project, we are working with the Representative XMM-Newton Cluster Structure Survey (REXCESS) team (Boehringer et al 2007) to study BCGs in 33 X-ray luminous galaxy clusters, 0.055 < z < 0.183. We are imaging the BCGs in R band at the Southern Observatory for Astrophysical Research (SOAR) in Chile. In this poster, we discuss our methods and give preliminary measurements of the BCG magnitudes, morphology, and stellar mass. We compare these BCG properties with the properties of their host clusters, particularly of the X-ray emitting gas.

Active Inference, homeostatic regulation and adaptive behavioural control.

Science.gov (United States)

Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl

2015-11-01

We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Generative Inferences Based on Learned Relations

Science.gov (United States)

Chen, Dawn; Lu, Hongjing; Holyoak, Keith J.

2017-01-01

A key property of relational representations is their "generativity": From partial descriptions of relations between entities, additional inferences can be drawn about other entities. A major theoretical challenge is to demonstrate how the capacity to make generative inferences could arise as a result of learning relations from…
Local wavelet correlation: applicationto timing analysis of multi-satellite CLUSTER data

Directory of Open Access Journals (Sweden)

J. Soucek

2004-12-01

Full Text Available Multi-spacecraft space observations, such as those of CLUSTER, can be used to infer information about local plasma structures by exploiting the timing differences between subsequent encounters of these structures by individual satellites. We introduce a novel wavelet-based technique, the Local Wavelet Correlation (LWC, which allows one to match the corresponding signatures of large-scale structures in the data from multiple spacecraft and determine the relative time shifts between the crossings. The LWC is especially suitable for analysis of strongly non-stationary time series, where it enables one to estimate the time lags in a more robust and systematic way than ordinary cross-correlation techniques. The technique, together with its properties and some examples of its application to timing analysis of bow shock and magnetopause crossing observed by CLUSTER, are presented. We also compare the performance and reliability of the technique with classical discontinuity analysis methods. Key words. Radio science (signal processing – Space plasma physics (discontinuities; instruments and techniques
Metal cluster compounds - chemistry and importance; clusters containing isolated main group element atoms, large metal cluster compounds, cluster fluxionality

International Nuclear Information System (INIS)

Walther, B.

1988-01-01

This part of the review on metal cluster compounds deals with clusters containing isolated main group element atoms, with high nuclearity clusters and metal cluster fluxionality. It will be obvious that main group element atoms strongly influence the geometry, stability and reactivity of the clusters. High nuclearity clusters are of interest in there own due to the diversity of the structures adopted, but their intermediate position between molecules and the metallic state makes them a fascinating research object too. These both sites of the metal cluster chemistry as well as the frequently observed ligand and core fluxionality are related to the cluster metal and surface analogy. (author)
Parametric statistical inference basic theory and modern approaches

CERN Document Server

Zacks, Shelemyahu; Tsokos, C P

1981-01-01

Parametric Statistical Inference: Basic Theory and Modern Approaches presents the developments and modern trends in statistical inference to students who do not have advanced mathematical and statistical preparation. The topics discussed in the book are basic and common to many fields of statistical inference and thus serve as a jumping board for in-depth study. The book is organized into eight chapters. Chapter 1 provides an overview of how the theory of statistical inference is presented in subsequent chapters. Chapter 2 briefly discusses statistical distributions and their properties. Chapt
CONSTRAINING THE SCATTER IN THE MASS-RICHNESS RELATION OF maxBCG CLUSTERS WITH WEAK LENSING AND X-RAY DATA

International Nuclear Information System (INIS)

Rozo, Eduardo; Rykoff, Eli S.; Evrard, August; McKay, Timothy; Hao Jiangang; Becker, Matthew; Wechsler, Risa H.; Koester, Benjamin P.; Hansen, Sarah; Frieman, Joshua; Sheldon, Erin; Johnston, David; Annis, James

2009-01-01

We measure the logarithmic scatter in mass at fixed richness for clusters in the maxBCG cluster catalog, an optically selected cluster sample drawn from Sloan Digital Sky Survey imaging data. Our measurement is achieved by demanding consistency between available weak-lensing and X-ray measurements of the maxBCG clusters, and the X-ray luminosity-mass relation inferred from the 400 days X-ray cluster survey, a flux-limited X-ray cluster survey. We find σ lnM|N 200 =0.45 -0.18 +0.20 (95% CL) at N 200 ∼ 40, where N 200 is the number of red sequence galaxies in a cluster. As a byproduct of our analysis, we also obtain a constraint on the correlation coefficient between ln L X and ln M at fixed richness, which is best expressed as a lower limit, r L,M|N ≥ 0.85(95% CL). This is the first observational constraint placed on a correlation coefficient involving two different cluster mass tracers. We use our results to produce a state-of-the-art estimate of the halo mass function at z = 0.23-the median redshift of the maxBCG cluster sample-and find that it is consistent with the WMAP5 cosmology. Both the mass function data and its covariance matrix are presented.
Constraining the Scatter in the Mass-Richness Relation of maxBCG Clusters With Weak Lensing and X-ray Data

Energy Technology Data Exchange (ETDEWEB)

Rozo, Eduardo; /Ohio State U.; Rykoff, Eli S.; /UC, Santa Barbara; Evrard, August; /Michigan U.; Becker, Matthew R.; /Chicago U.; McKay, Timothy; /Michigan U.; Wechsler, Risa H.; /SLAC; Koester, Benjamin P.; /Chicago U. /KICP, Chicago; Hao, Jiangang; /Michigan U.; Hansen, Sarah; /Chicago U. /KICP, Chicago; Sheldon, Erin; /New York U.; Johnston, David; /Houston U.; Annis, James T.; /Fermilab; Frieman, Joshua A.; /Chicago U. /KICP, Chicago /Fermilab

2009-08-03

We measure the logarithmic scatter in mass at fixed richness for clusters in the maxBCG cluster catalog, an optically selected cluster sample drawn from SDSS imaging data. Our measurement is achieved by demanding consistency between available weak lensing and X-ray measurements of the maxBCG clusters, and the X-ray luminosity-mass relation inferred from the 400d X-ray cluster survey, a flux limited X-ray cluster survey. We find {sigma}{sub lnM|N{sub 200}} = 0.45{sub -0.18}{sup +0.20} (95%CL) at N{sub 200} {approx} 40, where N{sub 200} is the number of red sequence galaxies in a cluster. As a byproduct of our analysis, we also obtain a constraint on the correlation coefficient between lnL{sub X} and lnM at fixed richness, which is best expressed as a lower limit, r{sub L,M|N} {ge} 0.85 (95% CL). This is the first observational constraint placed on a correlation coefficient involving two different cluster mass tracers. We use our results to produce a state of the art estimate of the halo mass function at z = 0.23 - the median redshift of the maxBCG cluster sample - and find that it is consistent with the WMAP5 cosmology. Both the mass function data and its covariance matrix are presented.
A New Method to Constrain Supernova Fractions Using X-ray Observations of Clusters of Galaxies

Science.gov (United States)

Bulbul, Esra; Smith, Randall K.; Loewenstein, Michael

2012-01-01

Supernova (SN) explosions enrich the intracluster medium (ICM) both by creating and dispersing metals. We introduce a method to measure the number of SNe and relative contribution of Type Ia supernovae (SNe Ia) and core-collapse supernovae (SNe cc) by directly fitting X-ray spectral observations. The method has been implemented as an XSPEC model called snapec. snapec utilizes a single-temperature thermal plasma code (apec) to model the spectral emission based on metal abundances calculated using the latest SN yields from SN Ia and SN cc explosion models. This approach provides a self-consistent single set of uncertainties on the total number of SN explosions and relative fraction of SN types in the ICM over the cluster lifetime by directly allowing these parameters to be determined by SN yields provided by simulations. We apply our approach to XMM-Newton European Photon Imaging Camera (EPIC), Reflection Grating Spectrometer (RGS), and 200 ks simulated Astro-H observations of a cooling flow cluster, A3112.We find that various sets of SN yields present in the literature produce an acceptable fit to the EPIC and RGS spectra of A3112. We infer that 30.3% plus or minus 5.4% to 37.1% plus or minus 7.1% of the total SN explosions are SNe Ia, and the total number of SN explosions required to create the observed metals is in the range of (1.06 plus or minus 0.34) x 10(exp 9), to (1.28 plus or minus 0.43) x 10(exp 9), fromsnapec fits to RGS spectra. These values may be compared to the enrichment expected based on well-established empirically measured SN rates per star formed. The proportions of SNe Ia and SNe cc inferred to have enriched the ICM in the inner 52 kiloparsecs of A3112 is consistent with these specific rates, if one applies a correction for the metals locked up in stars. At the same time, the inferred level of SN enrichment corresponds to a star-to-gas mass ratio that is several times greater than the 10% estimated globally for clusters in the A3112 mass range.
Variational inference & deep learning: A new synthesis

OpenAIRE

Kingma, D.P.

2017-01-01

In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Variational inference & deep learning : A new synthesis

NARCIS (Netherlands)

Kingma, D.P.

2017-01-01

In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
PREFACE: Nuclear Cluster Conference; Cluster'07

Science.gov (United States)

Freer, Martin

2008-05-01

The Cluster Conference is a long-running conference series dating back to the 1960's, the first being initiated by Wildermuth in Bochum, Germany, in 1969. The most recent meeting was held in Nara, Japan, in 2003, and in 2007 the 9th Cluster Conference was held in Stratford-upon-Avon, UK. As the name suggests the town of Stratford lies upon the River Avon, and shortly before the conference, due to unprecedented rainfall in the area (approximately 10 cm within half a day), lay in the River Avon! Stratford is the birthplace of the `Bard of Avon' William Shakespeare, and this formed an intriguing conference backdrop. The meeting was attended by some 90 delegates and the programme contained 65 70 oral presentations, and was opened by a historical perspective presented by Professor Brink (Oxford) and closed by Professor Horiuchi (RCNP) with an overview of the conference and future perspectives. In between, the conference covered aspects of clustering in exotic nuclei (both neutron and proton-rich), molecular structures in which valence neutrons are exchanged between cluster cores, condensates in nuclei, neutron-clusters, superheavy nuclei, clusters in nuclear astrophysical processes and exotic cluster decays such as 2p and ternary cluster decay. The field of nuclear clustering has become strongly influenced by the physics of radioactive beam facilities (reflected in the programme), and by the excitement that clustering may have an important impact on the structure of nuclei at the neutron drip-line. It was clear that since Nara the field had progressed substantially and that new themes had emerged and others had crystallized. Two particular topics resonated strongly condensates and nuclear molecules. These topics are thus likely to be central in the next cluster conference which will be held in 2011 in the Hungarian city of Debrechen. Martin Freer Participants and Cluster'07
A measurement of gravitational lensing of the cosmic microwave background by galaxy clusters using data from the south pole telescope

Energy Technology Data Exchange (ETDEWEB)

Baxter, E. J.; Keisler, R.; Dodelson, S.; Aird, K. A.; Allen, S. W.; Ashby, M. L. N.; Bautz, M.; Bayliss, M.; Benson, B. A.; Bleem, L. E.; Bocquet, S.; Brodwin, M.; Carlstrom, J. E.; Chang, C. L.; Chiu, I.; Cho, H-M.; Clocchiatti, A.; Crawford, T. M.; Crites, A. T.; Desai, S.; Dietrich, J. P.; de Haan, T.; Dobbs, M. A.; Foley, R. J.; Forman, W. R.; George, E. M.; Gladders, M. D.; Gonzalez, A. H.; Halverson, N. W.; Harrington, N. L.; Hennig, C.; Hoekstra, H.; Holder, G. P.; Holzapfel, W. L.; Hou, Z.; Hrubes, J. D.; Jones, C.; Knox, L.; Lee, A. T.; Leitch, E. M.; Liu, J.; Lueker, M.; Luong-Van, D.; Mantz, A.; Marrone, D. P.; McDonald, M.; McMahon, J. J.; Meyer, S. S.; Millea, M.; Mocanu, L. M.; Murray, S. S.; Padin, S.; Pryke, C.; Reichardt, C. L.; Rest, A.; Ruhl, J. E.; Saliwanchik, B. R.; Saro, A.; Sayre, J. T.; Schaffer, K. K.; Shirokoff, E.; Song, J.; Spieler, H. G.; Stalder, B.; Stanford, S. A.; Staniszewski, Z.; Stark, A. A.; Story, K. T.; van Engelen, A.; Vanderlinde, K.; Vieira, J. D.; Vikhlinin, A.; Williamson, R.; Zahn, O.; Zenteno, A.

2015-06-20

Clusters of galaxies are expected to gravitationally lens the cosmic microwave background (CMB) and thereby generate a distinct signal in the CMB on arcminute scales. Measurements of this effect can be used to constrain the masses of galaxy clusters with CMB data alone. Here we present a measurement of lensing of the CMB by galaxy clusters using data from the South Pole Telescope (SPT). We develop a maximum likelihood approach to extract the CMB cluster lensing signal and validate the method on mock data. We quantify the effects on our analysis of several potential sources of systematic error and find that they generally act to reduce the best-fit cluster mass. It is estimated that this bias to lower cluster mass is roughly 0.85σ in units of the statistical error bar, although this estimate should be viewed as an upper limit. We apply our maximum likelihood technique to 513 clusters selected via their Sunyaev–Zeldovich (SZ) signatures in SPT data, and rule out the null hypothesis of no lensing at 3.1σ. The lensing-derived mass estimate for the full cluster sample is consistent with that inferred from the SZ flux: ${M}_{200,\\mathrm{lens}}={0.83}_{-0.37}^{+0.38}\\;{M}_{200,\\mathrm{SZ}}$ (68% C.L., statistical error only).
Combining Galaxy-Galaxy Lensing and Galaxy Clustering

Energy Technology Data Exchange (ETDEWEB)

Park, Youngsoo [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Krause, Elisabeth [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Dodelson, Scott [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Jain, Bhuvnesh [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Amara, Adam [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Becker, Matt [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Bridle, Sarah [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Clampitt, Joseph [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Crocce, Martin [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Honscheid, Klaus [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Gaztanaga, Enrique [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Sanchez, Carles [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Wechsler, Risa [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)

2015-01-01

Combining galaxy-galaxy lensing and galaxy clustering is a promising method for inferring the growth rate of large scale structure, a quantity that will shed light on the mechanism driving the acceleration of the Universe. The Dark Energy Survey (DES) is a prime candidate for such an analysis, with its measurements of both the distribution of galaxies on the sky and the tangential shears of background galaxies induced by these foreground lenses. By constructing an end-to-end analysis that combines large-scale galaxy clustering and small-scale galaxy-galaxy lensing, we also forecast the potential of a combined probes analysis on DES datasets. In particular, we develop a practical approach to a DES combined probes analysis by jointly modeling the assumptions and systematics affecting the different components of the data vector, employing a shared halo model, HOD parametrization, photometric redshift errors, and shear measurement errors. Furthermore, we study the effect of external priors on different subsets of these parameters. We conclude that DES data will provide powerful constraints on the evolution of structure growth in the universe, conservatively/ optimistically constraining the growth function to 8%/4.9% with its first-year data covering 1000 square degrees, and to 4%/2.3% with its full five-year data covering 5000 square degrees.
Constraint Satisfaction Inference : Non-probabilistic Global Inference for Sequence Labelling

NARCIS (Netherlands)

Canisius, S.V.M.; van den Bosch, A.; Daelemans, W.; Basili, R.; Moschitti, A.

2006-01-01

We present a new method for performing sequence labelling based on the idea of using a machine-learning classifier to generate several possible output sequences, and then applying an inference procedure to select the best sequence among those. Most sequence labelling methods following a similar
Population structure of Atlantic Mackerel inferred from RAD-seq derived SNP markers: effects of sequence clustering parameters and hierarchical SNP selection

KAUST Repository

Rodríguez-Ezpeleta, Naiara

2016-03-03

Restriction-site associated DNA sequencing (RAD-seq) and related methods are revolutionizing the field of population genomics in non-model organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD-seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under- or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD-seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD-seq data analysis strategies on population structure inferences that are directly applicable to other species.
Reasoning about Informal Statistical Inference: One Statistician's View

Science.gov (United States)

Rossman, Allan J.

2008-01-01

This paper identifies key concepts and issues associated with the reasoning of informal statistical inference. I focus on key ideas of inference that I think all students should learn, including at secondary level as well as tertiary. I argue that a fundamental component of inference is to go beyond the data at hand, and I propose that statistical…
Meta-learning framework applied in bioinformatics inference system design.

Science.gov (United States)

Arredondo, Tomás; Ormazábal, Wladimir

2015-01-01

This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.
Formation of stable products from cluster-cluster collisions

International Nuclear Information System (INIS)

Alamanova, Denitsa; Grigoryan, Valeri G; Springborg, Michael

2007-01-01

The formation of stable products from copper cluster-cluster collisions is investigated by using classical molecular-dynamics simulations in combination with an embedded-atom potential. The dependence of the product clusters on impact energy, relative orientation of the clusters, and size of the clusters is studied. The structures and total energies of the product clusters are analysed and compared with those of the colliding clusters before impact. These results, together with the internal temperature, are used in obtaining an increased understanding of cluster fusion processes
Extending the Functionality of Behavioural Change-Point Analysis with k-Means Clustering: A Case Study with the Little Penguin (Eudyptula minor)

Science.gov (United States)

Zhang, Jingjing; Dennis, Todd E.

2015-01-01

We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1) behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2) hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3) k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known ‘artificial behaviours’ comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor) obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging), with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified. PMID:25922935
Extending the Functionality of Behavioural Change-Point Analysis with k-Means Clustering: A Case Study with the Little Penguin (Eudyptula minor).

Science.gov (United States)

Zhang, Jingjing; O'Reilly, Kathleen M; Perry, George L W; Taylor, Graeme A; Dennis, Todd E

2015-01-01

We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1) behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2) hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3) k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known 'artificial behaviours' comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor) obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging), with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified.

Extending the Functionality of Behavioural Change-Point Analysis with k-Means Clustering: A Case Study with the Little Penguin (Eudyptula minor.

Directory of Open Access Journals (Sweden)

Jingjing Zhang

Full Text Available We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1 behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2 hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3 k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known 'artificial behaviours' comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging, with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified.
Statistical inference and Aristotle's Rhetoric.

Science.gov (United States)

Macdonald, Ranald R

2004-11-01

Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture

Science.gov (United States)

Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA

2009-12-22

Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.
Children's and adults' judgments of the certainty of deductive inferences, inductive inferences, and guesses.

Science.gov (United States)

Pillow, Bradford H; Pearson, Raeanne M; Hecht, Mary; Bremer, Amanda

2010-01-01

Children and adults rated their own certainty following inductive inferences, deductive inferences, and guesses. Beginning in kindergarten, participants rated deductions as more certain than weak inductions or guesses. Deductions were rated as more certain than strong inductions beginning in Grade 3, and fourth-grade children and adults differentiated strong inductions, weak inductions, and informed guesses from pure guesses. By Grade 3, participants also gave different types of explanations for their deductions and inductions. These results are discussed in relation to children's concepts of cognitive processes, logical reasoning, and epistemological development.
Deep Learning for Population Genetic Inference.

Science.gov (United States)

Sheehan, Sara; Song, Yun S

2016-03-01

Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Surface self-diffusion behavior of individual tungsten adatoms on rhombohedral clusters

International Nuclear Information System (INIS)

Yang Jianyu; Hu Wangyu; Tang Jianfeng

2011-01-01

The diffusion of single tungsten adatoms on the surfaces of rhombohedral clusters is studied by means of molecular dynamics and the embedded atom method. The energy barriers for the adatom diffusing across and along the step edge between a {110} facet and a neighboring {110} facet are calculated using the nudged elastic band method. We notice that the tungsten adatom diffusion across the step edge has a much higher barrier than that for face-centered cubic metal clusters. The result shows that diffusion from the {110} facet to a neighboring {110} facet could not take place at low temperatures. In addition, the calculated energy barrier for an adatom diffusing along the step edge is lower than that for an adatom on the flat (110) surface. The results show that the adatom could diffuse easily along the step edge, and could be trapped by the facet corner. Taking all of this evidence together, we infer that the {110} facet starts to grow from the facet corner, and then along the step edge, and finally toward the {110} facet center. So the tungsten rhombohedron can grow epitaxially along the {110} facet one facet at a time and the rhombohedron should be the stable structure for both large and small tungsten clusters. (paper)
Using Alien Coins to Test Whether Simple Inference Is Bayesian

Science.gov (United States)

Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.

2016-01-01

Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
On Maximum Entropy and Inference

Directory of Open Access Journals (Sweden)

Luigi Gresele

2017-11-01

Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.
Mass profile and dynamical status of the z ~ 0.8 galaxy cluster LCDCS 0504

Science.gov (United States)

Guennou, L.; Biviano, A.; Adami, C.; Limousin, M.; Lima Neto, G. B.; Mamon, G. A.; Ulmer, M. P.; Gavazzi, R.; Cypriano, E. S.; Durret, F.; Clowe, D.; LeBrun, V.; Allam, S.; Basa, S.; Benoist, C.; Cappi, A.; Halliday, C.; Ilbert, O.; Johnston, D.; Jullo, E.; Just, D.; Kubo, J. M.; Márquez, I.; Marshall, P.; Martinet, N.; Maurogordato, S.; Mazure, A.; Murphy, K. J.; Plana, H.; Rostagni, F.; Russeil, D.; Schirmer, M.; Schrabback, T.; Slezak, E.; Tucker, D.; Zaritsky, D.; Ziegler, B.

2014-06-01

Context. Constraints on the mass distribution in high-redshift clusters of galaxies are currently not very strong. Aims: We aim to constrain the mass profile, M(r), and dynamical status of the z ~ 0.8 LCDCS 0504 cluster of galaxies that is characterized by prominent giant gravitational arcs near its center. Methods: Our analysis is based on deep X-ray, optical, and infrared imaging as well as optical spectroscopy, collected with various instruments, which we complemented with archival data. We modeled the mass distribution of the cluster with three different mass density profiles, whose parameters were constrained by the strong lensing features of the inner cluster region, by the X-ray emission from the intracluster medium, and by the kinematics of 71 cluster members. Results: We obtain consistent M(r) determinations from three methods based on kinematics (dispersion-kurtosis, caustics, and MAMPOSSt), out to the cluster virial radius, ≃1.3 Mpc and beyond. The mass profile inferred by the strong lensing analysis in the central cluster region is slightly higher than, but still consistent with, the kinematics estimate. On the other hand, the X-ray based M(r) is significantly lower than the kinematics and strong lensing estimates. Theoretical predictions from ΛCDM cosmology for the concentration-mass relation agree with our observational results, when taking into account the uncertainties in the observational and theoretical estimates. There appears to be a central deficit in the intracluster gas mass fraction compared with nearby clusters. Conclusions: Despite the relaxed appearance of this cluster, the determinations of its mass profile by different probes show substantial discrepancies, the origin of which remains to be determined. The extension of a dynamical analysis similar to that of other clusters of the DAFT/FADA survey with multiwavelength data of sufficient quality will allow shedding light on the possible systematics that affect the determination of mass
Nuclear clustering - a cluster core model study

International Nuclear Information System (INIS)

Paul Selvi, G.; Nandhini, N.; Balasubramaniam, M.

2015-01-01

Nuclear clustering, similar to other clustering phenomenon in nature is a much warranted study, since it would help us in understanding the nature of binding of the nucleons inside the nucleus, closed shell behaviour when the system is highly deformed, dynamics and structure at extremes. Several models account for the clustering phenomenon of nuclei. We present in this work, a cluster core model study of nuclear clustering in light mass nuclei
Detection of the YORP Effect for Small Asteroids in the Karin Cluster

Science.gov (United States)

Carruba, V.; Nesvorný, D.; Vokrouhlický, D.

2016-06-01

The Karin cluster is a young asteroid family thought to have formed only ≃ 5.75 Myr ago. The young age can be demonstrated by numerically integrating the orbits of Karin cluster members backward in time and showing the convergence of the perihelion and nodal longitudes (as well as other orbital elements). Previous work has pointed out that the convergence is not ideal if the backward integration only accounts for the gravitational perturbations from the solar system planets. It improves when the thermal radiation force known as the Yarkovsky effect is accounted for. This argument can be used to estimate the spin obliquities of the Karin cluster members. Here we take advantage of the fast growing membership of the Karin cluster and show that the obliquity distribution of diameter D≃ 1{--}2 km Karin asteroids is bimodal, as expected if the YORP effect acted to move obliquities toward extreme values (0° or 180°). The measured magnitude of the effect is consistent with the standard YORP model. The surface thermal conductivity is inferred to be 0.07-0.2 W m-1 K-1 (thermal inertia ≃ 300{--}500 J m-2 K-1 s{}-1/2). We find that the strength of the YORP effect is roughly ≃ 0.7 of the nominal strength obtained for a collection of random Gaussian spheroids. These results are consistent with a surface composed of rough, rocky regolith. The obliquity values predicted here for 480 members of the Karin cluster can be validated by the light-curve inversion method.
Large-scale dynamics associated with clustering of extratropical cyclones affecting Western Europe

Science.gov (United States)

Pinto, Joaquim G.; Gómara, Iñigo; Masato, Giacomo; Dacre, Helen F.; Woollings, Tim; Caballero, Rodrigo

2015-04-01

Some recent winters in Western Europe have been characterized by the occurrence of multiple extratropical cyclones following a similar path. The occurrence of such cyclone clusters leads to large socio-economic impacts due to damaging winds, storm surges, and floods. Recent studies have statistically characterized the clustering of extratropical cyclones over the North Atlantic and Europe and hypothesized potential physical mechanisms responsible for their formation. Here we analyze 4 months characterized by multiple cyclones over Western Europe (February 1990, January 1993, December 1999, and January 2007). The evolution of the eddy driven jet stream, Rossby wave-breaking, and upstream/downstream cyclone development are investigated to infer the role of the large-scale flow and to determine if clustered cyclones are related to each other. Results suggest that optimal conditions for the occurrence of cyclone clusters are provided by a recurrent extension of an intensified eddy driven jet toward Western Europe lasting at least 1 week. Multiple Rossby wave-breaking occurrences on both the poleward and equatorward flanks of the jet contribute to the development of these anomalous large-scale conditions. The analysis of the daily weather charts reveals that upstream cyclone development (secondary cyclogenesis, where new cyclones are generated on the trailing fronts of mature cyclones) is strongly related to cyclone clustering, with multiple cyclones developing on a single jet streak. The present analysis permits a deeper understanding of the physical reasons leading to the occurrence of cyclone families over the North Atlantic, enabling a better estimation of the associated cumulative risk over Europe.
A Visual Analysis Approach for Inferring Personal Job and Housing Locations Based on Public Bicycle Data

Directory of Open Access Journals (Sweden)

Xiaoying Shi

2017-07-01

Full Text Available Information concerning the home and workplace of residents is the basis of analyzing the urban job-housing spatial relationship. Traditional methods conduct time-consuming user surveys to obtain personal job and housing location information. Some new methods define rules to detect personal places based on human mobility data. However, because the travel patterns of residents are variable, simple rule-based methods are unable to generalize highly changing and complex travel modes. In this paper, we propose a visual analysis approach to assist the analyzer in inferring personal job and housing locations interactively based on public bicycle data. All users are first clustered to find potential commuting users. Then, several visual views are designed to find the key candidate stations for a specific user, and the visited temporal pattern of stations and the user’s hire behavior are analyzed, which helps with the inference of station semantic meanings. Finally, a number of users’ job and housing locations are detected by the analyzer and visualized. Our approach can manage the complex and diverse cycling habits of users. The effectiveness of the approach is shown through case studies based on a real-world public bicycle dataset.
Compiling Relational Bayesian Networks for Exact Inference

DEFF Research Database (Denmark)

Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan

2004-01-01

We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...
Causal inference in economics and marketing.

Science.gov (United States)

Varian, Hal R

2016-07-05

This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.
Cluster fusion algorithm: application to Lennard-Jones clusters

DEFF Research Database (Denmark)

Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

2006-01-01

paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...
Cluster fusion algorithm: application to Lennard-Jones clusters

DEFF Research Database (Denmark)

Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

2008-01-01

paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...
A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.

Science.gov (United States)

Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G

2018-01-01

Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Likelihood-based inference for discretely observed birth-death-shift processes, with applications to evolution of mobile genetic elements.

Science.gov (United States)

Xu, Jason; Guttorp, Peter; Kato-Maeda, Midori; Minin, Vladimir N

2015-12-01

Continuous-time birth-death-shift (BDS) processes are frequently used in stochastic modeling, with many applications in ecology and epidemiology. In particular, such processes can model evolutionary dynamics of transposable elements-important genetic markers in molecular epidemiology. Estimation of the effects of individual covariates on the birth, death, and shift rates of the process can be accomplished by analyzing patient data, but inferring these rates in a discretely and unevenly observed setting presents computational challenges. We propose a multi-type branching process approximation to BDS processes and develop a corresponding expectation maximization algorithm, where we use spectral techniques to reduce calculation of expected sufficient statistics to low-dimensional integration. These techniques yield an efficient and robust optimization routine for inferring the rates of the BDS process, and apply broadly to multi-type branching processes whose rates can depend on many covariates. After rigorously testing our methodology in simulation studies, we apply our method to study intrapatient time evolution of IS6110 transposable element, a genetic marker frequently used during estimation of epidemiological clusters of Mycobacterium tuberculosis infections. © 2015, The International Biometric Society.
Uncertainty in prediction and in inference

International Nuclear Information System (INIS)

Hilgevoord, J.; Uffink, J.

1991-01-01

The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close relationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in inference can be obtained by means of the so-called statistical distance between probability distributions. When applied to quantum mechanics, this distance leads to a measure of the distinguishability of quantum states, which essentially is the absolute value of the matrix element between the states. The importance of this result to the quantum mechanical uncertainty principle is noted. The second part of the paper provides a derivation of the statistical distance on the basis of the so-called method of support

Statistical inference for Cox processes

DEFF Research Database (Denmark)

Møller, Jesper; Waagepetersen, Rasmus Plenge

2002-01-01

Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome...... research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters...... that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space...
Nonparametric predictive inference in statistical process control

NARCIS (Netherlands)

Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

2000-01-01

New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
Compiling Relational Bayesian Networks for Exact Inference

DEFF Research Database (Denmark)

Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark

2006-01-01

We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...
Making inference from wildlife collision data: inferring predator absence from prey strikes

Directory of Open Access Journals (Sweden)

Peter Caley

2017-02-01

Full Text Available Wildlife collision data are ubiquitous, though challenging for making ecological inference due to typically irreducible uncertainty relating to the sampling process. We illustrate a new approach that is useful for generating inference from predator data arising from wildlife collisions. By simply conditioning on a second prey species sampled via the same collision process, and by using a biologically realistic numerical response functions, we can produce a coherent numerical response relationship between predator and prey. This relationship can then be used to make inference on the population size of the predator species, including the probability of extinction. The statistical conditioning enables us to account for unmeasured variation in factors influencing the runway strike incidence for individual airports and to enable valid comparisons. A practical application of the approach for testing hypotheses about the distribution and abundance of a predator species is illustrated using the hypothesized red fox incursion into Tasmania, Australia. We estimate that conditional on the numerical response between fox and lagomorph runway strikes on mainland Australia, the predictive probability of observing no runway strikes of foxes in Tasmania after observing 15 lagomorph strikes is 0.001. We conclude there is enough evidence to safely reject the null hypothesis that there is a widespread red fox population in Tasmania at a population density consistent with prey availability. The method is novel and has potential wider application.
Making inference from wildlife collision data: inferring predator absence from prey strikes.

Science.gov (United States)

Caley, Peter; Hosack, Geoffrey R; Barry, Simon C

2017-01-01

Wildlife collision data are ubiquitous, though challenging for making ecological inference due to typically irreducible uncertainty relating to the sampling process. We illustrate a new approach that is useful for generating inference from predator data arising from wildlife collisions. By simply conditioning on a second prey species sampled via the same collision process, and by using a biologically realistic numerical response functions, we can produce a coherent numerical response relationship between predator and prey. This relationship can then be used to make inference on the population size of the predator species, including the probability of extinction. The statistical conditioning enables us to account for unmeasured variation in factors influencing the runway strike incidence for individual airports and to enable valid comparisons. A practical application of the approach for testing hypotheses about the distribution and abundance of a predator species is illustrated using the hypothesized red fox incursion into Tasmania, Australia. We estimate that conditional on the numerical response between fox and lagomorph runway strikes on mainland Australia, the predictive probability of observing no runway strikes of foxes in Tasmania after observing 15 lagomorph strikes is 0.001. We conclude there is enough evidence to safely reject the null hypothesis that there is a widespread red fox population in Tasmania at a population density consistent with prey availability. The method is novel and has potential wider application.
The E-MOSAICS project: simulating the formation and co-evolution of galaxies and their star cluster populations

Science.gov (United States)

Pfeffer, Joel; Kruijssen, J. M. Diederik; Crain, Robert A.; Bastian, Nate

2018-04-01

We introduce the MOdelling Star cluster population Assembly In Cosmological Simulations within EAGLE (E-MOSAICS) project. E-MOSAICS incorporates models describing the formation, evolution, and disruption of star clusters into the EAGLE galaxy formation simulations, enabling the examination of the co-evolution of star clusters and their host galaxies in a fully cosmological context. A fraction of the star formation rate of dense gas is assumed to yield a cluster population; this fraction and the population's initial properties are governed by the physical properties of the natal gas. The subsequent evolution and disruption of the entire cluster population are followed accounting for two-body relaxation, stellar evolution, and gravitational shocks induced by the local tidal field. This introductory paper presents a detailed description of the model and initial results from a suite of 10 simulations of ˜L⋆ galaxies with disc-like morphologies at z = 0. The simulations broadly reproduce key observed characteristics of young star clusters and globular clusters (GCs), without invoking separate formation mechanisms for each population. The simulated GCs are the surviving population of massive clusters formed at early epochs (z ≳ 1-2), when the characteristic pressures and surface densities of star-forming gas were significantly higher than observed in local galaxies. We examine the influence of the star formation and assembly histories of galaxies on their cluster populations, finding that (at similar present-day mass) earlier-forming galaxies foster a more massive and disruption-resilient cluster population, while galaxies with late mergers are capable of forming massive clusters even at late cosmic epochs. We find that the phenomenological treatment of interstellar gas in EAGLE precludes the accurate modelling of cluster disruption in low-density environments, but infer that simulations incorporating an explicitly modelled cold interstellar gas phase will overcome
Causal inference in biology networks with integrated belief propagation.

Science.gov (United States)

Chang, Rui; Karr, Jonathan R; Schadt, Eric E

2015-01-01

Inferring causal relationships among molecular and higher order phenotypes is a critical step in elucidating the complexity of living systems. Here we propose a novel method for inferring causality that is no longer constrained by the conditional dependency arguments that limit the ability of statistical causal inference methods to resolve causal relationships within sets of graphical models that are Markov equivalent. Our method utilizes Bayesian belief propagation to infer the responses of perturbation events on molecular traits given a hypothesized graph structure. A distance measure between the inferred response distribution and the observed data is defined to assess the 'fitness' of the hypothesized causal relationships. To test our algorithm, we infer causal relationships within equivalence classes of gene networks in which the form of the functional interactions that are possible are assumed to be nonlinear, given synthetic microarray and RNA sequencing data. We also apply our method to infer causality in real metabolic network with v-structure and feedback loop. We show that our method can recapitulate the causal structure and recover the feedback loop only from steady-state data which conventional method cannot.
Efficient Bayesian inference for ARFIMA processes

Science.gov (United States)

Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.

2015-03-01

Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
Deep Learning for Population Genetic Inference.

Directory of Open Access Journals (Sweden)

Sara Sheehan

2016-03-01

Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference

Science.gov (United States)

Sheehan, Sara; Song, Yun S.

2016-01-01

Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908
Ionization-induced solvent migration in acetanilide-methanol clusters inferred from isomer-selective infrared spectroscopy.

Science.gov (United States)

Weiler, Martin; Nakamura, Takashi; Sekiya, Hiroshi; Dopfer, Otto; Miyazaki, Mitsuhiko; Fujii, Masaaki

2012-12-07

We present the resonance-enhanced multiphoton ionization, infrared-ultraviolet hole burning (IR-UV HB), and IR dip spectra of the trans-acetanilide-methanol (AA-MeOH) cluster in the S(0), S(1), and cationic ground state (D(0)) in a supersonic jet. The IR-UV HB spectra demonstrate the co-existence of two isomers in S(0,1), in which MeOH binds either to the NH or the CO site of the peptide linkage in AA, denoted as AA(NH)-MeOH and AA(CO)-MeOH. When AA(CO)-MeOH is selectively ionized, its IR spectrum in D(0) is the same as that measured for AA(+) (NH)-MeOH. Thus, photoionization of AA(CO)-MeOH induces migration of MeOH from the CO to the NH site with 100% yield. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Bayesian Network Schema for Lessening Database Inference

National Research Council Canada - National Science Library

Chang, LiWu; Moskowitz, Ira S

2001-01-01

.... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...
Type Inference for Session Types in the Pi-Calculus

DEFF Research Database (Denmark)

Graversen, Eva Fajstrup; Harbo, Jacob Buchreitz; Huttel, Hans

2014-01-01

In this paper we present a direct algorithm for session type inference for the π-calculus. Type inference for session types has previously been achieved by either imposing limitations and restriction on the π-calculus, or by reducing the type inference problem to that for linear types. Our approach...
Comprehensive cluster analysis with Transitivity Clustering.

Science.gov (United States)

Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

2011-03-01

Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
The impact of phenotypic and molecular data on the inference of Colletotrichum diversity associated with Musa.

Science.gov (United States)

Vieira, Willie A S; Lima, Waléria G; Nascimento, Eduardo S; Michereff, Sami J; Câmara, Marcos P S; Doyle, Vinson P

2017-01-01

Developing a comprehensive and reliable taxonomy for the Colletotrichum gloeosporioides species complex will require adopting data standards on the basis of an understanding of how methodological choices impact morphological evaluations and phylogenetic inference. We explored the impact of methodological choices in a morphological and molecular evaluation of Colletotrichum species associated with banana in Brazil. The choice of alignment filtering algorithm has a significant impact on topological inference and the retention of phylogenetically informative sites. Similarly, the choice of phylogenetic marker affects the delimitation of species boundaries, particularly if low phylogenetic signal is confounded with strong discordance, and inference of the species tree from multiple-gene trees. According to both phylogenetic informativeness profiling and Bayesian concordance analyses, the most informative loci are DNA lyase (APN2), intergenic spacer (IGS) between DNA lyase and the mating-type locus MAT1-2-1 (APN2/MAT-IGS), calmodulin (CAL), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), glutamine synthetase (GS), β-tubulin (TUB2), and a new marker, the intergenic spacer between GAPDH and an hypothetical protein (GAP2-IGS). Cornmeal agar minimizes the variance in conidial dimensions compared with potato dextrose agar and synthetic nutrient-poor agar, such that species are more readily distinguishable based on phenotypic differences. We apply these insights to investigate the diversity of Colletotrichum species associated with banana anthracnose in Brazil and report C. musae, C. tropicale, C. theobromicola, and C. siamense in association with banana anthracnose. One lineage did not cluster with any previously described species and is described here as C. chrysophilum.
Cluster-cluster correlations and constraints on the correlation hierarchy

Science.gov (United States)

Hamilton, A. J. S.; Gott, J. R., III

1988-01-01

The hypothesis that galaxies cluster around clusters at least as strongly as they cluster around galaxies imposes constraints on the hierarchy of correlation amplitudes in hierachical clustering models. The distributions which saturate these constraints are the Rayleigh-Levy random walk fractals proposed by Mandelbrot; for these fractal distributions cluster-cluster correlations are all identically equal to galaxy-galaxy correlations. If correlation amplitudes exceed the constraints, as is observed, then cluster-cluster correlations must exceed galaxy-galaxy correlations, as is observed.
Explanatory Preferences Shape Learning and Inference.

Science.gov (United States)

Lombrozo, Tania

2016-10-01

Explanations play an important role in learning and inference. People often learn by seeking explanations, and they assess the viability of hypotheses by considering how well they explain the data. An emerging body of work reveals that both children and adults have strong and systematic intuitions about what constitutes a good explanation, and that these explanatory preferences have a systematic impact on explanation-based processes. In particular, people favor explanations that are simple and broad, with the consequence that engaging in explanation can shape learning and inference by leading people to seek patterns and favor hypotheses that support broad and simple explanations. Given the prevalence of explanation in everyday cognition, understanding explanation is therefore crucial to understanding learning and inference. Copyright © 2016 Elsevier Ltd. All rights reserved.
Grammatical inference algorithms, routines and applications

CERN Document Server

Wieczorek, Wojciech

2017-01-01

This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.
Molecular dynamics simulations of nucleation and phase transitions in molecular clusters of hexafluorides

International Nuclear Information System (INIS)

Xu, S.

1993-01-01

Molecular dynamics simulations of nucleation and phase transitions in TeF 6 and SeF 6 clusters containing 100-350 molecules were carried out. Simulations successfully reproduced the crystalline structures observed in electron diffraction studies of large clusters (containing about 10 4 molecules) of the same materials. When the clusters were cooled, they spontaneously underwent the same bcc the monoclinic phase transition in simulations as in experiment, despite the million-fold difference in the time scales involved. Other transitions observed included melting and freezing. Several new techniques based on molecular translation and orientation were introduced to identify different condensed phases, to study nucleation and phase transitions, and to define characteristic temperatures of transitions. The solid-state transition temperatures decreased with cluster size in the same way as did the melting temperature, in that the depression of transition temperature was inversely proportional to the cluster radius. Rotational melting temperatures, as inferred from the rotational diffusion of molecules, coincided with those of the solid-state transition. Nucleation in liquid-solid and bcc-monoclinic transitions started in the interior of clusters on cooling, and at the surface on heating. Transition temperatures on cooling were always lower than those on heating due to the barriers to nucleation. Linear growth rates of nuclei in freezing were an order of magnitude lower than those in the bcc-monoclinic transition. Revealing evidence about the molecular behavior associated with phase changes was found. Simulations showed the formation of the actual transition complexes along the transition pathway, i.e., the critical nuclei of the new phase. These nuclei, consisting of a few dozen molecules, were distinguishable in the midst of the surrounding matter
CONSTRAINING CLUSTER PHYSICS WITH THE SHAPE OF X-RAY CLUSTERS: COMPARISON OF LOCAL X-RAY CLUSTERS VERSUS ΛCDM CLUSTERS

International Nuclear Information System (INIS)

Lau, Erwin T.; Nagai, Daisuke; Kravtsov, Andrey V.; Vikhlinin, Alexey; Zentner, Andrew R.

2012-01-01

Recent simulations of cluster formation have demonstrated that condensation of baryons into central galaxies during cluster formation can drive the shape of the gas distribution in galaxy clusters significantly rounder out to their virial radius. These simulations generally predict stellar fractions within cluster virial radii that are ∼2-3 times larger than the stellar masses deduced from observations. In this paper, we compare ellipticity profiles of simulated clusters performed with varying input physics (radiative cooling, star formation, and supernova feedback) to the cluster ellipticity profiles derived from Chandra and ROSAT observations, in an effort to constrain the fraction of gas that cools and condenses into the central galaxies within clusters. We find that local relaxed clusters have an average ellipticity of ε = 0.18 ± 0.05 in the radial range of 0.04 ≤ r/r 500 ≤ 1. At larger radii r > 0.1r 500 , the observed ellipticity profiles agree well with the predictions of non-radiative simulations. In contrast, the ellipticity profiles of simulated clusters that include dissipative gas physics deviate significantly from the observed ellipticity profiles at all radii. The dissipative simulations overpredict (underpredict) ellipticity in the inner (outer) regions of galaxy clusters. By comparing simulations with and without dissipative gas physics, we show that gas cooling causes the gas distribution to be more oblate in the central regions, but makes the outer gas distribution more spherical. We find that late-time gas cooling and star formation are responsible for the significantly oblate gas distributions in cluster cores, but the gas shapes outside of cluster cores are set primarily by baryon dissipation at high redshift (z ≥ 2). Our results indicate that the shapes of X-ray emitting gas in galaxy clusters, especially at large radii, can be used to place constraints on cluster gas physics, making it potential probes of the history of baryonic

Extracting Galaxy Cluster Gas Inhomogeneity from X-Ray Surface Brightness: A Statistical Approach and Application to Abell 3667

Science.gov (United States)

Kawahara, Hajime; Reese, Erik D.; Kitayama, Tetsu; Sasaki, Shin; Suto, Yasushi

2008-11-01

Our previous analysis indicates that small-scale fluctuations in the intracluster medium (ICM) from cosmological hydrodynamic simulations follow the lognormal probability density function. In order to test the lognormal nature of the ICM directly against X-ray observations of galaxy clusters, we develop a method of extracting statistical information about the three-dimensional properties of the fluctuations from the two-dimensional X-ray surface brightness. We first create a set of synthetic clusters with lognormal fluctuations around their mean profile given by spherical isothermal β-models, later considering polytropic temperature profiles as well. Performing mock observations of these synthetic clusters, we find that the resulting X-ray surface brightness fluctuations also follow the lognormal distribution fairly well. Systematic analysis of the synthetic clusters provides an empirical relation between the three-dimensional density fluctuations and the two-dimensional X-ray surface brightness. We analyze Chandra observations of the galaxy cluster Abell 3667, and find that its X-ray surface brightness fluctuations follow the lognormal distribution. While the lognormal model was originally motivated by cosmological hydrodynamic simulations, this is the first observational confirmation of the lognormal signature in a real cluster. Finally we check the synthetic cluster results against clusters from cosmological hydrodynamic simulations. As a result of the complex structure exhibited by simulated clusters, the empirical relation between the two- and three-dimensional fluctuation properties calibrated with synthetic clusters when applied to simulated clusters shows large scatter. Nevertheless we are able to reproduce the true value of the fluctuation amplitude of simulated clusters within a factor of 2 from their two-dimensional X-ray surface brightness alone. Our current methodology combined with existing observational data is useful in describing and inferring the
The ellipticity of galaxy cluster haloes from satellite galaxies and weak lensing

Science.gov (United States)

Shin, Tae-hyeon; Clampitt, Joseph; Jain, Bhuvnesh; Bernstein, Gary; Neil, Andrew; Rozo, Eduardo; Rykoff, Eli

2018-04-01

We study the ellipticity of galaxy cluster haloes as characterized by the distribution of cluster galaxies and as measured with weak lensing. We use Monte Carlo simulations of elliptical cluster density profiles to estimate and correct for Poisson noise bias, edge bias and projection effects. We apply our methodology to 10 428 Sloan Digital Sky Survey clusters identified by the redMaPPer algorithm with richness above 20. We find a mean ellipticity =0.271 ± 0.002 (stat) ±0.031 (sys) corresponding to an axis ratio = 0.573 ± 0.002 (stat) ±0.039 (sys). We compare this ellipticity of the satellites to the halo shape, through a stacked lensing measurement using optimal estimators of the lensing quadrupole based on Clampitt and Jain (2016). We find a best-fitting axis ratio of 0.56 ± 0.09 (stat) ±0.03 (sys), consistent with the ellipticity of the satellite distribution. Thus, cluster galaxies trace the shape of the dark matter halo to within our estimated uncertainties. Finally, we restack the satellite and lensing ellipticity measurements along the major axis of the cluster central galaxy's light distribution. From the lensing measurements, we infer a misalignment angle with an root-mean-square of 30° ± 10° when stacking on the central galaxy. We discuss applications of halo shape measurements to test the effects of the baryonic gas and active galactic nucleus feedback, as well as dark matter and gravity. The major improvements in signal-to-noise ratio expected with the ongoing Dark Energy Survey and future surveys from Large Synoptic Survey Telescope, Euclid, and Wide Field Infrared Survey Telescope will make halo shapes a useful probe of these effects.
BagReg: Protein inference through machine learning.

Science.gov (United States)

Zhao, Can; Liu, Dao; Teng, Ben; He, Zengyou

2015-08-01

Protein inference from the identified peptides is of primary importance in the shotgun proteomics. The target of protein inference is to identify whether each candidate protein is truly present in the sample. To date, many computational methods have been proposed to solve this problem. However, there is still no method that can fully utilize the information hidden in the input data. In this article, we propose a learning-based method named BagReg for protein inference. The method firstly artificially extracts five features from the input data, and then chooses each feature as the class feature to separately build models to predict the presence probabilities of proteins. Finally, the weak results from five prediction models are aggregated to obtain the final result. We test our method on six public available data sets. The experimental results show that our method is superior to the state-of-the-art protein inference algorithms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Clinical Outcome Prediction in Aneurysmal Subarachnoid Hemorrhage Using Bayesian Neural Networks with Fuzzy Logic Inferences

Directory of Open Access Journals (Sweden)

Benjamin W. Y. Lo

2013-01-01

Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
Forecasting building energy consumption with hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system

Energy Technology Data Exchange (ETDEWEB)

Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)

2010-11-15

There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)
Stochastic processes inference theory

CERN Document Server

Rao, Malempati M

2014-01-01

This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.
Russell and Humean Inferences

Directory of Open Access Journals (Sweden)

João Paulo Monteiro

2001-12-01

Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.
Efficient algorithms for conditional independence inference

Czech Academy of Sciences Publication Activity Database

Bouckaert, R.; Hemmecke, R.; Lindner, S.; Studený, Milan

2010-01-01

Roč. 11, č. 1 (2010), s. 3453-3479 ISSN 1532-4435 R&D Projects: GA ČR GA201/08/0539; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : conditional independence inference * linear programming approach Subject RIV: BA - General Mathematics Impact factor: 2.949, year: 2010 http://library.utia.cas.cz/separaty/2010/MTR/studeny-efficient algorithms for conditional independence inference.pdf
Connectivity in the early life history of sandeel inferred from otolith microchemistry

Science.gov (United States)

Gibb, Fiona M.; Régnier, Thomas; Donald, Kirsty; Wright, Peter J.

2017-01-01

Connectivity is a central issue in the development, sustainability and effectiveness of networks of Marine Protected Areas (MPAs). In populations with site attached adults, connectivity is limited to dispersal in the pelagic larval stage. While biophysical models have been widely used to infer early dispersal, empirical evidence through sources such as otolith microchemistry can provide a means of evaluating model predictions. In the present study, connectivity in the lesser sandeel, Ammodytes marinus, was investigated using LA-ICP-MS otolith microchemistry. Otoliths from juveniles (age 0) were examined from four Scottish spawning areas predicted to differ in terms of larval retention rates and connectivity based on past biophysical models. There were significant spatial differences in otolith post-settled juvenile chemistry among locations at a scale of 100-400 km. Differences in near core chemistry pointed to three chemically distinct natal sources, as identified by a cluster analysis, contributing to settlement locations.
State-Space Inference and Learning with Gaussian Processes

OpenAIRE

Turner, R; Deisenroth, MP; Rasmussen, CE

2010-01-01

18.10.13 KB. Ok to add author version to spiral, authors hold copyright. State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. C...
Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals.

Science.gov (United States)

Chen, Daizhuo; Fraiberger, Samuel P; Moakler, Robert; Provost, Foster

2017-09-01

Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions. As a use case, we explore personal inferences that are made possible from "Likes" on Facebook. We first present a means for providing transparency into the information responsible for inferences drawn by data-driven models. We then introduce the "cloaking device"-a mechanism for users to inhibit the use of particular pieces of information in inference. Using these analytical tools we ask two main questions: (1) How much information must users cloak to significantly affect inferences about their personal traits? We find that usually users must cloak only a small portion of their actions to inhibit inference. We also find that, encouragingly, false-positive inferences are significantly easier to cloak than true-positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. We demonstrate a simple modeling change that requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make control easier or harder for their users.
Fused Regression for Multi-source Gene Regulatory Network Inference.

Directory of Open Access Journals (Sweden)

Kari Y Lam

2016-12-01

Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.
Role of the cluster structure of {sup 7}Li in the dynamics of fragment capture

Energy Technology Data Exchange (ETDEWEB)

Shrivastava, A., E-mail: aradhana@barc.gov.in [Nuclear Physics Division, Bhabha Atomic Research Centre, Mumbai 400085 (India); Navin, A. [GANIL, CEA/DSM - CNRS/IN2P3, Bd Henri Becquerel, BP 55027, F-14076 Caen Cedex 5 (France); Diaz-Torres, A. [ECT, Villa Tambosi, I-38123 Villazzano, Trento (Italy); Nanal, V. [DNAP, Tata Institute of Fundamental Research, Mumbai 400005 (India); Ramachandran, K. [Nuclear Physics Division, Bhabha Atomic Research Centre, Mumbai 400085 (India); Rejmund, M. [GANIL, CEA/DSM - CNRS/IN2P3, Bd Henri Becquerel, BP 55027, F-14076 Caen Cedex 5 (France); Bhattacharyya, S. [Variable Energy Cyclotron Centre, 1/AF Bidhan Nagar, Kolkata 700064 (India); Chatterjee, A.; Kailas, S. [Nuclear Physics Division, Bhabha Atomic Research Centre, Mumbai 400085 (India); Lemasson, A. [GANIL, CEA/DSM - CNRS/IN2P3, Bd Henri Becquerel, BP 55027, F-14076 Caen Cedex 5 (France); Palit, R. [DNAP, Tata Institute of Fundamental Research, Mumbai 400005 (India); Parkar, V.V. [Nuclear Physics Division, Bhabha Atomic Research Centre, Mumbai 400085 (India); Pillay, R.G. [DNAP, Tata Institute of Fundamental Research, Mumbai 400005 (India); Rout, P.C. [Nuclear Physics Division, Bhabha Atomic Research Centre, Mumbai 400085 (India); Sawant, Y. [DNAP, Tata Institute of Fundamental Research, Mumbai 400005 (India)

2013-01-08

Exclusive measurements of prompt {gamma}-rays from the heavy-residues with various light charged particles in the {sup 7}Li + {sup 198}Pt system, at an energy near the Coulomb barrier (E/V{sub b}{approx}1.6) are reported. Recent dynamic classical trajectory calculations, constrained by the measured fusion, {alpha}- and t-capture cross-sections have been used to explain the excitation energy dependence of the residue cross-sections. These calculations distinctly illustrate a two-step process, breakup followed by fusion, in case of the capture of t and {alpha} clusters; whereas for {sup 6}He+p and {sup 5}He+d configurations, massive transfer is inferred to be the dominant mechanism. The present work clearly demonstrates the role played by the cluster structures of {sup 7}Li in understanding the reaction dynamics at energies around the Coulomb barrier.
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR

International Nuclear Information System (INIS)

Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin

2015-01-01

Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics
Bayesian structural inference for hidden processes

Science.gov (United States)

Strelioff, Christopher C.; Crutchfield, James P.

2014-04-01

We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
The Impact of Disablers on Predictive Inference

Science.gov (United States)

Cummins, Denise Dellarosa

2014-01-01

People consider alternative causes when deciding whether a cause is responsible for an effect (diagnostic inference) but appear to neglect them when deciding whether an effect will occur (predictive inference). Five experiments were conducted to test a 2-part explanation of this phenomenon: namely, (a) that people interpret standard predictive…
Automatic physical inference with information maximizing neural networks

Science.gov (United States)

Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

2018-04-01

Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.
Cluster management.

Science.gov (United States)

Katz, R

1992-11-01

Cluster management is a management model that fosters decentralization of management, develops leadership potential of staff, and creates ownership of unit-based goals. Unlike shared governance models, there is no formal structure created by committees and it is less threatening for managers. There are two parts to the cluster management model. One is the formation of cluster groups, consisting of all staff and facilitated by a cluster leader. The cluster groups function for communication and problem-solving. The second part of the cluster management model is the creation of task forces. These task forces are designed to work on short-term goals, usually in response to solving one of the unit's goals. Sometimes the task forces are used for quality improvement or system problems. Clusters are groups of not more than five or six staff members, facilitated by a cluster leader. A cluster is made up of individuals who work the same shift. For example, people with job titles who work days would be in a cluster. There would be registered nurses, licensed practical nurses, nursing assistants, and unit clerks in the cluster. The cluster leader is chosen by the manager based on certain criteria and is trained for this specialized role. The concept of cluster management, criteria for choosing leaders, training for leaders, using cluster groups to solve quality improvement issues, and the learning process necessary for manager support are described.
Cosmological constraints with clustering-based redshifts

Science.gov (United States)

Kovetz, Ely D.; Raccanelli, Alvise; Rahman, Mubdi

2017-07-01

We demonstrate that observations lacking reliable redshift information, such as photometric and radio continuum surveys, can produce robust measurements of cosmological parameters when empowered by clustering-based redshift estimation. This method infers the redshift distribution based on the spatial clustering of sources, using cross-correlation with a reference data set with known redshifts. Applying this method to the existing Sloan Digital Sky Survey (SDSS) photometric galaxies, and projecting to future radio continuum surveys, we show that sources can be efficiently divided into several redshift bins, increasing their ability to constrain cosmological parameters. We forecast constraints on the dark-energy equation of state and on local non-Gaussianity parameters. We explore several pertinent issues, including the trade-off between including more sources and minimizing the overlap between bins, the shot-noise limitations on binning and the predicted performance of the method at high redshifts, and most importantly pay special attention to possible degeneracies with the galaxy bias. Remarkably, we find that once this technique is implemented, constraints on dynamical dark energy from the SDSS imaging catalogue can be competitive with, or better than, those from the spectroscopic BOSS survey and even future planned experiments. Further, constraints on primordial non-Gaussianity from future large-sky radio-continuum surveys can outperform those from the Planck cosmic microwave background experiment and rival those from future spectroscopic galaxy surveys. The application of this method thus holds tremendous promise for cosmology.
Inference as Prediction

Science.gov (United States)

Watson, Jane

2007-01-01

Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…

Problem solving and inference mechanisms

Energy Technology Data Exchange (ETDEWEB)

Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A

1982-01-01

The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.
Elements of Causal Inference: Foundations and Learning Algorithms

DEFF Research Database (Denmark)

Peters, Jonas Martin; Janzing, Dominik; Schölkopf, Bernhard

A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning......A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning...
Bayesian methods for hackers probabilistic programming and Bayesian inference

CERN Document Server

Davidson-Pilon, Cameron

2016-01-01

Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
Lifting to cluster-tilting objects in higher cluster categories

OpenAIRE

Liu, Pin

2008-01-01

In this note, we consider the $d$-cluster-tilted algebras, the endomorphism algebras of $d$-cluster-tilting objects in $d$-cluster categories. We show that a tilting module over such an algebra lifts to a $d$-cluster-tilting object in this $d$-cluster category.
Cluster Analysis of Time-Dependent Crystallographic Data: Direct Identification of Time-Independent Structural Intermediates

Science.gov (United States)

Kostov, Konstantin S.; Moffat, Keith

2011-01-01

The initial output of a time-resolved macromolecular crystallography experiment is a time-dependent series of difference electron density maps that displays the time-dependent changes in underlying structure as a reaction progresses. The goal is to interpret such data in terms of a small number of crystallographically refinable, time-independent structures, each associated with a reaction intermediate; to establish the pathways and rate coefficients by which these intermediates interconvert; and thereby to elucidate a chemical kinetic mechanism. One strategy toward achieving this goal is to use cluster analysis, a statistical method that groups objects based on their similarity. If the difference electron density at a particular voxel in the time-dependent difference electron density (TDED) maps is sensitive to the presence of one and only one intermediate, then its temporal evolution will exactly parallel the concentration profile of that intermediate with time. The rationale is therefore to cluster voxels with respect to the shapes of their TDEDs, so that each group or cluster of voxels corresponds to one structural intermediate. Clusters of voxels whose TDEDs reflect the presence of two or more specific intermediates can also be identified. From such groupings one can then infer the number of intermediates, obtain their time-independent difference density characteristics, and refine the structure of each intermediate. We review the principles of cluster analysis and clustering algorithms in a crystallographic context, and describe the application of the method to simulated and experimental time-resolved crystallographic data for the photocycle of photoactive yellow protein. PMID:21244840
Data Clustering

Science.gov (United States)

Wagstaff, Kiri L.

2012-03-01

On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained
Causal inference in econometrics

CERN Document Server

Kreinovich, Vladik; Sriboonchitta, Songsak

2016-01-01

This book is devoted to the analysis of causal inference which is one of the most difficult tasks in data analysis: when two phenomena are observed to be related, it is often difficult to decide whether one of them causally influences the other one, or whether these two phenomena have a common cause. This analysis is the main focus of this volume. To get a good understanding of the causal inference, it is important to have models of economic phenomena which are as accurate as possible. Because of this need, this volume also contains papers that use non-traditional economic models, such as fuzzy models and models obtained by using neural networks and data mining techniques. It also contains papers that apply different econometric models to analyze real-life economic dependencies.
Dense Fe cluster-assembled films by energetic cluster deposition

International Nuclear Information System (INIS)

Peng, D.L.; Yamada, H.; Hihara, T.; Uchida, T.; Sumiyama, K.

2004-01-01

High-density Fe cluster-assembled films were produced at room temperature by an energetic cluster deposition. Though cluster-assemblies are usually sooty and porous, the present Fe cluster-assembled films are lustrous and dense, revealing a soft magnetic behavior. Size-monodispersed Fe clusters with the mean cluster size d=9 nm were synthesized using a plasma-gas-condensation technique. Ionized clusters are accelerated electrically and deposited onto the substrate together with neutral clusters from the same cluster source. Packing fraction and saturation magnetic flux density increase rapidly and magnetic coercivity decreases remarkably with increasing acceleration voltage. The Fe cluster-assembled film obtained at the acceleration voltage of -20 kV has a packing fraction of 0.86±0.03, saturation magnetic flux density of 1.78±0.05 Wb/m 2 , and coercivity value smaller than 80 A/m. The resistivity at room temperature is ten times larger than that of bulk Fe metal
Assessment of network inference methods: how to cope with an underdetermined problem.

Directory of Open Access Journals (Sweden)

Caroline Siegenthaler

Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.
Cluster Physics with Merging Galaxy Clusters

Directory of Open Access Journals (Sweden)

Sandor M. Molnar

2016-02-01

Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.
Probability and Statistical Inference

OpenAIRE

Prosper, Harrison B.

2006-01-01

These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.
Planck intermediate results: XL. The Sunyaev-Zeldovich signal from the Virgo cluster

International Nuclear Information System (INIS)

Ade, P. A. R.; Aghanim, N.; Arnaud, M.; Ashdown, M.; Aumont, J.

2016-01-01

The Virgo cluster is the largest Sunyaev-Zeldovich (SZ) source in the sky, both in terms of angular size and total integrated flux. Planck’s wide angular scale and frequency coverage, together with its high sensitivity, enable a detailed study of this big object through the SZ effect. Virgo is well resolved by Planck, showing an elongated structure that correlates well with the morphology observed from X-rays, but extends beyond the observed X-ray signal. We find good agreement between the SZ signal (or Compton parameter, y_c) observed by Planck and the expected signal inferred from X-ray observations and simple analytical models. Owing to its proximity to us, the gas beyond the virial radius in Virgo can be studied with unprecedented sensitivity by integrating the SZ signal over tens of square degrees. In this paper, we study the signal in the outskirts of Virgo and compare it with analytical models and a constrained simulation of the environment of Virgo. Planck data suggest that significant amounts of low-density plasma surround Virgo, out to twice the virial radius. We find the SZ signal in the outskirts of Virgo to be consistent with a simple model that extrapolates the inferred pressure at lower radii, while assuming that the temperature stays in the keV range beyond the virial radius. The observed signal is also consistent with simulations and points to a shallow pressure profile in the outskirts of the cluster. This reservoir of gas at large radii can be linked with the hottest phase of the elusivewarm/hot intergalactic medium. Taking the lack of symmetry of Virgo into account, we find that a prolate model is favoured by the combination of SZ and X-ray data, in agreement with predictions. In conclusion, based on the combination of the same SZ and X-ray data, we constrain the total amount of gas in Virgo. Under the hypothesis that the abundance of baryons in Virgo is representative of the cosmic average, we also infer a distance for Virgo of approximately
Fuzzy logic controller using different inference methods

International Nuclear Information System (INIS)

Liu, Z.; De Keyser, R.

1994-01-01

In this paper the design of fuzzy controllers by using different inference methods is introduced. Configuration of the fuzzy controllers includes a general rule-base which is a collection of fuzzy PI or PD rules, the triangular fuzzy data model and a centre of gravity defuzzification algorithm. The generalized modus ponens (GMP) is used with the minimum operator of the triangular norm. Under the sup-min inference rule, six fuzzy implication operators are employed to calculate the fuzzy look-up tables for each rule base. The performance is tested in simulated systems with MATLAB/SIMULINK. Results show the effects of using the fuzzy controllers with different inference methods and applied to different test processes
An algebra-based method for inferring gene regulatory networks.

Science.gov (United States)

Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

2014-03-26

The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the
Statistical inference based on divergence measures

CERN Document Server

Pardo, Leandro

2005-01-01

The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...
Active inference, sensory attenuation and illusions.

Science.gov (United States)

Brown, Harriet; Adams, Rick A; Parees, Isabel; Edwards, Mark; Friston, Karl

2013-11-01

Active inference provides a simple and neurobiologically plausible account of how action and perception are coupled in producing (Bayes) optimal behaviour. This can be seen most easily as minimising prediction error: we can either change our predictions to explain sensory input through perception. Alternatively, we can actively change sensory input to fulfil our predictions. In active inference, this action is mediated by classical reflex arcs that minimise proprioceptive prediction error created by descending proprioceptive predictions. However, this creates a conflict between action and perception; in that, self-generated movements require predictions to override the sensory evidence that one is not actually moving. However, ignoring sensory evidence means that externally generated sensations will not be perceived. Conversely, attending to (proprioceptive and somatosensory) sensations enables the detection of externally generated events but precludes generation of actions. This conflict can be resolved by attenuating the precision of sensory evidence during movement or, equivalently, attending away from the consequences of self-made acts. We propose that this Bayes optimal withdrawal of precise sensory evidence during movement is the cause of psychophysical sensory attenuation. Furthermore, it explains the force-matching illusion and reproduces empirical results almost exactly. Finally, if attenuation is removed, the force-matching illusion disappears and false (delusional) inferences about agency emerge. This is important, given the negative correlation between sensory attenuation and delusional beliefs in normal subjects--and the reduction in the magnitude of the illusion in schizophrenia. Active inference therefore links the neuromodulatory optimisation of precision to sensory attenuation and illusory phenomena during the attribution of agency in normal subjects. It also provides a functional account of deficits in syndromes characterised by false inference
Bayesian Inference and Online Learning in Poisson Neuronal Networks.

Science.gov (United States)

Huang, Yanping; Rao, Rajesh P N

2016-08-01

Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Contingency inferences driven by base rates: Valid by sampling

Directory of Open Access Journals (Sweden)

Florian Kutzner

2011-04-01

Full Text Available Fiedler et al. (2009, reviewed evidence for the utilization of a contingency inference strategy termed pseudocontingencies (PCs. In PCs, the more frequent levels (and, by implication, the less frequent levels are assumed to be associated. PCs have been obtained using a wide range of task settings and dependent measures. Yet, the readiness with which decision makers rely on PCs is poorly understood. A computer simulation explored two potential sources of subjective validity of PCs. First, PCs are shown to perform above chance level when the task is to infer the sign of moderate to strong population contingencies from a sample of observations. Second, contingency inferences based on PCs and inferences based on cell frequencies are shown to partially agree across samples. Intriguingly, this criterion and convergent validity are by-products of random sampling error, highlighting the inductive nature of contingency inferences.
Reinforcement and inference in cross-situational word learning.

Science.gov (United States)

Tilles, Paulo F C; Fontanari, José F

2013-01-01

Cross-situational word learning is based on the notion that a learner can determine the referent of a word by finding something in common across many observed uses of that word. Here we propose an adaptive learning algorithm that contains a parameter that controls the strength of the reinforcement applied to associations between concurrent words and referents, and a parameter that regulates inference, which includes built-in biases, such as mutual exclusivity, and information of past learning events. By adjusting these parameters so that the model predictions agree with data from representative experiments on cross-situational word learning, we were able to explain the learning strategies adopted by the participants of those experiments in terms of a trade-off between reinforcement and inference. These strategies can vary wildly depending on the conditions of the experiments. For instance, for fast mapping experiments (i.e., the correct referent could, in principle, be inferred in a single observation) inference is prevalent, whereas for segregated contextual diversity experiments (i.e., the referents are separated in groups and are exhibited with members of their groups only) reinforcement is predominant. Other experiments are explained with more balanced doses of reinforcement and inference.
Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

Science.gov (United States)

Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

2014-11-01

Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

Clusters and how to make it work : Cluster Strategy Toolkit

NARCIS (Netherlands)

Manickam, Anu; van Berkel, Karel

2014-01-01

Clusters are the magic answer to regional economic development. Firms in clusters are more innovative; cluster policy dominates EU policy; ‘top-sectors’ and excellence are the choice of national policy makers; clusters are ‘in’. But, clusters are complex, clusters are ‘messy’; there is no clear
Eight challenges in phylodynamic inference

Directory of Open Access Journals (Sweden)

Simon D.W. Frost

2015-03-01

Full Text Available The field of phylodynamics, which attempts to enhance our understanding of infectious disease dynamics using pathogen phylogenies, has made great strides in the past decade. Basic epidemiological and evolutionary models are now well characterized with inferential frameworks in place. However, significant challenges remain in extending phylodynamic inference to more complex systems. These challenges include accounting for evolutionary complexities such as changing mutation rates, selection, reassortment, and recombination, as well as epidemiological complexities such as stochastic population dynamics, host population structure, and different patterns at the within-host and between-host scales. An additional challenge exists in making efficient inferences from an ever increasing corpus of sequence data.
Cluster dynamics at different cluster size and incident laser wavelengths

International Nuclear Information System (INIS)

Desai, Tara; Bernardinello, Andrea

2002-01-01

X-ray emission spectra from aluminum clusters of diameter -0.4 μm and gold clusters of dia. ∼1.25 μm are experimentally studied by irradiating the cluster foil targets with 1.06 μm laser, 10 ns (FWHM) at an intensity ∼10 12 W/cm 2 . Aluminum clusters show a different spectra compared to bulk material whereas gold cluster evolve towards bulk gold. Experimental data are analyzed on the basis of cluster dimension, laser wavelength and pulse duration. PIC simulations are performed to study the behavior of clusters at higher intensity I≥10 17 W/cm 2 for different size of the clusters irradiated at different laser wavelengths. Results indicate the dependence of cluster dynamics on cluster size and incident laser wavelength
Text Clustering Algorithm Based on Random Cluster Core

Directory of Open Access Journals (Sweden)

Huang Long-Jun

2016-01-01

Full Text Available Nowadays clustering has become a popular text mining algorithm, but the huge data can put forward higher requirements for the accuracy and performance of text mining. In view of the performance bottleneck of traditional text clustering algorithm, this paper proposes a text clustering algorithm with random features. This is a kind of clustering algorithm based on text density, at the same time using the neighboring heuristic rules, the concept of random cluster is introduced, which effectively reduces the complexity of the distance calculation.
CAF: Cluster algorithm and a-star with fuzzy approach for lifetime enhancement in wireless sensor networks

KAUST Repository

Yuan, Y.; Li, C.; Yang, Y.; Zhang, Xiangliang; Li, L.

2014-01-01

Energy is a major factor in designing wireless sensor networks (WSNs). In particular, in the real world, battery energy is limited; thus the effective improvement of the energy becomes the key of the routing protocols. Besides, the sensor nodes are always deployed far away from the base station and the transmission energy consumption is index times increasing with the increase of distance as well. This paper proposes a new routing method for WSNs to extend the network lifetime using a combination of a clustering algorithm, a fuzzy approach, and an A-star method. The proposal is divided into two steps. Firstly, WSNs are separated into clusters using the Stable Election Protocol (SEP) method. Secondly, the combined methods of fuzzy inference and A-star algorithm are adopted, taking into account the factors such as the remaining power, the minimum hops, and the traffic numbers of nodes. Simulation results demonstrate that the proposed method has significant effectiveness in terms of balancing energy consumption as well as maximizing the network lifetime by comparing the performance of the A-star and fuzzy (AF) approach, cluster and fuzzy (CF)method, cluster and A-star (CA)method, A-star method, and SEP algorithm under the same routing criteria. 2014 Yali Yuan et al.
CAF: Cluster algorithm and a-star with fuzzy approach for lifetime enhancement in wireless sensor networks

KAUST Repository

Yuan, Y.

2014-04-28

Energy is a major factor in designing wireless sensor networks (WSNs). In particular, in the real world, battery energy is limited; thus the effective improvement of the energy becomes the key of the routing protocols. Besides, the sensor nodes are always deployed far away from the base station and the transmission energy consumption is index times increasing with the increase of distance as well. This paper proposes a new routing method for WSNs to extend the network lifetime using a combination of a clustering algorithm, a fuzzy approach, and an A-star method. The proposal is divided into two steps. Firstly, WSNs are separated into clusters using the Stable Election Protocol (SEP) method. Secondly, the combined methods of fuzzy inference and A-star algorithm are adopted, taking into account the factors such as the remaining power, the minimum hops, and the traffic numbers of nodes. Simulation results demonstrate that the proposed method has significant effectiveness in terms of balancing energy consumption as well as maximizing the network lifetime by comparing the performance of the A-star and fuzzy (AF) approach, cluster and fuzzy (CF)method, cluster and A-star (CA)method, A-star method, and SEP algorithm under the same routing criteria. 2014 Yali Yuan et al.
THE ACS FORNAX CLUSTER SURVEY. X. COLOR GRADIENTS OF GLOBULAR CLUSTER SYSTEMS IN EARLY-TYPE GALAXIES

International Nuclear Information System (INIS)

Liu Chengze; Peng, Eric W.; Jordan, Andres; Ferrarese, Laura; Blakeslee, John P.; Cote, Patrick; Mei, Simona

2011-01-01

We use the largest homogeneous sample of globular clusters (GCs), drawn from the ACS Virgo Cluster Survey (ACSVCS) and ACS Fornax Cluster Survey (ACSFCS), to investigate the color gradients of GC systems in 76 early-type galaxies. We find that most GC systems possess an obvious negative gradient in (g-z) color with radius (bluer outward), which is consistent with previous work. For GC systems displaying color bimodality, both metal-rich and metal-poor GC subpopulations present shallower but significant color gradients on average, and the mean color gradients of these two subpopulations are of roughly equal strength. The field of view of ACS mainly restricts us to measuring the inner gradients of the studied GC systems. These gradients, however, can introduce an aperture bias when measuring the mean colors of GC subpopulations from relatively narrow central pointings. Inferred corrections to previous work imply a reduced significance for the relation between the mean color of metal-poor GCs and their host galaxy luminosity. The GC color gradients also show a dependence with host galaxy mass where the gradients are weakest at the ends of the mass spectrum-in massive galaxies and dwarf galaxies-and strongest in galaxies of intermediate mass, around a stellar mass of M * ∼10 10 M sun . We also measure color gradients for field stars in the host galaxies. We find that GC color gradients are systematically steeper than field star color gradients, but the shape of the gradient-mass relation is the same for both. If gradients are caused by rapid dissipational collapse and weakened by merging, these color gradients support a picture where the inner GC systems of most intermediate-mass and massive galaxies formed early and rapidly with the most massive galaxies having experienced greater merging. The lack of strong gradients in the GC systems of dwarfs, which probably have not experienced many recent major mergers, suggests that low-mass halos were inefficient at retaining
Human Inferences about Sequences: A Minimal Transition Probability Model.

Directory of Open Access Journals (Sweden)

Florent Meyniel

2016-12-01

Full Text Available The brain constantly infers the causes of the inputs it receives and uses these inferences to generate statistical expectations about future observations. Experimental evidence for these expectations and their violations include explicit reports, sequential effects on reaction times, and mismatch or surprise signals recorded in electrophysiology and functional MRI. Here, we explore the hypothesis that the brain acts as a near-optimal inference device that constantly attempts to infer the time-varying matrix of transition probabilities between the stimuli it receives, even when those stimuli are in fact fully unpredictable. This parsimonious Bayesian model, with a single free parameter, accounts for a broad range of findings on surprise signals, sequential effects and the perception of randomness. Notably, it explains the pervasive asymmetry between repetitions and alternations encountered in those studies. Our analysis suggests that a neural machinery for inferring transition probabilities lies at the core of human sequence knowledge.
Making Type Inference Practical

DEFF Research Database (Denmark)

Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens

1992-01-01

We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...
THE GRISM LENS-AMPLIFIED SURVEY FROM SPACE (GLASS). V. EXTENT AND SPATIAL DISTRIBUTION OF STAR FORMATION IN z ∼ 0.5 CLUSTER GALAXIES

Energy Technology Data Exchange (ETDEWEB)

Vulcani, Benedetta [Kavli Institute for the Physics and Mathematics of the Universe (WPI), The University of Tokyo Institutes for Advanced Study (UTIAS), the University of Tokyo, Kashiwa, 277-8582 (Japan); Treu, Tommaso; Malkan, Matthew; Abramson, Louis [Department of Physics and Astronomy, University of California, Los Angeles, CA 90095-1547 (United States); Schmidt, Kasper B. [Department of Physics, University of California, Santa Barbara, CA 93106-9530 (United States); Poggianti, Bianca M. [INAF-Astronomical Observatory of Padova (Italy); Dressler, Alan [The Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Fontana, Adriano; Pentericci, Laura [INAF—Osservatorio Astronomico di Roma, Via Frascati 33, 00040 Monte Porzio Catone (Italy); Bradac, Marusa; Hoag, Austin; Huang, Kuan-Han; He, Julie [Department of Physics, University of California, Davis, CA 95616 (United States); Brammer, Gabriel B. [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States); Trenti, Michele [School of Physics, University of Melbourne, VIC 3010 (Australia); Linden, Anja von der [Dark Cosmology Centre, Niels Bohr Institute, University of Copenhagen Juliane Maries Vej 30, DK-2100 Copenhagen Ø (Denmark); Morris, Glenn [Kavli Institute for Particle Astrophysics and Cosmology, Stanford University, 452 Lomita Mall, Stanford, CA 94305-4085 (United States)

2015-12-01

We present the first study of the spatial distribution of star formation in z ∼ 0.5 cluster galaxies. The analysis is based on data taken with the Wide Field Camera 3 as part of the Grism Lens-Amplified Survey from Space (GLASS). We illustrate the methodology by focusing on two clusters (MACS 0717.5+3745 and MACS 1423.8+2404) with different morphologies (one relaxed and one merging) and use foreground and background galaxies as a field control sample. The cluster+field sample consists of 42 galaxies with stellar masses in the range 10{sup 8}–10{sup 11} M{sub ⊙} and star formation rates in the range 1–20 M{sub ⊙} yr{sup −1}. Both in clusters and in the field, Hα is more extended than the rest-frame UV continuum in 60% of the cases, consistent with diffuse star formation and inside-out growth. In ∼20% of the cases, the Hα emission appears more extended in cluster galaxies than in the field, pointing perhaps to ionized gas being stripped and/or star formation being enhanced at large radii. The peak of the Hα emission and that of the continuum are offset by less than 1 kpc. We investigate trends with the hot gas density as traced by the X-ray emission, and with the surface mass density as inferred from gravitational lens models, and find no conclusive results. The diversity of morphologies and sizes observed in Hα illustrates the complexity of the environmental processes that regulate star formation. Upcoming analysis of the full GLASS data set will increase our sample size by almost an order of magnitude, verifying and strengthening the inference from this initial data set.
THE GRISM LENS-AMPLIFIED SURVEY FROM SPACE (GLASS). V. EXTENT AND SPATIAL DISTRIBUTION OF STAR FORMATION IN z ∼ 0.5 CLUSTER GALAXIES

International Nuclear Information System (INIS)

Vulcani, Benedetta; Treu, Tommaso; Malkan, Matthew; Abramson, Louis; Schmidt, Kasper B.; Poggianti, Bianca M.; Dressler, Alan; Fontana, Adriano; Pentericci, Laura; Bradac, Marusa; Hoag, Austin; Huang, Kuan-Han; He, Julie; Brammer, Gabriel B.; Trenti, Michele; Linden, Anja von der; Morris, Glenn

2015-01-01

We present the first study of the spatial distribution of star formation in z ∼ 0.5 cluster galaxies. The analysis is based on data taken with the Wide Field Camera 3 as part of the Grism Lens-Amplified Survey from Space (GLASS). We illustrate the methodology by focusing on two clusters (MACS 0717.5+3745 and MACS 1423.8+2404) with different morphologies (one relaxed and one merging) and use foreground and background galaxies as a field control sample. The cluster+field sample consists of 42 galaxies with stellar masses in the range 10 8 –10 11 M ⊙ and star formation rates in the range 1–20 M ⊙ yr −1 . Both in clusters and in the field, Hα is more extended than the rest-frame UV continuum in 60% of the cases, consistent with diffuse star formation and inside-out growth. In ∼20% of the cases, the Hα emission appears more extended in cluster galaxies than in the field, pointing perhaps to ionized gas being stripped and/or star formation being enhanced at large radii. The peak of the Hα emission and that of the continuum are offset by less than 1 kpc. We investigate trends with the hot gas density as traced by the X-ray emission, and with the surface mass density as inferred from gravitational lens models, and find no conclusive results. The diversity of morphologies and sizes observed in Hα illustrates the complexity of the environmental processes that regulate star formation. Upcoming analysis of the full GLASS data set will increase our sample size by almost an order of magnitude, verifying and strengthening the inference from this initial data set
Examples in parametric inference with R

CERN Document Server

Dixit, Ulhas Jayram

2016-01-01

This book discusses examples in parametric inference with R. Combining basic theory with modern approaches, it presents the latest developments and trends in statistical inference for students who do not have an advanced mathematical and statistical background. The topics discussed in the book are fundamental and common to many fields of statistical inference and thus serve as a point of departure for in-depth study. The book is divided into eight chapters: Chapter 1 provides an overview of topics on sufficiency and completeness, while Chapter 2 briefly discusses unbiased estimation. Chapter 3 focuses on the study of moments and maximum likelihood estimators, and Chapter 4 presents bounds for the variance. In Chapter 5, topics on consistent estimator are discussed. Chapter 6 discusses Bayes, while Chapter 7 studies some more powerful tests. Lastly, Chapter 8 examines unbiased and other tests. Senior undergraduate and graduate students in statistics and mathematics, and those who have taken an introductory cou...
Causal Effect Inference with Deep Latent-Variable Models

NARCIS (Netherlands)

Louizos, C; Shalit, U.; Mooij, J.; Sontag, D.; Zemel, R.; Welling, M.

2017-01-01

Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of
Causal inference in survival analysis using pseudo-observations

DEFF Research Database (Denmark)

Andersen, Per K; Syriopoulou, Elisavet; Parner, Erik T

2017-01-01

Causal inference for non-censored response variables, such as binary or quantitative outcomes, is often based on either (1) direct standardization ('G-formula') or (2) inverse probability of treatment assignment weights ('propensity score'). To do causal inference in survival analysis, one needs ...
On clusters and clustering from atoms to fractals

CERN Document Server

Reynolds, PJ

1993-01-01

This book attempts to answer why there is so much interest in clusters. Clusters occur on all length scales, and as a result occur in a variety of fields. Clusters are interesting scientifically, but they also have important consequences technologically. The division of the book into three parts roughly separates the field into small, intermediate, and large-scale clusters. Small clusters are the regime of atomic and molecular physics and chemistry. The intermediate regime is the transitional regime, with its characteristics including the onset of bulk-like behavior, growth and aggregation, a
Statistical Inference at Work: Statistical Process Control as an Example

Science.gov (United States)

Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia

2008-01-01

To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…
On quantum statistical inference

NARCIS (Netherlands)

Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.

2003-01-01

Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have
Are Hox genes ancestrally involved in axial patterning? Evidence from the hydrozoan Clytia hemisphaerica (Cnidaria.

Directory of Open Access Journals (Sweden)

Roxane Chiori

Full Text Available BACKGROUND: The early evolution and diversification of Hox-related genes in eumetazoans has been the subject of conflicting hypotheses concerning the evolutionary conservation of their role in axial patterning and the pre-bilaterian origin of the Hox and ParaHox clusters. The diversification of Hox/ParaHox genes clearly predates the origin of bilaterians. However, the existence of a "Hox code" predating the cnidarian-bilaterian ancestor and supporting the deep homology of axes is more controversial. This assumption was mainly based on the interpretation of Hox expression data from the sea anemone, but growing evidence from other cnidarian taxa puts into question this hypothesis. METHODOLOGY/PRINCIPAL FINDINGS: Hox, ParaHox and Hox-related genes have been investigated here by phylogenetic analysis and in situ hybridisation in Clytia hemisphaerica, an hydrozoan species with medusa and polyp stages alternating in the life cycle. Our phylogenetic analyses do not support an origin of ParaHox and Hox genes by duplication of an ancestral ProtoHox cluster, and reveal a diversification of the cnidarian HOX9-14 genes into three groups called A, B, C. Among the 7 examined genes, only those belonging to the HOX9-14 and the CDX groups exhibit a restricted expression along the oral-aboral axis during development and in the planula larva, while the others are expressed in very specialised areas at the medusa stage. CONCLUSIONS/SIGNIFICANCE: Cross species comparison reveals a strong variability of gene expression along the oral-aboral axis and during the life cycle among cnidarian lineages. The most parsimonious interpretation is that the Hox code, collinearity and conservative role along the antero-posterior axis are bilaterian innovations.
Statistical inference

CERN Document Server

Rohatgi, Vijay K

2003-01-01

Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth
The Impact of Contextual Clue Selection on Inference

Directory of Open Access Journals (Sweden)

Leila Barati

2010-05-01

Full Text Available Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with selecting different embedded contextual clues. Naturally, people with different world experiences infer similar contextual situations differently. Lack of contextual knowledge of the target language can present an obstacle to comprehension (Anderson & Lynch, 2003. This paper tries to investigate how true contextual clue selection from the text can influence listener’s inference. In the present study 60 male and female teenagers (13-19 and 60 male and female young adults (20-26 were selected randomly based on Oxford Placement Test (OPT. During the study two fiction and two non-fiction passages were read to the participants in the experimental and control groups respectively and they were given scores according to Lexile’s Score (LS[1] based on their correct inference and logical thinking ability. In general the results show that participants’ clue selection based on their personal schematic references and background knowledge differ between teenagers and young adults and influence inference and listening comprehension. [1]- This is a framework for reading and listening which matches the appropriate score to each text based on degree of difficulty of text and each text was given a Lexile score from zero to four.

GibbsCluster: unsupervised clustering and alignment of peptide sequences

DEFF Research Database (Denmark)

Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten

2017-01-01

motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large......-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0....
Inferring Demographic History Using Two-Locus Statistics.

Science.gov (United States)

Ragsdale, Aaron P; Gutenkunst, Ryan N

2017-06-01

Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.
Statistical Inference on the Canadian Middle Class

Directory of Open Access Journals (Sweden)

Russell Davidson

2018-03-01

Full Text Available Conventional wisdom says that the middle classes in many developed countries have recently suffered losses, in terms of both the share of the total population belonging to the middle class, and also their share in total income. Here, distribution-free methods are developed for inference on these shares, by means of deriving expressions for their asymptotic variances of sample estimates, and the covariance of the estimates. Asymptotic inference can be undertaken based on asymptotic normality. Bootstrap inference can be expected to be more reliable, and appropriate bootstrap procedures are proposed. As an illustration, samples of individual earnings drawn from Canadian census data are used to test various hypotheses about the middle-class shares, and confidence intervals for them are computed. It is found that, for the earlier censuses, sample sizes are large enough for asymptotic and bootstrap inference to be almost identical, but that, in the twenty-first century, the bootstrap fails on account of a strange phenomenon whereby many presumably different incomes in the data are rounded to one and the same value. Another difference between the centuries is the appearance of heavy right-hand tails in the income distributions of both men and women.
The importance of learning when making inferences

Directory of Open Access Journals (Sweden)

Jorg Rieskamp

2008-03-01

Full Text Available The assumption that people possess a repertoire of strategies to solve the inference problems they face has been made repeatedly. The experimental findings of two previous studies on strategy selection are reexamined from a learning perspective, which argues that people learn to select strategies for making probabilistic inferences. This learning process is modeled with the strategy selection learning (SSL theory, which assumes that people develop subjective expectancies for the strategies they have. They select strategies proportional to their expectancies, which are updated on the basis of experience. For the study by Newell, Weston, and Shanks (2003 it can be shown that people did not anticipate the success of a strategy from the beginning of the experiment. Instead, the behavior observed at the end of the experiment was the result of a learning process that can be described by the SSL theory. For the second study, by Br"oder and Schiffer (2006, the SSL theory is able to provide an explanation for why participants only slowly adapted to new environments in a dynamic inference situation. The reanalysis of the previous studies illustrates the importance of learning for probabilistic inferences.
Diametrical clustering for identifying anti-correlated gene clusters.

Science.gov (United States)

Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

2003-09-01

Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.
Partitional clustering algorithms

CERN Document Server

2015-01-01

This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...
Uncovering and testing the fuzzy clusters based on lumped Markov chain in complex network.

Science.gov (United States)

Jing, Fan; Jianbin, Xie; Jinlong, Wang; Jinshuai, Qu

2013-01-01

Identifying clusters, namely groups of nodes with comparatively strong internal connectivity, is a fundamental task for deeply understanding the structure and function of a network. By means of a lumped Markov chain model of a random walker, we propose two novel ways of inferring the lumped markov transition matrix. Furthermore, some useful results are proposed based on the analysis of the properties of the lumped Markov process. To find the best partition of complex networks, a novel framework including two algorithms for network partition based on the optimal lumped Markovian dynamics is derived to solve this problem. The algorithms are constructed to minimize the objective function under this framework. It is demonstrated by the simulation experiments that our algorithms can efficiently determine the probabilities with which a node belongs to different clusters during the learning process and naturally supports the fuzzy partition. Moreover, they are successfully applied to real-world network, including the social interactions between members of a karate club.
Bayesian inference of substrate properties from film behavior

International Nuclear Information System (INIS)

Aggarwal, R; Demkowicz, M J; Marzouk, Y M

2015-01-01

We demonstrate that by observing the behavior of a film deposited on a substrate, certain features of the substrate may be inferred with quantified uncertainty using Bayesian methods. We carry out this demonstration on an illustrative film/substrate model where the substrate is a Gaussian random field and the film is a two-component mixture that obeys the Cahn–Hilliard equation. We construct a stochastic reduced order model to describe the film/substrate interaction and use it to infer substrate properties from film behavior. This quantitative inference strategy may be adapted to other film/substrate systems. (paper)
Brain Imaging, Forward Inference, and Theories of Reasoning

Science.gov (United States)

Heit, Evan

2015-01-01

This review focuses on the issue of how neuroimaging studies address theoretical accounts of reasoning, through the lens of the method of forward inference (Henson, 2005, 2006). After theories of deductive and inductive reasoning are briefly presented, the method of forward inference for distinguishing between psychological theories based on brain imaging evidence is critically reviewed. Brain imaging studies of reasoning, comparing deductive and inductive arguments, comparing meaningful versus non-meaningful material, investigating hemispheric localization, and comparing conditional and relational arguments, are assessed in light of the method of forward inference. Finally, conclusions are drawn with regard to future research opportunities. PMID:25620926
Brain imaging, forward inference, and theories of reasoning.

Science.gov (United States)

Heit, Evan

2014-01-01

This review focuses on the issue of how neuroimaging studies address theoretical accounts of reasoning, through the lens of the method of forward inference (Henson, 2005, 2006). After theories of deductive and inductive reasoning are briefly presented, the method of forward inference for distinguishing between psychological theories based on brain imaging evidence is critically reviewed. Brain imaging studies of reasoning, comparing deductive and inductive arguments, comparing meaningful versus non-meaningful material, investigating hemispheric localization, and comparing conditional and relational arguments, are assessed in light of the method of forward inference. Finally, conclusions are drawn with regard to future research opportunities.
Statistical inference an integrated approach

CERN Document Server

Migon, Helio S; Louzada, Francisco

2014-01-01

Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
Statistical learning and selective inference.

Science.gov (United States)

Taylor, Jonathan; Tibshirani, Robert J

2015-06-23

We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
A TALE OF DWARFS AND GIANTS: USING A z = 1.62 CLUSTER TO UNDERSTAND HOW THE RED SEQUENCE GREW OVER THE LAST 9.5 BILLION YEARS

International Nuclear Information System (INIS)

Rudnick, Gregory H.; Tran, Kim-Vy; Papovich, Casey; Momcheva, Ivelina; Willmer, Christopher

2012-01-01

We study the red sequence in a cluster of galaxies at z = 1.62 and follow its evolution over the intervening 9.5 Gyr to the present day. Using deep YJK s imaging with the HAWK-I instrument on the Very Large Telescope, we identify a tight red sequence and construct its rest-frame i-band luminosity function (LF). There is a marked deficit of faint red galaxies in the cluster that causes a turnover in the LF. We compare the red-sequence LF to that for clusters at z 0.6. In this model the cluster accretes blue galaxies from the field whose star formation is quenched and who are subsequently allowed to merge. We find that three to four mergers among cluster galaxies during the 4 Gyr between z = 1.62 and z = 0.6 match the observed LF evolution between the two redshifts. The inferred merger rate is consistent with other studies of this cluster. Our result supports the picture that galaxy merging during the major growth phase of massive clusters is an important process in shaping the red-sequence population at all luminosities.
Relativistic protons in the Coma galaxy cluster: first gamma-ray constraints ever on turbulent reacceleration

Science.gov (United States)

Brunetti, G.; Zimmer, S.; Zandanel, F.

2017-12-01

The Fermi-LAT (Large Area Telescope) collaboration recently published deep upper limits to the gamma-ray emission of the Coma cluster, a cluster hosting the prototype of giant radio haloes. In this paper, we extend previous studies and use a formalism that combines particle reacceleration by turbulence and the generation of secondary particles in the intracluster medium to constrain relativistic protons and their role for the origin of the radio halo. We conclude that a pure hadronic origin of the halo is clearly disfavoured as it would require excessively large magnetic fields. However, secondary particles can still generate the observed radio emission if they are reaccelerated. For the first time the deep gamma-ray limits allow us to derive meaningful constraints if the halo is generated during phases of reacceleration of relativistic protons and their secondaries by cluster-scale turbulence. In this paper, we explore a relevant range of parameter space of reacceleration models of secondaries. Within this parameter space, a fraction of model configurations is already ruled out by current gamma-ray limits, including the cases that assume weak magnetic fields in the cluster core, B ≤ 2-3 μG. Interestingly, we also find that the flux predicted by a large fraction of model configurations assuming magnetic fields consistent with Faraday rotation measures (RMs) is not far from the limits. This suggests that a detection of gamma-rays from the cluster might be possible in the near future, provided that the electrons generating the radio halo are secondaries reaccelerated and the magnetic field in the cluster is consistent with that inferred from RM.
Cluster Matters

DEFF Research Database (Denmark)

Gulati, Mukesh; Lund-Thomsen, Peter; Suresh, Sangeetha

2018-01-01

sell their products successfully in international markets, but there is also an increasingly large consumer base within India. Indeed, Indian industrial clusters have contributed to a substantial part of this growth process, and there are several hundred registered clusters within the country...... of this handbook, which focuses on the role of CSR in MSMEs. Hence we contribute to the literature on CSR in industrial clusters and specifically CSR in Indian industrial clusters by investigating the drivers of CSR in India’s industrial clusters....
Weighted Clustering

DEFF Research Database (Denmark)

Ackerman, Margareta; Ben-David, Shai; Branzei, Simina

2012-01-01

We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...
Probing dark energy models with extreme pairwise velocities of galaxy clusters from the DEUS-FUR simulations

Science.gov (United States)

Bouillot, Vincent R.; Alimi, Jean-Michel; Corasaniti, Pier-Stefano; Rasera, Yann

2015-06-01

Observations of colliding galaxy clusters with high relative velocity probe the tail of the halo pairwise velocity distribution with the potential of providing a powerful test of cosmology. As an example it has been argued that the discovery of the Bullet Cluster challenges standard Λ cold dark matter (ΛCDM) model predictions. Halo catalogues from N-body simulations have been used to estimate the probability of Bullet-like clusters. However, due to simulation volume effects previous studies had to rely on a Gaussian extrapolation of the pairwise velocity distribution to high velocities. Here, we perform a detail analysis using the halo catalogues from the Dark Energy Universe Simulation Full Universe Runs (DEUS-FUR), which enables us to resolve the high-velocity tail of the distribution and study its dependence on the halo mass definition, redshift and cosmology. Building upon these results, we estimate the probability of Bullet-like systems in the framework of Extreme Value Statistics. We show that the tail of extreme pairwise velocities significantly deviates from that of a Gaussian, moreover it carries an imprint of the underlying cosmology. We find the Bullet Cluster probability to be two orders of magnitude larger than previous estimates, thus easing the tension with the ΛCDM model. Finally, the comparison of the inferred probabilities for the different DEUS-FUR cosmologies suggests that observations of extreme interacting clusters can provide constraints on dark energy models complementary to standard cosmological tests.
Not-so-simple stellar populations in the intermediate-age Large Magellanic Cloud star clusters NGC 1831 and NGC 1868

Energy Technology Data Exchange (ETDEWEB)

Li, Chengyuan; De Grijs, Richard [Kavli Institute for Astronomy and Astrophysics and Department of Astronomy, Peking University, Yi He Yuan Lu 5, Hai Dian District, Beijing 100871 (China); Deng, Licai, E-mail: joshuali@pku.edu.cn, E-mail: grijs@pku.edu.cn [Key Laboratory for Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, 20A Datun Road, Chaoyang District, Beijing 100012 (China)

2014-04-01

Using a combination of high-resolution Hubble Space Telescope/Wide-Field and Planetary Camera-2 observations, we explore the physical properties of the stellar populations in two intermediate-age star clusters, NGC 1831 and NGC 1868, in the Large Magellanic Cloud based on their color-magnitude diagrams. We show that both clusters exhibit extended main-sequence turn offs. To explain the observations, we consider variations in helium abundance, binarity, age dispersions, and the fast rotation of the clusters' member stars. The observed narrow main sequence excludes significant variations in helium abundance in both clusters. We first establish the clusters' main-sequence binary fractions using the bulk of the clusters' main-sequence stellar populations ≳ 1 mag below their turn-offs. The extent of the turn-off regions in color-magnitude space, corrected for the effects of binarity, implies that age spreads of order 300 Myr may be inferred for both clusters if the stellar distributions in color-magnitude space were entirely due to the presence of multiple populations characterized by an age range. Invoking rapid rotation of the population of cluster members characterized by a single age also allows us to match the observed data in detail. However, when taking into account the extent of the red clump in color-magnitude space, we encounter an apparent conflict for NGC 1831 between the age dispersion derived from that based on the extent of the main-sequence turn off and that implied by the compact red clump. We therefore conclude that, for this cluster, variations in stellar rotation rate are preferred over an age dispersion. For NGC 1868, both models perform equally well.
Support Policies in Clusters: Prioritization of Support Needs by Cluster Members According to Cluster Life Cycle

Directory of Open Access Journals (Sweden)

Gulcin Salıngan

2012-07-01

Full Text Available Economic development has always been a moving target. Both the national and local governments have been facing the challenge of implementing the effective and efficient economic policy and program in order to best utilize their limited resources. One of the recent approaches in this area is called cluster-based economic analysis and strategy development. This study reviews key literature and some of the cluster based economic policies adopted by different governments. Based on this review, it proposes “the cluster life cycle” as a determining factor to identify the support requirements of clusters. A survey, designed based on literature review of International Cluster support programs, was conducted with 30 participants from 3 clusters with different maturity stage. This paper discusses the results of this study conducted among the cluster members in Eskişehir- Bilecik-Kütahya Region in Turkey on the requirement of the support to foster the development of related clusters.
Object-Oriented Type Inference

DEFF Research Database (Denmark)

Schwartzbach, Michael Ignatieff; Palsberg, Jens

1991-01-01

We present a new approach to inferring types in untyped object-oriented programs with inheritance, assignments, and late binding. It guarantees that all messages are understood, annotates the program with type information, allows polymorphic methods, and can be used as the basis of an op...

Bayesian investigation of isochrone consistency using the old open cluster NGC 188

Energy Technology Data Exchange (ETDEWEB)

Hills, Shane; Courteau, Stéphane [Department of Physics, Engineering Physics and Astronomy, Queen’s University, Kingston, ON K7L 3N6 Canada (Canada); Von Hippel, Ted [Department of Physical Sciences, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114 (United States); Geller, Aaron M., E-mail: shane.hills@queensu.ca, E-mail: courteau@astro.queensu.ca, E-mail: ted.vonhippel@erau.edu, E-mail: a-geller@northwestern.edu [Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208 (United States)

2015-03-01

This paper provides a detailed comparison of the differences in parameters derived for a star cluster from its color–magnitude diagrams (CMDs) depending on the filters and models used. We examine the consistency and reliability of fitting three widely used stellar evolution models to 15 combinations of optical and near-IR photometry for the old open cluster NGC 188. The optical filter response curves match those of theoretical systems and are thus not the source of fit inconsistencies. NGC 188 is ideally suited to this study thanks to a wide variety of high-quality photometry and available proper motions and radial velocities that enable us to remove non-cluster members and many binaries. Our Bayesian fitting technique yields inferred values of age, metallicity, distance modulus, and absorption as a function of the photometric band combinations and stellar models. We show that the historically favored three-band combinations of UBV and VRI can be meaningfully inconsistent with each other and with longer baseline data sets such as UBVRIJHK{sub S}. Differences among model sets can also be substantial. For instance, fitting Yi et al. (2001) and Dotter et al. (2008) models to UBVRIJHK{sub S} photometry for NGC 188 yields the following cluster parameters: age = (5.78 ± 0.03, 6.45 ± 0.04) Gyr, [Fe/H] = (+0.125 ± 0.003, −0.077 ± 0.003) dex, (m−M){sub V} = (11.441 ± 0.007, 11.525 ± 0.005) mag, and A{sub V} = (0.162 ± 0.003, 0.236 ± 0.003) mag, respectively. Within the formal fitting errors, these two fits are substantially and statistically different. Such differences among fits using different filters and models are a cautionary tale regarding our current ability to fit star cluster CMDs. Additional modeling of this kind, with more models and star clusters, and future Gaia parallaxes are critical for isolating and quantifying the most relevant uncertainties in stellar evolutionary models.
The Probabilistic Convolution Tree: Efficient Exact Bayesian Inference for Faster LC-MS/MS Protein Inference

Science.gov (United States)

Serang, Oliver

2014-01-01

Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Clusters and how to make it work : toolkit for cluster strategy

NARCIS (Netherlands)

Manickam, Anu; van Berkel, Karel

2013-01-01

Clusters are the magic answer to regional economic development. Firms in clusters are more innovative; cluster policy dominates EU policy; ‘top-sectors’ and excellence are the choice of national policy makers; clusters are ‘in’. But, clusters are complex, clusters are ‘messy’; there is no clear
ON THE RELIABILITY OF STELLAR AGES AND AGE SPREADS INFERRED FROM PRE-MAIN-SEQUENCE EVOLUTIONARY MODELS

International Nuclear Information System (INIS)

Hosokawa, Takashi; Offner, Stella S. R.; Krumholz, Mark R.

2011-01-01

We revisit the problem of low-mass pre-main-sequence stellar evolution and its observational consequences for where stars fall on the Hertzsprung-Russell diagram (HRD). In contrast to most previous work, our models follow stars as they grow from small masses via accretion, and we perform a systematic study of how the stars' HRD evolution is influenced by their initial radius, by the radiative properties of the accretion flow, and by the accretion history, using both simple idealized accretion histories and histories taken from numerical simulations of star cluster formation. We compare our numerical results to both non-accreting isochrones and to the positions of observed stars in the HRD, with a goal of determining whether both the absolute ages and the age dispersions inferred from non-accreting isochrones are reliable. We show that non-accreting isochrones can sometimes overestimate stellar ages for more massive stars (those with effective temperatures above ∼3500 K), thereby explaining why non-accreting isochrones often suggest a systematic age difference between more and less massive stars in the same cluster. However, we also find the only way to produce a similar overestimate for the ages of cooler stars is if these stars grow from ∼0.01 M sun seed protostars that are an order of magnitude smaller than predicted by current theoretical models, and if the size of the seed protostar correlates systematically with the final stellar mass at the end of accretion. We therefore conclude that, unless both of these conditions are met, inferred ages and age spreads for cool stars are reliable, at least to the extent that the observed bolometric luminosities and temperatures are accurate. Finally, we note that the time dependence of the mass accretion rate has remarkably little effect on low-mass stars' evolution on the HRD, and that such time dependence may be neglected for all stars except those with effective temperatures above ∼4000 K.
Reward inference by primate prefrontal and striatal neurons.

Science.gov (United States)

Pan, Xiaochuan; Fan, Hongwei; Sawa, Kosuke; Tsuda, Ichiro; Tsukada, Minoru; Sakagami, Masamichi

2014-01-22

The brain contains multiple yet distinct systems involved in reward prediction. To understand the nature of these processes, we recorded single-unit activity from the lateral prefrontal cortex (LPFC) and the striatum in monkeys performing a reward inference task using an asymmetric reward schedule. We found that neurons both in the LPFC and in the striatum predicted reward values for stimuli that had been previously well experienced with set reward quantities in the asymmetric reward task. Importantly, these LPFC neurons could predict the reward value of a stimulus using transitive inference even when the monkeys had not yet learned the stimulus-reward association directly; whereas these striatal neurons did not show such an ability. Nevertheless, because there were two set amounts of reward (large and small), the selected striatal neurons were able to exclusively infer the reward value (e.g., large) of one novel stimulus from a pair after directly experiencing the alternative stimulus with the other reward value (e.g., small). Our results suggest that although neurons that predict reward value for old stimuli in the LPFC could also do so for new stimuli via transitive inference, those in the striatum could only predict reward for new stimuli via exclusive inference. Moreover, the striatum showed more complex functions than was surmised previously for model-free learning.
Implementasi Adaptive Neuro-Fuzzy Inference System (Anfis untuk Peramalan Pemakaian Air di Perusahaan Daerah Air Minum Tirta Moedal Semarang

Directory of Open Access Journals (Sweden)

Ulfatun Hani'ah

2016-06-01

Full Text Available Peramalan pemakaian air pada bulan januari 2015 sampai April 2015 dapat dilakukan menggunakan perhitungan matematika dengan bantuan ilmu komputer. Metode yang digunakan adalah Adaptive Neuro Fuzzy Inference System (ANFIS dengan bantuan software MATLAB. Untuk pengujian program, dilakukan percobaan dengan memasukkan variabel klas = 2, maksimum epoh = 100, error = 10-6, rentang nilai learning rate = 0.6 sampai 0.9, dan rentang nilai momentum = 0.6 sampai 0.9. Simpulan yang diperoleh adalah bahwa implementasi metode Adaptive Neuro-Fuzzy Inference System dalam peramalan pemakaian air yang pertama adalah membuat rancangan flowchart, melakukan clustering data menggunakan fuzzy C-Mean, menentukan neuron tiap-tiap lapisan, mencari nilai parameter dengan menggunakan LSE rekursif, lalu penentuan perhitungan error menggunakan sum square error (SSE dan membuat sistem peramalan pemakaian air dengan software MATLAB. Setelah dilakukan percobaan hasil yang menunjukkan SSE paling kecil adalah nilai learning rate 0.9 dan momentum 0.6 dengan SSE 0.0080107. Hasil peramalan pemakaian air pada bulan Januari adalah 3.836.138m3, bulan Februari adalah 3.595.188m3, bulan Maret adalah 3.596.416 m3, dan bulan April adalah 3.776.833 m3.
Bootstrap inference when using multiple imputation.

Science.gov (United States)

Schomaker, Michael; Heumann, Christian

2018-04-16

Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is nonsymmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present 4 methods that are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that 3 of the 4 approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the 4 methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the g-formula for inference, a method for which no standard errors are available. Copyright © 2018 John Wiley & Sons, Ltd.
Evolutionary inference via the Poisson Indel Process.

Science.gov (United States)

Bouchard-Côté, Alexandre; Jordan, Michael I

2013-01-22

We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.
RUPRECHT 147: THE OLDEST NEARBY OPEN CLUSTER AS A NEW BENCHMARK FOR STELLAR ASTROPHYSICS

Energy Technology Data Exchange (ETDEWEB)

Curtis, Jason L.; Wright, Jason T. [Department of Astronomy and Astrophysics, The Pennsylvania State University, University Park, PA 16802 (United States); Wolfgang, Angie [Department of Astronomy and Astrophysics, University of California, Santa Cruz, CA 95064 (United States); Brewer, John M. [Department of Astronomy, Yale University, New Haven, CT 06511 (United States); Johnson, John Asher, E-mail: jcurtis@psu.edu [Department of Astrophysics, California Institute of Technology, Pasadena, CA 91125 (United States)

2013-05-15

Ruprecht 147 is a hitherto unappreciated open cluster that holds great promise as a standard in fundamental stellar astrophysics. We have conducted a radial velocity survey of astrometric candidates with Lick, Palomar, and MMT observatories and have identified over 100 members, including 5 blue stragglers, 11 red giants, and 5 double-lined spectroscopic binaries (SB2s). We estimate the cluster metallicity from spectroscopic analysis, using Spectroscopy Made Easy (SME), and find it to be [M/H] = +0.07 {+-} 0.03. We have obtained deep CFHT/MegaCam g'r'i'z' photometry and fit Padova isochrones to the (g' - i') and Two Micron All Sky Survey (J - K{sub S} ) color-magnitude diagrams, using the {tau}{sup 2} maximum-likelihood procedure of Naylor, and an alternative method using two-dimensional cross-correlations developed in this work. We find best fits for Padova isochrones at age t = 2.5 {+-} 0.25 Gyr, m - M = 7.35 {+-} 0.1, and A{sub V} = 0.25 {+-} 0.05, with additional uncertainty from the unresolved binary population and possibility of differential extinction across this large cluster. The inferred age is heavily dependent on our choice of stellar evolution model: fitting Dartmouth and PARSEC models yield age parameters of 3 Gyr and 3.25 Gyr, respectively. At {approx}300 pc and {approx}3 Gyr, Ruprecht 147 is by far the oldest nearby star cluster.
RUPRECHT 147: THE OLDEST NEARBY OPEN CLUSTER AS A NEW BENCHMARK FOR STELLAR ASTROPHYSICS

International Nuclear Information System (INIS)

Curtis, Jason L.; Wright, Jason T.; Wolfgang, Angie; Brewer, John M.; Johnson, John Asher

2013-01-01

Ruprecht 147 is a hitherto unappreciated open cluster that holds great promise as a standard in fundamental stellar astrophysics. We have conducted a radial velocity survey of astrometric candidates with Lick, Palomar, and MMT observatories and have identified over 100 members, including 5 blue stragglers, 11 red giants, and 5 double-lined spectroscopic binaries (SB2s). We estimate the cluster metallicity from spectroscopic analysis, using Spectroscopy Made Easy (SME), and find it to be [M/H] = +0.07 ± 0.03. We have obtained deep CFHT/MegaCam g'r'i'z' photometry and fit Padova isochrones to the (g' – i') and Two Micron All Sky Survey (J – K S ) color-magnitude diagrams, using the τ 2 maximum-likelihood procedure of Naylor, and an alternative method using two-dimensional cross-correlations developed in this work. We find best fits for Padova isochrones at age t = 2.5 ± 0.25 Gyr, m – M = 7.35 ± 0.1, and A V = 0.25 ± 0.05, with additional uncertainty from the unresolved binary population and possibility of differential extinction across this large cluster. The inferred age is heavily dependent on our choice of stellar evolution model: fitting Dartmouth and PARSEC models yield age parameters of 3 Gyr and 3.25 Gyr, respectively. At ∼300 pc and ∼3 Gyr, Ruprecht 147 is by far the oldest nearby star cluster.
Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

Directory of Open Access Journals (Sweden)

Jing Yang

Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.
System Support for Forensic Inference

Science.gov (United States)

Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan

Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.
Determination of atomic cluster structure with cluster fusion algorithm

DEFF Research Database (Denmark)

Obolensky, Oleg I.; Solov'yov, Ilia; Solov'yov, Andrey V.

2005-01-01

We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters.......We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters....
Large-Scale Multi-Dimensional Document Clustering on GPU Clusters

Energy Technology Data Exchange (ETDEWEB)

Cui, Xiaohui [ORNL; Mueller, Frank [North Carolina State University; Zhang, Yongpeng [ORNL; Potok, Thomas E [ORNL

2010-01-01

Document clustering plays an important role in data mining systems. Recently, a flocking-based document clustering algorithm has been proposed to solve the problem through simulation resembling the flocking behavior of birds in nature. This method is superior to other clustering algorithms, including k-means, in the sense that the outcome is not sensitive to the initial state. One limitation of this approach is that the algorithmic complexity is inherently quadratic in the number of documents. As a result, execution time becomes a bottleneck with large number of documents. In this paper, we assess the benefits of exploiting the computational power of Beowulf-like clusters equipped with contemporary Graphics Processing Units (GPUs) as a means to significantly reduce the runtime of flocking-based document clustering. Our framework scales up to over one million documents processed simultaneously in a sixteennode GPU cluster. Results are also compared to a four-node cluster with higher-end GPUs. On these clusters, we observe 30X-50X speedups, which demonstrates the potential of GPU clusters to efficiently solve massive data mining problems. Such speedups combined with the scalability potential and accelerator-based parallelization are unique in the domain of document-based data mining, to the best of our knowledge.
Membership determination of open clusters based on a spectral clustering method

Science.gov (United States)

Gao, Xin-Hua

2018-06-01

We present a spectral clustering (SC) method aimed at segregating reliable members of open clusters in multi-dimensional space. The SC method is a non-parametric clustering technique that performs cluster division using eigenvectors of the similarity matrix; no prior knowledge of the clusters is required. This method is more flexible in dealing with multi-dimensional data compared to other methods of membership determination. We use this method to segregate the cluster members of five open clusters (Hyades, Coma Ber, Pleiades, Praesepe, and NGC 188) in five-dimensional space; fairly clean cluster members are obtained. We find that the SC method can capture a small number of cluster members (weak signal) from a large number of field stars (heavy noise). Based on these cluster members, we compute the mean proper motions and distances for the Hyades, Coma Ber, Pleiades, and Praesepe clusters, and our results are in general quite consistent with the results derived by other authors. The test results indicate that the SC method is highly suitable for segregating cluster members of open clusters based on high-precision multi-dimensional astrometric data such as Gaia data.
DISCOVERY OF THE LARGEST KNOWN LENSED IMAGES FORMED BY A CRITICALLY CONVERGENT LENSING CLUSTER

International Nuclear Information System (INIS)

Zitrin, Adi; Broadhurst, Tom

2009-01-01

We identify the largest known lensed images of a single spiral galaxy, lying close to the center of the distant cluster MACS J1149.5+2223 (z = 0.544). These images cover a total area of ≅150 mbox '' and are magnified ≅200 times. Unusually, there is very little image distortion, implying that the central mass distribution is almost uniform over a wide area (r ≅ 200 kpc) with a surface density equal to the critical density for lensing, corresponding to maximal lens magnification. Many fainter multiply lensed galaxies are also uncovered by our model, outlining a very large tangential critical curve, of radius r ≅ 170 kpc, posing a potential challenge for the standard LCDM cosmology. Because of the uniform central mass distribution, a particularly clean measurement of the mass of the brightest cluster galaxy is possible here, for which we infer stars contribute most of the mass within a limiting radius of ≅30 kpc, with a mass-to-light ratio of M/L B ≅ 4.5(M/L) sun . This cluster with its uniform and central mass distribution acts analogously to a regular magnifying glass, converging light without distorting the images, resulting in the most powerful lens yet discovered for accessing the faint high-z universe.
Geostatistical inference using crosshole ground-penetrating radar

DEFF Research Database (Denmark)

Looms, Majken C; Hansen, Thomas Mejer; Cordua, Knud Skou

2010-01-01

of the subsurface are used to evaluate the uncertainty of the inversion estimate. We have explored the full potential of the geostatistical inference method using several synthetic models of varying correlation structures and have tested the influence of different assumptions concerning the choice of covariance...... reflection profile. Furthermore, the inferred values of the subsurface global variance and the mean velocity have been corroborated with moisturecontent measurements, obtained gravimetrically from samples collected at the field site....
Bayesian Inference for Functional Dynamics Exploring in fMRI Data

Directory of Open Access Journals (Sweden)

Xuan Guo

2016-01-01

Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Cluster headache

Science.gov (United States)

Histamine headache; Headache - histamine; Migrainous neuralgia; Headache - cluster; Horton's headache; Vascular headache - cluster ... Doctors do not know exactly what causes cluster headaches. They ... (chemical in the body released during an allergic response) or ...
Clustering redshifts: a new window through the Universe

International Nuclear Information System (INIS)

Scottez, Vivien L.

2015-01-01

The main goals of this thesis are to validate, consolidate and develop a new method to measure the redshift distribution of a sample of galaxies. Where current methods - spectroscopic and photometric redshifts - rely on the study of the spectral energy distribution of extragalactic sources, the approach presented here is based on the clustering properties of galaxies. Indeed clustering of galaxies caused by gravity gives them a particular spatial - and angular - distribution. In this clustering redshift approach, we use this particular property between a galaxies sample of unknown redshifts and a galaxies sample of reference to reconstruct the redshift distribution of the unknown population. Thus, possible systematics in this approach should be independent of those existing in other methods. This new method responds to a real need from the scientific community in the context of large dark imaging experiments such as the Euclid mission of the European Space Agency (ESA). After introducing the general scientific context and having highlighted the crucial role of distance measurements in astronomy, I present the statistical tools generally used to study the large scale structure of the Universe as well as their modification to infer redshift distributions. After validating this approach on a particular type of extragalactic objects, I generalized its application to all types of galaxies. Then, I explored the precision and some systematic effects by conducting an ideal case study. Thus, I performed a real case study. I also pushed further this analysis and found that the reference sample used in the measurement does not need to have the same limiting magnitude than the population of unknown redshift. This property is a great advantage for the use of this approach in the context of large imaging dark energy experiments like the Euclid space mission. Finally, I summarize my main results and present some of my future projects. (author)

Coupled cluster calculations for static and dynamic polarizabilities of C60

Science.gov (United States)

Kowalski, Karol; Hammond, Jeff R.; de Jong, Wibe A.; Sadlej, Andrzej J.

2008-12-01

New theoretical predictions for the static and frequency dependent polarizabilities of C60 are reported. Using the linear response coupled cluster approach with singles and doubles and a basis set especially designed to treat the molecular properties in external electric field, we obtained 82.20 and 83.62 Å3 for static and dynamic (λ =1064 nm) polarizabilities. These numbers are in a good agreement with experimentally inferred data of 76.5±8 and 79±4 Å3 [R. Antoine et al., J. Chem. Phys.110, 9771 (1999); A. Ballard et al., J. Chem. Phys.113, 5732 (2000)]. The reported results were obtained with the highest wave function-based level of theory ever applied to the C60 system.
Deep multi-frequency rotation measure tomography of the galaxy cluster A2255

Science.gov (United States)

Pizzo, R. F.; de Bruyn, A. G.; Bernardi, G.; Brentjens, M. A.

2011-01-01

Aims: By studying the polarimetric properties of the radio galaxies and the radio filaments belonging to the galaxy cluster Abell 2255, we aim to unveil their 3-dimensional location within the cluster. Methods: We performed WSRT observations of A2255 at 18, 21, 25, 85, and 200 cm. The polarization images of the cluster were processed through rotation measure (RM) synthesis, producing three final RM cubes. Results: The radio galaxies and the filaments at the edges of the halo are detected in the high-frequency RM cube, obtained by combining the data at 18, 21, and 25 cm. Their Faraday spectra show different levels of complexity. The radio galaxies lying near by the cluster center have Faraday spectra with multiple peaks, while those at large distances show only one peak, as do the filaments. Similar RM distributions are observed for the external radio galaxies and for the filaments, with much lower average RM values and RM variance than those found in previous works for the central radio galaxies. The 85 cm RM cube is dominated by the Galactic foreground emission, but it also shows features associated with the cluster. At 2 m, no polarized emission from A2255 nor our Galaxy is detected. Conclusions: The radial trend observed in the RM distributions of the radio galaxies and in the complexity of their Faraday spectra favors the interpretation that the external Faraday screen for all the sources in A2255 is the ICM. Its differential contribution depends on the amount of medium that the radio signal crosses along the line of sight. The filaments should therefore be located at the periphery of the cluster, and their apparent central location comes from projection effects. Their high fractional polarization and morphology suggest that they are relics rather than part of a genuine radio halo. Their inferred large distance from the cluster center and their geometry could argue for an association with large-scale structure (LSS) shocks. The RM cubes in gif format are only
Working memory supports inference learning just like classification learning.

Science.gov (United States)

Craig, Stewart; Lewandowsky, Stephan

2013-08-01

Recent research has found a positive relationship between people's working memory capacity (WMC) and their speed of category learning. To date, only classification-learning tasks have been considered, in which people learn to assign category labels to objects. It is unknown whether learning to make inferences about category features might also be related to WMC. We report data from a study in which 119 participants undertook classification learning and inference learning, and completed a series of WMC tasks. Working memory capacity was positively related to people's classification and inference learning performance.
Single-cluster dynamics for the random-cluster model

NARCIS (Netherlands)

Deng, Y.; Qian, X.; Blöte, H.W.J.

2009-01-01

We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q-state Potts model to noninteger values q>1. Its results for static quantities are in a satisfactory agreement with those
Statistical inference for stochastic processes

National Research Council Canada - National Science Library

Basawa, Ishwar V; Prakasa Rao, B. L. S

1980-01-01

The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...
Inference of Large Phylogenies Using Neighbour-Joining

DEFF Research Database (Denmark)

Simonsen, Martin; Mailund, Thomas; Pedersen, Christian Nørgaard Storm

2011-01-01

The neighbour-joining method is a widely used method for phylogenetic reconstruction which scales to thousands of taxa. However, advances in sequencing technology have made data sets with more than 10,000 related taxa widely available. Inference of such large phylogenies takes hours or days using...... the Neighbour-Joining method on a normal desktop computer because of the O(n^3) running time. RapidNJ is a search heuristic which reduce the running time of the Neighbour-Joining method significantly but at the cost of an increased memory consumption making inference of large phylogenies infeasible. We present...... two extensions for RapidNJ which reduce the memory requirements and \\makebox{allows} phylogenies with more than 50,000 taxa to be inferred efficiently on a desktop computer. Furthermore, an improved version of the search heuristic is presented which reduces the running time of RapidNJ on many data...
Inference of the oxidative stress network in Anopheles stephensi upon Plasmodium infection.

Science.gov (United States)

Shrinet, Jatin; Nandal, Umesh Kumar; Adak, Tridibes; Bhatnagar, Raj K; Sunil, Sujatha

2014-01-01

Ookinete invasion of Anopheles midgut is a critical step for malaria transmission; the parasite numbers drop drastically and practically reach a minimum during the parasite's whole life cycle. At this stage, the parasite as well as the vector undergoes immense oxidative stress. Thereafter, the vector undergoes oxidative stress at different time points as the parasite invades its tissues during the parasite development. The present study was undertaken to reconstruct the network of differentially expressed genes involved in oxidative stress in Anopheles stephensi during Plasmodium development and maturation in the midgut. Using high throughput next generation sequencing methods, we generated the transcriptome of the An. stephensi midgut during Plasmodium vinckei petteri oocyst invasion of the midgut epithelium. Further, we utilized large datasets available on public domain on Anopheles during Plasmodium ookinete invasion and Drosophila datasets and arrived upon clusters of genes that may play a role in oxidative stress. Finally, we used support vector machines for the functional prediction of the un-annotated genes of An. stephensi. Integrating the results from all the different data analyses, we identified a total of 516 genes that were involved in oxidative stress in An. stephensi during Plasmodium development. The significantly regulated genes were further extracted from this gene cluster and used to infer an oxidative stress network of An. stephensi. Using system biology approaches, we have been able to ascertain the role of several putative genes in An. stephensi with respect to oxidative stress. Further experimental validations of these genes are underway.
Statistical causal inferences and their applications in public health research

CERN Document Server

Wu, Pan; Chen, Ding-Geng

2016-01-01

This book compiles and presents new developments in statistical causal inference. The accompanying data and computer programs are publicly available so readers may replicate the model development and data analysis presented in each chapter. In this way, methodology is taught so that readers may implement it directly. The book brings together experts engaged in causal inference research to present and discuss recent issues in causal inference methodological development. This is also a timely look at causal inference applied to scenarios that range from clinical trials to mediation and public health research more broadly. In an academic setting, this book will serve as a reference and guide to a course in causal inference at the graduate level (Master's or Doctorate). It is particularly relevant for students pursuing degrees in Statistics, Biostatistics and Computational Biology. Researchers and data analysts in public health and biomedical research will also find this book to be an important reference.
clusterMaker: a multi-algorithm clustering plugin for Cytoscape

Directory of Open Access Journals (Sweden)

Morris John H

2011-11-01

Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster
The anatomy of choice: active inference and agency

Directory of Open Access Journals (Sweden)

Karl eFriston

2013-09-01

Full Text Available This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behaviour. In particular, we consider prior beliefs that action minimises the Kullback-Leibler divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimises a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimising free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action – constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualises optimal decision theory and economic (utilitarian formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimisation, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria has a unique and Bayes-optimal solution – that minimises free energy. This sensitivity corresponds to the precision of beliefs about behaviour, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behaviour entails a representation of confidence about outcomes that are under an agent's control.
The anatomy of choice: active inference and agency.

Science.gov (United States)

Friston, Karl; Schwartenbeck, Philipp; Fitzgerald, Thomas; Moutoussis, Michael; Behrens, Timothy; Dolan, Raymond J

2013-01-01

This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behavior. In particular, we consider prior beliefs that action minimizes the Kullback-Leibler (KL) divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimizes a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimizing free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action-constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualizes optimal decision theory and economic (utilitarian) formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimization, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria) has a unique and Bayes-optimal solution-that minimizes free energy. This sensitivity corresponds to the precision of beliefs about behavior, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behavior entails a representation of confidence about outcomes that are under an agent's control.
Universal Darwinism As a Process of Bayesian Inference.

Science.gov (United States)

Campbell, John O

2016-01-01

Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
sick: The Spectroscopic Inference Crank

Science.gov (United States)

Casey, Andrew R.

2016-03-01

There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
SICK: THE SPECTROSCOPIC INFERENCE CRANK

Energy Technology Data Exchange (ETDEWEB)

Casey, Andrew R., E-mail: arc@ast.cam.ac.uk [Institute of Astronomy, University of Cambridge, Madingley Road, Cambdridge, CB3 0HA (United Kingdom)

2016-03-15

There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
SICK: THE SPECTROSCOPIC INFERENCE CRANK

International Nuclear Information System (INIS)

Casey, Andrew R.

2016-01-01

There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Relevant Subspace Clustering

DEFF Research Database (Denmark)

Müller, Emmanuel; Assent, Ira; Günnemann, Stephan

2009-01-01

Subspace clustering aims at detecting clusters in any subspace projection of a high dimensional space. As the number of possible subspace projections is exponential in the number of dimensions, the result is often tremendously large. Recent approaches fail to reduce results to relevant subspace...... clusters. Their results are typically highly redundant, i.e. many clusters are detected multiple times in several projections. In this work, we propose a novel model for relevant subspace clustering (RESCU). We present a global optimization which detects the most interesting non-redundant subspace clusters...... achieves top clustering quality while competing approaches show greatly varying performance....
On principles of inductive inference

OpenAIRE

Kostecki, Ryszard Paweł

2011-01-01

We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.
Horticultural cluster

OpenAIRE

SHERSTIUK S.V.; POSYLAYEVA K.I.

2013-01-01

In the article there are the theoretical and methodological approaches to the nature and existence of the cluster. The cluster differences from other kinds of cooperative and integration associations. Was develop by scientific-practical recommendations for forming a competitive horticultur cluster.
Model averaging, optimal inference and habit formation

Directory of Open Access Journals (Sweden)

Thomas H B FitzGerald

2014-06-01

Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.
Voting-based consensus clustering for combining multiple clusterings of chemical structures

Directory of Open Access Journals (Sweden)

Saeed Faisal

2012-12-01

Full Text Available Abstract Background Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Results The cumulative voting-based aggregation algorithm (CVAA, cluster-based similarity partitioning algorithm (CSPA and hyper-graph partitioning algorithm (HGPA were examined. The F-measure and Quality Partition Index method (QPI were used to evaluate the clusterings and the results were compared to the Ward’s clustering method. The MDL Drug Data Report (MDDR dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward’s method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward’s method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria. Conclusions The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA was the method of choice among consensus clustering methods.

Bootstrapping phylogenies inferred from rearrangement data

Directory of Open Access Journals (Sweden)

Lin Yu

2012-08-01

Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its
Bootstrapping phylogenies inferred from rearrangement data.

Science.gov (United States)

Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me

2012-08-29

Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver
OBSERVED SCALING RELATIONS FOR STRONG LENSING CLUSTERS: CONSEQUENCES FOR COSMOLOGY AND CLUSTER ASSEMBLY

International Nuclear Information System (INIS)

Comerford, Julia M.; Moustakas, Leonidas A.; Natarajan, Priyamvada

2010-01-01

Scaling relations of observed galaxy cluster properties are useful tools for constraining cosmological parameters as well as cluster formation histories. One of the key cosmological parameters, σ 8 , is constrained using observed clusters of galaxies, although current estimates of σ 8 from the scaling relations of dynamically relaxed galaxy clusters are limited by the large scatter in the observed cluster mass-temperature (M-T) relation. With a sample of eight strong lensing clusters at 0.3 8 , but combining the cluster concentration-mass relation with the M-T relation enables the inclusion of unrelaxed clusters as well. Thus, the resultant gains in the accuracy of σ 8 measurements from clusters are twofold: the errors on σ 8 are reduced and the cluster sample size is increased. Therefore, the statistics on σ 8 determination from clusters are greatly improved by the inclusion of unrelaxed clusters. Exploring cluster scaling relations further, we find that the correlation between brightest cluster galaxy (BCG) luminosity and cluster mass offers insight into the assembly histories of clusters. We find preliminary evidence for a steeper BCG luminosity-cluster mass relation for strong lensing clusters than the general cluster population, hinting that strong lensing clusters may have had more active merging histories.
Classification versus inference learning contrasted with real-world categories.

Science.gov (United States)

Jones, Erin L; Ross, Brian H

2011-07-01

Categories are learned and used in a variety of ways, but the research focus has been on classification learning. Recent work contrasting classification with inference learning of categories found important later differences in category performance. However, theoretical accounts differ on whether this is due to an inherent difference between the tasks or to the implementation decisions. The inherent-difference explanation argues that inference learners focus on the internal structure of the categories--what each category is like--while classification learners focus on diagnostic information to predict category membership. In two experiments, using real-world categories and controlling for earlier methodological differences, inference learners learned more about what each category was like than did classification learners, as evidenced by higher performance on a novel classification test. These results suggest that there is an inherent difference between learning new categories by classifying an item versus inferring a feature.
Statistical inference via fiducial methods

OpenAIRE

Salomé, Diemer

1998-01-01

In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary
Cluster Headache

OpenAIRE

Pearce, Iris

1985-01-01

Cluster headache is the most severe primary headache with recurrent pain attacks described as worse than giving birth. The aim of this paper was to make an overview of current knowledge on cluster headache with a focus on pathophysiology and treatment. This paper presents hypotheses of cluster headache pathophysiology, current treatment options and possible future therapy approaches. For years, the hypothalamus was regarded as the key structure in cluster headache, but is now thought to be pa...
Information-Theoretic Inference of Large Transcriptional Regulatory Networks

Directory of Open Access Journals (Sweden)

Meyer Patrick

2007-01-01

Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Information-Theoretic Inference of Large Transcriptional Regulatory Networks

Directory of Open Access Journals (Sweden)

Patrick E. Meyer

2007-06-01

Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
IMAGINE: Interstellar MAGnetic field INference Engine

Science.gov (United States)

Steininger, Theo

2018-03-01

IMAGINE (Interstellar MAGnetic field INference Engine) performs inference on generic parametric models of the Galaxy. The modular open source framework uses highly optimized tools and technology such as the MultiNest sampler (ascl:1109.006) and the information field theory framework NIFTy (ascl:1302.013) to create an instance of the Milky Way based on a set of parameters for physical observables, using Bayesian statistics to judge the mismatch between measured data and model prediction. The flexibility of the IMAGINE framework allows for simple refitting for newly available data sets and makes state-of-the-art Bayesian methods easily accessible particularly for random components of the Galactic magnetic field.
Properties of an ionised-cluster beam from a vaporised-cluster ion source

International Nuclear Information System (INIS)

Takagi, T.; Yamada, I.; Sasaki, A.

1978-01-01

A new type of ion source vaporised-metal cluster ion source, has been developed for deposition and epitaxy. A cluster consisting of 10 2 to 10 3 atoms coupled loosely together is formed by adiabatic expansion ejecting the vapour of materials into a high-vacuum region through the nozzle of a heated crucible. The clusters are ionised by electron bombardment and accelerated with neutral clusters toward a substrate. In this paper, mechanisms of cluster formation experimental results of the cluster size (atoms/cluster) and its distribution, and characteristics of the cluster ion beams are reported. The size is calculated from the kinetic equation E = (1/2)mNVsub(ej) 2 , where E is the cluster beam energy, Vsub(ej) is the ejection velocity, m is the mass of atom and N is the cluster size. The energy and the velocity of the cluster are measured by an electrostatic 127 0 energy analyser and a rotating disc system, respectively. The cluster size obtained for Ag is about 5 x 10 2 to 2 x 10 3 atoms. The retarding potential method is used to confirm the results for Ag. The same dependence on cluster size for metals such as Ag, Cu and Pb has been obtained in previous experiments. In the cluster state the cluster ion beam is easily produced by electron bombardment. About 50% of ionised clusters are obtained under typical operation conditions, because of the large ionisation cross sections of the clusters. To obtain a uniform spatial distribution, the ionising electrode system is also discussed. The new techniques are termed ionised-cluster beam deposition (ICBD) and epitaxy (ICBE). (author)
Inferring epidemic network topology from surveillance data.

Directory of Open Access Journals (Sweden)

Xiang Wan

Full Text Available The transmission of infectious diseases can be affected by many or even hidden factors, making it difficult to accurately predict when and where outbreaks may emerge. One approach at the moment is to develop and deploy surveillance systems in an effort to detect outbreaks as timely as possible. This enables policy makers to modify and implement strategies for the control of the transmission. The accumulated surveillance data including temporal, spatial, clinical, and demographic information, can provide valuable information with which to infer the underlying epidemic networks. Such networks can be quite informative and insightful as they characterize how infectious diseases transmit from one location to another. The aim of this work is to develop a computational model that allows inferences to be made regarding epidemic network topology in heterogeneous populations. We apply our model on the surveillance data from the 2009 H1N1 pandemic in Hong Kong. The inferred epidemic network displays significant effect on the propagation of infectious diseases.
A Learning Algorithm for Multimodal Grammar Inference.

Science.gov (United States)

D'Ulizia, A; Ferri, F; Grifoni, P

2011-12-01

The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.
Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters

Science.gov (United States)

Muraoka, Masae; Okuda, Hiroshi

With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.
Interplay between experiments and calculations for organometallic clusters and caged clusters

International Nuclear Information System (INIS)

Nakajima, Atsushi

2015-01-01

Clusters consisting of 10-1000 atoms exhibit size-dependent electronic and geometric properties. In particular, composite clusters consisting of several elements and/or components provide a promising way for a bottom-up approach for designing functional advanced materials, because the functionality of the composite clusters can be optimized not only by the cluster size but also by their compositions. In the formation of composite clusters, their geometric symmetry and dimensionality are emphasized to control the physical and chemical properties, because selective and anisotropic enhancements for optical, chemical, and magnetic properties can be expected. Organometallic clusters and caged clusters are demonstrated as a representative example of designing the functionality of the composite clusters. Organometallic vanadium-benzene forms a one dimensional sandwich structure showing ferromagnetic behaviors and anomalously large HOMO-LUMO gap differences of two spin orbitals, which can be regarded as spin-filter components for cluster-based spintronic devices. Caged clusters of aluminum (Al) are well stabilized both geometrically and electronically at Al 12 X, behaving as a “superatom”
Categorias Cluster

OpenAIRE

Queiroz, Dayane Andrade

2015-01-01

Neste trabalho apresentamos as categorias cluster, que foram introduzidas por Aslak Bakke Buan, Robert Marsh, Markus Reineke, Idun Reiten e Gordana Todorov, com o objetivo de categoriíicar as algebras cluster criadas em 2002 por Sergey Fomin e Andrei Zelevinsky. Os autores acima, em [4], mostraram que existe uma estreita relação entre algebras cluster e categorias cluster para quivers cujo grafo subjacente é um diagrama de Dynkin. Para isto desenvolveram uma teoria tilting na estrutura triang...
BRIGHTEST CLUSTER GALAXIES AND CORE GAS DENSITY IN REXCESS CLUSTERS

International Nuclear Information System (INIS)

Haarsma, Deborah B.; Leisman, Luke; Donahue, Megan; Bruch, Seth; Voit, G. Mark; Boehringer, Hans; Pratt, Gabriel W.; Pierini, Daniele; Croston, Judith H.; Arnaud, Monique

2010-01-01

We investigate the relationship between brightest cluster galaxies (BCGs) and their host clusters using a sample of nearby galaxy clusters from the Representative XMM-Newton Cluster Structure Survey. The sample was imaged with the Southern Observatory for Astrophysical Research in R band to investigate the mass of the old stellar population. Using a metric radius of 12 h -1 kpc, we found that the BCG luminosity depends weakly on overall cluster mass as L BCG ∝ M 0.18±0.07 cl , consistent with previous work. We found that 90% of the BCGs are located within 0.035 r 500 of the peak of the X-ray emission, including all of the cool core (CC) clusters. We also found an unexpected correlation between the BCG metric luminosity and the core gas density for non-cool-core (non-CC) clusters, following a power law of n e ∝ L 2.7±0.4 BCG (where n e is measured at 0.008 r 500 ). The correlation is not easily explained by star formation (which is weak in non-CC clusters) or overall cluster mass (which is not correlated with core gas density). The trend persists even when the BCG is not located near the peak of the X-ray emission, so proximity is not necessary. We suggest that, for non-CC clusters, this correlation implies that the same process that sets the central entropy of the cluster gas also determines the central stellar density of the BCG, and that this underlying physical process is likely to be mergers.
Bayesian Inference of High-Dimensional Dynamical Ocean Models

Science.gov (United States)

Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.

2015-12-01

This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
HDclassif : An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data

Directory of Open Access Journals (Sweden)

Laurent Berge

2012-01-01

Full Text Available This paper presents the R package HDclassif which is devoted to the clustering and the discriminant analysis of high-dimensional data. The classification methods proposed in the package result from a new parametrization of the Gaussian mixture model which combines the idea of dimension reduction and model constraints on the covariance matrices. The supervised classification method using this parametrization is called high dimensional discriminant analysis (HDDA. In a similar manner, the associated clustering method iscalled high dimensional data clustering (HDDC and uses the expectation-maximization algorithm for inference. In order to correctly t the data, both methods estimate the specific subspace and the intrinsic dimension of the groups. Due to the constraints on the covariance matrices, the number of parameters to estimate is significantly lower than other model-based methods and this allows the methods to be stable and efficient in high dimensions. Two introductory examples illustrated with R codes allow the user to discover the hdda and hddc functions. Experiments on simulated and real datasets also compare HDDC and HDDA with existing classification methods on high-dimensional datasets. HDclassif is a free software and distributed under the general public license, as part of the R software project.
Scientific Cluster Deployment and Recovery - Using puppet to simplify cluster management

Science.gov (United States)

Hendrix, Val; Benjamin, Doug; Yao, Yushu

2012-12-01

Deployment, maintenance and recovery of a scientific cluster, which has complex, specialized services, can be a time consuming task requiring the assistance of Linux system administrators, network engineers as well as domain experts. Universities and small institutions that have a part-time FTE with limited time for and knowledge of the administration of such clusters can be strained by such maintenance tasks. This current work is the result of an effort to maintain a data analysis cluster (DAC) with minimal effort by a local system administrator. The realized benefit is the scientist, who is the local system administrator, is able to focus on the data analysis instead of the intricacies of managing a cluster. Our work provides a cluster deployment and recovery process (CDRP) based on the puppet configuration engine allowing a part-time FTE to easily deploy and recover entire clusters with minimal effort. Puppet is a configuration management system (CMS) used widely in computing centers for the automatic management of resources. Domain experts use Puppet's declarative language to define reusable modules for service configuration and deployment. Our CDRP has three actors: domain experts, a cluster designer and a cluster manager. The domain experts first write the puppet modules for the cluster services. A cluster designer would then define a cluster. This includes the creation of cluster roles, mapping the services to those roles and determining the relationships between the services. Finally, a cluster manager would acquire the resources (machines, networking), enter the cluster input parameters (hostnames, IP addresses) and automatically generate deployment scripts used by puppet to configure it to act as a designated role. In the event of a machine failure, the originally generated deployment scripts along with puppet can be used to easily reconfigure a new machine. The cluster definition produced in our CDRP is an integral part of automating cluster deployment
Weak-lensing mass calibration of redMaPPer galaxy clusters in Dark Energy Survey Science Verification data

Energy Technology Data Exchange (ETDEWEB)

Melchior, P.; Gruen, D.; McClintock, T.; Varga, T. N.; Sheldon, E.; Rozo, E.; Amara, A.; Becker, M. R.; Benson, B. A.; Bermeo, A.; Bridle, S. L.; Clampitt, J.; Dietrich, J. P.; Hartley, W. G.; Hollowood, D.; Jain, B.; Jarvis, M.; Jeltema, T.; Kacprzak, T.; MacCrann, N.; Rykoff, E. S.; Saro, A.; Suchyta, E.; Troxel, M. A.; Zuntz, J.; Bonnett, C.; Plazas, A. A.; Abbott, T. M. C.; Abdalla, F. B.; Annis, J.; Benoit-Lévy, A.; Bernstein, G. M.; Bertin, E.; Brooks, D.; Buckley-Geer, E.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Cunha, C. E.; D’Andrea, C. B.; da Costa, L. N.; Desai, S.; Eifler, T. F.; Flaugher, B.; Fosalba, P.; García-Bellido, J.; Gaztanaga, E.; Gerdes, D. W.; Gruendl, R. A.; Gschwend, J.; Gutierrez, G.; Honscheid, K.; James, D. J.; Kirk, D.; Krause, E.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Lima, M.; Maia, M. A. G.; March, M.; Martini, P.; Menanteau, F.; Miller, C. J.; Miquel, R.; Mohr, J. J.; Nichol, R. C.; Ogando, R.; Romer, A. K.; Sanchez, E.; Scarpine, V.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Walker, A. R.; Weller, J.; Zhang, Y.

2017-05-16

We use weak-lensing shear measurements to determine the mean mass of optically selected galaxy clusters in Dark Energy Survey Science Verification data. In a blinded analysis, we split the sample of more than 8,000 redMaPPer clusters into 15 subsets, spanning ranges in the richness parameter $5 \\leq \\lambda \\leq 180$ and redshift $0.2 \\leq z \\leq 0.8$, and fit the averaged mass density contrast profiles with a model that accounts for seven distinct sources of systematic uncertainty: shear measurement and photometric redshift errors; cluster-member contamination; miscentering; deviations from the NFW halo profile; halo triaxiality; and line-of-sight projections. We combine the inferred cluster masses to estimate the joint scaling relation between mass, richness and redshift, $\\mathcal{M}(\\lambda,z) \\varpropto M_0 \\lambda^{F} (1+z)^{G}$. We find $M_0 \\equiv \\langle M_{200\\mathrm{m}}\\,|\\,\\lambda=30,z=0.5\\rangle=\\left[ 2.35 \\pm 0.22\\ \\rm{(stat)} \\pm 0.12\\ \\rm{(sys)} \\right] \\cdot 10^{14}\\ M_\\odot$, with $F = 1.12\\,\\pm\\,0.20\\ \\rm{(stat)}\\, \\pm\\, 0.06\\ \\rm{(sys)}$ and $G = 0.18\\,\\pm\\, 0.75\\ \\rm{(stat)}\\, \\pm\\, 0.24\\ \\rm{(sys)}$. The amplitude of the mass-richness relation is in excellent agreement with the weak-lensing calibration of redMaPPer clusters in SDSS by Simet et al. (2016) and with the Saro et al. (2015) calibration based on abundance matching of SPT-detected clusters. Our results extend the redshift range over which the mass-richness relation of redMaPPer clusters has been calibrated with weak lensing from $z\\leq 0.3$ to $z\\leq0.8$. Calibration uncertainties of shear measurements and photometric redshift estimates dominate our systematic error budget and require substantial improvements for forthcoming studies.

Hybrid Optical Inference Machines

Science.gov (United States)

1991-09-27

with labels. Now, events. a set of facts cal be generated in the dyadic form "u, R 1,2" Eichmann and Caulfield (19] consider the same type of and can...these enceding-schemes. These architectures are-based pri- 19. G. Eichmann and H. J. Caulfield, "Optical Learning (Inference)marily on optical inner
Evaluating Spatial Variability in Sediment and Phosphorus Concentration-Discharge Relationships Using Bayesian Inference and Self-Organizing Maps

Science.gov (United States)

Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.

2017-12-01

Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.
Segmentation of High Angular Resolution Diffusion MRI using Sparse Riemannian Manifold Clustering

Science.gov (United States)

Wright, Margaret J.; Thompson, Paul M.; Vidal, René

2015-01-01

We address the problem of segmenting high angular resolution diffusion imaging (HARDI) data into multiple regions (or fiber tracts) with distinct diffusion properties. We use the orientation distribution function (ODF) to represent HARDI data and cast the problem as a clustering problem in the space of ODFs. Our approach integrates tools from sparse representation theory and Riemannian geometry into a graph theoretic segmentation framework. By exploiting the Riemannian properties of the space of ODFs, we learn a sparse representation for each ODF and infer the segmentation by applying spectral clustering to a similarity matrix built from these representations. In cases where regions with similar (resp. distinct) diffusion properties belong to different (resp. same) fiber tracts, we obtain the segmentation by incorporating spatial and user-specified pairwise relationships into the formulation. Experiments on synthetic data evaluate the sensitivity of our method to image noise and the presence of complex fiber configurations, and show its superior performance compared to alternative segmentation methods. Experiments on phantom and real data demonstrate the accuracy of the proposed method in segmenting simulated fibers, as well as white matter fiber tracts of clinical importance in the human brain. PMID:24108748
A Network Inference Workflow Applied to Virulence-Related Processes in Salmonella typhimurium

Energy Technology Data Exchange (ETDEWEB)

Taylor, Ronald C.; Singhal, Mudita; Weller, Jennifer B.; Khoshnevis, Saeed; Shi, Liang; McDermott, Jason E.

2009-04-20

Inference of the structure of mRNA transcriptional regulatory networks, protein regulatory or interaction networks, and protein activation/inactivation-based signal transduction networks are critical tasks in systems biology. In this article we discuss a workflow for the reconstruction of parts of the transcriptional regulatory network of the pathogenic bacterium Salmonella typhimurium based on the information contained in sets of microarray gene expression data now available for that organism, and describe our results obtained by following this workflow. The primary tool is one of the network inference algorithms deployed in the Software Environment for BIological Network Inference (SEBINI). Specifically, we selected the algorithm called Context Likelihood of Relatedness (CLR), which uses the mutual information contained in the gene expression data to infer regulatory connections. The associated analysis pipeline automatically stores the inferred edges from the CLR runs within SEBINI and, upon request, transfers the inferred edges into either Cytoscape or the plug-in Collective Analysis of Biological of Biological Interaction Networks (CABIN) tool for further post-analysis of the inferred regulatory edges. The following article presents the outcome of this workflow, as well as the protocols followed for microarray data collection, data cleansing, and network inference. Our analysis revealed several interesting interactions, functional groups, metabolic pathways, and regulons in S. typhimurium.
Cluster-cluster correlations in the two-dimensional stationary Ising-model

International Nuclear Information System (INIS)

Klassmann, A.

1997-01-01

In numerical integration of the Cahn-Hillard equation, which describes Oswald rising in a two-phase matrix, N. Masbaum showed that spatial correlations between clusters scale with respect to the mean cluster size (itself a function of time). T. B. Liverpool showed by Monte Carlo simulations for the Ising model that the analogous correlations have a similar form. Both demonstrated that immediately around each cluster there is some depletion area followed by something like a ring of clusters of the same size as the original one. More precisely, it has been shown that the distribution of clusters around a given cluster looks like a sinus-curve decaying exponentially with respect to the distance to a constant value
Training Inference Making Skills Using a Situation Model Approach Improves Reading Comprehension

Directory of Open Access Journals (Sweden)

Lisanne eBos

2016-02-01

Full Text Available This study aimed to enhance third and fourth graders’ text comprehension at the situation model level. Therefore, we tested a reading strategy training developed to target inference making skills, which are widely considered to be pivotal to situation model construction. The training was grounded in contemporary literature on situation model-based inference making and addressed the source (text-based versus knowledge-based, type (necessary versus unnecessary for (re-establishing coherence, and depth of an inference (making single lexical inferences versus combining multiple lexical inferences, as well as the type of searching strategy (forward versus backward. Results indicated that, compared to a control group (n = 51, children who followed the experimental training (n = 67 improved their inference making skills supportive to situation model construction. Importantly, our training also resulted in increased levels of general reading comprehension and motivation. In sum, this study showed that a ‘level of text representation’-approach can provide a useful framework to teach inference making skills to third and fourth graders.
Robust Demographic Inference from Genomic and SNP Data

Science.gov (United States)

Excoffier, Laurent; Dupanloup, Isabelle; Huerta-Sánchez, Emilia; Sousa, Vitor C.; Foll, Matthieu

2013-01-01

We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets. PMID:24204310
Universal Darwinism as a process of Bayesian inference

Directory of Open Access Journals (Sweden)

John Oberon Campbell

2016-06-01

Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Behavior Intention Derivation of Android Malware Using Ontology Inference

Directory of Open Access Journals (Sweden)

Jian Jiao

2018-01-01

Full Text Available Previous researches on Android malware mainly focus on malware detection, and malware’s evolution makes the process face certain hysteresis. The information presented by these detected results (malice judgment, family classification, and behavior characterization is limited for analysts. Therefore, a method is needed to restore the intention of malware, which reflects the relation between multiple behaviors of complex malware and its ultimate purpose. This paper proposes a novel description and derivation model of Android malware intention based on the theory of intention and malware reverse engineering. This approach creates ontology for malware intention to model the semantic relation between behaviors and its objects and automates the process of intention derivation by using SWRL rules transformed from intention model and Jess inference engine. Experiments on 75 typical samples show that the inference system can perform derivation of malware intention effectively, and 89.3% of the inference results are consistent with artificial analysis, which proves the feasibility and effectiveness of our theory and inference system.
Genealogical and evolutionary inference with the human Y chromosome.

Science.gov (United States)

Stumpf, M P; Goldstein, D B

2001-03-02

Population genetics has emerged as a powerful tool for unraveling human history. In addition to the study of mitochondrial and autosomal DNA, attention has recently focused on Y-chromosome variation. Ambiguities and inaccuracies in data analysis, however, pose an important obstacle to further development of the field. Here we review the methods available for genealogical inference using Y-chromosome data. Approaches can be divided into those that do and those that do not use an explicit population model in genealogical inference. We describe the strengths and weaknesses of these model-based and model-free approaches, as well as difficulties associated with the mutation process that affect both methods. In the case of genealogical inference using microsatellite loci, we use coalescent simulations to show that relatively simple generalizations of the mutation process can greatly increase the accuracy of genealogical inference. Because model-free and model-based approaches have different biases and limitations, we conclude that there is considerable benefit in the continued use of both types of approaches.
SDG multiple fault diagnosis by real-time inverse inference

International Nuclear Information System (INIS)

Zhang Zhaoqian; Wu Chongguang; Zhang Beike; Xia Tao; Li Anfeng

2005-01-01

In the past 20 years, one of the qualitative simulation technologies, signed directed graph (SDG) has been widely applied in the field of chemical fault diagnosis. However, the assumption of single fault origin was usually used by many former researchers. As a result, this will lead to the problem of combinatorial explosion and has limited SDG to the realistic application on the real process. This is mainly because that most of the former researchers used forward inference engine in the commercial expert system software to carry out the inverse diagnosis inference on the SDG model which violates the internal principle of diagnosis mechanism. In this paper, we present a new SDG multiple faults diagnosis method by real-time inverse inference. This is a method of multiple faults diagnosis from the genuine significance and the inference engine use inverse mechanism. At last, we give an example of 65t/h furnace diagnosis system to demonstrate its applicability and efficiency
SDG multiple fault diagnosis by real-time inverse inference

Energy Technology Data Exchange (ETDEWEB)

Zhang Zhaoqian; Wu Chongguang; Zhang Beike; Xia Tao; Li Anfeng

2005-02-01

In the past 20 years, one of the qualitative simulation technologies, signed directed graph (SDG) has been widely applied in the field of chemical fault diagnosis. However, the assumption of single fault origin was usually used by many former researchers. As a result, this will lead to the problem of combinatorial explosion and has limited SDG to the realistic application on the real process. This is mainly because that most of the former researchers used forward inference engine in the commercial expert system software to carry out the inverse diagnosis inference on the SDG model which violates the internal principle of diagnosis mechanism. In this paper, we present a new SDG multiple faults diagnosis method by real-time inverse inference. This is a method of multiple faults diagnosis from the genuine significance and the inference engine use inverse mechanism. At last, we give an example of 65t/h furnace diagnosis system to demonstrate its applicability and efficiency.
A Wide-Field Photometric Survey for Extratidal Tails Around Five Metal-Poor Globular Clusters in the Galactic Halo

Science.gov (United States)

Chun, Sang-Hyun; Kim, Jae-Woo; Sohn, Sangmo T.; Park, Jang-Hyun; Han, Wonyong; Kim, Ho-Il; Lee, Young-Wook; Lee, Myung Gyoon; Lee, Sang-Gak; Sohn, Young-Jong

2010-02-01

Wide-field deep g'r'i' images obtained with the Megacam of the Canada-France-Hawaii Telescope are used to investigate the spatial configuration of stars around five metal-poor globular clusters M15, M30, M53, NGC 5053, and NGC 5466, in a field-of-view ~3°. Applying a mask filtering algorithm to the color-magnitude diagrams of the observed stars, we sorted cluster's member star candidates that are used to examine the characteristics of the spatial stellar distribution surrounding the target clusters. The smoothed surface density maps and the overlaid isodensity contours indicate that all of the five metal-poor globular clusters exhibit strong evidence of extratidal overdensity features over their tidal radii, in the form of extended tidal tails around the clusters. The orientations of the observed extratidal features show signatures of tidal tails tracing the clusters' orbits, inferred from their proper motions, and effects of dynamical interactions with the Galaxy. Our findings include detections of a tidal bridge-like feature and an envelope structure around the pair of globular clusters M53 and NGC 5053. The observed radial surface density profiles of target clusters have a deviation from theoretical King models, for which the profiles show a break at 0.5-0.7rt , extending the overdensity features out to 1.5-2rt . Both radial surface density profiles for different angular sections and azimuthal number density profiles confirm the overdensity features of tidal tails around the five metal-poor globular clusters. Our results add further observational evidence that the observed metal-poor halo globular clusters originate from an accreted satellite system, indicative of the merging scenario of the formation of the Galactic halo. Based on observations carried out at the Canada-France-Hawaii Telescope, operated by the National Research Council of Canada, the Centre National de la Recherche Scientifique de France, and the University of Hawaii. This is part of the
HIV-TRACE (Transmission Cluster Engine): a tool for large scale molecular epidemiology of HIV-1 and other rapidly evolving pathogens.

Science.gov (United States)

Kosakovsky Pond, Sergei L; Weaver, Steven; Leigh Brown, Andrew J; Wertheim, Joel O

2018-01-31

In modern applications of molecular epidemiology, genetic sequence data are routinely used to identify clusters of transmission in rapidly evolving pathogens, most notably HIV-1. Traditional 'shoeleather' epidemiology infers transmission clusters by tracing chains of partners sharing epidemiological connections (e.g., sexual contact). Here, we present a computational tool for identifying a molecular transmission analog of such clusters: HIV-TRACE (TRAnsmission Cluster Engine). HIV-TRACE implements an approach inspired by traditional epidemiology, by identifying chains of partners whose viral genetic relatedness imply direct or indirect epidemiological connections. Molecular transmission clusters are constructed using codon-aware pairwise alignment to a reference sequence followed by pairwise genetic distance estimation among all sequences. This approach is computationally tractable and is capable of identifying HIV-1 transmission clusters in large surveillance databases comprising tens or hundreds of thousands of sequences in near real time, i.e., on the order of minutes to hours. HIV-TRACE is available at www.hivtrace.org and from github.com/veg/hivtrace, along with the accompanying result visualization module from github.com/veg/hivtrace-viz. Importantly, the approach underlying HIV-TRACE is not limited to the study of HIV-1 and can be applied to study outbreaks and epidemics of other rapidly evolving pathogens. © The Author 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Meaningful Clusters

Energy Technology Data Exchange (ETDEWEB)

Sanfilippo, Antonio P.; Calapristi, Augustin J.; Crow, Vernon L.; Hetzler, Elizabeth G.; Turner, Alan E.

2004-05-26

We present an approach to the disambiguation of cluster labels that capitalizes on the notion of semantic similarity to assign WordNet senses to cluster labels. The approach provides interesting insights on how document clustering can provide the basis for developing a novel approach to word sense disambiguation.
Macroeconomic Dimensions in the Clusterization Processes: Lithuanian Biomass Cluster Case

Directory of Open Access Journals (Sweden)

Navickas Valentinas

2017-03-01

Full Text Available The Future production systems’ increasing significance will impose work, which maintains not a competitive, but a collaboration basis, with concentrated resources and expertise, which can help to reach the general purpose. One form of collaboration among medium-size business organizations is work in clusters. Clusterization as a phenomenon has been known from quite a long time, but it offers simple benefits to researches at micro and medium levels. The clusterization process evaluation in macroeconomic dimensions has been comparatively little investigated. Thereby, in this article, the clusterization processes is analysed by concentrating our attention on macroeconomic factor researches. The authors analyse clusterization’s influence on country’s macroeconomic growth; they apply a structure research methodology for clusterization’s macroeconomic influence evaluation and propose that clusterization processes benefit macroeconomic analysis. The theoretical model of clusterization processes was validated by referring to a biomass cluster case. Because biomass cluster case is a new phenomenon, currently there are no other scientific approaches to them. The authors’ accomplished researches show that clusterization allows the achievement of a large positive slip in macroeconomics, which proves to lead to a high value added to creation, a faster country economic growth, and social situation amelioration.
Functional networks inference from rule-based machine learning models.

Science.gov (United States)

Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume

2016-01-01

Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The
Clustering Dycom

KAUST Repository

Minku, Leandro L.

2017-10-06

Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.
Causal inference based on counterfactuals

Directory of Open Access Journals (Sweden)

Höfler M

2005-09-01

Full Text Available Abstract Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept.
LMC clusters: young

International Nuclear Information System (INIS)

Freeman, K.C.

1980-01-01

The young globular clusters of the LMC have ages of 10 7 -10 8 y. Their masses and structure are similar to those of the smaller galactic globular clusters. Their stellar mass functions (in the mass range 6 solar masses to 1.2 solar masses) vary greatly from cluster to cluster, although the clusters are similar in total mass, age, structure and chemical composition. It would be very interesting to know why these clusters are forming now in the LMC and not in the Galaxy. The author considers the 'young globular' or 'blue populous' clusters of the LMC. The ages of these objects are 10 7 to 10 8 y, and their masses are 10 4 to 10 5 solar masses, so they are populous enough to be really useful for studying the evolution of massive stars. The author concentrates on the structure and stellar content of these young clusters. (Auth.)

Major cluster mergers and the location of the brightest cluster galaxy

International Nuclear Information System (INIS)

Martel, Hugo; Robichaud, Fidèle; Barai, Paramita

2014-01-01

Using a large N-body cosmological simulation combined with a subgrid treatment of galaxy formation, merging, and tidal destruction, we study the formation and evolution of the galaxy and cluster population in a comoving volume (100 Mpc) 3 in a ΛCDM universe. At z = 0, our computational volume contains 1788 clusters with mass M cl > 1.1 × 10 12 M ☉ , including 18 massive clusters with M cl > 10 14 M ☉ . It also contains 1, 088, 797 galaxies with mass M gal ≥ 2 × 10 9 M ☉ and luminosity L > 9.5 × 10 5 L ☉ . For each cluster, we identified the brightest cluster galaxy (BCG). We then computed two separate statistics: the fraction f BNC of clusters in which the BCG is not the closest galaxy to the center of the cluster in projection, and the ratio Δv/σ, where Δv is the difference in radial velocity between the BCG and the whole cluster and σ is the radial velocity dispersion of the cluster. We found that f BNC increases from 0.05 for low-mass clusters (M cl ∼ 10 12 M ☉ ) to 0.5 for high-mass clusters (M cl > 10 14 M ☉ ) with very little dependence on cluster redshift. Most of this result turns out to be a projection effect and when we consider three-dimensional distances instead of projected distances, f BNC increases only to 0.2 at high-cluster mass. The values of Δv/σ vary from 0 to 1.8, with median values in the range 0.03-0.15 when considering all clusters, and 0.12-0.31 when considering only massive clusters. These results are consistent with previous observational studies and indicate that the central galaxy paradigm, which states that the BCG should be at rest at the center of the cluster, is usually valid, but exceptions are too common to be ignored. We built merger trees for the 18 most massive clusters in the simulation. Analysis of these trees reveal that 16 of these clusters have experienced 1 or several major or semi-major mergers in the past. These mergers leave each cluster in a non-equilibrium state, but eventually the cluster
Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

Science.gov (United States)

2014-01-01

Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations
A Comparative Study of Spatially Clustered Distribution of Jumbo Flying Squid (Dosidicus gigas) Offshore Peru

Institute of Scientific and Technical Information of China (English)

FENG Yongjiu; CUI Li; CHEN Xinjun; LIU Yu

2017-01-01

We examined spatially clustered distribution of jumbo flying squid (Dosidicus gigas) in the offshore waters of Peru bounded by 78°-86°W and 8°-20°S under 0.5°×0.5° fishing grid.The study is based on the catch-per-unit-effort (CPUE) and fishing effort from Chinese mainland squid jigging fleet in 2003-2004 and 2006-2013.The data for all years as well as the eight years (excluding E1 Ni(n)o events) were studied to examine the effect of climate variation on the spatial distribution of D.gigas.Five spatial clusters reflecting the spatial distribution were computed using K-means and Getis-Ord Gi* for a detailed comparative study.Our results showed that clusters identified by the two methods were quite different in terms of their spatial patterns,and K-means was not as accurate as Getis-Ord Gi*,as inferred from the agreement degree and receiver operating characteristic.There were more areas of hot and cold spots in years without the impact of El Ni(n)o,suggesting that such large-scale climate variations could reduce the clustering level ofD.gigas.The catches also showed that warm E1 Ni(n)o conditions and high water temperature were less favorable for D.gigas offshore Peru.The results suggested that the use of K-means is preferable if the aim is to discover the spatial distribution of each sub-region (cluster) of the study area,while Getis-Ord Gi* is preferable if the aim is to identify statistically significant hot spots that may indicate the central fishing ground.
A comparative study of spatially clustered distribution of jumbo flying squid ( Dosidicus gigas) offshore Peru

Science.gov (United States)

Feng, Yongjiu; Cui, Li; Chen, Xinjun; Liu, Yu

2017-06-01

We examined spatially clustered distribution of jumbo flying squid ( Dosidicus gigas) in the offshore waters of Peru bounded by 78°-86°W and 8°-20°S under 0.5°×0.5° fishing grid. The study is based on the catch-per-unit-effort (CPUE) and fishing effort from Chinese mainland squid jigging fleet in 2003-2004 and 2006-2013. The data for all years as well as the eight years (excluding El Niño events) were studied to examine the effect of climate variation on the spatial distribution of D. gigas. Five spatial clusters reflecting the spatial distribution were computed using K-means and Getis-Ord Gi* for a detailed comparative study. Our results showed that clusters identified by the two methods were quite different in terms of their spatial patterns, and K-means was not as accurate as Getis-Ord Gi*, as inferred from the agreement degree and receiver operating characteristic. There were more areas of hot and cold spots in years without the impact of El Niño, suggesting that such large-scale climate variations could reduce the clustering level of D. gigas. The catches also showed that warm El Niño conditions and high water temperature were less favorable for D. gigas offshore Peru. The results suggested that the use of K-means is preferable if the aim is to discover the spatial distribution of each sub-region (cluster) of the study area, while Getis-Ord Gi* is preferable if the aim is to identify statistically significant hot spots that may indicate the central fishing ground.
The Role of Cerenkov Radiation in the Pressure Balance of Cool Core Clusters of Galaxies

Energy Technology Data Exchange (ETDEWEB)

Lieu, Richard [Department of Physics, University of Alabama, Huntsville, AL 35899 (United States)

2017-03-20

Despite the substantial progress made recently in understanding the role of AGN feedback and associated non-thermal effects, the precise mechanism that prevents the core of some clusters of galaxies from collapsing catastrophically by radiative cooling remains unidentified. In this Letter, we demonstrate that the evolution of a cluster's cooling core, in terms of its density, temperature, and magnetic field strength, inevitably enables the plasma electrons there to quickly become Cerenkov loss dominated, with emission at the radio frequency of ≲350 Hz, and with a rate considerably exceeding free–free continuum and line emission. However, the same does not apply to the plasmas at the cluster's outskirts, which lacks such radiation. Owing to its low frequency, the radiation cannot escape, but because over the relevant scale size of a Cerenkov wavelength the energy of an electron in the gas cannot follow the Boltzmann distribution to the requisite precision to ensure reabsorption always occurs faster than stimulated emission, the emitting gas cools before it reheats. This leaves behind the radiation itself, trapped by the overlying reflective plasma, yet providing enough pressure to maintain quasi-hydrostatic equilibrium. The mass condensation then happens by Rayleigh–Taylor instability, at a rate determined by the outermost radius where Cerenkov radiation can occur. In this way, it is possible to estimate the rate at ≈2 M {sub ⊙} year{sup −1}, consistent with observational inference. Thus, the process appears to provide a natural solution to the longstanding problem of “cooling flow” in clusters; at least it offers another line of defense against cooling and collapse should gas heating by AGN feedback be inadequate in some clusters.
Implementing and analyzing the multi-threaded LP-inference

Science.gov (United States)

Bolotova, S. Yu; Trofimenko, E. V.; Leschinskaya, M. V.

2018-03-01

The logical production equations provide new possibilities for the backward inference optimization in intelligent production-type systems. The strategy of a relevant backward inference is aimed at minimization of a number of queries to external information source (either to a database or an interactive user). The idea of the method is based on the computing of initial preimages set and searching for the true preimage. The execution of each stage can be organized independently and in parallel and the actual work at a given stage can also be distributed between parallel computers. This paper is devoted to the parallel algorithms of the relevant inference based on the advanced scheme of the parallel computations “pipeline” which allows to increase the degree of parallelism. The author also provides some details of the LP-structures implementation.
International Conference on Trends and Perspectives in Linear Statistical Inference

CERN Document Server

Rosen, Dietrich

2018-01-01

This volume features selected contributions on a variety of topics related to linear statistical inference. The peer-reviewed papers from the International Conference on Trends and Perspectives in Linear Statistical Inference (LinStat 2016) held in Istanbul, Turkey, 22-25 August 2016, cover topics in both theoretical and applied statistics, such as linear models, high-dimensional statistics, computational statistics, the design of experiments, and multivariate analysis. The book is intended for statisticians, Ph.D. students, and professionals who are interested in statistical inference. .
Packaging design as communicator of product attributes: Effects on consumers’ attribute inferences

NARCIS (Netherlands)

van Ooijen, I.

2016-01-01

This dissertation will focus on two types of attribute inferences that result from packaging design cues. First, the effects of product packaging design on quality related inferences are investigated. Second, the effects of product packaging design on healthiness related inferences are examined (See
Connections between Star Cluster Populations and Their Host Galaxy Nuclear Rings

Science.gov (United States)

Ma, Chao; de Grijs, Richard; Ho, Luis C.

2018-04-01

Nuclear rings are excellent laboratories for probing diverse phenomena such as the formation and evolution of young massive star clusters and nuclear starbursts, as well as the secular evolution and dynamics of their host galaxies. We have compiled a sample of 17 galaxies with nuclear rings, which are well resolved by high-resolution Hubble and Spitzer Space Telescope imaging. For each nuclear ring, we identified the ring star cluster population, along with their physical properties (ages, masses, and extinction values). We also determined the integrated ring properties, including the average age, total stellar mass, and current star formation rate (SFR). We find that Sb-type galaxies tend to have the highest ring stellar mass fraction with respect to the host galaxy, and this parameter is correlated with the ring’s SFR surface density. The ring SFRs are correlated with their stellar masses, which is reminiscent of the main sequence of star-forming galaxies. There are striking correlations between star-forming properties (i.e., SFR and SFR surface density) and nonaxisymmetric bar parameters, appearing to confirm previous inferences that strongly barred galaxies tend to have lower ring SFRs, although the ring star formation histories turn out to be significantly more complicated. Nuclear rings with higher stellar masses tend to be associated with lower cluster mass fractions, but there is no such relation for the ages of the rings. The two youngest nuclear rings in our sample, NGC 1512 and NGC 4314, which have the most extreme physical properties, represent the young extremity of the nuclear ring age distribution.
Star-forming brightest cluster galaxies at 0.25

Energy Technology Data Exchange (ETDEWEB)

McDonald, M.; Stalder, B.; Bayliss, M.; Allen, S. W.; Applegate, D. E.; Ashby, M. L. N.; Bautz, M.; Benson, B. A.; Bleem, L. E.; Brodwin, M.; Carlstrom, J. E.; Chiu, I.; Desai, S.; Gonzalez, A. H.; Hlavacek-Larrondo, J.; Holzapfel, W. L.; Marrone, D. P.; Miller, E. D.; Reichardt, C. L.; Saliwanchik, B. R.; Saro, A.; Schrabback, T.; Stanford, S. A.; Stark, A. A.; Vieira, J. D.; Zenteno, A.

2016-01-22

We present a multiwavelength study of the 90 brightest cluster galaxies (BCGs) in a sample of galaxy clusters selected via the Sunyaev Zel'dovich effect by the South Pole Telescope, utilizing data from various ground- and space-based facilities. We infer the star-formation rate (SFR) for the BCG in each cluster—based on the UV and IR continuum luminosity, as well as the [O ii]λλ3726,3729 emission line luminosity in cases where spectroscopy is available—and find seven systems with SFR > 100 M⊙ yr-1. We find that the BCG SFR exceeds 10 M⊙ yr-1 in 31 of 90 (34%) cases at 0.25 < z < 1.25, compared to ~1%–5% at z ~ 0 from the literature. At z gsim 1, this fraction increases to ${92}_{-31}^{+6}$%, implying a steady decrease in the BCG SFR over the past ~9 Gyr. At low-z, we find that the specific SFR in BCGs is declining more slowly with time than for field or cluster galaxies, which is most likely due to the replenishing fuel from the cooling ICM in relaxed, cool core clusters. At z gsim 0.6, the correlation between the cluster central entropy and BCG star formation—which is well established at z ~ 0—is not present. Instead, we find that the most star-forming BCGs at high-z are found in the cores of dynamically unrelaxed clusters. We use data from the Hubble Space Telescope to investigate the rest-frame near-UV morphology of a subsample of the most star-forming BCGs, and find complex, highly asymmetric UV morphologies on scales as large as ~50–60 kpc. The high fraction of star-forming BCGs hosted in unrelaxed, non-cool core clusters at early times suggests that the dominant mode of fueling star formation in BCGs may have recently transitioned from galaxy–galaxy interactions to ICM cooling.
On the Merging Cluster Abell 578 and Its Central Radio Galaxy 4C+67.13

Science.gov (United States)

Hagino, K.; Stawarz, Ł.; Siemiginowska, A.; Cheung, C. C.; Kozieł-Wierzbowska, D.; Szostek, A.; Madejski, G.; Harris, D. E.; Simionescu, A.; Takahashi, T.

2015-06-01

Here we analyze radio, optical, and X-ray data for the peculiar cluster Abell 578. This cluster is not fully relaxed and consists of two merging sub-systems. The brightest cluster galaxy (BCG), CGPG 0719.8+6704, is a pair of interacting ellipticals with projected separation ˜10 kpc, the brighter of which hosts the radio source 4C+67.13. The Fanaroff-Riley type-II radio morphology of 4C+67.13 is unusual for central radio galaxies in local Abell clusters. Our new optical spectroscopy revealed that both nuclei of the CGPG 0719.8+6704 pair are active, albeit at low accretion rates corresponding to the Eddington ratio ˜ {{10}-4} (for the estimated black hole masses of ˜ 3× {{10}8} {{M}⊙ } and ˜ {{10}9} {{M}⊙ }). The gathered X-ray (Chandra) data allowed us to confirm and to quantify robustly the previously noted elongation of the gaseous atmosphere in the dominant sub-cluster, as well as a large spatial offset (˜60 kpc projected) between the position of the BCG and the cluster center inferred from the modeling of the X-ray surface brightness distribution. Detailed analysis of the brightness profiles and temperature revealed also that the cluster gas in the vicinity of 4C+67.13 is compressed (by a factor of about ˜1.4) and heated (from ≃ 2.0 keV up to 2.7 keV), consistent with the presence of a weak shock (Mach number ˜1.3) driven by the expanding jet cocoon. This would then require the jet kinetic power of the order of ˜ {{10}45} erg s-1, implying either a very high efficiency of the jet production for the current accretion rate, or a highly modulated jet/accretion activity in the system. Based on service observations made with the WHT operated on the island of La Palma by the Isaac Newton Group in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofísica de Canarias.
Mayaro virus infection in amazonia: a multimodel inference approach to risk factor assessment.

Directory of Open Access Journals (Sweden)

Fernando Abad-Franch

Full Text Available BACKGROUND: Arboviral diseases are major global public health threats. Yet, our understanding of infection risk factors is, with a few exceptions, considerably limited. A crucial shortcoming is the widespread use of analytical methods generally not suited for observational data--particularly null hypothesis-testing (NHT and step-wise regression (SWR. Using Mayaro virus (MAYV as a case study, here we compare information theory-based multimodel inference (MMI with conventional analyses for arboviral infection risk factor assessment. METHODOLOGY/PRINCIPAL FINDINGS: A cross-sectional survey of anti-MAYV antibodies revealed 44% prevalence (n = 270 subjects in a central Amazon rural settlement. NHT suggested that residents of village-like household clusters and those using closed toilet/latrines were at higher risk, while living in non-village-like areas, using bednets, and owning fowl, pigs or dogs were protective. The "minimum adequate" SWR model retained only residence area and bednet use. Using MMI, we identified relevant covariates, quantified their relative importance, and estimated effect-sizes (β ± SE on which to base inference. Residence area (β(Village = 2.93 ± 0.41; β(Upland = -0.56 ± 0.33, β(Riverbanks = -2.37 ± 0.55 and bednet use (β = -0.95 ± 0.28 were the most important factors, followed by crop-plot ownership (β = 0.39 ± 0.22 and regular use of a closed toilet/latrine (β = 0.19 ± 0.13; domestic animals had insignificant protective effects and were relatively unimportant. The SWR model ranked fifth among the 128 models in the final MMI set. CONCLUSIONS/SIGNIFICANCE: Our analyses illustrate how MMI can enhance inference on infection risk factors when compared with NHT or SWR. MMI indicates that forest crop-plot workers are likely exposed to typical MAYV cycles maintained by diurnal, forest dwelling vectors; however, MAYV might also be circulating in nocturnal, domestic-peridomestic cycles
Surrogate based approaches to parameter inference in ocean models

KAUST Repository

Knio, Omar

2016-01-06

This talk discusses the inference of physical parameters using model surrogates. Attention is focused on the use of sampling schemes to build suitable representations of the dependence of the model response on uncertain input data. Non-intrusive spectral projections and regularized regressions are used for this purpose. A Bayesian inference formalism is then applied to update the uncertain inputs based on available measurements or observations. To perform the update, we consider two alternative approaches, based on the application of Markov Chain Monte Carlo methods or of adjoint-based optimization techniques. We outline the implementation of these techniques to infer dependence of wind drag, bottom drag, and internal mixing coefficients.
Fast and scalable inference of multi-sample cancer lineages.

KAUST Repository

Popic, Victoria; Salari, Raheleh; Hajirasouliha, Iman; Kashef-Haghighi, Dorna; West, Robert B; Batzoglou, Serafim

2015-01-01

Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee .
Fast and scalable inference of multi-sample cancer lineages.

KAUST Repository

Popic, Victoria

2015-05-06

Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee .
Surrogate based approaches to parameter inference in ocean models

KAUST Repository

Knio, Omar

2016-01-01

This talk discusses the inference of physical parameters using model surrogates. Attention is focused on the use of sampling schemes to build suitable representations of the dependence of the model response on uncertain input data. Non-intrusive spectral projections and regularized regressions are used for this purpose. A Bayesian inference formalism is then applied to update the uncertain inputs based on available measurements or observations. To perform the update, we consider two alternative approaches, based on the application of Markov Chain Monte Carlo methods or of adjoint-based optimization techniques. We outline the implementation of these techniques to infer dependence of wind drag, bottom drag, and internal mixing coefficients.
Inferring causality from noisy time series data

DEFF Research Database (Denmark)

Mønster, Dan; Fusaroli, Riccardo; Tylén, Kristian

2016-01-01

Convergent Cross-Mapping (CCM) has shown high potential to perform causal inference in the absence of models. We assess the strengths and weaknesses of the method by varying coupling strength and noise levels in coupled logistic maps. We find that CCM fails to infer accurate coupling strength...... and even causality direction in synchronized time-series and in the presence of intermediate coupling. We find that the presence of noise deterministically reduces the level of cross-mapping fidelity, while the convergence rate exhibits higher levels of robustness. Finally, we propose that controlled noise...
Cluster-cluster aggregation of Ising dipolar particles under thermal noise

KAUST Repository

Suzuki, Masaru

2009-08-14

The cluster-cluster aggregation processes of Ising dipolar particles under thermal noise are investigated in the dilute condition. As the temperature increases, changes in the typical structures of clusters are observed from chainlike (D1) to crystalline (D2) through fractal structures (D1.45), where D is the fractal dimension. By calculating the bending energy of the chainlike structure, it is found that the transition temperature is associated with the energy gap between the chainlike and crystalline configurations. The aggregation dynamics changes from being dominated by attraction to diffusion involving changes in the dynamic exponent z=0.2 to 0.5. In the region of temperature where the fractal clusters grow, different growth rates are observed between charged and neutral clusters. Using the Smoluchowski equation with a twofold kernel, this hetero-aggregation process is found to result from two types of dynamics: the diffusive motion of neutral clusters and the weak attractive motion between charged clusters. The fact that changes in structures and dynamics take place at the same time suggests that transitions in the structure of clusters involve marked changes in the dynamics of the aggregation processes. © 2009 The American Physical Society.
Correlation between the Total Gravitating Mass of Groups and Clusters and the Supermassive Black Hole Mass of Brightest Galaxies

Science.gov (United States)

Bogdán, Ákos; Lovisari, Lorenzo; Volonteri, Marta; Dubois, Yohan

2018-01-01

Supermassive black holes (BHs) residing in the brightest cluster galaxies are over-massive relative to the stellar bulge mass or central stellar velocity dispersion of their host galaxies. As BHs residing at the bottom of the galaxy cluster’s potential well may undergo physical processes that are driven by the large-scale characteristics of the galaxy clusters, it is possible that the growth of these BHs is (indirectly) governed by the properties of their host clusters. In this work, we explore the connection between the mass of BHs residing in the brightest group/cluster galaxies (BGGs/BCGs) and the virial temperature, and hence total gravitating mass, of galaxy groups/clusters. To this end, we investigate a sample of 17 BGGs/BCGs with dynamical BH mass measurements and utilize XMM-Newton X-ray observations to measure the virial temperatures and infer the {M}500 mass of the galaxy groups/clusters. We find that the {M}{BH}{--}{kT} relation is significantly tighter and exhibits smaller scatter than the {M}{BH}{--}{M}{bulge} relations. The best-fitting power-law relations are {{log}}10({M}{BH}/{10}9 {M}ȯ )=0.20+1.74{{log}}10({kT}/1 {keV}) and {{log}}10({M}{BH}/{10}9 {M}ȯ ) = -0.80+1.72{{log}}10({M}{bulge}/{10}11 {M}ȯ ). Thus, the BH mass of BGGs/BCGs may be set by physical processes that are governed by the properties of the host galaxy group/cluster. These results are confronted with the Horizon-AGN simulation, which reproduces the observed relations well, albeit the simulated relations exhibit notably smaller scatter.
Circumstellar Disk Lifetimes In Numerous Galactic Young Stellar Clusters

Science.gov (United States)

Richert, A. J. W.; Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Broos, P. S.; Povich, M. S.; Bate, M. R.; Garmire, G. P.

2018-04-01

Photometric detections of dust circumstellar disks around pre-main sequence (PMS) stars, coupled with estimates of stellar ages, provide constraints on the time available for planet formation. Most previous studies on disk longevity, starting with Haisch, Lada & Lada (2001), use star samples from PMS clusters but do not consider datasets with homogeneous photometric sensitivities and/or ages placed on a uniform timescale. Here we conduct the largest study to date of the longevity of inner dust disks using X-ray and 1-8 {μ m} infrared photometry from the MYStIX and SFiNCs projects for 69 young clusters in 32 nearby star-forming regions with ages t ≤ 5 Myr. Cluster ages are derived by combining the empirical AgeJX method with PMS evolutionary models, which treat dynamo-generated magnetic fields in different ways. Leveraging X-ray data to identify disk-free objects, we impose similar stellar mass sensitivity limits for disk-bearing and disk-free YSOs while extending the analysis to stellar masses as low as M ˜ 0.1 M⊙. We find that the disk longevity estimates are strongly affected by the choice of PMS evolutionary model. Assuming a disk fraction of 100% at zero age, the inferred disk half-life changes significantly, from t1/2 ˜ 1.3 - 2 Myr to t1/2 ˜ 3.5 Myr when switching from non-magnetic to magnetic PMS models. In addition, we find no statistically significant evidence that disk fraction varies with stellar mass within the first few Myr of life for stars with masses <2 M⊙, but our samples may not be complete for more massive stars. The effects of initial disk fraction and star-forming environment are also explored.

Diversity among galaxy clusters

International Nuclear Information System (INIS)

Struble, M.F.; Rood, H.J.

1988-01-01

The classification of galaxy clusters is discussed. Consideration is given to the classification scheme of Abell (1950's), Zwicky (1950's), Morgan, Matthews, and Schmidt (1964), and Morgan-Bautz (1970). Galaxies can be classified based on morphology, chemical composition, spatial distribution, and motion. The correlation between a galaxy's environment and morphology is examined. The classification scheme of Rood-Sastry (1971), which is based on clusters's morphology and galaxy population, is described. The six types of clusters they define include: (1) a cD-cluster dominated by a single large galaxy, (2) a cluster dominated by a binary, (3) a core-halo cluster, (4) a cluster dominated by several bright galaxies, (5) a cluster appearing flattened, and (6) an irregularly shaped cluster. Attention is also given to the evolution of cluster structures, which is related to initial density and cluster motion
Making Inferences in Adulthood: Falling Leaves Mean It's Fall.

Science.gov (United States)

Zandi, Taher; Gregory, Monica E.

1988-01-01

Assessed age differences in making inferences from prose. Older adults correctly answered mean of 10 questions related to implicit information and 8 related to explicit information. Young adults answered mean of 7 implicit and 12 explicit information questions. In spite of poorer recall of factual details, older subjects made inferences to greater…
Cluster-cluster aggregation of Ising dipolar particles under thermal noise

KAUST Repository

Suzuki, Masaru; Kun, Ferenc; Ito, Nobuyasu

2009-01-01

The cluster-cluster aggregation processes of Ising dipolar particles under thermal noise are investigated in the dilute condition. As the temperature increases, changes in the typical structures of clusters are observed from chainlike (D1
Re-estimating sample size in cluster randomized trials with active recruitment within clusters

NARCIS (Netherlands)

van Schie, Sander; Moerbeek, Mirjam

2014-01-01

Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster
Mixed normal inference on multicointegration

NARCIS (Netherlands)

Boswijk, H.P.

2009-01-01

Asymptotic likelihood analysis of cointegration in I(2) models, see Johansen (1997, 2006), Boswijk (2000) and Paruolo (2000), has shown that inference on most parameters is mixed normal, implying hypothesis test statistics with an asymptotic 2 null distribution. The asymptotic distribution of the
A large sample of shear-selected clusters from the Hyper Suprime-Cam Subaru Strategic Program S16A Wide field mass maps

Science.gov (United States)

Miyazaki, Satoshi; Oguri, Masamune; Hamana, Takashi; Shirasaki, Masato; Koike, Michitaro; Komiyama, Yutaka; Umetsu, Keiichi; Utsumi, Yousuke; Okabe, Nobuhiro; More, Surhud; Medezinski, Elinor; Lin, Yen-Ting; Miyatake, Hironao; Murayama, Hitoshi; Ota, Naomi; Mitsuishi, Ikuyuki

2018-01-01

We present the result of searching for clusters of galaxies based on weak gravitational lensing analysis of the ˜160 deg2 area surveyed by Hyper Suprime-Cam (HSC) as a Subaru Strategic Program. HSC is a new prime focus optical imager with a 1.5°-diameter field of view on the 8.2 m Subaru telescope. The superb median seeing on the HSC i-band images of 0.56" allows the reconstruction of high angular resolution mass maps via weak lensing, which is crucial for the weak lensing cluster search. We identify 65 mass map peaks with a signal-to-noise (S/N) ratio larger than 4.7, and carefully examine their properties by cross-matching the clusters with optical and X-ray cluster catalogs. We find that all the 39 peaks with S/N > 5.1 have counterparts in the optical cluster catalogs, and only 2 out of the 65 peaks are probably false positives. The upper limits of X-ray luminosities from the ROSAT All Sky Survey (RASS) imply the existence of an X-ray underluminous cluster population. We show that the X-rays from the shear-selected clusters can be statistically detected by stacking the RASS images. The inferred average X-ray luminosity is about half that of the X-ray-selected clusters of the same mass. The radial profile of the dark matter distribution derived from the stacking analysis is well modeled by the Navarro-Frenk-White profile with a small concentration parameter value of c500 ˜ 2.5, which suggests that the selection bias on the orientation or the internal structure for our shear-selected cluster sample is not strong.
Galaxy Clustering in Early SDSS Redshift Data

CERN Document Server

Zehavi, I.; Frieman, Joshua A.; Weinberg, David H.; Mo, Houjun J.; Anderson, Scott F.; Strauss, Michael A.; Annis, James; Bahcall, Neta A.; Bernardi, Mariangela; Briggs, John W.; Brinkmann, Jon; Burles, Scott; Carey, Larry; Castander, Francisco J.; Connolly, J.; Csabai, Istvan; Dalcanton, Julianne J.; Dodelson,Scott; Doi,Mamoru; Eisenstein, Daniel; Evans, Michael L.; Finkbeiner, Douglas P.; Friedman, Scott; Fukugita, Masataka; Gunn, James E.; Hennessy, Greg S.; Hindsley, Robert B.; Ivezic, Zeljko; Kent,Stephen; Knapp, Gillian R.; Kron, Richard; Kunszt, Peter; Lamb, Donald; French Leger, R.; Long, Daniel C.; Loveday, Jon.; Lupton, Robert H.; McKay, Timothy; Meiksin, Avery; Merrelli, Aronne; Munn, Jeffrey A.; Narayanan, Vijay; Newcomb, Matt; Nichol, Robert C.; Owen, Russell; Peoples, John; Pope, Adrian; Rockosi, Constance M.; Schlegel, David; Schneider, Donald P.; Scoccimarro, Roman; Sheth, Ravi K.; Siegmund, Walter; Smee, Stephen; Snir, Yehuda; Stebbins, Albert; Stoughton, Christopher; SubbaRao, Mark; Szalay, Alexander S.; Szapudi, Istvan; Tegmark, Max; Tucker, Douglas L.; Uomoto, Alan; Vanden Berk, Dan; Vogeley, Michael S.; Waddell,Patrick; Yanny, Brian; York, Donald G.; Zehavi, Idit; Blanton, Michael R.; Frieman, Joshua A.; Weinberg, David H.; Mo, Houjun J.; Strauss, Michael A.

2002-01-01

We present the first measurements of clustering in the Sloan Digital Sky Survey (SDSS) galaxy redshift survey. Our sample consists of 29,300 galaxies with redshifts 5,700 km/s < cz < 39,000 km/s, distributed in several long but narrow (2.5-5 degree) segments, covering 690 square degrees. For the full, flux-limited sample, the redshift-space correlation length is approximately 8 Mpc/h. The two-dimensional correlation function \\xi(r_p,\\pi) shows clear signatures of both the small-scale, ``fingers-of-God'' distortion caused by velocity dispersions in collapsed objects and the large-scale compression caused by coherent flows, though the latter cannot be measured with high precision in the present sample. The inferred real-space correlation function is well described by a power law, \\xi(r)=(r/6.1+/-0.2 Mpc/h)^{-1.75+/-0.03}, for 0.1 Mpc/h < r < 16 Mpc/h. The galaxy pairwise velocity dispersion is \\sigma_{12} ~ 600+/-100 km/s for projected separations 0.15 Mpc/h < r_p < 5 Mpc/h. When we divide the...
Baselines and test data for cross-lingual inference

DEFF Research Database (Denmark)

Agic, Zeljko; Schluter, Natalie

2018-01-01

The recent years have seen a revival of interest in textual entailment, sparked by i) the emergence of powerful deep neural network learners for natural language processing and ii) the timely development of large-scale evaluation datasets such as SNLI. Recast as natural language inference......, the problem now amounts to detecting the relation between pairs of statements: they either contradict or entail one another, or they are mutually neutral. Current research in natural language inference is effectively exclusive to English. In this paper, we propose to advance the research in SNLI-style natural...... language inference toward multilingual evaluation. To that end, we provide test data for four major languages: Arabic, French, Spanish, and Russian. We experiment with a set of baselines. Our systems are based on cross-lingual word embeddings and machine translation. While our best system scores an average...
Bayesian inference with ecological applications

CERN Document Server

Link, William A

2009-01-01

This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Nonparametric Bayesian inference in biostatistics

CERN Document Server

Müller, Peter

2015-01-01

As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Intracranial EEG correlates of implicit relational inference within the hippocampus.

Science.gov (United States)

Reber, T P; Do Lam, A T A; Axmacher, N; Elger, C E; Helmstaedter, C; Henke, K; Fell, J

2016-01-01

Drawing inferences from past experiences enables adaptive behavior in future situations. Inference has been shown to depend on hippocampal processes. Usually, inference is considered a deliberate and effortful mental act which happens during retrieval, and requires the focus of our awareness. Recent fMRI studies hint at the possibility that some forms of hippocampus-dependent inference can also occur during encoding and possibly also outside of awareness. Here, we sought to further explore the feasibility of hippocampal implicit inference, and specifically address the temporal evolution of implicit inference using intracranial EEG. Presurgical epilepsy patients with hippocampal depth electrodes viewed a sequence of word pairs, and judged the semantic fit between two words in each pair. Some of the word pairs entailed a common word (e.g., "winter-red," "red-cat") such that an indirect relation was established in following word pairs (e.g., "winter-cat"). The behavioral results suggested that drawing inference implicitly from past experience is feasible because indirect relations seemed to foster "fit" judgments while the absence of indirect relations fostered "do not fit" judgments, even though the participants were unaware of the indirect relations. A event-related potential (ERP) difference emerging 400 ms post-stimulus was evident in the hippocampus during encoding, suggesting that indirect relations were already established automatically during encoding of the overlapping word pairs. Further ERP differences emerged later post-stimulus (1,500 ms), were modulated by the participants' responses and were evident during encoding and test. Furthermore, response-locked ERP effects were evident at test. These ERP effects could hence be a correlate of the interaction of implicit memory with decision-making. Together, the data map out a time-course in which the hippocampus automatically integrates memories from discrete but related episodes to implicitly influence future
Estimating mountain basin-mean precipitation from streamflow using Bayesian inference

Science.gov (United States)

Henn, Brian; Clark, Martyn P.; Kavetski, Dmitri; Lundquist, Jessica D.

2015-10-01

Estimating basin-mean precipitation in complex terrain is difficult due to uncertainty in the topographical representativeness of precipitation gauges relative to the basin. To address this issue, we use Bayesian methodology coupled with a multimodel framework to infer basin-mean precipitation from streamflow observations, and we apply this approach to snow-dominated basins in the Sierra Nevada of California. Using streamflow observations, forcing data from lower-elevation stations, the Bayesian Total Error Analysis (BATEA) methodology and the Framework for Understanding Structural Errors (FUSE), we infer basin-mean precipitation, and compare it to basin-mean precipitation estimated using topographically informed interpolation from gauges (PRISM, the Parameter-elevation Regression on Independent Slopes Model). The BATEA-inferred spatial patterns of precipitation show agreement with PRISM in terms of the rank of basins from wet to dry but differ in absolute values. In some of the basins, these differences may reflect biases in PRISM, because some implied PRISM runoff ratios may be inconsistent with the regional climate. We also infer annual time series of basin precipitation using a two-step calibration approach. Assessment of the precision and robustness of the BATEA approach suggests that uncertainty in the BATEA-inferred precipitation is primarily related to uncertainties in hydrologic model structure. Despite these limitations, time series of inferred annual precipitation under different model and parameter assumptions are strongly correlated with one another, suggesting that this approach is capable of resolving year-to-year variability in basin-mean precipitation.
Feature inference with uncertain categorization: Re-assessing Anderson's rational model.

Science.gov (United States)

Konovalova, Elizaveta; Le Mens, Gaël

2017-09-18

A key function of categories is to help predictions about unobserved features of objects. At the same time, humans are often in situations where the categories of the objects they perceive are uncertain. In an influential paper, Anderson (Psychological Review, 98(3), 409-429, 1991) proposed a rational model for feature inferences with uncertain categorization. A crucial feature of this model is the conditional independence assumption-it assumes that the within category feature correlation is zero. In prior research, this model has been found to provide a poor fit to participants' inferences. This evidence is restricted to task environments inconsistent with the conditional independence assumption. Currently available evidence thus provides little information about how this model would fit participants' inferences in a setting with conditional independence. In four experiments based on a novel paradigm and one experiment based on an existing paradigm, we assess the performance of Anderson's model under conditional independence. We find that this model predicts participants' inferences better than competing models. One model assumes that inferences are based on just the most likely category. The second model is insensitive to categories but sensitive to overall feature correlation. The performance of Anderson's model is evidence that inferences were influenced not only by the more likely category but also by the other candidate category. Our findings suggest that a version of Anderson's model which relaxes the conditional independence assumption will likely perform well in environments characterized by within-category feature correlation.
Integrating distributed Bayesian inference and reinforcement learning for sensor management

NARCIS (Netherlands)

Grappiolo, C.; Whiteson, S.; Pavlin, G.; Bakker, B.

2009-01-01

This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically
Reliability of dose volume constraint inference from clinical data

Science.gov (United States)

Lutz, C. M.; Møller, D. S.; Hoffmann, L.; Knap, M. M.; Alber, M.

2017-04-01

Dose volume histogram points (DVHPs) frequently serve as dose constraints in radiotherapy treatment planning. An experiment was designed to investigate the reliability of DVHP inference from clinical data for multiple cohort sizes and complication incidence rates. The experimental background was radiation pneumonitis in non-small cell lung cancer and the DVHP inference method was based on logistic regression. From 102 NSCLC real-life dose distributions and a postulated DVHP model, an ‘ideal’ cohort was generated where the most predictive model was equal to the postulated model. A bootstrap and a Cohort Replication Monte Carlo (CoRepMC) approach were applied to create 1000 equally sized populations each. The cohorts were then analyzed to establish inference frequency distributions. This was applied to nine scenarios for cohort sizes of 102 (1), 500 (2) to 2000 (3) patients (by sampling with replacement) and three postulated DVHP models. The Bootstrap was repeated for a ‘non-ideal’ cohort, where the most predictive model did not coincide with the postulated model. The Bootstrap produced chaotic results for all models of cohort size 1 for both the ideal and non-ideal cohorts. For cohort size 2 and 3, the distributions for all populations were more concentrated around the postulated DVHP. For the CoRepMC, the inference frequency increased with cohort size and incidence rate. Correct inference rates >85 % were only achieved by cohorts with more than 500 patients. Both Bootstrap and CoRepMC indicate that inference of the correct or approximate DVHP for typical cohort sizes is highly uncertain. CoRepMC results were less spurious than Bootstrap results, demonstrating the large influence that randomness in dose-response has on the statistical analysis.
Inference in {open_quotes}poor{close_quotes} languages

Energy Technology Data Exchange (ETDEWEB)

Petrov, S. [Oak Ridge National Lab., TN (United States)

1996-12-31

Languages with a solvable implication problem but without complete and consistent systems of inference rules ({open_quote}poor{close_quote} languages) are considered. The problem of existence of a finite, complete, and consistent inference rule system for a {open_quotes}poor{close_quotes} language is stated independently of the language or the rule syntax. Several properties of the problem are proved. An application of the results to the language of join dependencies is given.
Inference of beliefs and emotions in patients with Alzheimer's disease.

Science.gov (United States)

Zaitchik, Deborah; Koff, Elissa; Brownell, Hiram; Winner, Ellen; Albert, Marilyn

2006-01-01

The present study compared 20 patients with mild to moderate Alzheimer's disease with 20 older controls (ages 69-94 years) on their ability to make inferences about emotions and beliefs in others. Six tasks tested their ability to make 1st-order and 2nd-order inferences as well as to offer explanations and moral evaluations of human action by appeal to emotions and beliefs. Results showed that the ability to infer emotions and beliefs in 1st-order tasks remains largely intact in patients with mild to moderate Alzheimer's. Patients were able to use mental states in the prediction, explanation, and moral evaluation of behavior. Impairment on 2nd-order tasks involving inference of mental states was equivalent to impairment on control tasks, suggesting that patients' difficulty is secondary to their cognitive impairments. ((c) 2006 APA, all rights reserved).
Inference method using bayesian network for diagnosis of pulmonary nodules

International Nuclear Information System (INIS)

Kawagishi, Masami; Iizuka, Yoshio; Yamamoto, Hiroyuki; Yakami, Masahiro; Kubo, Takeshi; Fujimoto, Koji; Togashi, Kaori

2010-01-01

This report describes the improvements of a naive Bayes model that infers the diagnosis of pulmonary nodules in chest CT images based on the findings obtained when a radiologist interprets the CT images. We have previously introduced an inference model using a naive Bayes classifier and have reported its clinical value based on evaluation using clinical data. In the present report, we introduce the following improvements to the original inference model: the selection of findings based on correlations and the generation of a model using only these findings, and the introduction of classifiers that integrate several simple classifiers each of which is specialized for specific diagnosis. These improvements were found to increase the inference accuracy by 10.4% (p<.01) as compared to the original model in 100 cases (222 nodules) based on leave-one-out evaluation. (author)
Multi-Optimisation Consensus Clustering

Science.gov (United States)

Li, Jian; Swift, Stephen; Liu, Xiaohui

Ensemble Clustering has been developed to provide an alternative way of obtaining more stable and accurate clustering results. It aims to avoid the biases of individual clustering algorithms. However, it is still a challenge to develop an efficient and robust method for Ensemble Clustering. Based on an existing ensemble clustering method, Consensus Clustering (CC), this paper introduces an advanced Consensus Clustering algorithm called Multi-Optimisation Consensus Clustering (MOCC), which utilises an optimised Agreement Separation criterion and a Multi-Optimisation framework to improve the performance of CC. Fifteen different data sets are used for evaluating the performance of MOCC. The results reveal that MOCC can generate more accurate clustering results than the original CC algorithm.
Generative inference for cultural evolution.

Science.gov (United States)

Kandler, Anne; Powell, Adam

2018-04-05

One of the major challenges in cultural evolution is to understand why and how various forms of social learning are used in human populations, both now and in the past. To date, much of the theoretical work on social learning has been done in isolation of data, and consequently many insights focus on revealing the learning processes or the distributions of cultural variants that are expected to have evolved in human populations. In population genetics, recent methodological advances have allowed a greater understanding of the explicit demographic and/or selection mechanisms that underlie observed allele frequency distributions across the globe, and their change through time. In particular, generative frameworks-often using coalescent-based simulation coupled with approximate Bayesian computation (ABC)-have provided robust inferences on the human past, with no reliance on a priori assumptions of equilibrium. Here, we demonstrate the applicability and utility of generative inference approaches to the field of cultural evolution. The framework advocated here uses observed population-level frequency data directly to establish the likely presence or absence of particular hypothesized learning strategies. In this context, we discuss the problem of equifinality and argue that, in the light of sparse cultural data and the multiplicity of possible social learning processes, the exclusion of those processes inconsistent with the observed data might be the most instructive outcome. Finally, we summarize the findings of generative inference approaches applied to a number of case studies.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.